A collection of obfuscated version of a 20B-token sample of the FineWeb Edu dataset.
Daniel Gallagher
DanielGallagherIRE
AI & ML interests
NLP at the Institute for Applied Informatics in Leipzig, Germany.
Recent Activity
liked a model 2 days ago
Boldt/Boldt-1B updated a collection 3 days ago
Obfuscated FineWeb Edu updated a collection 3 days ago
Obfuscated FineWeb Edu