Commit History

Adding pad_to_multiple_of=16
986ff4e

versae commited on

Changed print to logger
1c5d797

versae commited on

Preparing code for final runs
ea0132b

versae commited on

Improved version of conversion script Flax → PyTorch
346a10a

versae commited on

Fixed widget example
3f4b8d4

versae commited on

Fix config for checkpoint
3950061

versae commited on

Changed and added vocab and tokenizer
29e26bb

versae commited on

Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
61f6971

versae commited on

New Flax model
300e533

versae commited on

Fixes to mc4 fork
8bd9e95

versae commited on

Fixes treatment of jsonl
7b22f12

versae commited on

Fix format for filepaths
7d6bbb2

versae commited on

Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
13757a8

versae commited on

Adding reading streaming files from local disk
4e4228c

versae commited on

Base model at 105k steps
f7ba030

versae commited on

Fixes and defaults
a5b19d7

versae commited on

Adding Numpy random number generator
f562f06

versae commited on

Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
f965ae3

versae commited on

Adding random sampling
60b6f6b

versae commited on

Adding config and models for the hub widget
d75240e

versae commited on

Adding missing import
79555ba

versae commited on

Adding base config and organizing configs
9c5541b

versae commited on

Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
36b7dde

versae commited on

Adding sampling to mc4
3f09f56

versae commited on

Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
9072c50

versae commited on

New tokenizer
eb4e77c

versae commited on

Adjust batch size for extrating tokens
8b9ba87

versae commited on

Scripts for perplexity sampling and fixes
853cd83

versae commited on

Remove unused imports
d5cede4

edugp commited on

Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
840171b

edugp commited on

Add script to generate dataset of embeddings and perplexities. Add script to generate t-SNE plot for embedding and perplexity visualization.
a81e575

edugp commited on

Adding correct models 10k steps
fe7ff35

versae commited on

Updating run script
a1f93c9

versae commited on

Adding checkpointing, wandb, and new mlm script
d988382

versae commited on

Epoch 1 Flax model
48f8c78

versae commited on

Changed batch size
a95f7b8

versae commited on

Changed execution mode
40f69ff

versae commited on

Initial test with BETO's corpus
2835721

versae commited on

:sparkles: Added test_script and a folder for scripts
2a963f0

Pablo commited on

:see_no_evil: Added .gitignore file
de633ab

Pablo commited on

Update README
e14e482

versae commited on

README
af3cb2c

versae commited on

initial commit
9b1d641

system HF staff commited on