Please, check the intermediate models you uploaded seriously!!!
#267
by
ErikaaWang
- opened
As far as I tried, you uploaded the same model for (1) global step 10k and 500k, bloom-560m-intermediate; (2) global step 1k and 10k for bloom-1b7-intermediate; (3) global step 250k and 300k for bloom-1b7-intermediate; They are outputting literally the same distribution for same input!
And please, please, also check the other checkpoint models uploaded are located in the correct global step. Some weird degradation in performance appears in the later stage.