- all_dumps_bad
- all_filtering_steps
- c4_filters_hellaswag
- cross_ind_unfiltered_comparison
- custom_filters
- dataset_ablations
- dededup_difference
- dedup_attempts
- duplicates-simul
- edu-100k
- edu-8k
- edu_ablations
- edu_abljtions
- edu_fw_ablations
- filtering_steps
- ind_dedup_better
- minhash_params
- removed_data_dedup
- score_by_dump
- stats
- wet_comparison