Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Are instruction models evaluated with chat template?
#1
by
alexrs
- opened
In the Hugging Face Harness fork it is possible to specify --apply_chat_template
and fewshot_as_multiturn
options for instruction models (https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about#reproducibility). That does not seem to be the case in this leaderboard according to the reproducibility instructions and when I try it (the flag exists in the code -- https://github.com/mohamedalhajjar/lm-evaluation-harness-multilingual/blob/64286c9b9a270f9b72a9c4ba05e014b8284108da/lm_eval/__main__.py#L172) I get the following error:
[rank6]: Traceback (most recent call last):
[rank6]: File "/opt/conda/envs/openllm/lib/python3.10/runpy.py", line 196, in _run_module_as_main
[rank6]: return _run_code(code, main_globals, None,
[rank6]: File "/opt/conda/envs/openllm/lib/python3.10/runpy.py", line 86, in _run_code
[rank6]: exec(code, run_globals)
[rank6]: File "/opt/conda/envs/openllm/lib/python3.10/site-packages/lm_eval/__main__.py", line 461, in <module>
[rank6]: cli_evaluate()
[rank6]: File "/opt/conda/envs/openllm/lib/python3.10/site-packages/lm_eval/__main__.py", line 382, in cli_evaluate
[rank6]: results = evaluator.simple_evaluate(
[rank6]: File "/opt/conda/envs/openllm/lib/python3.10/site-packages/lm_eval/utils.py", line 397, in _wrapper
[rank6]: return fn(*args, **kwargs)
[rank6]: File "/opt/conda/envs/openllm/lib/python3.10/site-packages/lm_eval/evaluator.py", line 288, in simple_evaluate
[rank6]: evaluation_tracker.general_config_tracker.log_experiment_args(
[rank6]: File "/opt/conda/envs/openllm/lib/python3.10/site-packages/lm_eval/loggers/evaluation_tracker.py", line 97, in log_experiment_args
[rank6]: self.chat_template_sha = hash_string(chat_template) if chat_template else None
[rank6]: File "/opt/conda/envs/openllm/lib/python3.10/site-packages/lm_eval/utils.py", line 36, in hash_string
[rank6]: return hashlib.sha256(string.encode("utf-8")).hexdigest()
[rank6]: AttributeError: 'dict' object has no attribute 'encode'
Thank you for raising this. Could you please add it to the github repo to be fixed? Thanks!
malhajar
changed discussion status to
closed
added an issue tab now. Could you check again?
Thanks alot for raising this, I will perform a fix really soon :)