Dataset

by titan087 - opened Mar 9, 2024

Mar 9, 2024

Hey,

What dataset did you use to finetune this model? I was looking for one to finetune codellama 34b and havent found one that looked good.

Thanks!

akameswa

Owner Mar 10, 2024

Same here. So I chose a benchmark dataset.
https://huggingface.co/datasets/codeparrot/xlcost-text-to-code

akameswa

Owner Mar 10, 2024

The JavaScript subsection has about 10K rows. I felt that to be good enough for a fine-tune. Let me know your thoughts as well.

titan087

May 19, 2024

Its worth a shot, for a basic test I can try training either Gemma or Llama3, or potentially Phi-3, at least to start with. If it works well enough, than scale it up to one of the coding based 34b models.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment