π© Report: Legal issue(s)
The Spanish derivative dataset based on alpaca & dolly-15k is not provided.
As a derivative work, this is contrary to the licence used by both original datasets:
databricks/databricks-dolly-15k, license cc-by-sa-3.0
https://creativecommons.org/licenses/by-sa/3.0/es/
tatsu-lab/alpaca, license cc-by-sa-4.0
https://creativecommons.org/licenses/by-sa/4.0/
As they contain the clause:
ShareAlike - If you remix, transform or build upon the material, you must distribute your contributions under the same licence as the original.
After a month, you still haven't replied.
We are still waiting for the dataset that has been used and due to the licence of the original dataset you have to do it.
Hello @v1ckxy , and sorry for the late response. We've been working on releasing new and more powerful models like Lince Mistral to the community. Sorry for the misunderstanding, but we did not use the mentioned datasets as baseline but just for analysis of the content they have on it (i.e. distribution of topics, formats, etc). We created new ones from scratch with completely new structures and information, so the license you are referring to for Dolly15k is not applicable as it is not derivative work from the dataset. Again, thank you for pointing it out, and the sorry for the inconvenience and miss understanding
Whatever you say.
It must be scary to see the dataset used during training, given that the biased answers you get from this specific model.