Smol Community

community

AI & ML interests

The SmolTuners group is a community dedicated to the development of small-scale Large Language Models (LLMs) using consumer-grade GPUs.

Recent Activity

Delta-Vectorย  new activity about 10 hours ago
SmolTuners/README:Gh organization
s3nhย  new activity 1 day ago
SmolTuners/README:Gh organization
s3nhย  new activity 1 day ago
SmolTuners/README:Optimizers
View all activity

SmolTuners's activity

Delta-Vectorย 
in SmolTuners/README about 10 hours ago

Gh organization

1
#3 opened 1 day ago by
s3nh
s3nhย 
in SmolTuners/README 1 day ago

Gh organization

1
#3 opened 1 day ago by
s3nh

Optimizers

#2 opened 1 day ago by
s3nh
s3nhย 
in SmolTuners/README 3 days ago

Datasets

3
#1 opened 5 days ago by
s3nh
Delta-Vectorย 
in SmolTuners/README 3 days ago

Datasets

3
#1 opened 5 days ago by
s3nh
s3nhย 
updated a Space 5 days ago
s3nhย 
posted an update 6 days ago
view post
Post
1597
Welcome back,

Small Language Models Enthusiasts and GPU Poor oss enjoyers lets connect.
Just created an organization which main target is to have fun with smaller models tuneable on consumer range GPUs, feel free to join and lets have some fun, much love ;3

https://huggingface.co/SmolTuners
ยท
Felladrinย 
posted an update about 1 month ago
view post
Post
1413
I'm curating an AI-powered web search software timeline at https://github.com/felladrin/awesome-ai-web-search

The list covers three main categories:

1. Web Search with LLM summarization and follow-up capabilities
2. LLM chat interfaces with Web Search integration
3. Agent-driven research tools using LLM + Web Search

The timeline helps track the evolution of this space and serves as a reference for anyone looking for alternatives. If you know of any tools that should be included, please contribute by:
- opening a PR to edit the readme: https://github.com/felladrin/awesome-ai-web-search/edit/main/readme.md
- creating an issue in the repository: https://github.com/felladrin/awesome-ai-web-search/issues/new/choose
- or sharing in the comments below.
  • 1 reply
ยท
Felladrinย 
posted an update 2 months ago
view post
Post
2855
MiniSearch is celebrating its 1st birthday! ๐ŸŽ‰

Exactly one year ago, I shared the initial version of this side-project on Hugging Face. Since then, there have been numerous changes under the hood. Nowadays it uses [Web-LLM](https://github.com/mlc-ai/web-llm), [Wllama](https://github.com/ngxson/wllama) and [SearXNG](https://github.com/searxng/searxng). I use it daily as my default search engine and have done my best to make it useful. I hope it's interesting for you too!

HF Space: Felladrin/MiniSearch
Embeddable URL: https://felladrin-minisearch.hf.space
  • 1 reply
ยท
Fizzarolliย 
posted an update 7 months ago
view post
Post
1910
hi everyone!

i wanted to share an experiment i did with upcycling phi-3 mini into an moe recently.
while benchmarks are definitely within a margin of error and they performed similarly, i think it's an interesting base to try and see if you can improve phi's performance! (maybe looking into HuggingFaceFW/fineweb-edu could be interesting, i also left some other notes if anyone with more compute access wants to try it themselves)

check it out! Fizzarolli/phi3-4x4b-v1
Fizzarolliย 
posted an update 8 months ago
view post
Post
2338
Is anyone looking into some sort of decentralized/federated dataset generation or classification by humans instead of synthetically?

From my experience with trying models, a *lot* of modern finetunes are trained on what amounts to, in essence, GPT-4 generated slop that makes everything sound like a rip-off GPT-4 (refer to i.e. the Dolphin finetunes). I have a feeling that this is a lot of the reason people haven't been quite as successful as Meta's instruct tunes of Llama 3.
s3nhย 
posted an update 11 months ago
view post
Post
GPU Poor POV: Burnout

Sometimes we do not have an energy to post about AI and new methods.
And thats totally ok, I guess.
Remember to sleep well and drink a lot of water. Have a great day :D <3
  • 2 replies
ยท
s3nhย 
posted an update 11 months ago
view post
Post
GPU Poor POV: Quantization

Today I want to share with you my notebook plug and play code
which help me a lot through my quantization journey.
Hope youll find it interesting it could be a good starter point to
gguf some of your awesome models :)

Have a great day <3

https://s3nh.bearblog.dev/gpu-poor-pov-gguf-snippet/
ยท
s3nhย 
posted an update 11 months ago
view post
Post
GPU Poor POV: Willingness of Customization

I love to use libraries in which you can customize a lot of things. Chromadb is my choice of db if it comes to store embeddings. Te cool feature is that you can define your own embeddings function which can be called on every chromadb collection initialisation or creation. It is useful because sometimes we want to use different prompts, different models, and it can be easily written as inheritence from EmbeddingFunction class.

Edit:

My CustomEmbeddingFunction can be found here:
https://gist.github.com/s3nh/cfbbf43f5e9e3cfe8c3e4e2f0d550b80

and you can use it by initializing or calling the chroma collection.

import chromadb 
from your_custom_fn import CustomEmbeddingFunction
class ChromaStorage:
    def __init__(self, config):
        self.config = config
        self.client = self.init_client()
        self.embedding_function = CustomEmbeddingFunction()

    def check_config(self):
        assert os.path.exists(self.config.path), ValueError('Provided path does not exists!!')

    def init_client(self):
        return chromadb.PersistentClient(path = self.config.path,)

    def init_collection(self, name: str): 
        return self.client.get_or_create_collection(name = name, embedding_function = self.embedding_function)
  • 3 replies
ยท
s3nhย 
posted an update 11 months ago
view post
Post
GPU Poor POV: Dont be Afraid :D

Sometimes we dont want to do something because of low self esteem,
I ofter hear 'its to hard for me','i am not an expert','i do not know how to do it', etc. These words are never the truth, we should not be afraid and try to build something because there is no additive value without a failure.

Same things comes in LLMs, there is a lot of fancy words happening, but whats is more important is that there are also people who are constantly building so other can build. Diving into finetuning LLMs is incredibly simple if we assume using axolotl library and pretrains stored on huggingface.

All we need is an idea, our GPU Poor desktop or colab notebooks and these steps:
git clone https://github.com/OpenAccess-AI-Collective/axolotl
cd axolotl

pip3 install packaging
pip3 install -e '.[flash-attn,deepspeed]'

After installation process we can go to examples, and modify configs to our own needs.
Lets jump into
axolotl\examples\llama-2\qlora.yml

and change
base_model: NousResearch/Llama-2-7b-hf

to
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0

choose dataset from huge amounts of dataset that are possible to use from hf.co/datasets and tweak additional params like batch_size, number of epochs, how often do we want to save our model and many more (which I wont focus on rn).
Then,
accelerate launch -m axolotl.cli.train examples/llama-2/qlora.yml

Will allow to start the finetuning process on structure defined strictly by you. After finetuning, model will be saved in path provided in config, and you can check out if it performs better than the base one. Or even you can put it on llm Leaderboard to check if we do not have new SOTA :)
Have fun and have a great day <3
ยท
s3nhย 
posted an update 11 months ago
view post
Post
GPU Poor POV: My storytelling choices of the week

Its end of the week, I decided to summarize my observations in community based LLMs and mention few models in specific area which are very interesting and has capability to create some insightful stories despite of its relatively lightweight form.

I personally did not use LLMs in my daily routine to tasks like function calling, parsing or assist in code writing. What I tried to use for is storytelling, because it always amaze me how different these models comes to different preferred tasks.

How this model are able to generalize the stories and sometimes, how high level of creativity they carry.

BlueNipples/DaringLotus-v2-10.7b its main target is to generate prose. Quoting the author 'It shares it's good prose, and relatively decent coherency, being a little bit more on the side of prose, and a little bit less on the side of coherency. I like this model for generating great prose if I feel like regening a bit. '

https://huggingface.co/NeuralNovel/Aeryth-7B-v0.1
great work by @NeuralNovel , I really like how flexible this model is, there is no strict focus on a certain role, so definitely worth a try. Would love to hear more about dataset on which was trained, afaik is private rn. best suited for Science Fiction, History & Romance genres due to the training data used.

And the last one for today is FPHam/Sydney_Pirate_Mistral_7b @FPHam work always amaze me how the models are able to stick to provided role. awesome work as always, Ill for sure use this model to generate some interesting stories.

I know that hype train is going fast but as I observe people here on huggingface are creating really creative models which are for sure worth to try. Have a great day <3
ยท
s3nhย 
posted an update 11 months ago
view post
Post
GPU Poor POV: Low Hanging Fruits


Sometimes we had to work with different language than English (what a surprise!) and it can be problematic, because as you may know many algorithms are mainly developed in English.
I was involved in building RAG in Polish language. At first, we need an proper embeddings for Polish language to feed them into lightweight LLM.
Looking through possible solution I become aware that existing/possible models are not accurate enough, and worked much worse than its 'english equivalent'.
First thing that comes to mind is:
Lets become a mad scientist, download all possible data and train model for months to get the proper one.

But there are few cons of this.
- Its computionally heavy
- You are not full time researcher
- you have potential clients who want to use your solution, and they really happy to use it (in optimistic mood).
Here comes the low hanging fruits.
We developed a easier, workable solution. Instead of training new SOTA, we can use translation module like this one:

Helsinki-NLP/opus-mt-pl-en
translate your knowledge base to english, and use proper embedding model accurately.
I converted existing model using ctranslate2,

ct2-transformers-converter --model Helsinki-NLP/opus-mt-pl-en --output_dir opus-mt-pl-en

so making an inference is not heavy (we observe 5 times speedup in compare to original version).

And by indexing knowledge base, we can return answer to LLM in any language. (Indexes of context found in english language are equal to indexes in native language knowledge base).

Of course there are some tweaks required, we have to validate accuracy of the translation.

It was nice episode, we have our work done, there are people who can use it, so additive value exists.
Have a great day and I wish you more effective deploys! <3
  • 4 replies
ยท
s3nhย 
posted an update 11 months ago
view post
Post
GPU Poor POV: Building a RAG which solves specific task.

Everyone loves benchmarks.
They are great because we have standarized approach, competitive feeling. But if you are in specific area, trying to implement some LLM/RAG use case, these benchmarks cannot exactly reflect on the data that you have to deal with.

I built RAG system on bunch of niche procedures/regulation etc, which can be finally deployed as an virtual assistant to minimize the effort in searching through a lot of documentations manually.

Tested a lot of different methods/models/pretrains, finetunes and whats interesting is that, final solution which was scored by human feedback is based on relatively low param models, with multitask ability
Something like:

BAAI/llm-embedder

LLMs help summarize the chunk version of knowledge base found, does not require the model with high number of params, because tradeoff between inference time and accuracy has to be made. Some lightweight models have ability to perform certain task based on instructions, so eg. qwen 7b or mistral 7b (not moe one), realized a task really nicely. And what is more important is that in overall we are able to deploy a RAG system in smaller tasks, in specific area. They can be used by people who need it, give additive value and positive feedback, which IMO is what is all of the building process about.

Have a great day and think about problem which your models have to solve <3
  • 2 replies
ยท