stephantulkens commited on
Commit
9d0f30b
Β·
verified Β·
1 Parent(s): e833733

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -19
README.md CHANGED
@@ -7,30 +7,25 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- ## Hi there πŸ₯¬
11
 
12
- We are the minish lab πŸ„! Welcome to our github page. We're a two-person ([@pringled](https://huggingface.co/Pringled) and [@stephantulkens](https://huggingface.co/stephantulkens)) open-source research lab, with a focus on Natural Language Processing.
13
- Our goal is to provide with usable and fun tools to make working with language data easy and fun.
14
 
15
- * We like fast things, so we focus on small models.
16
- * We like "classical" nlp and machine learning, so no LLM interfaces here.
17
- * We like cpu-bound work, not everyone has access to GPUs or wants to pay big tech companies for using GPUs.
18
- * We try to be as multi-lingual as possible: NLP work tends to focus purely on English, to the detriment of other languages.
19
- * We write in Python, and use a pretty opinionated stack (uv, everything fully typed, everything fully documented, no exceptions).
20
- * We try to be inclusive: if you'd like to help out, please let us know πŸ€—.
21
 
22
- ### Main goals
 
 
 
 
 
23
 
24
- We aim to make software that is:
25
- * Easy to use ⛓️
26
- * Fun to use πŸ₯³
27
- * Opinionated πŸ€”
28
- * Open for integration 🧲
29
- * Original (does not re-invent the wheel) 🀸
30
- * Fast 🚴
31
 
32
- In short, this means we make software packages that do one thing well, and that let you do that specific thing, and integrate it into a use of your choosing.
33
- We're not going to try and tell you what to do, we'll just show you what you can do, and we'll hope you have fun doing it.
 
 
34
 
35
  You can also find us on:
36
  πŸ”¬ [GitHub](https://github.com/MinishLab)
 
7
  pinned: false
8
  ---
9
 
10
+ ## Hello, we're minish!
11
 
12
+ We're a two-person ([@pringled](https://github.com/Pringled) and [@stephantul](https://github.com/stephantul)) open-source company, with a focus on Natural Language Processing.
 
13
 
14
+ We believe that if you make models fast enough, you unlock new possibilities.
 
 
 
 
 
15
 
16
+ Using our software, you can:
17
+ * Ingest the entire English Wikipedia in 5 minutes
18
+ * Classify tens of thousands of documents per second on CPU
19
+ * Approximately deduplicate extremely large datasets in minutes
20
+ * Build the fastest RAG application in the world
21
+ * Easily evaluate which ANN algorithm works best for your data
22
 
23
+ Our projects:
 
 
 
 
 
 
24
 
25
+ * [model2vec](https://github.com/MinishLab/model2vec): make tiny models that are still really really good.
26
+ * [potion](https://huggingface.co/minishlab/potion-base-8M): the best small model in the world. 100-500x faster than a sentence-transformer, and almost as good.
27
+ * [vicinity](https://github.com/MinishLab/vicinity): consistent interfaces to many approximate nearest neighbor algorithms.
28
+ * [semhash](https://github.com/MinishLab/semhash): lightning-fast, super accuracte, approximate deduplication for your text datasets.
29
 
30
  You can also find us on:
31
  πŸ”¬ [GitHub](https://github.com/MinishLab)