File size: 1,655 Bytes
7b0d06b aedc80c 7b0d06b 67931fb 09b90bc 67931fb 09b90bc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
---
title: RepoSnipy
emoji: ππ«
colorFrom: gray
colorTo: gray
sdk: streamlit
sdk_version: 1.21.0
python_version: 3.11.3
app_file: app.py
pinned: true
license: mit
---
# RepoSnipy ππ«
[](https://huggingface.co/spaces/Lazyhope/RepoSnipy)
Neural search engine for discovering semantically similar Python repositories on GitHub.
## Demo
Searching an indexed repository:

## About
RepoSnipy is a neural search engine built with [streamlit](https://github.com/streamlit/streamlit) and [docarray](https://github.com/docarray/docarray). You can query a public Python repository hosted on GitHub and find popular repositories that are semantically similar to it.
It uses the [RepoSim](https://github.com/RepoAnalysis/RepoSim/) pipeline to create embeddings for Python repositories. We have created a [vector dataset](data/index.bin) (stored as docarray index) of over 9700 GitHub Python repositories that has license and over 300 stars by the time of 20th May, 2023.
## Running Locally
Download the repository and install the required packages:
```bash
git clone https://github.com/RepoAnalysis/RepoSnipy
cd RepoSnipy
pip install -r requirements.txt
```
Then run the app on your local machine using:
```bash
streamlit run app.py
```
## License
Distributed under the MIT License. See [LICENSE](LICENSE) for more information.
## Acknowledgments
The model and the fine-tuning dataset used:
* [UniXCoder](https://arxiv.org/abs/2203.03850)
* [AdvTest](https://arxiv.org/abs/1909.09436)
|