Spaces:

popV
/

README

Running

App Files Files Community

canergen commited on Dec 13, 2024

Commit

169b2fa

verified ·

1 Parent(s): a38995c

Update README.md

Browse files

Files changed (1) hide show

README.md +69 -5

README.md CHANGED Viewed

@@ -1,10 +1,74 @@
 ---
 title: README
-emoji: 📚
-colorFrom: pink
-colorTo: green
 sdk: static
-pinned: false
 ---
-Edit this `README.md` markdown file to author your organization card.

 ---
 title: README
+emoji: 🐨
+colorFrom: purple
+colorTo: blue
 sdk: static
+pinned: true
+license: bsd-3-clause
+short_description: Ensemble of experts for cell-type annotation
 ---
+# **popV**
+Welcome to the **popV** framework. We provide state-of-the-art performance in cell-type label transfer using an ensemble of experts approach. We provide here pre-trained
+models to transfer cell-types to your own query dataset. Cell-type definition is a tedious process. Using reference data can significantly accelerate this process.
+By using several tools for label transfer, we provide a certainty score that is well calibrated and allows to detect cell-types, where automatic annotation has high
+uncertainty. We recommend to manually check transferred cell-type labels by plotting marker or differentially expressed genes before blindly trusting them.
+This is an open science initiative, please contribute your own models to allow the single-cell community to leverage your reference datasets by asking in our [GitHub
+repository](https://github.com/YosefLab/popV) to add your dataset.
+---
+## **Model Overview**
+popV trains up to 9 different algorithms for automatic label transfer and computes a consensus score. We provide an automatic report. To learn how to apply popV to your
+own dataset, please refer to our [tutorial]()
+### Algorithms
+Currently implemented algorithms are:
+-   K-nearest neighbor classification after dataset integration with [BBKNN](https://github.com/Teichlab/bbknn)
+-   K-nearest neighbor classification after dataset integration with [SCANORAMA](https://github.com/brianhie/scanorama)
+-   K-nearest neighbor classification after dataset integration with [scVI](https://github.com/scverse/scvi-tools)
+-   K-nearest neighbor classification after dataset integration with [Harmony](https://github.com/lilab-bcb/harmony-pytorch)
+-   Random forest classification
+-   Support vector machine classification
+-   [OnClass](https://github.com/wangshenguiuc/OnClass) cell type classification
+-   [scANVI](https://github.com/scverse/scvi-tools) label transfer
+-   [Celltypist](https://www.celltypist.org) cell type classification
+All algorithms are implemented as a class in [popv/algorithms](popv/algorithms/__init__.py).
+To implement a new method, a class has to have several methods:
+-   algorithm.compute_integration: Computes dataset integration to yield an integrated latent space.
+-   algorithm.predict: Computes cell-type labels based on the specific classifier.
+-   algorithm.compute_embedding: Computes UMAP embedding of previously computed integrated latent space.
+Adding a new class with those methods will automatically tell popV to include this class into its classifiers and will use the new classifier as another expert.
+---
+## **Key Applications**
+The purpose of these models is to perform cell-type label transfer.
+We provide models with (CUML support)[collection] for large-scale reference mapping and (without CUML support)[collection] if no GPU is available. PopV without GPU scales
+well to 100k cells. PopV has three levels of prediction complexities:
+-   retrain will train all classifiers from scratch. For 50k cells this takes up to an hour of computing time using a GPU.
+-   inference will use pretrained classifiers to annotate query as well as reference cells and construct a joint embedding using all integration methods from above. For 50k cells this takes in our hands up to half an hour of computing time using a GPU.
+-   fast will use only methods with pretrained classifiers to annotate only query cells. For 50k cells this takes 5 minutes without a GPU (without UMAP embedding).
+---
+## **Publications**
+- **[Original popV paper](https://www.nature.com/articles/s41588-024-01993-3)**:
+  - Published in *Nature Genetics*, this paper introduces popV and benchmarks it.
+## **Contact**
+- GitHub: [https://github.com/YosefLab/popV](https://github.com/YosefLab/popV)
+- User questions: [Discourse](https://discourse.scverse.org)
+<!---
+- **[MultiVI](https://docs.scvi-tools.org/en/stable/user_guide/models/multivi.html)**:
+- A multi-modal model for joint analysis of RNA, ATAC and protein data, enabling integrative insights from diverse omics data.
+-->