matq007 commited on
Commit
3ca9bd1
·
unverified ·
1 Parent(s): afcbd76

release: v1.1 models

Browse files
Files changed (4) hide show
  1. README.md +40 -56
  2. _scvi_required_metadata.json +3 -3
  3. adata.h5ad +2 -2
  4. model.pt +2 -2
README.md CHANGED
@@ -6,85 +6,69 @@ tags:
6
  - genomics
7
  - single-cell
8
  - model_cls_name:SCANVI
9
- - scvi_version:1.0.0
10
- - anndata_version:0.9.1
11
  - modality:rna
12
  - annotated:True
13
  ---
14
 
15
  # Description
16
 
17
- Mouse scANVI reference model
 
 
 
 
18
 
19
- # Model properties
20
 
21
- Many model properties are in the model tags. Some more are listed below.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- **model_init_params**:
24
  ```json
25
  {
26
- "n_hidden": 128,
27
- "n_latent": 10,
28
- "n_layers": 2,
29
- "dropout_rate": 0.1,
30
- "dispersion": "gene",
31
- "gene_likelihood": "nb",
32
- "linear_classifier": false,
33
- "latent_distribution": "normal"
34
  }
35
  ```
36
 
37
- **model_setup_anndata_args**:
 
38
  ```json
39
  {
40
  "labels_key": "ct",
41
  "unlabeled_category": "Unknown",
42
  "layer": "counts",
43
  "batch_key": "batch",
44
- "size_factor_key": null,
45
- "categorical_covariate_keys": null,
46
  "continuous_covariate_keys": null
47
  }
48
  ```
49
 
50
- **model_summary_stats**:
51
- |  Summary Stat Key  | Value |
52
- |--------------------------|-------|
53
- |  n_batch  |  18  |
54
- |  n_cells  | 2004  |
55
- | n_extra_categorical_covs |  0  |
56
- | n_extra_continuous_covs  |  0  |
57
- |  n_labels  |  16  |
58
- |  n_vars  | 3000  |
59
-
60
- **model_data_registry**:
61
- | Registry Key |  scvi-tools Location  |
62
- |--------------|---------------------------|
63
- |  X  |  adata.layers['counts']  |
64
- |  batch  | adata.obs['_scvi_batch']  |
65
- |  labels  | adata.obs['_scvi_labels'] |
66
-
67
- **model_parent_module**: https://zenodo.org/records/10669600/files/01_mouse_reprocessed.h5ad?download=1
68
-
69
- **data_is_minified**: False
70
-
71
- # Training data
72
-
73
- This is an optional link to where the training data is stored if it is too large
74
- to host on the huggingface Model hub.
75
-
76
- <!-- If your model is not uploaded with any data (e.g., minified data) on the Model Hub, then make
77
- sure to provide this field if you want users to be able to access your training data. See the scvi-tools
78
- documentation for details. -->
79
-
80
- Training data url: https://github.com/brickmanlab/proks-salehin-et-al
81
-
82
- # Training code
83
-
84
- This is an optional link to the code used to train the model.
85
-
86
- Training code url: N/A
87
-
88
  # References
89
 
90
- Proks, Salehin et al., biorXiv
 
 
 
 
6
  - genomics
7
  - single-cell
8
  - model_cls_name:SCANVI
 
 
9
  - modality:rna
10
  - annotated:True
11
  ---
12
 
13
  # Description
14
 
15
+ Mouse preimplantation development model spanning early stages of development. The
16
+ model was trained utilizing single‐cell ANnotation using Variational Inference
17
+ (scANVI, [Xu et al., 2021]) implemented in [scvi-tools]. In short, scANVI raw
18
+ single-cell RNA sequencing (scRNA-seq) count matrix - cell by gene, where values
19
+ represent gene expression measured by counting number of transcribed RNA.
20
 
21
+ # Model Training
22
 
23
+ - [raw dataset](https://zenodo.org/records/13749348/files/01_mouse_reprocessed.h5ad)
24
+ - [notebook analysis](https://github.com/brickmanlab/proks-salehin-et-al/blob/master/notebooks/15_mouse_scANVI_fix.ipynb)
25
+
26
+ # Metrics
27
+
28
+ Cell type (`ct`) prediction
29
+
30
+ | Metric | Score |
31
+ |-------------------|---------------------|
32
+ | Accuracy score | 0.9126746506986028 |
33
+ | Balanced accuracy | 0.9572872718187365 |
34
+ | F1 (micro) | 0.9126746506986028 |
35
+ | F1 (macro) | 0.9201654923575322 |
36
+
37
+ # Model parameters
38
+
39
+ Below we provide settings for scANVI setup
40
+
41
+ `lvae.init_params_["non_kwargs"]`
42
 
 
43
  ```json
44
  {
45
+ "n_hidden": 128,
46
+ "n_latent": 10,
47
+ "n_layers": 2,
48
+ "dropout_rate": 0.1,
49
+ "dispersion": "gene",
50
+ "gene_likelihood": "nb",
51
+ "linear_classifier": false
 
52
  }
53
  ```
54
 
55
+ `lvae.adata_manager.registry['setup_args']`
56
+
57
  ```json
58
  {
59
  "labels_key": "ct",
60
  "unlabeled_category": "Unknown",
61
  "layer": "counts",
62
  "batch_key": "batch",
63
+ "size_factor_key": null,
64
+ "categorical_covariate_keys": null,
65
  "continuous_covariate_keys": null
66
  }
67
  ```
68
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
  # References
70
 
71
+ Proks, M., Salehin, N. & Brickman, J.M. Deep learning-based models for preimplantation mouse and human embryos based on single-cell RNA sequencing. Nat Methods 22, 207–216 (2025). [https://doi.org/10.1038/s41592-024-02511-3](https://doi.org/10.1038/s41592-024-02511-3)
72
+
73
+ [Xu et al., 2021]: https://www.embopress.org/doi/full/10.15252/msb.20209620
74
+ [scvi-tools]: http://scvi-tools.org
_scvi_required_metadata.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
- "scvi_version": "1.0.0",
3
- "anndata_version": "0.9.1",
4
  "model_cls_name": "SCANVI",
5
- "training_data_url": null,
6
  "model_parent_module": "scvi.model"
7
  }
 
1
  {
2
+ "scvi_version": "1.1.5",
3
+ "anndata_version": "0.10.8",
4
  "model_cls_name": "SCANVI",
5
+ "training_data_url": "https://zenodo.org/records/13749348/files/01_mouse_reprocessed.h5ad",
6
  "model_parent_module": "scvi.model"
7
  }
adata.h5ad CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5dd15019b2b4a9d5da0eada605fc11747bfc41ab4e439f0b4faec11f32c30299
3
- size 351669774
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e7d8acf5c35914856a94e28e2a8d441ee8b1df5308a29d7a5cd9208feaaee725
3
+ size 367084886
model.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d80c3a8b0afd6400fb3b3ac89a3af98c904b9fc3e5be874a0ec8305337676936
3
- size 8350369
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d076c005b90cb84fc2b96aa63319877ca6dcf5083acaf507f890ff91b1d5f712
3
+ size 8351118