kaczmarj
/

gbmlgg-survival-porpoise.tcga

Inference Endpoints

Model card Files Files and versions Community

gbmlgg-survival-porpoise.tcga / README.md

kaczmarj's picture

Upload 7 files

defd546 verified 11 months ago

|

history blame contribute delete

2.28 kB

	---
	license: cc-by-4.0
	---
	# Survival prediction using PORPOISE (TCGA GBMLGG)

	This model predicts a patient's overall survival using an H&E-stained digital pathology image of GBMLGG. It was trained by Jakub Kaczmarzyk using PORPOISE. It is an attempt to reproduce the PORPOISE manuscript.

	Original journal article: https://doi.org/10.1016/j.ccell.2022.07.004

	If you find this model useful, please make sure you cite the original publication.

	Inputs: Bag of patches with 128um edge length, embedded with CTransPath.

	Output classes: logits of hazards at four timepoints

	To calculate the arbitrary risk score given the model outputs `logits`, use the following:

	```python
	hazards = torch.sigmoid(logits)
	S = torch.cumprod(1 - hazards, dim=1)
	risk = -torch.sum(S, dim=1)
	```

	## Data

	TCGA-GBMLGG was used to train the model. The whole slide images were tiled into 128x128um patches, and each patch was encoded using CTransPath (this produces 768-dimensional embeddings).

	The training and validation splits were provided by the original PORPOISE code. Here, we report the model in fold 3, because it had the highest c-index of the folds.

	Samples sizes:
	- Train: 810 slides (455 patients)
	- Validation: 201 slides (114 patients)

	## Reusing this model

	To use this model on the command line, see [WSInfer-MIL](https://github.com/kaczmarj/wsinfer-mil).

	Alternatively, you may use PyTorch on ONNX to run the model. First, embed 128um x 128um patches using CTransPath. Then pass the bag of embeddings to the model.

	```python
	import onnxruntime as ort
	import numpy as np
	embedding = np.ones((1_000, 768), dtype="float32")
	ort_sess = ort.InferenceSession("model.onnx")
	logits, attention = ort_sess.run(["logits", "attention"], {'input': embedding})
	# To get the risk score, implement the following:
	# hazards = sigmoid(logits)
	# S = cumprod(1 - hazards, dim=1)
	# risk = -sum(S, dim=1)
	```

	The median risk score was -3.22, and this value was used to split patients into low risk and high risk.

	## Model performance

	The model achieves a c-index of 0.83 in the validation set.

	# Intended uses

	This model is ONLY intended for research purposes.

	This model may not be used for clinical purposes. This model is distributed without warranties, either express or implied.