ClinicalMetaScience
/

NegativeResultDetector

Text Classification

Model card Files Files and versions Metrics Training metrics Community

ClinicalMetaScience commited on Sep 27, 2023

Commit

1ef071e

·

1 Parent(s): 128988d

Update README.md

Files changed (1) hide show

README.md +31 -2

README.md CHANGED Viewed

@@ -23,8 +23,37 @@ Click 'Compute' to predict the class labels for an example abstract or an abstra
 The class label 'positive' corresponds to 'positive results only', while 'negative' represents 'mixed and negative results'.
 ## Using the model for larger data
-Use this [script](https://github.com/PsyCapsLock/PubBiasDetect/blob/main/Scripts/Predict_Example_Abstracts_using_NegativeResultDetector.ipynb)
-from our [GitHub repository](https://github.com/PsyCapsLock/PubBiasDetect) to analyze your own or our example data.
 ## Disclaimer
 This tool is developed to analyze and predict the prevalence of positive and negative results in scientific abstracts based on the SciBERT model. While publication bias is a plausible explanation for certain patterns of results observed in scientific literature, the analyses conducted by this tool do not conclusively establish the presence of publication bias or any other underlying factors. It's essential to understand that this tool evaluates data but does not delve into the underlying reasons for the observed trends.

 The class label 'positive' corresponds to 'positive results only', while 'negative' represents 'mixed and negative results'.
 ## Using the model for larger data
+```
+from transformers import AutoTokenizer, Trainer, AutoModelForSequenceClassification
+## 1. Load tokenizer
+tokenizer = AutoTokenizer.from_pretrained('allenai/scibert_scivocab_uncased')
+## 2. Apply preprocess function to data
+## Make sure your text column is named 'text'. Otherwise replace 'text' with the name of your text column.
+def preprocess_function(examples):
+    return tokenizer(examples["text"],
+                     truncation=True,
+                     max_length=512,
+                     padding='max_length'
+                     )
+tokenized_data = dataset.map(preprocess_function, batched=True)
+# 3. Load Model
+NegativeResultDetector = AutoModelForSequenceClassification.from_pretrained("ClinicalMetaScience/NegativeResultDetector")
+## 4. Initialize the trainer with the model and tokenizer
+trainer = Trainer(
+    model=NegativeResultDetector,
+    tokenizer=tokenizer,
+  )
+# 5. Apply NegativeResultDetector for prediction on inference data
+predict_test=trainer.predict(tokenized_data["inference"])
+```
+Further information on analyzing your own or our example data can be found in this [script](https://github.com/PsyCapsLock/PubBiasDetect/blob/main/Scripts/Predict_Example_Abstracts_using_NegativeResultDetector.ipynb)
+from our [GitHub repository](https://github.com/PsyCapsLock/PubBiasDetect).
 ## Disclaimer
 This tool is developed to analyze and predict the prevalence of positive and negative results in scientific abstracts based on the SciBERT model. While publication bias is a plausible explanation for certain patterns of results observed in scientific literature, the analyses conducted by this tool do not conclusively establish the presence of publication bias or any other underlying factors. It's essential to understand that this tool evaluates data but does not delve into the underlying reasons for the observed trends.