Commit
·
1ef071e
1
Parent(s):
128988d
Update README.md
Browse files
README.md
CHANGED
@@ -23,8 +23,37 @@ Click 'Compute' to predict the class labels for an example abstract or an abstra
|
|
23 |
The class label 'positive' corresponds to 'positive results only', while 'negative' represents 'mixed and negative results'.
|
24 |
|
25 |
## Using the model for larger data
|
26 |
-
|
27 |
-
from
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
|
29 |
## Disclaimer
|
30 |
This tool is developed to analyze and predict the prevalence of positive and negative results in scientific abstracts based on the SciBERT model. While publication bias is a plausible explanation for certain patterns of results observed in scientific literature, the analyses conducted by this tool do not conclusively establish the presence of publication bias or any other underlying factors. It's essential to understand that this tool evaluates data but does not delve into the underlying reasons for the observed trends.
|
|
|
23 |
The class label 'positive' corresponds to 'positive results only', while 'negative' represents 'mixed and negative results'.
|
24 |
|
25 |
## Using the model for larger data
|
26 |
+
```
|
27 |
+
from transformers import AutoTokenizer, Trainer, AutoModelForSequenceClassification
|
28 |
+
## 1. Load tokenizer
|
29 |
+
tokenizer = AutoTokenizer.from_pretrained('allenai/scibert_scivocab_uncased')
|
30 |
+
|
31 |
+
## 2. Apply preprocess function to data
|
32 |
+
## Make sure your text column is named 'text'. Otherwise replace 'text' with the name of your text column.
|
33 |
+
def preprocess_function(examples):
|
34 |
+
return tokenizer(examples["text"],
|
35 |
+
truncation=True,
|
36 |
+
max_length=512,
|
37 |
+
padding='max_length'
|
38 |
+
)
|
39 |
+
tokenized_data = dataset.map(preprocess_function, batched=True)
|
40 |
+
|
41 |
+
# 3. Load Model
|
42 |
+
NegativeResultDetector = AutoModelForSequenceClassification.from_pretrained("ClinicalMetaScience/NegativeResultDetector")
|
43 |
+
|
44 |
+
## 4. Initialize the trainer with the model and tokenizer
|
45 |
+
trainer = Trainer(
|
46 |
+
model=NegativeResultDetector,
|
47 |
+
tokenizer=tokenizer,
|
48 |
+
)
|
49 |
+
|
50 |
+
# 5. Apply NegativeResultDetector for prediction on inference data
|
51 |
+
predict_test=trainer.predict(tokenized_data["inference"])
|
52 |
+
|
53 |
+
```
|
54 |
+
|
55 |
+
Further information on analyzing your own or our example data can be found in this [script](https://github.com/PsyCapsLock/PubBiasDetect/blob/main/Scripts/Predict_Example_Abstracts_using_NegativeResultDetector.ipynb)
|
56 |
+
from our [GitHub repository](https://github.com/PsyCapsLock/PubBiasDetect).
|
57 |
|
58 |
## Disclaimer
|
59 |
This tool is developed to analyze and predict the prevalence of positive and negative results in scientific abstracts based on the SciBERT model. While publication bias is a plausible explanation for certain patterns of results observed in scientific literature, the analyses conducted by this tool do not conclusively establish the presence of publication bias or any other underlying factors. It's essential to understand that this tool evaluates data but does not delve into the underlying reasons for the observed trends.
|