File size: 2,848 Bytes
a85fd0a
 
 
 
 
 
 
 
 
a8a8ba2
a85fd0a
 
 
 
 
7374427
a85fd0a
 
 
028d68b
a85fd0a
 
 
 
 
 
 
 
 
 
 
d6df91f
 
 
 
 
a85fd0a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3b3abeb
a85fd0a
 
 
 
 
 
 
 
 
a8a8ba2
a85fd0a
22f2767
 
 
 
 
 
a85fd0a
a8a8ba2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
license: apache-2.0
tags:
- Automated Peer Reviewing
- SFT
---

## Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis

Paper Link: https://arxiv.org/abs/2407.12857

Project Page: https://ecnu-sea.github.io/


## πŸ”₯ News
- πŸ”₯πŸ”₯πŸ”₯ SEA is accepted by EMNLP 2024 !
- πŸ”₯πŸ”₯πŸ”₯ We have made SEA series models (7B) public !

## Model Description
The SEA-E model utilizes [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) as its backbone. It is derived by performing supervised fine-tuning (SFT) on a high-quality peer review instruction dataset, standardized through the SEA-S model. **This model can provide comprehensive and insightful review feedback for submitted papers!**
  
## Review Paper With SEA-E

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

instruction = system_prompt_dict['instruction_e']
paper = read_txt_file(mmd_file_path)
idx = paper.find("## References")
paper = paper[:idx].strip()

model_name = "/root/sea/"
tokenizer = AutoTokenizer.from_pretrained(model_name)
chat_model = AutoModelForCausalLM.from_pretrained(model_name)
chat_model.to("cuda:0")

messages = [
    {"role": "system", "content": instruction},
    {"role": "user", "content": paper},
]

encodes = tokenizer.apply_chat_template(messages, return_tensors="pt")
encodes = encodes.to("cuda:0")
len_input = encodes.shape[1]
generated_ids = chat_model.generate(encodes,max_new_tokens=8192,do_sample=True)
# response = chat_model.chat(messages)[0].response_text
response = tokenizer.batch_decode(generated_ids[: , len_input:])[0]

```
The code provided above is an example. For detailed usage instructions, please refer to https://github.com/ecnu-sea/sea.

## Additional Clauses

The additional clauses for this project are as follows:

- Commercial use is not allowed.
- The SEA-E model is intended solely to provide informative reviews for authors to polish their papers instead of directly recommending acceptance/rejection on papers.
- Currently, the SEA-E model is only applicable within the field of machine learning and does not guarantee insightful comments for other disciplines.


## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

If you find our paper or models helpful, please consider cite as follows:

```bibtex
@inproceedings{yu2024automated,
  title={Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis},
  author={Yu, Jianxiang and Ding, Zichen and Tan, Jiaqi and Luo, Kangyang and Weng, Zhenmin and Gong, Chenghua and Zeng, Long and Cui, RenJing and Han, Chengcheng and Sun, Qiushi and others},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
  pages={10164--10184},
  year={2024}
}
```