6cf commited on
Commit
194586e
Β·
verified Β·
1 Parent(s): 0fc19ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +103 -0
README.md CHANGED
@@ -14,3 +14,106 @@ tags:
14
  - medical
15
  ---
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  - medical
15
  ---
16
 
17
+
18
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6205fefd3f1dc8a642d70b10/JEZgA_xV6oF8AIsya9dop.jpeg)
19
+
20
+
21
+ # IdeaWhiz Model Card 🧠
22
+
23
+ ## Model Summary πŸ”¬
24
+ IdeaWhiz is a fine-tuned version of QwQ-32B-Preview, specifically optimized for scientific creativity and step-by-step reasoning. The model leverages the LiveIdeaBench dataset to enhance its capabilities in generating novel scientific ideas and hypotheses.
25
+
26
+ ## Key Features 🌟
27
+ - Base Model: QwQ-32B-Preview πŸ”‹
28
+ - Training Dataset: LiveIdeaBench πŸ“Š
29
+ - Main Focus: Scientific creativity and idea generation πŸ’‘
30
+ - Reasoning Style: o1-style step-by-step reasoning ⚑
31
+
32
+ ## Intended Use 🎯
33
+ - Scientific hypothesis generation πŸ§ͺ
34
+ - Creative problem-solving in research πŸ”
35
+ - Step-by-step scientific reasoning πŸ“
36
+ - Research direction brainstorming 🌱
37
+
38
+ ## Quickstart πŸš€
39
+
40
+ ```python
41
+ from transformers import AutoModelForCausalLM, AutoTokenizer
42
+
43
+ model_name = "6cf/QwQ-32B-Preview-IdeaWhiz-v1"
44
+
45
+ model = AutoModelForCausalLM.from_pretrained(
46
+ model_name,
47
+ torch_dtype="auto",
48
+ device_map="auto"
49
+ )
50
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
51
+
52
+
53
+
54
+ prompt = """I'll be submitting your next responses to a "Good Scientific Idea" expert review panel. If they consider your idea to be a good one, you'll receive a reward. Your assigned keyword is: "cancer". You may provide background information. The idea MUST be within 100 words (including background information). (Note: good scientific ideas should be novel, verifiable, practically valuable, and able to advance the field.). NOTE: You MUST give your answer after **Final Idea:**
55
+ """
56
+
57
+
58
+ messages = [
59
+ {"role": "system", "content": "You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step."},
60
+ {"role": "user", "content": prompt}
61
+ ]
62
+ text = tokenizer.apply_chat_template(
63
+ messages,
64
+ tokenize=False,
65
+ add_generation_prompt=True
66
+ )
67
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
68
+
69
+ generated_ids = model.generate(
70
+ **model_inputs,
71
+ max_new_tokens=4096
72
+ )
73
+ generated_ids = [
74
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
75
+ ]
76
+
77
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
78
+ ```
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ # Example Outputs
83
+
84
+ ```
85
+ Alright,
86
+
87
+ ...
88
+
89
+ **Final Idea:** Engineering exosomes to express checkpoint inhibitors on their surface for targeted delivery to the tumor microenvironment, with the aim of enhancing
90
+ immune-mediated elimination of cancer cells. This approach combines the natural tropism of exosomes for tumor sites with the immunostimulatory effects of checkpoint inhibitors, potentially leading to more effective and less toxic cancer treatments.
91
+ ```
92
+
93
+
94
+
95
+ # Traning Dataset
96
+
97
+ ## πŸ€–πŸ’‘ LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
98
+
99
+
100
+ ### Dataset
101
+
102
+ [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dataset-yellow)](https://huggingface.co/datasets/6cf/liveideabench)
103
+
104
+ ### Paper
105
+
106
+ [![arXiv](https://img.shields.io/badge/arXiv-2412.17596-b31b1b.svg)](https://arxiv.org/abs/2412.17596)
107
+
108
+
109
+
110
+ If you use this model, please cite:
111
+
112
+ ```
113
+ @article{ruan2024liveideabench,
114
+ title={LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context},
115
+ author={Ruan, Kai and Wang, Xuan and Hong, Jixiang and Sun, Hao},
116
+ journal={arXiv preprint arXiv:2412.17596},
117
+ year={2024}
118
+ }
119
+ ```