Li Tan PRO
tanliboy
AI & ML interests
None yet
Recent Activity
liked
a model
2 days ago
deepseek-ai/DeepSeek-R1
new activity
2 days ago
moonshotai/Moonlight-16B-A3B:Thank you!
updated
a model
21 days ago
tanliboy/Qwen2.5-14B-Instruct-1M-AWQ
Organizations
tanliboy's activity
Thank you!
#2 opened 2 days ago
by
tanliboy

what is your "continuous finetuning"
7
#2 opened 5 months ago
by
MaziyarPanahi

Batch Inference causes degraded performance
3
#43 opened 6 months ago
by
tanliboy

Scorecard on popular benchmarks
2
#2 opened 5 months ago
by
tanliboy

Phi-2-Instruct-APO: aligned with Anchored Preference Optimization
16
#3 opened 5 months ago
by
rasyosef
Preference Alignment
4
#6 opened 5 months ago
by
tanliboy

Text Classification with LLMs
7
#30 opened 7 months ago
by
dss107
IFEVAL drop
#16 opened 5 months ago
by
tanliboy

bfloat16 vs. float32
#34 opened 5 months ago
by
tanliboy

Qwen 2.5 1.5B retrain?
4
#12 opened 5 months ago
by
tomaarsen

GSM8K Evaluation Result: 84.5 vs. 76.95
17
#81 opened 7 months ago
by
tanliboy

Finetuning script using HuggingFace (No llama-factory)
36
#32 opened 6 months ago
by
2U1
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
8
#120 opened 6 months ago
by
erildo
Have you deleted your GitHub page?
7
#10 opened 6 months ago
by
xwzy6
Sliding window vs. Global Attention
6
#41 opened 6 months ago
by
tanliboy

Gemma2-2b training uses much more momory!
2
#23 opened 6 months ago
by
bubbleseller
GemmaSdpaAttention vs GemmaAttention
2
#71 opened 6 months ago
by
canqin001
Fix Llama 3.1 Chat Template to Properly Handle add_generation_prompt
9
#26 opened 6 months ago
by
Tostino
🍭 Fine-tuning support for Qwen2-VL-7B-Instruct
5
#1 opened 6 months ago
by
study-hjt

Evaluation Result
1
#15 opened 6 months ago
by
tanliboy
