Li Tan PRO
tanliboy
AI & ML interests
None yet
Organizations
tanliboy's activity
Scorecard on popular benchmarks
2
#2 opened 30 days ago
by
tanliboy
Phi-2-Instruct-APO: aligned with Anchored Preference Optimization
9
#3 opened about 1 month ago
by
rasyosef
Preference Alignment
4
#6 opened 26 days ago
by
tanliboy
Text Classification with LLMs
7
#30 opened 2 months ago
by
dss107
IFEVAL drop
#16 opened 27 days ago
by
tanliboy
bfloat16 vs. float32
#34 opened 28 days ago
by
tanliboy
Qwen 2.5 1.5B retrain?
4
#12 opened 30 days ago
by
tomaarsen
GSM8K Evaluation Result: 84.5 vs. 76.95
17
#81 opened 3 months ago
by
tanliboy
Finetuning script using HuggingFace (No llama-factory)
6
#32 opened about 1 month ago
by
2U1
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
8
#120 opened about 2 months ago
by
erildo
Have you deleted your GitHub page?
7
#10 opened about 1 month ago
by
xwzy6
Sliding window vs. Global Attention
5
#41 opened about 2 months ago
by
tanliboy
Gemma2-2b training uses much more momory!
1
#23 opened about 2 months ago
by
bubbleseller
GemmaSdpaAttention vs GemmaAttention
2
#71 opened about 2 months ago
by
canqin001
Fix Llama 3.1 Chat Template to Properly Handle add_generation_prompt
9
#26 opened about 2 months ago
by
Tostino
🍭 Fine-tuning support for Qwen2-VL-7B-Instruct
5
#1 opened about 2 months ago
by
study-hjt
Batch Inference causes degraded performance
1
#43 opened about 2 months ago
by
tanliboy
Evaluation Result
#15 opened about 2 months ago
by
tanliboy
How is this dataset supposed to be used to evaluate the model?
4
#1 opened 2 months ago
by
realdanielbyrne
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
2
#18 opened 3 months ago
by
lcahill
Llama-3-Instruct with Langchain keeps talking to itself
10
#147 opened 4 months ago
by
fahim9778
Pruning
7
#24 opened 2 months ago
by
dhivakarsa
Bad test results using lm-evaluation-harness
4
#68 opened 7 months ago
by
smart-liu
two BOS token id is right?
4
#97 opened 2 months ago
by
hpsun
Fine tuning data templates Please help
2
#32 opened 2 months ago
by
Cagatayd
add_special_tokens=False results in poor generation
3
#80 opened 7 months ago
by
DMaksimov
Why is "bos_token": null, in tokenizer_config.json?
6
#15 opened 2 months ago
by
3Simplex
TTS support?
3
#4 opened 2 months ago
by
yukiarimo
The base model doesn't generate coherently
4
#9 opened 4 months ago
by
migtissera
Fine-tuning Hyperparameters
6
#27 opened 3 months ago
by
tanliboy
Error: size mismatch for model.layers.0.self_attn.q_proj.weight:
2
#6 opened 3 months ago
by
tanliboy
dtype: float32 in base model vs. dtype: bfloat16 in the instruction fine-tuned model
#32 opened 3 months ago
by
tanliboy
TypeError: arange() received an invalid combination of arguments
4
#12 opened 4 months ago
by
darrenbudiman
TypeError: 'NoneType' object cannot be interpreted as an integer
2
#3 opened 4 months ago
by
tanliboy
Crash in Fine-tuning
4
#14 opened 5 months ago
by
tanliboy
"bos_token": "<s>" vs. "<|endoftext|>"
1
#20 opened 4 months ago
by
tanliboy
Difference in chat templates between Phi-3-small-8k-instruct and Phi-3-medium-4k-instruct
1
#4 opened 5 months ago
by
tanliboy