22 3 90

Shuyue Jia (Bruce)

shuyuej

https://shuyuej.com

SuperBruceJia

AI & ML interests

A Ph.D. Student at @vkola-lab, Boston University. Passionate about Large Language Models (LLMs), Multimodal Foundation Models, Generative AI, and Medical AI.

Recent Activity

updated a model about 19 hours ago

shuyuej/Ministral-8B-Instruct-2410-2048-with-grad_norm

updated a model about 19 hours ago

shuyuej/Llama-3.1-8B-Instruct-2048-with-grad_norm

updated a model about 19 hours ago

shuyuej/Differential-Diagnoser-GPTQ-Model

View all activity

Organizations

shuyuej's activity

New activity in meta-llama/Llama-3.3-70B-Instruct 7 days ago

What Happens If the Prompt Exceeds 8,196 Tokens? And difference between input limit and context length limit?

#36 opened 9 days ago by

averyyu99

New activity in meta-llama/Llama-3.3-70B-Instruct 11 days ago

quant versions?

#12 opened 19 days ago by

apol

New activity in TheBloke/h2ogpt-research-oasst1-llama-65B-GPTQ 27 days ago

RecursionError: maximum recursion depth exceeded

#1 opened over 1 year ago by

WajihUllahBaig

New activity in shuyuej/e5-mistral-7b-instruct-GPTQ 5 months ago

missing model.safetensors.index.json

#1 opened 5 months ago by

kresimirfijacko

New activity in shuyuej/Mistral-Nemo-Instruct-2407-GPTQ 5 months ago

Can you create gptq 8 bits quants?

#1 opened 5 months ago by

rjmehta

New activity in hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4 5 months ago

Can you provide one model using `group_size=1024` to make the model smaller?

#15 opened 5 months ago by

shuyuej

OOM Error

#13 opened 5 months ago by

shuyuej

Update quantize_config.json

#12 opened 5 months ago by

shuyuej

Update config.json

#11 opened 5 months ago by

shuyuej

Source codes to quantize the LLaMA 3.1 405B model

#10 opened 5 months ago by

shuyuej

New activity in hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4 5 months ago

Request for Mistral Large Instruct GPTQ INT4

#2 opened 5 months ago by

sparsh35

New activity in mistralai/Mamba-Codestral-7B-v0.1 5 months ago

Missing config.json

#6 opened 5 months ago by

wxl2001

New activity in openerotica/c4ai-command-r-plus-GPTQ-ERQ 5 months ago

Where can we download `quant.py`?

#1 opened 5 months ago by

shuyuej

New activity in CohereForAI/c4ai-command-r-v01 5 months ago

Learning Rate during pretraining

#58 opened 5 months ago by

shuyuej

New activity in Salesforce/SFR-Embedding-2_R 5 months ago

About the tokenizer - Why use LLaMA tokenizer?

#5 opened 5 months ago by

shuyuej

New activity in dunzhang/stella_en_1.5B_v5 5 months ago

Model max_seq_length

#6 opened 5 months ago by

shuyuej

New activity in Salesforce/SFR-Embedding-2_R 5 months ago

Model max_seq_length

#4 opened 5 months ago by

shuyuej

New activity in openlifescienceai/open_medical_llm_leaderboard 7 months ago

Where can we find `eval_medical_llm.py` and `main.py`

#15 opened 7 months ago by

shuyuej

New activity in google/gemma-7b 7 months ago

Fine-Tune a gemma model for question answering

#62 opened 10 months ago by

Iamexperimenting

New activity in google/gemma-7b 8 months ago

Weird Performance Issue with Gemma-7b compared to Gemma-2b with Qlora

#91 opened 8 months ago by

UserDAN