Spaces:
Running
Running
Update app.py
Browse files
app.py
CHANGED
@@ -71,7 +71,7 @@ def display_work_experience():
|
|
71 |
- Engineered comprehensive evaluation benchmarks for Gemini 3.0 by analyzing reasoning loss patterns and image loss patterns in state-of-the-art Vision-Language Models (VLMs) including o3 and Gemini 2.5 Pro, developing custom datasets across multiple domains (mathematics, finance, chemistry, biology) spanning educational levels from high-school through PhD with statistical validation methods
|
72 |
- Implemented advanced LLM fine-tuning strategies for Qwen model including Parameter-Efficient Fine-Tuning (PEFT) with LoRA and 2-stage whole model training on multi-GPU clusters, achieving 12% performance improvement across 15+ categories
|
73 |
- Developed "auto hinter" system to improve LLM reasoning, guiding models towards correct answers based on question complexity, resulting in 8% performance increment on PhD-level questions
|
74 |
-
- Built "auto rater" system to assess responses from leading models like Gemini 2.5 Pro and o3 custom builds, scoring across four key dimensions: completeness, coherence, clarity, and
|
75 |
- Applied advanced model compression techniques including quantization and distillation methods to optimize inference performance while maintaining model accuracy for production-ready LLM deployment
|
76 |
- Designed robust evaluation pipelines incorporating ROC curve analysis, performance benchmarking, bias mitigation, and RMSE validation to ensure model reliability and efficiency
|
77 |
|
|
|
71 |
- Engineered comprehensive evaluation benchmarks for Gemini 3.0 by analyzing reasoning loss patterns and image loss patterns in state-of-the-art Vision-Language Models (VLMs) including o3 and Gemini 2.5 Pro, developing custom datasets across multiple domains (mathematics, finance, chemistry, biology) spanning educational levels from high-school through PhD with statistical validation methods
|
72 |
- Implemented advanced LLM fine-tuning strategies for Qwen model including Parameter-Efficient Fine-Tuning (PEFT) with LoRA and 2-stage whole model training on multi-GPU clusters, achieving 12% performance improvement across 15+ categories
|
73 |
- Developed "auto hinter" system to improve LLM reasoning, guiding models towards correct answers based on question complexity, resulting in 8% performance increment on PhD-level questions
|
74 |
+
- Built "auto rater" system to assess responses from leading models like Gemini 2.5 Pro and o3 custom builds, scoring across four key dimensions: completeness, coherence, clarity, correctness, style and formatting
|
75 |
- Applied advanced model compression techniques including quantization and distillation methods to optimize inference performance while maintaining model accuracy for production-ready LLM deployment
|
76 |
- Designed robust evaluation pipelines incorporating ROC curve analysis, performance benchmarking, bias mitigation, and RMSE validation to ensure model reliability and efficiency
|
77 |
|