Vietnamese Mistral

Activity Feed

AI & ML interests

Mistral & Mixtral for Vietnamese

Recent Activity

Taishi-N324 authored a paper 17 days ago

Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs

Taishi-N324 authored a paper 17 days ago

Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

Taishi-N324 authored a paper 17 days ago

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

View all activity

Viet-Mistral's activity

Taishi-N324

authored 3 papers 17 days ago

Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs

Paper • 2411.08719 • Published Nov 10, 2024

Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

Paper • 2412.14471 • Published Dec 19, 2024

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Paper • 2503.04412 • Published 18 days ago • 1

huu-ontocord

authored 3 papers 22 days ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 55

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4

Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs

Paper • 2502.19413 • Published 26 days ago • 19

JJitsev

authored a paper 24 days ago

Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs

Paper • 2502.19413 • Published 26 days ago • 19

Taishi-N324

authored a paper 25 days ago

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Paper • 2502.19261 • Published 26 days ago • 7

vumichien

authored a paper about 1 month ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 9

PSaiml

authored a paper 2 months ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

Paper • 2501.10057 • Published Jan 17 • 8

anoperson

authored a paper 3 months ago

LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models

Paper • 2501.00874 • Published Jan 1 • 13

PSaiml

authored a paper 3 months ago

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4

anoperson

authored a paper 4 months ago

Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering

Paper • 2411.09213 • Published Nov 14, 2024 • 7

chiennv

authored a paper 5 months ago

A Survey of Small Language Models

Paper • 2410.20011 • Published Oct 25, 2024 • 41

Taishi-N324

authored a paper 5 months ago

Agent Skill Acquisition for Large Language Models via CycleQD

Paper • 2410.14735 • Published Oct 16, 2024 • 2

huy-nh-2000

authored a paper 5 months ago

Taipan: Efficient and Expressive State Space Language Models with Selective Attention

Paper • 2410.18572 • Published Oct 24, 2024 • 18

chiennv

authored a paper 5 months ago

Taipan: Efficient and Expressive State Space Language Models with Selective Attention

Paper • 2410.18572 • Published Oct 24, 2024 • 18

JJitsev

authored 3 papers 5 months ago

A Comparative Study on Generative Models for High Resolution Solar Observation Imaging

Paper • 2304.07169 • Published Apr 14, 2023

DataComp: In search of the next generation of multimodal datasets

Paper • 2304.14108 • Published Apr 27, 2023 • 2

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

Paper • 2111.02114 • Published Nov 3, 2021

AI & ML interests

Recent Activity

Team members 19

Viet-Mistral's activity