promptam (Vincent)

upvoted an article 11 days ago

Article

Accelerate 1.0.0

Sep 13

• 48

upvoted a paper 14 days ago

From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published 16 days ago • 30

upvoted a collection 16 days ago

BRAVE Models 🦁

Collection

Models mentioned in https://huggingface.co/papers/2404.07204 • 6 items • Updated 18 days ago • 1

upvoted a collection 22 days ago

Molmo

Collection

Artifacts for open multimodal language models. • 5 items • Updated 22 days ago • 251

upvoted a paper about 1 month ago

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published Sep 16 • 34

upvoted a collection about 1 month ago

Yi-Coder

Collection

4 items • Updated Sep 4 • 29

upvoted a collection about 2 months ago

Qwen2-VL

Collection

Vision-language model series based on Qwen2 • 15 items • Updated 30 days ago • 138

upvoted an article about 2 months ago

Article

Scaling robotics datasets with video encoding

Aug 27

• 33

upvoted a collection about 2 months ago

NVEagle

Collection

4 items • Updated Aug 29 • 11

upvoted a paper 2 months ago

ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities

Paper • 2408.04682 • Published Aug 8 • 14

upvoted 3 papers 3 months ago

upvoted 3 collections 4 months ago

SPPO

Collection

Self-Play Preference Optimization • 10 items • Updated Jun 29 • 12

LLM Compiler

Collection

Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27 • 147

Florence

Collection

9 items • Updated Jul 11 • 157

upvoted 2 papers 4 months ago

VoCo-LLaMA: Towards Vision Compression with Large Language Models

Paper • 2406.12275 • Published Jun 18 • 29

HelpSteer2: Open-source dataset for training top-performing reward models

Paper • 2406.08673 • Published Jun 12 • 16

upvoted an article 4 months ago

Article

Uncensor any LLM with abliteration

By

•

Jun 13

• 346

upvoted 2 papers 4 months ago

Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7 • 55

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6 • 71

upvoted 4 papers 5 months ago

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 85

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Paper • 2405.15319 • Published May 24 • 25

Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation

Paper • 2405.14598 • Published May 23 • 11

Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

Paper • 2405.09215 • Published May 15 • 18

upvoted a collection 5 months ago

Searching for Better ViT Baselines

Collection

Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). • 25 items • Updated Aug 21 • 12

upvoted a paper 5 months ago

Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots

Paper • 2405.07990 • Published May 13 • 16

upvoted 2 collections 5 months ago

Yi-1.5 (2024/05)

Collection

10 items • Updated May 20 • 89

Neo-Models

Collection

Neo • 9 items • Updated May 29 • 17

upvoted a paper 5 months ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3 • 98

upvoted a collection 5 months ago

Granite Code Models

Collection

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated Aug 30 • 169

upvoted 2 papers 6 months ago

WildChat: 1M ChatGPT Interaction Logs in the Wild

Paper • 2405.01470 • Published May 2 • 59

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

upvoted an article 6 months ago

Article

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

Apr 29

• 71

upvoted 3 papers 6 months ago

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Paper • 2404.14723 • Published Apr 23 • 10

A Multimodal Automated Interpretability Agent

Paper • 2404.14394 • Published Apr 22 • 20

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Paper • 2404.13208 • Published Apr 19 • 38

upvoted 2 articles 6 months ago

Article

seemore: Implement a Vision Language Model from Scratch

By

•

Jun 23

• 60

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Apr 22

• 78

upvoted 3 papers 6 months ago

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

Paper • 2404.13013 • Published Apr 19 • 30

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Paper • 2404.11912 • Published Apr 18 • 16

Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video

Paper • 2404.09833 • Published Apr 15 • 29

upvoted 2 collections 6 months ago

Idefics2 🐶

Collection

Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 88

WizardLM

Collection

0 items • Updated Jul 11 • 103

upvoted 3 papers 6 months ago

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11 • 43

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Paper • 2404.05726 • Published Apr 8 • 20

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8 • 62

upvoted 2 papers 7 months ago

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 124

LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model

Paper • 2404.01331 • Published Mar 29 • 24

upvoted 2 collections 7 months ago

Aurora-M models

Collection

Aurora-M models (base, biden-harris redteams and instruct) • 5 items • Updated May 6 • 17

PDF Document / OCR Datasets

Collection

Document datasets with .pdf files that are usable with pixparse libraries and tools. • 2 items • Updated Mar 30 • 47

upvoted a paper 7 months ago

Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models

Paper • 2403.20331 • Published Mar 29 • 14

upvoted a collection 7 months ago

MGM

Collection

Official model collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" • 13 items • Updated May 3 • 46

upvoted 2 papers 7 months ago

GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14 • 25

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding

Paper • 2403.09626 • Published Mar 14 • 13

upvoted 2 papers 8 months ago

BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay

Paper • 2402.14194 • Published Feb 22 • 5

AgentScope: A Flexible yet Robust Multi-Agent Platform

Paper • 2402.14034 • Published Feb 21 • 12

Vincent

AI & ML interests

Organizations

promptam's activity

Accelerate 1.0.0

Scaling robotics datasets with video encoding

Uncensor any LLM with abliteration

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

seemore: Implement a Vision Language Model from Scratch

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent