12 13 39

Flo Schneider

floschne

https://www.inf.uni-hamburg.de/en/inst/ab/lt/people/florian-schneider.html

floschne

AI & ML interests

Multi Modal Information Retrieval and Representation Learning

Recent Activity

upvoted a paper 4 days ago

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

upvoted a paper 4 days ago

Qwen2.5 Technical Report

upvoted a paper 4 days ago

Progressive Multimodal Reasoning via Active Retrieval

View all activity

Organizations

floschne's activity

upvoted 3 papers 4 days ago

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Paper • 2412.13871 • Published 7 days ago • 17

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 6 days ago • 327

Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published 6 days ago • 66

New activity in maya-multimodal/maya 13 days ago

File missing

#1 opened 13 days ago by

floschne

liked a dataset 15 days ago

HuggingFaceFW/fineweb-2

Viewer • Updated 17 days ago • 13.8B • 86.4k • 361

reacted to merve's post with ❤️ 17 days ago

Post

5505

This week in open-source AI was insane 🤠 A small recap🕺🏻 merve/dec-6-releases-67545caebe9fc4776faac0a3

Multimodal 🖼️
> Google shipped a PaliGemma 2, new iteration of PaliGemma with more sizes: 3B, 10B and 28B, with pre-trained and captioning variants 👏
> OpenGVLab released InternVL2, seven new vision LMs in different sizes, with sota checkpoint with MIT license ✨
> Qwen team at Alibaba released the base models of Qwen2VL models with 2B, 7B and 72B ckpts

LLMs 💬
> Meta released a new iteration of Llama 70B, Llama3.2-70B trained further
> EuroLLM-9B-Instruct is a new multilingual LLM for European languages with Apache 2.0 license 🔥
> Dataset: CohereForAI released GlobalMMLU, multilingual version of MMLU with 42 languages with Apache 2.0 license
> Dataset: QwQ-LongCoT-130K is a new dataset to train reasoning models
> Dataset: FineWeb2 just landed with multilinguality update! 🔥 nearly 8TB pretraining data in many languages!

Image/Video Generation 🖼️
> Tencent released HunyuanVideo, a new photorealistic video generation model
> OminiControl is a new editing/control framework for image generation models like Flux

Audio 🔊
> Indic-Parler-TTS is a new text2speech model made by community

New activity in neulab/PangeaBench-xmmmu about 2 months ago

Issues when downloading the dataset

#1 opened about 2 months ago by

floschne

upvoted a paper 2 months ago

Aria: An Open Multimodal Native Mixture-of-Experts Model

Paper • 2410.05993 • Published Oct 8 • 107

upvoted a collection 3 months ago

LLaVA-Onevision

Collection

LLaVa_Onevision models for single-image, multi-image, and video scenarios • 9 items • Updated Sep 18 • 12

liked a dataset 3 months ago

facebook/belebele

Viewer • Updated Aug 12 • 110k • 7.44k • 99

liked a model 4 months ago

Qwen/Qwen2-VL-7B-Instruct

Image-Text-to-Text • Updated 19 days ago • 2.44M • 980

upvoted an article 4 months ago

Article

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

Aug 22, 2023

• 28

upvoted a paper 4 months ago

M5 -- A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks

Paper • 2407.03791 • Published Jul 4 • 1

liked a model 4 months ago

royokong/e5-v

Image-Text-to-Text • Updated Oct 31 • 2.67k • 18

liked a dataset 4 months ago

Rocktim/EXAMS-V

Viewer • Updated May 7 • 21.3k • 336 • 7

upvoted a paper 4 months ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6 • 59

liked a dataset 6 months ago

afaji/cvqa

Viewer • Updated 28 days ago • 10.4k • 346 • 23

reacted to mrm8488's post with ❤️ 6 months ago

Post

4679

🚨Exciting news for the Multilingual Synthetic Data Community!🚨

I’ve taken inspiration from the MAGPIE paper on Llama-3-8B-instruct and extended its capabilities. Here’s what’s new!

🗞 The MAGPIE paper showcased that if you use the instruction-tuned version (Llama-3-8B-instruct) to generate synthetic instructions and then fine-tune the base version (Llama-3-8B) on this dataset, you can improve even the it-tuned version

🤔 While reading a script by Sebastian Raschka, PhD, I wondered: Could these advancements be replicated in other languages? Specifically, could they benefit non-English datasets?

🎉 And the answer is YES! At least for Spanish. I've successfully adapted the techniques for Spanish, proving the model's flexibility and multilingual capabilities.

👩‍💻 To make this accessible, I created a basic script (heavily inspired by the Sebastian Raschka one) that allows you to generate similar datasets using ollama models (initially phi and llama3) automatically and upload it to the Hugging Face Hub!
[Script](https://gist.github.com/mrm8488/4650a5e3cc45523798a527a3446eb312)

🔍 Explore the datasets 📚 generated using our new script!

- [Llama-3-8B](https://huggingface.co/datasets/mrm8488/dataset_llama3_5000_samples_es_4231_filtered)
- [Phi-3-medium](https://huggingface.co/datasets/mrm8488/dataset_phi3-medium_5000_samples_es_3906_filtered)
- [Phi-3-mini](https://huggingface.co/datasets/mrm8488/dataset_phi3_5000_samples_es_3282_filtered)

Note: These datasets have basic filtering. Apply additional quality filters before using them to fine-tune large language models.

Inspiration and base script:
https://github.com/rasbt/LLMs-from-scratch/blob/main/ch07/05_dataset-generation/llama3-ollama.ipynb
https://www.linkedin.com/feed/update/urn:li:activity:7210982019751661568/

7 replies

updated a dataset 6 months ago

floschne/wismir3

Viewer • Updated Jul 1 • 301k • 97

liked a model 6 months ago

xtuner/llava-llama-3-8b-v1_1-gguf

Image-to-Text • Updated Apr 30 • 8.43k • 196