Mohamed Sadek Saadi

yohoji

yohoji

AI & ML interests

None yet

Recent Activity

reacted to jasoncorkill's post with 🔥 about 19 hours ago

At Rapidata, we compared DeepL with LLMs like DeepSeek-R1, Llama, and Mixtral for translation quality using feedback from over 51,000 native speakers. Despite the costs, the performance makes it a valuable investment, especially in critical applications where translation quality is paramount. Now we can say that Europe is more than imposing regulations. Our dataset, based on these comparisons, is now available on Hugging Face. This might be useful for anyone working on AI translation or language model evaluation. https://huggingface.co/datasets/Rapidata/Translation-deepseek-llama-mixtral-v-deepl

reacted to jasoncorkill's post with 👍 about 19 hours ago

liked a dataset 8 days ago

Rapidata/text-2-video-human-preferences-veo2

View all activity

Organizations

None yet

yohoji's activity

reacted to jasoncorkill's post with 🔥👍 about 19 hours ago

Post

1437

At Rapidata, we compared DeepL with LLMs like DeepSeek-R1, Llama, and Mixtral for translation quality using feedback from over 51,000 native speakers. Despite the costs, the performance makes it a valuable investment, especially in critical applications where translation quality is paramount. Now we can say that Europe is more than imposing regulations.

Our dataset, based on these comparisons, is now available on Hugging Face. This might be useful for anyone working on AI translation or language model evaluation.

Rapidata/Translation-deepseek-llama-mixtral-v-deepl

1 reply

liked 2 datasets 8 days ago

Rapidata/text-2-video-human-preferences-veo2

Viewer • Updated 8 days ago • 760 • 341 • 11

Rapidata/text-2-video-human-preferences-wan2.1

Viewer • Updated 8 days ago • 787 • 407 • 15

liked a dataset 12 days ago

Rapidata/Translation-deepseek-llama-mixtral-v-deepl

Viewer • Updated 9 days ago • 845 • 334 • 14

reacted to jasoncorkill's post with 🚀 20 days ago

Post

3830

Has OpenGVLab Lumina Outperformed OpenAI’s Model?

We’ve just released the results from a large-scale human evaluation (400k annotations) of OpenGVLab’s newest text-to-image model, Lumina. Surprisingly, Lumina outperforms OpenAI’s DALL-E 3 in terms of alignment, although it ranks #6 in our overall human preference benchmark.

To support further development in text-to-image models, we’re making our entire human-annotated dataset publicly available. If you’re working on model improvements and need high-quality data, feel free to explore.

We welcome your feedback and look forward to any insights you might share!

Rapidata/OpenGVLab_Lumina_t2i_human_preference

liked a dataset 21 days ago

Rapidata/OpenGVLab_Lumina_t2i_human_preference

Viewer • Updated 21 days ago • 13k • 1.17k • 13

reacted to jasoncorkill's post with 🚀🚀 27 days ago

Post

2509

This dataset was collected in roughly 4 hours using the Rapidata Python API, showcasing how quickly large-scale annotations can be performed with the right tooling!

All that at less than the cost of a single hour of a typical ML engineer in Zurich!

The new dataset of ~22,000 human annotations evaluating AI-generated videos based on different dimensions, such as Prompt-Video Alignment, Word for Word Prompt Alignment, Style, Speed of Time flow and Quality of Physics.

Rapidata/text-2-video-Rich-Human-Feedback

reacted to jasoncorkill's post with 👀 27 days ago

Post

2849

Integrating human feedback is vital for evolving AI models. Boost quality, scalability, and cost-effectiveness with our crowdsourcing tool!

..Or run A/B tests and gather thousands of responses in minutes. Upload two images, ask a question, and watch the insights roll in!

Check it out here and let us know your feedback: https://app.rapidata.ai/compare

reacted to jasoncorkill's post with ❤️ about 1 month ago

Post

4555

Runway Gen-3 Alpha: The Style and Coherence Champion

Runway's latest video generation model, Gen-3 Alpha, is something special. It ranks #3 overall on our text-to-video human preference benchmark, but in terms of style and coherence, it outperforms even OpenAI Sora.

However, it struggles with alignment, making it less predictable for controlled outputs.

We've released a new dataset with human evaluations of Runway Gen-3 Alpha: Rapidata's text-2-video human preferences dataset. If you're working on video generation and want to see how your model compares to the biggest players, we can benchmark it for you.

🚀 DM us if you’re interested!

Dataset: Rapidata/text-2-video-human-preferences-runway-alpha