t.d.a.g. PRO

sequelbox

AI & ML interests

open source, infinite games. (they/them)

Recent Activity

Organizations

Valiant Labs's profile picture

sequelbox's activity

posted an update 4 days ago
view post
Post
1313
NEW RELEASE: the sequelbox/Tachibana-QVQ dataset is here! Code-reasoning and code-instruct data generated with Qwen/QVQ-72B-Preview

Come check out QVQ's coding skills!

for everyone to use!

more QVQ and Llama 3.1 405b datasets coming soon :)
reacted to DawnC's post with โค๏ธ 7 days ago
view post
Post
2168
๐ŸŒŸ PawMatchAI: Making Breed Selection More Intuitive! ๐Ÿ•
Excited to share the latest update to this AI-powered companion for finding your perfect furry friend! I've made significant architectural improvements to enhance breed recognition accuracy and feature detection.

โœจ What's New?
Enhanced breed recognition through advanced morphological feature analysis:
- Implemented a sophisticated feature extraction system that analyzes specific characteristics like body proportions, head features, tail structure, fur texture, and color patterns
- Added an intelligent attention mechanism that dynamically focuses on the most relevant features for each image
- Improved multi-dog detection capabilities through enhanced spatial feature analysis
- Achieved better precision in distinguishing subtle breed characteristics

๐ŸŽฏ Key Features:
Smart breed recognition powered by advanced AI architecture
Visual matching scores with intuitive color indicators
Detailed breed comparisons with interactive tooltips
Lifestyle-based recommendations tailored to your needs

๐Ÿ’ญ Project Vision
Combining my passion for AI and pets, this project represents another step toward creating meaningful AI applications. Each update aims to make the breed selection process more accessible while improving the underlying technology.

๐Ÿ‘‰ Try it now: DawnC/PawMatchAI

Your likes โค๏ธ on this space fuel this project's growth!

#AI #MachineLearning #DeepLearning #Pytorch #ComputerVision #TechForLife
  • 2 replies
ยท
posted an update 12 days ago
reacted to m-ric's post with ๐Ÿ‘€ 22 days ago
view post
Post
2464
๐‡๐ฎ๐ ๐ ๐ข๐ง๐  ๐…๐š๐œ๐ž ๐ซ๐ž๐ฅ๐ž๐š๐ฌ๐ž๐ฌ ๐๐ข๐œ๐จ๐ญ๐ซ๐จ๐ง, ๐š ๐ฆ๐ข๐œ๐ซ๐จ๐ฌ๐œ๐จ๐ฉ๐ข๐œ ๐ฅ๐ข๐› ๐ญ๐ก๐š๐ญ ๐ฌ๐จ๐ฅ๐ฏ๐ž๐ฌ ๐‹๐‹๐Œ ๐ญ๐ซ๐š๐ข๐ง๐ข๐ง๐  ๐Ÿ’๐ƒ ๐ฉ๐š๐ซ๐š๐ฅ๐ฅ๐ž๐ฅ๐ข๐ณ๐š๐ญ๐ข๐จ๐ง ๐Ÿฅณ

๐Ÿ•ฐ๏ธ Llama-3.1-405B took 39 million GPU-hours to train, i.e. about 4.5 thousand years.

๐Ÿ‘ด๐Ÿป If they had needed all this time, we would have GPU stories from the time of Pharaoh ๐“‚€: "Alas, Lord of Two Lands, the shipment of counting-stones arriving from Cathay was lost to pirates, this shall delay the building of your computing temple by many moons "

๐Ÿ› ๏ธ But instead, they just parallelized the training on 24k H100s, which made it take just a few months.
This required parallelizing across 4 dimensions: data, tensor, context, pipeline.
And it is infamously hard to do, making for bloated code repos that hold together only by magic.

๐Ÿค ๐—•๐˜‚๐˜ ๐—ป๐—ผ๐˜„ ๐˜„๐—ฒ ๐—ฑ๐—ผ๐—ป'๐˜ ๐—ป๐—ฒ๐—ฒ๐—ฑ ๐—ต๐˜‚๐—ด๐—ฒ ๐—ฟ๐—ฒ๐—ฝ๐—ผ๐˜€ ๐—ฎ๐—ป๐˜†๐—บ๐—ผ๐—ฟ๐—ฒ! Instead of building mega-training codes, Hugging Face colleagues cooked in the other direction, towards tiny 4D parallelism libs. A team has built Nanotron, already widely used in industry.
And now a team releases Picotron, a radical approach to code 4D Parallelism in just a few hundred lines of code, a real engineering prowess, making it much easier to understand what's actually happening!

โšก ๐—œ๐˜'๐˜€ ๐˜๐—ถ๐—ป๐˜†, ๐˜†๐—ฒ๐˜ ๐—ฝ๐—ผ๐˜„๐—ฒ๐—ฟ๐—ณ๐˜‚๐—น:
Counting in MFU (Model FLOPs Utilization, how much the model actually uses all the compute potential), this lib reaches ~50% on SmolLM-1.7B model with 8 H100 GPUs, which is really close to what huge libs would reach. (Caution: the team is leading further benchmarks to verify this)

Go take a look ๐Ÿ‘‰ https://github.com/huggingface/picotron/tree/main/picotron
  • 1 reply
ยท
reacted to takarajordan's post with โค๏ธ about 1 month ago
view post
Post
2278
I'm super excited to release my first open-source text dataset:

WorldScenario 20K is a novel dataset of 20,000 synthetically generated multi-stakeholder scenarios designed to simulate real-world decision-making processes. Each scenario explores a unique environmental, societal, or economic issue.

I used the brand new meta-llama/Llama-3.3-70B-Instruct model to generate this dataset and I put the dataset through some post processing to clean and evaluate the dataset for diversity.

I'd appreciate some feedback and thoughts on my new release! Thanks!

takarajordan/WorldScenario_20K
ยท
posted an update about 1 month ago
view post
Post
485
NEW RELEASE: Celestia 2!

- Multi-turn science-instruct conversations in the microsoft/orca-agentinstruct-1M-v1 style, generated by meta-llama/Llama-3.1-405B-Instruct
- 100% challenging, multi-turn conversations focused on physics, chemistry, computer science, biology, Earth science, and more!

Celestia 2 will be one of the datasets used for training by the upcoming agent-instruct model, Shining Valiant 3. very excited for this :)

Get it now: sequelbox/Celestia2

do as you will. there is only the sea.
posted an update about 2 months ago
posted an update 2 months ago
view post
Post
451
NEW RELEASE! Shining Valiant 2 for Llama 3.1 70b is here!

- Trained on high quality science-instruct, complex queries, and general chat data!
- Uses our newest datasets, ALL open-sourced for everyone to use!

GET SV2 70B: ValiantLabs/Llama3.1-70B-ShiningValiant2

- Find the SV datasets here, including the expanded version of our science-instruct dataset:
- sequelbox/Celestia
- sequelbox/Spurline
- sequelbox/Supernova
- SV2 8b and 3b will be updated with the new datasets soon!

Enjoy! :)
posted an update 3 months ago
replied to their post 3 months ago
posted an update 3 months ago
posted an update 3 months ago
posted an update 4 months ago
view post
Post
495
NEW RELEASE! We've brought Shining Valiant 2 to Llama 3.2!

ValiantLabs/Llama3.2-3B-ShiningValiant2 is trained on high-quality general chat and science-instruct data! Get it now :)

(Enigma's up next for 3b, that'll be out soon!)

Additionally, newly expanded versions of the following datasets are now available:

sequelbox/Supernova is now 178k rows of high-quality synthetic general chat data.
sequelbox/Tachibana is now 104k rows of high-quality synthetic code-instruct data.

for everyone to use :)
more soon
reacted to fdaudens's post with ๐Ÿš€ 4 months ago
view post
Post
3354
๐Ÿš€ 1,000,000 public models milestone achieved on Hugging Face! ๐Ÿคฏ

This chart by @cfahlgren1 shows the explosive growth of open-source AI. It's not just about numbers - it's a thriving community combining cutting-edge ML with real-world applications. cfahlgren1/hub-stats

Can't wait to see what's next!
  • 2 replies
ยท
posted an update 4 months ago
view post
Post
356
today's release: the updated Supernova general chat dataset!

- the new Supernova is 2x the rows, continuing to provide high quality general synthetic data generated with Llama 405b Instruct.

Find it at sequelbox/Supernova

Enjoy! There's also a new version of sequelbox/Llama3.1-8B-MOTH available using the new dataset. (new and better MOTHs for other models will come as well, but the Build Tools and Shining Valiant take priority.)
posted an update 4 months ago
replied to their post 4 months ago
view reply

coming next: newest version of Shining Valiant + the new science-instruct dataset that she'll be using as a knowledge base. very excited for this!
after that more build tool releases :)

posted an update 4 months ago
reacted to vilarin's post with ๐Ÿš€ 4 months ago
posted an update 4 months ago