AI & ML interests

GenAI, Diffusion, LLM's and State of the Art Solutions.

Recent Activity

takara-ai's activity

takarajordanΒ 
updated a Space 7 days ago
takarajordanΒ 
posted an update 7 days ago
view post
Post
1045
I made an RSS feed for HuggingFace Daily Papers!! πŸ€—

Just Subscribe here: https://papers.takara.ai/api/feed

It updates every 24 hours, completely written as a serverless go script with a Redis cache (to avoid hitting HF all the time).

I'm open sourcing the code, you can check out my repo and deploy it on Vercel extremely easily!
https://github.com/404missinglink/HF-Daily-Papers-Feeds

thanks to @John6666 @p3nGu1nZz for your early support
takarajordanΒ 
posted an update 14 days ago
view post
Post
2199
I'm super excited to release my first open-source text dataset:

WorldScenario 20K is a novel dataset of 20,000 synthetically generated multi-stakeholder scenarios designed to simulate real-world decision-making processes. Each scenario explores a unique environmental, societal, or economic issue.

I used the brand new meta-llama/Llama-3.3-70B-Instruct model to generate this dataset and I put the dataset through some post processing to clean and evaluate the dataset for diversity.

I'd appreciate some feedback and thoughts on my new release! Thanks!

takarajordan/WorldScenario_20K
Β·
takarajordanΒ 
posted an update 28 days ago
takarajordanΒ 
posted an update about 1 month ago
view post
Post
1120
First post here goes!

takarajordan/CineDiffusion

Super excited to announce CineDiffusionπŸŽ₯, it creates images up to 4.2 Megapixels in Cinematic ultrawide formats like:
- 2.39:1 (Modern Widescreen)
- 2.76:1 (Ultra Panavision 70)
- 3.00:1 (Experimental Ultra-wide)
- 4.00:1 (Polyvision)
- 2.55:1 (CinemaScope)
- 2.20:1 (Todd-AO)

More to come soon!!

Thanks to @John6666 and @Resoldjew for your early support <3

And thanks to the team at ShuttleAI for their brand new Shuttle-3 model, what an amazing job.

shuttleai/shuttle-3-diffusion
TonicΒ 
posted an update about 2 months ago
view post
Post
3391
πŸ™‹πŸ»β€β™‚οΈhey there folks,

periodic reminder : if you are experiencing ⚠️500 errors ⚠️ or ⚠️ abnormal spaces behavior on load or launch ⚠️

we have a thread πŸ‘‰πŸ» https://discord.com/channels/879548962464493619/1295847667515129877

if you can record the problem and share it there , or on the forums in your own post , please dont be shy because i'm not sure but i do think it helps πŸ€—πŸ€—πŸ€—
  • 2 replies
Β·
takarajordanΒ 
updated a Space about 2 months ago
TonicΒ 
posted an update about 2 months ago
view post
Post
1088
boomers still pick zenodo.org instead of huggingface ??? absolutely clownish nonsense , my random datasets have 30x more downloads and views than front page zenodos ... gonna write a comparison blog , but yeah... cringe.
  • 1 reply
Β·
TonicΒ 
posted an update 2 months ago
view post
Post
817
πŸ™‹πŸ»β€β™‚οΈ hey there folks ,

really enjoying sharing cool genomics and protein datasets on the hub these days , check out our cool new org : https://huggingface.co/seq-to-pheno

scroll down for the datasets, still figuring out how to optimize for discoverability , i do think on that part it will be better than zenodo[dot}org , it would be nice to write a tutorial about that and compare : we already have more downloads than most zenodo datasets from famous researchers !
TonicΒ 
posted an update 2 months ago
view post
Post
1446
hey there folks,

twitter is aweful isnt it ? just getting into the habbit of using hf/posts for shares πŸ¦™πŸ¦™

Tonic/on-device-granite-3.0-1b-a400m-instruct

new granite on device instruct model demo , hope you like it πŸš€πŸš€
TonicΒ 
posted an update 2 months ago
TonicΒ 
posted an update 3 months ago
TonicΒ 
posted an update 3 months ago
view post
Post
1853
πŸ™‹πŸ»β€β™‚οΈ Hey there folks ,

🦎Salamandra release by @mvillegas and team
@BSC_CNS https://huggingface.co/BSC-LT is absolutely impressive so far !

perhaps the largest single training dataset of high quality text to date of 7.8 trillion tokens in 35 European languages and code.

the best part : the data was correctly licenced so it's actually future-proof!

the completions model is really creative and instruct fine tuned version is very good also.

now you can use such models for multi-lingual enterprise applications with further finetunes , long response generation, structured outputs (coding) also works.

check out πŸ‘‡πŸ»
the collection : BSC-LT/salamandra-66fc171485944df79469043a
the repo : https://github.com/langtech-bsc/salamandra
7B-Instruct demo : Tonic/Salamandra-7B