maywell (Jeonghwan Park)

reacted to yongchanghao's post with 🔥 3 months ago

Post

3765

We just released a paper (NeuZip) that compresses VRAM in a lossless manner to run larger models. This should be particularly useful when VRAM is insufficient during training/inference. Specifically, we look inside each floating number and find that the exponents are highly compressible (as shown in the figure below).

Read more about the work at NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks (2410.20650)

reacted to thomwolf's post with ❤️ 4 months ago

Post

4199

Parents in the 1990: Teach the kids to code
Parents now: Teach the kids to fix the code when it starts walking around 🤖✨

2 replies

·

reacted to beomi's post with 🔥 4 months ago

Post

5596

# PyTorch == 2.5.0 Breaks Transformers' SDPAttention!

When you encounter "RuntimeError: cuDNN Frontend error: [cudnn_frontend] Error: No execution plans support the graph."

We can use workaround like this:

torch.backends.cuda.enable_cudnn_sdp(False)

but this slow downs the performance gain from PyTorch 2.5.

Although it is fixed(not "fixed" but default option is turn-off the cuDNN SDPA) at here -- https://github.com/pytorch/pytorch/pull/138587 , but not released yet. (you need to install directly from source)

Fastest way for now : pip install "torch<2.5"

Ref: https://github.com/huggingface/diffusers/issues/9704#issuecomment-2422585273

reacted to Felladrin's post with ❤️ 4 months ago

Post

3097

MiniSearch is celebrating its 1st birthday! 🎉

Exactly one year ago, I shared the initial version of this side-project on Hugging Face. Since then, there have been numerous changes under the hood. Nowadays it uses [Web-LLM](https://github.com/mlc-ai/web-llm), [Wllama](https://github.com/ngxson/wllama) and [SearXNG](https://github.com/searxng/searxng). I use it daily as my default search engine and have done my best to make it useful. I hope it's interesting for you too!

HF Space: Felladrin/MiniSearch
Embeddable URL: https://felladrin-minisearch.hf.space

1 reply

·

reacted to Wauplin's post with 🔥 5 months ago

Post

4677

🚀 Exciting News! 🚀

We've just released 𝚑𝚞𝚐𝚐𝚒𝚗𝚐𝚏𝚊𝚌𝚎_𝚑𝚞𝚋 v0.25.0 and it's packed with powerful new features and improvements!

✨ 𝗧𝗼𝗽 𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀:

• 📁 𝗨𝗽𝗹𝗼𝗮𝗱 𝗹𝗮𝗿𝗴𝗲 𝗳𝗼𝗹𝗱𝗲𝗿𝘀 with ease using huggingface-cli upload-large-folder. Designed for your massive models and datasets. Much recommended if you struggle to upload your Llama 70B fine-tuned model 🤡
• 🔎 𝗦𝗲𝗮𝗿𝗰𝗵 𝗔𝗣𝗜: new search filters (gated status, inference status) and fetch trending score.
• ⚡𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲𝗖𝗹𝗶𝗲𝗻𝘁: major improvements simplifying chat completions and handling async tasks better.

We’ve also introduced tons of bug fixes and quality-of-life improvements - thanks to the awesome contributions from our community! 💪

💡 Check out the release notes: Wauplin/huggingface_hub#8

Want to try it out? Install the release with:

pip install huggingface_hub==0.25.0

1 reply

·

reacted to do-me's post with 🚀 5 months ago

Post

3377

SemanticFinder now supports WebGPU thanks to @Xenova 's efforts with transformers.js v3!
Expect massive performance gains. Inferenced a whole book with 46k chunks in <5min. If your device doesn't support #WebGPU use the classic Wasm-based version:
- WebGPU: https://do-me.github.io/SemanticFinder/webgpu/
- Wasm: https://do-me.github.io/SemanticFinder/

WebGPU harnesses the full power of your hardware, no longer being restricted to just the CPU. The speedup is significant (4-60x) for all kinds of devices: consumer-grade laptops, heavy Nvidia GPU setups or Apple Silicon. Measure the difference for your device here: Xenova/webgpu-embedding-benchmark
Chrome currently works out of the box, Firefox requires some tweaking.

WebGPU + transformers.js allows to build amazing applications and make them accessible to everyone. E.g. SemanticFinder could become a simple GUI for populating your (vector) DB of choice. See the pre-indexed community texts here: do-me/SemanticFinder
Happy to hear your ideas!

1 reply

·

reacted to fantos's post with 🚀❤️ 6 months ago

Post

2543

1. **Overview**
"EveryText" is at the forefront of AI image generation, offering a novel "TBF ('Text by Font') Image Model" that enables the representation of all languages globally in AI-generated images without prior training.

2. **Background**
Platforms like MidJourneyV6 and FLUX have advanced AI image generation, typically supporting English text. Alibaba Group expanded this to include Chinese, Japanese, and Korean, signaling a shift towards global language support.

3. **Challenges**
Existing methods faced several challenges including the need for additional editing, dependency on specific training, and substantial resource requirements. These approaches also struggled with limited vocabulary and were primarily effective only for English.

4. **Innovative Solution**
EveryText utilizes "Fonts" as pre-trained models, allowing any text to be visually represented without traditional training. This approach not only enhances diversity and aesthetics by utilizing various fonts but also ensures unlimited expression.

5. **Using the Service**
EveryText is free and easy to use:
- **Prompt**: Describe the image.
- **Text for Image Generation**: Add your text.
- **Text Position and Size**: Customize the text's placement and size.
- **Font Selection**: Optionally select a font.
- **Advanced Settings**: Further refine the image creation.
- Click "START" to generate the image.

6. **Comparative Analysis**
EveryText supports all languages with superior image quality and text legibility, setting it apart from platforms like MidJourneyV6/Flux and AnyText by Alibaba Group.

7. **Conclusion**
EveryText has revolutionized AI-generated imagery by integrating all global languages, broadening the scope for creative and communicative applications. Its future potential is vast and promising.

**Related Links**
- Huggingface Service: https://fantos-EveryText.hf.space
-email: [email protected]

reacted to maximuspowers's post with 👀 6 months ago

Post

2537

Here's my favorite piece of the summer bias detection research project (paper coming in Sept). We trained BERT for token classification (multi-label), to identify:
- Generalizations
- Unfairness
- Stereotypes

HF Space: maximuspowers/bias-detection-ner
Article on Training: https://huggingface.co/blog/maximuspowers/bias-entity-recognition

Pls reach out with ideas!! Lot's more info coming soon, our research group has workshops and a hackathon planned for launching this open source project. Thanks

reacted to davidberenstein1957's post with ❤️ 6 months ago

Post

1768

📣 Introducing Dataset Viber: your chill repo for data collection, annotation and vibe checks! 🎉

I've cooked up Dataset Viber, a set of cool tools designed to make data preparation for AI models easier, more approachable and enjoyable for standalone AI engineers and enthusiasts.

🔧 What Dataset Viber offers:
- CollectorInterface: Lazily collect model interaction data without human annotation
- AnnotatorInterface: Annotate your data with models in the loop
- BulkInterface: Explore data distribution and annotate in bulk
- Embedder: Efficiently embed data with ONNX-optimized speeds

🎯 Key features:
- Supports various tasks for text, chat, and image modalities
- Runs in .ipynb notebooks
- Logs data to local CSV or directly to Hugging Face Hub
- Easy to install via pip: pip install dataset-viber

It's not designed for team collaboration or production use, but rather as a fun and efficient toolkit for individual projects.

Want to give it a try? Check out the repository link https://github.com/davidberenstein1957/dataset-viber/.

I'm excited to hear your feedback and learn how you vibe with your data. Feel free to open an issue or reach out if you have any questions or suggestions!

Some shoutouts:
- Gradio for the amazing backbone
- Daniel van Strien for some initial presentations I did on vibe checks
- Emily Omier for the workshop on structuring GitHub repo READMEs
- Hamel Husain for keeping mentioning that people should look at their data.
- Philipp Schmid for his code for ONNX feature-extractors
- Ben Burtenshaw for the first PR

1 reply

·

reacted to Jaward's post with 🔥 6 months ago

Post

1780

PyTorch implementation of the Self-Compression & Differentiable Quantization Algorithm introduced in “Self-Compressing Neural Networks” paper.

The algorithm shows dynamic neural network compression during training - with reduced size of weight, activation tensors and bits required to represent weights.

It’s basically shrinking the neural network size (weights and activations) as it’s being trained without compromising performance - this helps reduce compute and inference cost.

Code: https://github.com/Jaykef/ai-algorithms
Paper: https://arxiv.org/pdf/2301.13142

reacted to davanstrien's post with ❤️ 6 months ago

Post

3160

Is your summer reading list still empty? Curious if an LLM can generate a book blurb you'd enjoy and help build a KTO preference dataset at the same time?

A demo using Hugging Face Spaces and Gradio to collect LLM output preferences: davanstrien/would-you-read-it

1 reply

·

reacted to fdaudens's post with 🚀 7 months ago

Post

2675

Exciting news for audio AI enthusiasts! 🎙️🌍

The Emilia dataset dropped last week, and it's a cool one:
- 101k+ hours of high-quality audio
- 6 languages: 🇨🇳 🇺🇸 🇯🇵 🇰🇷 🇩🇪 🇫🇷
- Diverse content: talk shows, interviews, debates, sports commentary, audiobooks

This dataset could improve multilingual speech generation and recognition. Opens up many possibilities for global media, language learning, and accessibility!

Explore it: amphion/Emilia

#AIAudio

reacted to lamhieu's post with 😔 7 months ago

Post

4284

🎉 The Ghost 8B Beta model outperforms prominent models such as Llama 3 8B Instruct, GPT 3.5 Turbo in the lc_winrate score. In addition, it also outperforms Claude 3 Opus, Claude 3 Sonnet, GPT-4, and Mistral Large when comparing the winrate score of AlpacaEval 2.0.

Ghost 8B Beta is a large language model developed with goals that include excellent multilingual support, superior knowledge capabilities, and cost-effectiveness. The model comes in two context length versions, 8k and 128k, along with multilingual function tools support by default.
The languages supported are 🇺🇸 English, 🇫🇷 French, 🇮🇹 Italian, 🇪🇸 Spanish, 🇵🇹 Portuguese, 🇩🇪 German, 🇻🇳 Vietnamese, 🇰🇷 Korean and 🇨🇳 Chinese.

Explore the Potential:
To learn more about this groundbreaking language model, visit the official website or explore the online demo platforms:
- Ghost 8B Beta (β, 8k) on Spaces: lamhieu/ghost-8b-beta-8k.
- Ghost 8B Beta (β, 128k) on Spaces: lamhieu/ghost-8b-beta-128k
- Official website: https://ghost-x.org/docs/models/ghost-8b-beta

44 replies

·

reacted to m-ric's post with 👍 8 months ago

Post

3137

💰 𝗚𝗲𝘁 𝘁𝗵𝗲 𝗽𝗿𝗶𝗰𝗲 𝗼𝗳 𝗮𝗻𝘆 𝗟𝗟𝗠 𝗔𝗣𝗜 𝗿𝗲𝗾𝘂𝗲𝘀𝘁 ⇒ 𝘁𝗼𝗸𝗲𝗻𝗰𝗼𝘀𝘁

I've just found out about 𝙰𝚐𝚎𝚗𝚝𝙾𝚙𝚜-𝙰𝙸/𝚝𝚘𝚔𝚎𝚗𝚌𝚘𝚜𝚝 (https://github.com/AgentOps-AI/tokencost).
𝗧𝗵𝗶𝘀 𝗹𝗶𝗯𝗿𝗮𝗿𝘆 𝗴𝗶𝘃𝗲𝘀 𝘆𝗼𝘂 𝘁𝗵𝗲 𝗽𝗿𝗶𝗰𝗲 𝗼𝗳 𝘆𝗼𝘂𝗿 𝗰𝗮𝗹𝗹𝘀 𝘁𝗼 𝗮𝗻𝘆 𝗟𝗟𝗠 𝗔𝗣𝗜: OpenAI, Anthropic, Mistral, AWS or Databricks...

For any model, you can use as input either string prompts or messages, and get as outputs either the price or token count.

Congrats to the AgentOps-AI team: this will be very useful when trying to get a ballpark estimate of a project's price, to compare APIs, or for precise monitoring of usage!

✨ Daily reminder: 𝗿𝘂𝗻𝗻𝗶𝗻𝗴 𝗮𝗻 𝗔𝟭𝟬𝟬 𝗰𝗼𝘀𝘁𝘀 𝘆𝗼𝘂 𝗲𝘅𝗮𝗰𝘁𝗹𝘆 $𝟬.𝟬𝟬/𝗵𝗼𝘂𝗿 (or 0.00€ in current exchange rates) on a HF space with ZeroGPU!
Learn more on ZeroGPU 👉 https://www.datacenterdynamics.com/en/news/hugging-face-launches-zerogpu-project-to-democratize-ai-gives-away-10-million-worth-of-compute/

5 replies

·

reacted to loubnabnl's post with 🔥 9 months ago

Post

5870

🍷 FineWeb technical report is out and so is 📚 FineWeb-Edu, a 1.3 trillion tokens dataset that outperforms all other open web datasets, with remarkable improvements on educational benchmarks such as MMLU, ARC, and OpenBookQA.

Technical report: HuggingFaceFW/blogpost-fineweb-v1
Dataset: HuggingFaceFW/fineweb-edu

We used Llama 3 generations to train an educational quality classifier, filtering the 15 trillion tokens of FineWeb to select only those with high educational value (an approach also used in Llama 3 and Phi-3 training datasets). We're releasing both FineWeb-Edu and the classifier, along with a larger, less heavily filtered version containing 5.4 trillion tokens.

You can find more details about the dataset and the experiments we ran in the FineWeb technical report, It's a 45-minute read but it contains all the secret sauce for building high quality web datasets.

Enjoy!

replied to their post 9 months ago

Hi, just read it. It's merging method with calibration looks interesting. I don't see their method without it have significant benefit over previous methods.

reacted to mmhamdy's post with ❤️ 10 months ago

Post

1777

⌚ Visiting the past with Time Machine GPT!

We are all familiar with the concept of a suite of models being a series of variants of a certain model that differ mainly in size. For example, Llama-2 7B, Llama-2 13B, Llama-2 70B

But this is not always the case. Researchers from The University of Oxford, The Alan Turing Institute, and The University of Manchester introduced TimeMachineGPT (TiMaGPT), a suite of language models that were pretrained on data constrained by a certain period in time. Instead of various sizes of the model, you get the same model but trained on different data coming from different times.

Using a GPT-2 model architecture with 117 million parameters, they trained 12 different models on Wikipedia and WMT News from 2011 to 2022 with each year represented by a model. For example, TiMaGPT-2011, TiMaGPT-2012, ..., TiMaGPT-2022.

🤔 But how could these models be useful?

They can be very useful. For example:

1️⃣ Most language models are static in the sense that they are trapped in the time bubble of their pretraining data, their knowledge is limited by the cut-off date of their training dataset. In order to update their knowledge, Temporal Adaptation can be performed, which means further training on newer data. The TiMaGPT series of models can be used to study the limitations of Temporal Adaptation of language models.

2️⃣ Word meaning can change not only with its context but also with its time of use and there is a large amount of research that focuses on understanding how embeddings shift through time. TiMaGPT will be very helpful in studying this phenomenon.

3️⃣ One more use case in the context of Time-series forecasting and event prediction is "backtesting". Which is using historical data to evaluate new models for forecasting the future. Models like TiMaGPT (each living in its own time without any knowledge of the future/present) will be great for such a use case.

🤗 All models and datasets are on the hub: https://huggingface.co/Ti-Ma

1 reply

·

reacted to m-ric's post with 🤯👀 10 months ago

Post

2801

💰❌ 𝐑𝐞𝐬𝐞𝐚𝐫𝐜𝐡 𝐟𝐨𝐫 𝐭𝐡𝐞 𝐯𝐞𝐫𝐲 𝐆𝐏𝐔 𝐏𝐨𝐨𝐫 - 𝐒𝐜𝐚𝐥𝐢𝐧𝐠 𝐥𝐚𝐰𝐬 𝐫𝐞𝐩𝐥𝐢𝐜𝐚𝐭𝐢𝐨𝐧

🎆 Good news: 𝘆𝗼𝘂 𝗰𝗮𝗻 𝗱𝗼 𝗰𝘂𝘁𝘁𝗶𝗻𝗴-𝗲𝗱𝗴𝗲 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝘄𝗶𝘁𝗵 𝗮 𝗰𝗮𝗹𝗰𝘂𝗹𝗮𝘁𝗼𝗿 𝗮𝗻𝗱 𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗣𝗮𝗶𝗻𝘁 𝟮𝟬𝟬𝟲!

The Chinchilla experiments (by Google DeepMind) ran hundreds of pre-trainings with models >1B parameters (I do not want to imagine how much that cost) to 𝗳𝗶𝗻𝗱 𝘁𝗵𝗲 𝗼𝗽𝘁𝗶𝗺𝗮𝗹 𝗿𝗮𝘁𝗶𝗼 𝗼𝗳 𝗺𝗼𝗱𝗲𝗹 𝘀𝗶𝘇𝗲 𝘃𝘀 𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝘁𝗼𝗸𝗲𝗻𝘀. Why is this question so important?
Well, you only ever have access to a fixed compute, counted in FLOPs (floating point operations). So if your model is bigger, you will have less compute to train on many tokens, and if you want to train on more tokens, your model will be smaller. When model trainings cost million, you absolutely need to get this right.

The new paper "Chinchilla Scaling: A replication attempt" by Epoch AI sets on on the ambitious goal of reproducing this.

But since the authors do not have infinite money, they decided to directly run their computations from DeepMind's own experiments! They took the figure from the last experiment (cf slide below), measured point positions, picked color codes, and ended up reconstructing the underlying data.

💥 They then just fit the scaling laws proposed by the Chinchilla Authors, but arrived at wildly different results! They find that as a rough rule of thumb, you should use 20 training tokens for each parameter in your model, instead of the 70 obtained in the original paper. They also point out inconsistencies in the paper, and unrealistically narrow confidence intervals.

➡️ This only contradicts the results from the last (out of 3) experiments in the Chinchilla paper. And the model trained at the end of the Chinchilla paper still seems properly scaled.

✅ But it does show that a tiny bit more theoretical work can go a long way, especially given the huge financial costs that such an error can have!

Jeonghwan Park PRO

AI & ML interests

Recent Activity

Organizations

maywell's activity