Crispin Almodovar

calmodovar

AI & ML interests

NLP, log anomaly detection, cyber intelligence

Recent Activity

reacted to openfree's post with 🚀 1 day ago

🌟 MoneyRadar - AI-Powered Global News Analysis System 💻 Live Demo: https://huggingface.co/spaces/openfree/MoneyRadar 🎯 Core Features 1. 🤖 24/7 Automated News Scanning Auto-collection of Top 100 trending news Real-time monitoring across 60 countries Smart filtering of investment-critical news 2. 🔍 Advanced Custom Search Unlimited keyword search capability Country/language-specific search options Real-time trend-based related keywords 3. 🎨 Smart Analysis & Visualization AI-powered sentiment analysis Automated content summarization Investment decision-supporting insights ⚡ Automated Information Collection Key Companies (NVIDIA, APPLE, TESLA, etc.) Earnings/Forecasts Product/Technology announcements Market share changes M&A and major news Financial Markets & Digital Assets Macroeconomic indicators Regulatory changes Market sentiment analysis Major exchange updates 📊 Business Applications Real-time market trend tracking Competitor movement monitoring Early investment opportunity detection Risk early warning system 🌟 Key Differentiators Full Automation Zero manual intervention Real-time data updates Automated result storage/management User-Centric Design Intuitive interface Customizable alerts Mobile optimization Advanced Analytics News cross-checking Historical tracking Trend prediction support Join Community 💬 "With MoneyRadar, never miss a beat in the global market movements!"

reacted to merve's post with 🔥 1 day ago

Oof, what a week! 🥵 So many things have happened, let's recap! https://huggingface.co/collections/merve/jan-24-releases-6793d610774073328eac67a9 Multimodal 💬 - We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗 - UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B - Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B - MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context - Dataset: Yale released a new benchmark called MMVU - Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark LLMs 📖 - DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯 - Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B - NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!) Audio 🗣️ - Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B - TangoFlux is a new audio generation model trained from scratch and aligned with CRPO Image/Video/3D Generation ⏯️ - Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux - tencent released Hunyuan3D-2, new 3D asset generation from images

upvoted a collection 2 days ago

DeepSeek-R1

View all activity

Organizations

calmodovar's activity

reacted to openfree's post with 🚀 1 day ago

Post

3883

🌟 MoneyRadar - AI-Powered Global News Analysis System

💻 Live Demo: openfree/MoneyRadar

🎯 Core Features
1. 🤖 24/7 Automated News Scanning

Auto-collection of Top 100 trending news
Real-time monitoring across 60 countries
Smart filtering of investment-critical news

2. 🔍 Advanced Custom Search

Unlimited keyword search capability
Country/language-specific search options
Real-time trend-based related keywords

3. 🎨 Smart Analysis & Visualization

AI-powered sentiment analysis
Automated content summarization
Investment decision-supporting insights

⚡ Automated Information Collection
Key Companies (NVIDIA, APPLE, TESLA, etc.)

Earnings/Forecasts
Product/Technology announcements
Market share changes
M&A and major news

Financial Markets & Digital Assets

Macroeconomic indicators
Regulatory changes
Market sentiment analysis
Major exchange updates

📊 Business Applications

Real-time market trend tracking
Competitor movement monitoring
Early investment opportunity detection
Risk early warning system

🌟 Key Differentiators

Full Automation

Zero manual intervention
Real-time data updates
Automated result storage/management

User-Centric Design

Intuitive interface
Customizable alerts
Mobile optimization

Advanced Analytics

News cross-checking
Historical tracking
Trend prediction support

Join Community 💬
"With MoneyRadar, never miss a beat in the global market movements!"

reacted to merve's post with 🔥 1 day ago

Post

2260

Oof, what a week! 🥵 So many things have happened, let's recap! merve/jan-24-releases-6793d610774073328eac67a9

Multimodal 💬
- We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗
- UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B
- Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B
- MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context
- Dataset: Yale released a new benchmark called MMVU
- Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark

LLMs 📖
- DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯
- Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B
- NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)

Audio 🗣️
- Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B
- TangoFlux is a new audio generation model trained from scratch and aligned with CRPO

Image/Video/3D Generation ⏯️
- Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux
- tencent released Hunyuan3D-2, new 3D asset generation from images

7 replies

upvoted a collection 2 days ago

DeepSeek-R1

Collection

8 items • Updated 5 days ago • 145

reacted to merve's post with 🔥 25 days ago

Post

4821

supercharge your LLM apps with smolagents 🔥

however cool your LLM is, without being agentic it can only go so far

enter smolagents: a new agent library by Hugging Face to make the LLM write code, do analysis and automate boring stuff!

Here's our blog for you to get started https://huggingface.co/blog/smolagents

reacted to MoritzLaurer's post with 👍 about 1 month ago

Post

2605

Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!

This will probably be the basis for many future SOTA encoders! And I can finally stop using DeBERTav3 from 2021 :D

Congrats @answerdotai , @LightOnIO and collaborators like @tomaarsen !

Paper and models here 👇https://huggingface.co/collections/answerdotai/modernbert-67627ad707a4acbf33c41deb

3 replies

upvoted an article 3 months ago

Article

Visually Multilingual: Introducing mcdse-2b

•

Oct 27, 2024

• 38

reacted to singhsidhukuldeep's post with 👀 3 months ago

Post

2163

While Google's Transformer might have introduced "Attention is all you need," Microsoft and Tsinghua University are here with the DIFF Transformer, stating, "Sparse-Attention is all you need."

The DIFF Transformer outperforms traditional Transformers in scaling properties, requiring only about 65% of the model size or training tokens to achieve comparable performance.

The secret sauce? A differential attention mechanism that amplifies focus on relevant context while canceling out noise, leading to sparser and more effective attention patterns.

How?
- It uses two separate softmax attention maps and subtracts them.
- It employs a learnable scalar λ for balancing the attention maps.
- It implements GroupNorm for each attention head independently.
- It is compatible with FlashAttention for efficient computation.

What do you get?
- Superior long-context modeling (up to 64K tokens).
- Enhanced key information retrieval.
- Reduced hallucination in question-answering and summarization tasks.
- More robust in-context learning, less affected by prompt order.
- Mitigation of activation outliers, opening doors for efficient quantization.

Extensive experiments show DIFF Transformer's advantages across various tasks and model sizes, from 830M to 13.1B parameters.

This innovative architecture could be a game-changer for the next generation of LLMs. What are your thoughts on DIFF Transformer's potential impact?