Csaba  Kecskemeti's picture

Csaba Kecskemeti PRO

csabakecskemeti

AI & ML interests

None yet

Recent Activity

Organizations

Zillow's profile picture DevQuasar's profile picture Hugging Face Party @ PyTorch Conference's profile picture open/ acc's profile picture

csabakecskemeti's activity

reacted to MoritzLaurer's post with ๐Ÿ‘ 4 days ago
view post
Post
2248
Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!

This will probably be the basis for many future SOTA encoders! And I can finally stop using DeBERTav3 from 2021 :D

Congrats @answerdotai , @LightOnIO and collaborators like @tomaarsen !

Paper and models here ๐Ÿ‘‡https://huggingface.co/collections/answerdotai/modernbert-67627ad707a4acbf33c41deb
replied to luigi12345's post 5 days ago
posted an update 7 days ago
reacted to cutechicken's post with โค๏ธ 7 days ago
view post
Post
2820
๐Ÿš€ RAGOndevice: High-Performance Local AI Document Analysis Assistant
๐Ÿ’ซ Core Value
RAGOndevice is a high-performance AI system running locally without cloud dependency. Using CohereForAI's optimized 7B model, it enables professional-grade document analysis on standard PCs. โœจ
๐ŸŒŸ Ondevice AI Advantages
1. ๐Ÿ”‹ Efficient Resource Utilization

๐ŸŽฏ Optimized 7B Model: Runs on standard PCs
โšก Local Processing: Instant response without cloud
๐Ÿ’ป Low-Spec Compatible: Performs well on regular GPUs
๐Ÿ”„ Optimized Memory: Ensures stable operation

2. ๐Ÿ›ก๏ธ Data Security & Cost Efficiency

๐Ÿ”’ Complete Privacy: No external data transmission
๐ŸŒ Offline Operation: No internet required
๐Ÿ’ฐ No Subscription: One-time installation
โš™๏ธ Resource Optimization: Uses existing hardware

๐ŸŽฎ Key Features
1. ๐Ÿ“Š Powerful Document Analysis

๐Ÿ“ Multi-Format Support: TXT, CSV, PDF, Parquet
๐Ÿง  Intelligent Analysis: Automatic structure recognition
๐Ÿ‘๏ธ OCR Support: Advanced PDF text extraction
๐Ÿ’ฌ Real-time Chat: Natural language interaction

2. ๐Ÿ” Local RAG System

๐ŸŽฏ Efficient Search: TF-IDF based local search
๐Ÿงฉ Context Understanding: Accurate information retrieval
๐Ÿ“š Wikipedia Integration: Rich background knowledge

๐ŸŽฏ Use Cases

๐Ÿข Enterprise: Secure confidential document processing
๐Ÿ”ฌ Personal Research: Private data analysis
๐Ÿ“š Education: Personal learning material analysis
๐Ÿ’ป Development: Local codebase analysis

โญ Differentiators

๐Ÿƒโ€โ™‚๏ธ Independent Operation: Zero cloud dependency
โšก Instant Response: No network latency
๐Ÿ” Complete Security: Full data control
๐Ÿ’Ž Cost Efficiency: No ongoing costs

๐Ÿ”ฎ Future Plans

๐Ÿš€ Enhanced model optimization
๐Ÿ“š Local knowledge base expansion
โšก Hardware optimization
๐Ÿ“ Extended file support


๐ŸŒŸ RAGOndevice democratizes high-performance AI, providing the optimal local AI solution for security-sensitive environments. ๐Ÿš€

๐Ÿ”ฅ Power of Local AI: Experience enterprise-grade AI capabilities right on your device!

VIDraft/RAGOndevice
reacted to nicolay-r's post with ๐Ÿ‘€ 10 days ago
view post
Post
1921
๐Ÿ“ขFor those who wish to quick start with reasoning / cot application over rows of tabular data but with minimal dependencies, this post would be valuable.

๐Ÿ”Ž I found that the problem is that given a bulk of Chain-of-Though (CoT) ๐Ÿ”— queries for remotely accessed LLM ๐Ÿค– (like openrouter / Replicate / OpenAI) might result in connection loss which may lead exception ๐Ÿ’ฅ and challenges with generated content restoration.

Here, is where I contribute with the bulk-chain.
โญ https://github.com/nicolay-r/bulk-chain

Currently working on 0.24.3 version, in which I am happy to announce the API for developing your apps that are based on CoT schema declaration in JSON (details in attached images ๐Ÿ“ธ)

All you have to do is:
โœ… 1. Declare CoT-schema in json
โœ… 2. Declare the model or use the preset
โœ… 3. Launch code

One example is to use ReplicateIO provider:
https://github.com/nicolay-r/bulk-chain/blob/master/ext/replicate.py

Each model has a wrapped call for inference in try-catch block
posted an update 10 days ago
reacted to cutechicken's post with ๐Ÿš€ 10 days ago
view post
Post
3460
๐ŸŽฎ Introduction to the World's First 3D Tank Game Created Solely with Generative AI ๐Ÿš€
The advancement of AI technology is revolutionizing game development paradigms. I embarked on a challenge to create a 3D tank game using "only AI assistance," pushing the boundaries of what's possible in AI-driven game development. ๐Ÿค–
Following the success of my first 2D tank game ( cutechicken/tankwar) ๐ŸŽฏ, I ventured into the more challenging realm of 3D FPS game development. Remarkably, using Hugging Face's AI tool ( VIDraft/mouse1), the basic game framework was generated in just one minute โšก. The 3D modeling ( ginipick/SORA-3D) and sound effects ( fantaxy/Sound-AI-SFX) were also easily created with AI assistance.
The resulting game ( cutechicken/TankWar3D) represents arguably the world's first 3D FPS game created primarily with generative AI. 90% was accomplished through AI capabilities, with the remaining 10% comprising my post-processing work. ๐ŸŽ‰
Key Technical Features: ๐Ÿ› ๏ธ

Complete 3D rendering system using Three.js ๐Ÿ–ฅ๏ธ
Real-time physics-based collision detection and handling ๐Ÿ’ฅ
Dynamic shadow and lighting system โ˜€๏ธ
Real-time radar and enemy tracking system ๐ŸŽฏ
Advanced particle effects system (explosions, smoke, fire) ๐Ÿ’ซ
Dynamic sound system (engine, firing, explosion sounds) ๐Ÿ”Š
AI-driven enemy strategy system (pursuit, evasion, combat) ๐Ÿค–
Terrain-based tank tilt adjustment ๐ŸŒ
Real-time crosshair targeting system ๐ŸŽฏ
Dynamic UI system (health bars, ammo, score) ๐Ÿ“Š

Technical Implementation: โš™๏ธ

Physics Engine: ๐ŸŽณ
Custom collision detection system
Dynamic obstacle handling
Real-time terrain interaction


AI Systems: ๐Ÿง 
State-based AI behavior patterns
Dynamic pathfinding
Tactical decision-making system

Graphics: ๐ŸŽจ
PBR-based rendering
Dynamic particle system
Real-time shadow mapping
reacted to AdinaY's post with ๐Ÿ”ฅ 19 days ago
view post
Post
1571
Sailor 2 ๐Ÿšข open multilingual model for Southeast Asia by Sea AI Lab๐Ÿ”ฅ
https://huggingface.co/sailor2
sail/Sailor2-20B-Chat

โœจ Fully open code & ALL datasets ๐Ÿ™Œ
โœจ 1B/ 8B/20B base & chat expanded on Qwen2.5
โœจ Apache 2.0
โœจ Supports 15 languages including English, Chinese, Burmese, Cebuano, Ilocano, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tagalog, Thai, Vietnamese, and Waray๐Ÿ‡ฌ๐Ÿ‡ง๐Ÿ‡จ๐Ÿ‡ณ๐Ÿ‡ฑ๐Ÿ‡ฆ๐Ÿ‡ฒ๐Ÿ‡พ๐Ÿ‡ฒ๐Ÿ‡ฒ๐Ÿ‡ป๐Ÿ‡ณ๐Ÿ‡น๐Ÿ‡ญ
replied to their post 24 days ago
view reply

self eval results posted soon, partial:

Tasks Version. Filter. n-shot Metric Value Stderr
hellaswag 1 none 0 acc โ†‘ 0.5141 ยฑ 0.0050
none 0 acc_norm โ†‘ 0.6792 ยฑ 0.0047
reacted to julien-c's post with ๐Ÿ‘ 24 days ago
view post
Post
2179
wow ๐Ÿ˜ฎ

INTELLECT-1 is the first collaboratively trained 10 billion parameter language model trained from scratch on 1 trillion tokens of English text and code.

PrimeIntellect/INTELLECT-1-Instruct
posted an update 25 days ago
reacted to m-ric's post with โค๏ธ 25 days ago
view post
Post
2376
Single most important thing to do today: ๐—ด๐—ผ ๐˜๐—ฟ๐˜† ๐—ค๐˜„๐—ค ๐—ผ๐—ป ๐—›๐˜‚๐—ด๐—ด๐—ถ๐—ป๐—ด ๐—–๐—ต๐—ฎ๐˜!

๐Ÿ‘‰ https://huggingface.co/chat/models/Qwen/QwQ-32B-Preview
  • 2 replies
ยท
posted an update 28 days ago
view post
Post
1174
I have this small utility: no_more_typo
It is running in the background and able to call the LLM model to update the text on the clipboard. I think it would be ideal to fix typos and syntax.
I have just added the option to use custom prompt templates to perform different tasks.

Details, code and executable:
https://github.com/csabakecskemeti/no_more_typo

https://devquasar.com/no-more-typo/
replied to their post about 1 month ago
view reply

Exactly :D
Overflow fridge will be replaced with a rack :)

reacted to cappuch's post with ๐Ÿ˜Ž about 1 month ago
posted an update about 1 month ago
view post
Post
291
Repurposed my older AI workstation to a homelab server, it has received 2xV100 + 1xP40
I can reach huge 210k token context size with MegaBeam-Mistral-7B-512k-GGUF ~70+tok/s, or run Llama-3.1-Nemotron-70B-Instruct-HF-GGUF with 50k Context ~10tok/s (V100 only 40k ctx and 15tok/s).
Also able to Lora finetune with similar performace as an RTX3090.
It moved to the garage to no complaints for the noise from the family. Will move to a Rack soon :D
  • 2 replies
ยท
reacted to fdaudens's post with ๐Ÿ˜Ž about 1 month ago
view post
Post
1636
๐Ÿš€ @Qwen just dropped 2.5-Turbo!

1M token context (that's entire "War and Peace"!) + 4.3x faster processing speed. Same price, way more power ๐Ÿ”ฅ

Check out the demo: Qwen/Qwen2.5-Turbo-1M-Demo

#QWEN
posted an update about 1 month ago
view post
Post
1227
Some time ago, I built a predictive LLM router that routes chat requests between small and large LLM models based on prompt classification. It dynamically selects the most suitable model depending on the complexity of the user input, ensuring optimal performance while maintaining conversation context. I also fine-tuned a RoBERTa model to use with the package, but you can plug and play any classifier of your choice.

Project's homepage:
https://devquasar.com/llm-predictive-router/
Pypi:
https://pypi.org/project/llm-predictive-router/
Model:
DevQuasar/roberta-prompt_classifier-v0.1
Training data:
DevQuasar/llm_router_dataset-synth
Git:
https://github.com/csabakecskemeti/llm_predictive_router_package

Feel free to check it out, and/or contribute.
reacted to singhsidhukuldeep's post with ๐Ÿ‘ about 1 month ago
view post
Post
2299
Good folks at @nvidia and @Tsinghua_Uni have released LLAMA-MESH - A Revolutionary Approach to 3D Content Generation!

This innovative framework enables the direct generation of 3D meshes from natural language prompts while maintaining strong language capabilities.

Here is the Architecture & Implementation!

>> Core Components

Model Foundation
- If you haven't guessed it yet, it's built on the LLaMA-3.1-8B-Instruct base model
- Maintains original language capabilities while adding 3D generation
- Context length is set to 8,000 tokens

3D Representation Strategy
- Uses the OBJ file format for mesh representation
- Quantizes vertex coordinates into 64 discrete bins per axis
- Sorts vertices by z-y-x coordinates, from lowest to highest
- Sorts faces by the lowest vertex indices for consistency

Data Processing Pipeline
- Filters meshes to a maximum of 500 faces for computational efficiency
- Applies random rotations (0ยฐ, 90ยฐ, 180ยฐ, 270ยฐ) for data augmentation
- Generates ~125k mesh variations from 31k base meshes
- Uses Cap3D-generated captions for text descriptions

>> Training Framework

Dataset Composition
- 40% Mesh Generation tasks
- 20% Mesh Understanding tasks
- 40% General Conversation (UltraChat dataset)
- 8x training turns for generation, 4x for understanding

Training Configuration
- Deployed on 32 A100 GPUs (for Nvidia, this is literally in-house)
- 21,000 training iterations
- Global batch size: 128
- AdamW optimizer with a 1e-5 learning rate
- 30-step warmup with cosine scheduling
- Total training time: approximately 3 days (based on the paper)

This research opens exciting possibilities for intuitive 3D content creation through natural language interaction. The future of digital design is conversational!
posted an update about 1 month ago
view post
Post
1508
I've built a small open utility pip package called LLM-Forwarder that allows you to inject context, such as adding a private RAG, into existing chat applications by forwarding the app through the LLM-Forwarder. In the forwarder server, you can configure custom code to re-process chat messages and alter the user prompt, for example, by adding extra context.

https://pypi.org/project/llm-forwarder/
More details
https://devquasar.com/llmforwarder/