Dokyoon

leeloolee

AI & ML interests

ai

Recent Activity

Organizations

sionic-ai's profile picture MultišŸ¤–Transformers's profile picture ģøģŠ¤ķŠøėŸ­ķŠø.ķ•œźµ­'s profile picture AI Safeguard's profile picture

leeloolee's activity

published a model 20 days ago
reacted to mitkox's post with šŸ‘€ 24 days ago
view post
Post
1407
Training a model to reason in the continuous latent space based on Meta's Coconut.
If it all works will apply it on the MiniCPM-o SVD-LR.
Endgame is a multimodal, adaptive, and efficient foundational on device AI model.
  • 2 replies
Ā·
upvoted an article about 1 month ago
reacted to singhsidhukuldeep's post with šŸ‘€ about 1 month ago
view post
Post
3422
Exciting breakthrough in e-commerce recommendation systems!
Walmart Global Tech researchers have developed a novel Triple Modality Fusion (TMF) framework that revolutionizes how we make product recommendations.

>> Key Innovation
The framework ingeniously combines three distinct data types:
- Visual data to capture product aesthetics and context
- Textual information for detailed product features
- Graph data to understand complex user-item relationships

>> Technical Architecture
The system leverages a Large Language Model (Llama2-7B) as its backbone and introduces several sophisticated components:

Modality Fusion Module
- All-Modality Self-Attention (AMSA) for unified representation
- Cross-Modality Attention (CMA) mechanism for deep feature integration
- Custom FFN adapters to align different modality embeddings

Advanced Training Strategy
- Curriculum learning approach with three complexity levels
- Parameter-Efficient Fine-Tuning using LoRA
- Special token system for behavior and item representation

>> Real-World Impact
The results are remarkable:
- 38.25% improvement in Electronics recommendations
- 43.09% boost in Sports category accuracy
- Significantly higher human evaluation scores compared to traditional methods

Currently deployed in Walmart's production environment, this research demonstrates how combining multiple data modalities with advanced LLM architectures can dramatically improve recommendation accuracy and user satisfaction.
  • 2 replies
Ā·
reacted to m-ric's post with šŸ‘ about 2 months ago
view post
Post
2531
š‡š®š š š¢š§š  š…šššœšž š«šžš„šžššš¬šžš¬ šš¢šœšØš­š«šØš§, šš š¦š¢šœš«šØš¬šœšØš©š¢šœ š„š¢š› š­š”ššš­ š¬šØš„šÆšžš¬ š‹š‹šŒ š­š«ššš¢š§š¢š§š  šŸ’šƒ š©ššš«ššš„š„šžš„š¢š³ššš­š¢šØš§ šŸ„³

šŸ•°ļø Llama-3.1-405B took 39 million GPU-hours to train, i.e. about 4.5 thousand years.

šŸ‘“šŸ» If they had needed all this time, we would have GPU stories from the time of Pharaoh š“‚€: "Alas, Lord of Two Lands, the shipment of counting-stones arriving from Cathay was lost to pirates, this shall delay the building of your computing temple by many moons "

šŸ› ļø But instead, they just parallelized the training on 24k H100s, which made it take just a few months.
This required parallelizing across 4 dimensions: data, tensor, context, pipeline.
And it is infamously hard to do, making for bloated code repos that hold together only by magic.

šŸ¤ š—•š˜‚š˜ š—»š—¼š˜„ š˜„š—² š—±š—¼š—»'š˜ š—»š—²š—²š—± š—µš˜‚š—“š—² š—暝—²š—½š—¼š˜€ š—®š—»š˜†š—ŗš—¼š—暝—²! Instead of building mega-training codes, Hugging Face colleagues cooked in the other direction, towards tiny 4D parallelism libs. A team has built Nanotron, already widely used in industry.
And now a team releases Picotron, a radical approach to code 4D Parallelism in just a few hundred lines of code, a real engineering prowess, making it much easier to understand what's actually happening!

āš” š—œš˜'š˜€ š˜š—¶š—»š˜†, š˜†š—²š˜ š—½š—¼š˜„š—²š—暝—³š˜‚š—¹:
Counting in MFU (Model FLOPs Utilization, how much the model actually uses all the compute potential), this lib reaches ~50% on SmolLM-1.7B model with 8 H100 GPUs, which is really close to what huge libs would reach. (Caution: the team is leading further benchmarks to verify this)

Go take a look šŸ‘‰ https://github.com/huggingface/picotron/tree/main/picotron
  • 1 reply
Ā·
reacted to alimotahharynia's post with šŸ”„ about 2 months ago
view post
Post
1604
Here's the space for our new article that leverages LLMs with reinforcement learning to design high-quality small molecules. Check it out at alimotahharynia/GPT-2-Drug-Generator. You can also access the article here: https://arxiv.org/abs/2411.14157.
I would be happy to receive your feedback.
reacted to cutechicken's post with ā¤ļø about 2 months ago
view post
Post
2928
šŸš€ RAGOndevice: High-Performance Local AI Document Analysis Assistant
šŸ’« Core Value
RAGOndevice is a high-performance AI system running locally without cloud dependency. Using CohereForAI's optimized 7B model, it enables professional-grade document analysis on standard PCs. āœØ
šŸŒŸ Ondevice AI Advantages
1. šŸ”‹ Efficient Resource Utilization

šŸŽÆ Optimized 7B Model: Runs on standard PCs
āš” Local Processing: Instant response without cloud
šŸ’» Low-Spec Compatible: Performs well on regular GPUs
šŸ”„ Optimized Memory: Ensures stable operation

2. šŸ›”ļø Data Security & Cost Efficiency

šŸ”’ Complete Privacy: No external data transmission
šŸŒ Offline Operation: No internet required
šŸ’° No Subscription: One-time installation
āš™ļø Resource Optimization: Uses existing hardware

šŸŽ® Key Features
1. šŸ“Š Powerful Document Analysis

šŸ“ Multi-Format Support: TXT, CSV, PDF, Parquet
šŸ§  Intelligent Analysis: Automatic structure recognition
šŸ‘ļø OCR Support: Advanced PDF text extraction
šŸ’¬ Real-time Chat: Natural language interaction

2. šŸ” Local RAG System

šŸŽÆ Efficient Search: TF-IDF based local search
šŸ§© Context Understanding: Accurate information retrieval
šŸ“š Wikipedia Integration: Rich background knowledge

šŸŽÆ Use Cases

šŸ¢ Enterprise: Secure confidential document processing
šŸ”¬ Personal Research: Private data analysis
šŸ“š Education: Personal learning material analysis
šŸ’» Development: Local codebase analysis

ā­ Differentiators

šŸƒā€ā™‚ļø Independent Operation: Zero cloud dependency
āš” Instant Response: No network latency
šŸ” Complete Security: Full data control
šŸ’Ž Cost Efficiency: No ongoing costs

šŸ”® Future Plans

šŸš€ Enhanced model optimization
šŸ“š Local knowledge base expansion
āš” Hardware optimization
šŸ“ Extended file support


šŸŒŸ RAGOndevice democratizes high-performance AI, providing the optimal local AI solution for security-sensitive environments. šŸš€

šŸ”„ Power of Local AI: Experience enterprise-grade AI capabilities right on your device!

VIDraft/RAGOndevice