Burhan PRO

brhnsbn

burhansebin

AI & ML interests

Open Source, Large Vision Models, Community, Responsible AI

Recent Activity

liked a model about 21 hours ago

deepseek-ai/DeepSeek-R1

reacted to clem's post with 🤗 10 months ago

Introducing https://huggingface.co/datasets/gretelai/synthetic_text_to_sql by https://huggingface.co/gretelai It stands as the largest and most diverse synthetic Text-to-SQL dataset available to-date. The dataset includes: - 105,851 records partitioned into 100,000 train and 5,851 test records ~23M total tokens, including ~12M SQL tokens - Coverage across 100 distinct domains/verticals - Comprehensive array of SQL tasks: data definition, retrieval, manipulation, analytics & reporting - Wide range of SQL complexity levels, including subqueries, single joins, multiple joins, aggregations, window functions, set operations - Database context, including table and view create statements - Natural language explanations of what the SQL query is doing - Contextual tags to optimize model training Blogpost: https://gretel.ai/blog/synthetic-text-to-sql-dataset Dataset: https://huggingface.co/datasets/gretelai/synthetic_text_to_sql

liked a model 10 months ago

google/codegemma-7b-it

View all activity

Organizations

brhnsbn's activity

liked a model about 21 hours ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 1 day ago • 149k • 3.25k

reacted to clem's post with 🤗 10 months ago

Post

2537

Introducing gretelai/synthetic_text_to_sql by https://huggingface.co/gretelai

It stands as the largest and most diverse synthetic Text-to-SQL dataset available to-date.

The dataset includes:

- 105,851 records partitioned into 100,000 train and 5,851 test records
~23M total tokens, including ~12M SQL tokens
- Coverage across 100 distinct domains/verticals
- Comprehensive array of SQL tasks: data definition, retrieval, manipulation, analytics & reporting
- Wide range of SQL complexity levels, including subqueries, single joins, multiple joins, aggregations, window functions, set operations
- Database context, including table and view create statements
- Natural language explanations of what the SQL query is doing
- Contextual tags to optimize model training

Blogpost: https://gretel.ai/blog/synthetic-text-to-sql-dataset
Dataset: gretelai/synthetic_text_to_sql