Spaces:
Running
Running
File size: 1,445 Bytes
6032e5b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
---
id: "duckdb-summarize"
title: "Histogram"
slug: "duckdb-histogram-query"
description: "Create a histogram for a specific column to visualize the distribution of values."
code: |
from histogram(
table_name,
column_name,
bin_count := 10
)
---
# DuckDB Histogram
This snippet demonstrates how to use the `Histogram` function in DuckDB to calculate aggregate statistics for a dataset. The `histogram` function in DuckDB is used to compute histograms over columns of a dataset. It works for columns of any type and allows for various binning strategies and a custom number of bins.
```sql
from histogram(
table_name,
column_name,
bin_count := 10
)
```
## Parameters
- `table_name`: The name of the table or a subquery result.
- `column_name`: The name of the column for which to create the histogram, you can use different expressions to summarize the data such as length of a string.
- `bin_count`: The number of bins to use in the histogram.
## Histogram of the length of the input persona from the `PersonaHub` dataset
```sql
from histogram(
instruction,
len("input persona"),
bin_count := 5
)
```
<iframe
src="https://huggingface.co/datasets/proj-persona/PersonaHub/embed/viewer/instruction/train?sql_console=true&sql=from+histogram%28%0A++instruction%2C%0A++len%28%22input+persona%22%29%2C%0A++bin_count+%3A%3D+5%0A%29"
frameborder="0"
width="100%"
height="560px"
></iframe> |