While building a feature for a web analytics project, I wanted an api which returns all the events occurred between startdate
and enddate
on daily basis. If there is no event on any day, the value should be 0.
Exploring postgres, I found the use of generate_series
function. It can be used to generate
-
Numeric series
generate_series(start, stop) generate_series(start, stop, step)
start
: The first value in the series (inclusive). Can beinteger
,bigint
, ornumeric
.stop
: The last potential value in the series (inclusive). Must be the same type asstart
.step
(Optional): The increment between values. Defaults to1
if omitted. Must be the same type asstart
andstop
. Can be positive or negative.
select generate_series(1,10) as index; -- returns 10 rows of value from 1 to 10
-
Timestamp series
generate_series(start, stop, step)
start
: The starting timestamp (inclusive). Can betimestamp
,timestamptz
, ordate
(though using intervals withdate
is common).stop
: The ending timestamp (inclusive). Must be compatible withstart
.step
: The interval between timestamps. Must be of typeinterval
.
SELECT generate_series( '2025-04-01'::date, '2025-04-30'::date, interval '1 day' )::date AS report_date;
MSSQL Server also has similar function to generate these type of data, while for other DBMS
RECURSIVE CTE
can be used for such data.
Real use-case
Filling Gaps in Time-Series Data (using left join with actual data)
Imagine you have an events
table logging occurrences with a created_at
timestamp. To analyze trends over time, a common requirement is to aggregate these events daily. Specifically, we want to count how many pagevisits occurred and visitors visited the page within a given date range.
With the help of generate_series and using CTE, we can achieve this easily.
WITH date_series AS (
SELECT generate_series(
'2025-03-20'::date,
'2025-03-27'::date,
interval '1 day'
)::date AS day
)
SELECT
TO_CHAR(ds.day, 'YYYY-MM-DD') AS key,
COUNT(DISTINCT evt.ip) AS visitors,
COUNT(evt.id) AS pagevisits
FROM date_series ds
LEFT JOIN events evt ON DATE(evt.createdat) = ds.day
GROUP BY ds.day
ORDER BY ds.day;
It will return following rows
key | visitors | pagevisits |
---|---|---|
2025-03-20 | 0 | 0 |
2025-03-21 | 2 | 10 |
2025-03-22 | 0 | 0 |
2025-03-23 | 0 | 0 |
2025-03-24 | 3 | 20 |
2025-03-25 | 0 | 0 |
2025-03-26 | 1 | 27 |
2025-03-27 | 0 | 0 |
As demonstrated, generate_series
is a remarkably versatile function in PostgreSQL. Whether you’re filling gaps in time-series data, creating complete periodic reports, or generating test data, it offers a concise and efficient way to create sequential data directly within your SQL queries, simplifying complex reporting and analysis tasks.