Postgres window functions replaced four Python analytics loops in one query
← Back
April 4, 2026Database8 min read

Postgres window functions replaced four Python analytics loops in one query

Published April 4, 20268 min read

I was computing monthly revenue growth in Python: fetch all orders, group by month, iterate to calculate the previous month for each row, compute the percentage change. Four nested loops, 300 lines of code, 45-second runtime on 2M rows. Then I learned OVER and LAG and rewrote the whole thing in a 15-line SQL query that runs in 800ms. Window functions are the most powerful analytics feature in Postgres that most engineers avoid.

The anatomy of a window function

sql
SELECT
    date,
    revenue,
    -- Function OVER (partition and ordering rules)
    SUM(revenue) OVER (ORDER BY date) AS running_total,
    LAG(revenue) OVER (ORDER BY date) AS prev_month_revenue,
    RANK() OVER (ORDER BY revenue DESC) AS revenue_rank
FROM monthly_revenue
ORDER BY date;

The key: the window function computes across rows without collapsing them (unlike GROUP BY). You get one row per input row, with the window computation as an additional column.

Running totals and moving averages

sql
SELECT
    date,
    daily_revenue,
    -- Cumulative sum for the year
    SUM(daily_revenue) OVER (
        PARTITION BY EXTRACT(YEAR FROM date)
        ORDER BY date
    ) AS ytd_revenue,
    
    -- 7-day moving average
    AVG(daily_revenue) OVER (
        ORDER BY date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS revenue_7day_avg,
    
    -- 30-day moving average
    AVG(daily_revenue) OVER (
        ORDER BY date
        ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
    ) AS revenue_30day_avg

FROM daily_revenue_view
ORDER BY date;

Month-over-month growth with LAG

sql
WITH monthly_revenue AS (
    SELECT
        DATE_TRUNC('month', created_at) AS month,
        SUM(amount_cents) / 100.0 AS revenue
    FROM orders
    WHERE status = 'completed'
    GROUP BY 1
)
SELECT
    month,
    revenue,
    LAG(revenue) OVER (ORDER BY month) AS prev_month,
    revenue - LAG(revenue) OVER (ORDER BY month) AS absolute_change,
    ROUND(
        (revenue - LAG(revenue) OVER (ORDER BY month)) 
        / NULLIF(LAG(revenue) OVER (ORDER BY month), 0) * 100,
        1
    ) AS pct_change
FROM monthly_revenue
ORDER BY month;

Ranking users within segments

sql
-- Top spenders per country — without window functions, this requires a subquery per country
SELECT
    user_id,
    country,
    total_spend,
    RANK() OVER (
        PARTITION BY country
        ORDER BY total_spend DESC
    ) AS country_rank,
    NTILE(4) OVER (
        PARTITION BY country
        ORDER BY total_spend DESC
    ) AS spend_quartile  -- 1 = top 25%, 4 = bottom 25%
FROM user_spend_summary
ORDER BY country, country_rank;

Session analysis with LEAD

sql
-- Calculate session duration from event timestamps
WITH events AS (
    SELECT
        user_id,
        event_time,
        LEAD(event_time) OVER (
            PARTITION BY user_id
            ORDER BY event_time
        ) AS next_event_time
    FROM user_events
),
sessions AS (
    SELECT
        user_id,
        event_time AS session_start,
        next_event_time AS session_end,
        EXTRACT(EPOCH FROM (next_event_time - event_time)) / 60.0 AS gap_minutes
    FROM events
)
SELECT
    user_id,
    session_start,
    session_end,
    gap_minutes,
    -- New session if gap > 30 minutes
    CASE WHEN gap_minutes > 30 OR gap_minutes IS NULL THEN 1 ELSE 0 END AS is_session_boundary
FROM sessions
ORDER BY user_id, session_start;

Percentile calculations

sql
-- P50, P90, P99 response times by endpoint
SELECT
    endpoint,
    COUNT(*) AS request_count,
    ROUND(AVG(response_ms)) AS avg_ms,
    PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY response_ms) AS p50_ms,
    PERCENTILE_CONT(0.90) WITHIN GROUP (ORDER BY response_ms) AS p90_ms,
    PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY response_ms) AS p99_ms
FROM api_requests
WHERE created_at > NOW() - INTERVAL '7 days'
GROUP BY endpoint
ORDER BY p99_ms DESC;

The performance story

Window functions are pushed down into the database execution plan. Postgres can use indexes, avoid materializing intermediate results, and leverage parallel query execution. The application-layer Python loop I replaced fetched 2M rows to the application server, grouped them, and iterated — over and over. The SQL window function kept the data in the database and computed everything in a single scan with a sort.

If you are doing any analytics computation in your application layer by fetching rows and iterating, there is a window function for it. The learning curve is the OVER clause syntax — once that clicks, you will find uses everywhere.

Share this
← All Posts8 min read