Free resource - No login required

    Free PostgreSQL Take-Home Assignment Template for Data Engineer Hiring

    Use this realistic SQL work-sample assignment to screen data engineering, analytics engineering, and SQL-heavy candidates before technical interviews.

    Includes PostgreSQL schema, sample data, candidate instructions, expected outputs, and scoring rubric.

    What's included

    This free PostgreSQL hiring assignment includes everything you need to run a structured technical screen.

    • Candidate instructions
    • PostgreSQL schema
    • Sample data (seed SQL)
    • 3 realistic SQL tasks
    • Expected output tables
    • Reviewer scoring rubric
    • Seniority calibration guidance

    Who this template is for

    This template is useful for teams hiring:

    • Data engineers
    • Analytics engineers
    • BI developers
    • Data analysts
    • Backend engineers working with SQL-heavy systems

    The assignment tests practical SQL reasoning, joins, aggregation, date handling, conditional logic, window functions, and ability to translate business requirements into correct SQL.

    PostgreSQL-specific skills tested

    This assignment tests practical PostgreSQL skills including DATE_TRUNC, interval comparisons, FILTER on aggregate functions, CASE expressions, common table expressions, joins, grouping, rounding, and ROW_NUMBER window functions.

    Assignment scenario

    SaaS support operations analytics

    Candidates are given a small support-ticket dataset for a SaaS company. The dataset contains customers, support agents, and tickets. The goal is to write PostgreSQL queries that help the support team understand SLA breaches, agent performance, and customer ticket burden.

    PostgreSQL schema and sample data

    Run this seed file to create the tables and load the sample dataset.

    postgres_seed.sql
    DROP TABLE IF EXISTS tickets;
    DROP TABLE IF EXISTS agents;
    DROP TABLE IF EXISTS customers;
    
    CREATE TABLE customers (
      customer_id INT PRIMARY KEY,
      company_name TEXT NOT NULL,
      plan TEXT NOT NULL
    );
    
    CREATE TABLE agents (
      agent_id INT PRIMARY KEY,
      agent_name TEXT NOT NULL,
      team TEXT NOT NULL
    );
    
    CREATE TABLE tickets (
      ticket_id INT PRIMARY KEY,
      customer_id INT NOT NULL REFERENCES customers(customer_id),
      agent_id INT NOT NULL REFERENCES agents(agent_id),
      priority TEXT NOT NULL CHECK (priority IN ('critical', 'high', 'normal')),
      created_at TIMESTAMP NOT NULL,
      first_response_at TIMESTAMP NOT NULL,
      resolved_at TIMESTAMP NOT NULL,
      reopened BOOLEAN NOT NULL DEFAULT FALSE
    );
    
    INSERT INTO customers (customer_id, company_name, plan) VALUES
      (1, 'Northwind Analytics', 'Startup'),
      (2, 'BluePeak Logistics', 'Growth'),
      (3, 'Cedar Bank', 'Enterprise'),
      (4, 'Luma Retail', 'Growth');
    
    INSERT INTO agents (agent_id, agent_name, team) VALUES
      (10, 'Ava', 'Platform'),
      (11, 'Mihkel', 'Data'),
      (12, 'Sara', 'Data');
    
    INSERT INTO tickets (
      ticket_id,
      customer_id,
      agent_id,
      priority,
      created_at,
      first_response_at,
      resolved_at,
      reopened
    ) VALUES
      (1001, 1, 10, 'high',     '2025-01-03 09:00', '2025-01-03 13:30', '2025-01-04 10:00', FALSE),
      (1002, 2, 11, 'normal',   '2025-01-05 08:00', '2025-01-06 10:00', '2025-01-07 09:00', TRUE),
      (1003, 3, 12, 'critical', '2025-01-08 14:00', '2025-01-08 19:00', '2025-01-09 14:00', FALSE),
      (1004, 4, 11, 'high',     '2025-01-10 10:00', '2025-01-10 17:00', '2025-01-11 12:00', FALSE),
      (1005, 3, 10, 'normal',   '2025-01-12 09:00', '2025-01-12 20:00', '2025-01-13 09:00', FALSE),
      (1006, 2, 12, 'critical', '2025-01-18 11:00', '2025-01-18 16:30', '2025-01-19 11:00', TRUE),
      (1007, 1, 10, 'normal',   '2025-02-02 09:00', '2025-02-03 08:00', '2025-02-04 09:00', FALSE),
      (1008, 2, 11, 'high',     '2025-02-04 12:00', '2025-02-04 22:00', '2025-02-05 10:00', FALSE),
      (1009, 4, 12, 'normal',   '2025-02-06 09:00', '2025-02-07 12:00', '2025-02-08 09:00', FALSE),
      (1010, 3, 10, 'critical', '2025-02-07 15:00', '2025-02-07 17:00', '2025-02-07 23:00', FALSE),
      (1011, 3, 11, 'high',     '2025-02-12 08:00', '2025-02-12 18:30', '2025-02-13 08:00', TRUE),
      (1012, 2, 12, 'normal',   '2025-02-16 09:00', '2025-02-16 20:00', '2025-02-17 09:00', FALSE);

    Candidate instructions

    You are given a small SaaS support-ticket database.

    Write PostgreSQL queries to answer the tasks below. Use clear, readable SQL. You may use common table expressions, subqueries, aggregate functions, CASE expressions, and window functions.

    Return exactly the columns requested in each task. Follow ordering and tie-breaking rules exactly.

    Task 1

    SLA breach rate by month and plan

    Support leadership wants to understand whether customers on different plans are receiving timely first responses.

    SLA rules:

    • critical tickets: first response within 4 hours
    • high tickets: first response within 8 hours
    • normal tickets: first response within 24 hours

    Write a query that returns the SLA breach rate by month and customer plan.

    Expected columns:

    • month, plan, total_tickets, breached_tickets, breach_rate_pct

    Rules:

    • month should be the first day of the month as a date.
    • breach_rate_pct should be rounded to 1 decimal place.
    • Order by month ascending, then plan ascending.

    Solution query

    postgres_task_1_solution.sql
    SELECT
      DATE_TRUNC('month', t.created_at)::date AS month,
      c.plan,
      COUNT(*) AS total_tickets,
      COUNT(*) FILTER (
        WHERE t.first_response_at - t.created_at >
          CASE t.priority
            WHEN 'critical' THEN INTERVAL '4 hours'
            WHEN 'high' THEN INTERVAL '8 hours'
            WHEN 'normal' THEN INTERVAL '24 hours'
          END
      ) AS breached_tickets,
      ROUND(
        COUNT(*) FILTER (
          WHERE t.first_response_at - t.created_at >
            CASE t.priority
              WHEN 'critical' THEN INTERVAL '4 hours'
              WHEN 'high' THEN INTERVAL '8 hours'
              WHEN 'normal' THEN INTERVAL '24 hours'
            END
        ) * 100.0 / COUNT(*),
        1
      ) AS breach_rate_pct
    FROM tickets t
    JOIN customers c
      ON t.customer_id = c.customer_id
    GROUP BY
      DATE_TRUNC('month', t.created_at)::date,
      c.plan
    ORDER BY
      month,
      c.plan;

    Expected output

    monthplantotal_ticketsbreached_ticketsbreach_rate_pct
    2025-01-01Enterprise2150.0
    2025-01-01Growth3266.7
    2025-01-01Startup100.0
    2025-02-01Enterprise2150.0
    2025-02-01Growth3266.7
    2025-02-01Startup100.0
    Task 2

    Agent resolution performance

    The support manager wants to compare resolution speed for agents who resolved at least two tickets in January 2025.

    Write a query that returns each qualifying agent's number of resolved tickets and average resolution time.

    Expected columns:

    • agent_id, agent_name, resolved_tickets, avg_resolution_hours

    Rules:

    • Include only tickets resolved in January 2025.
    • Include only agents with at least 2 resolved tickets in that period.
    • avg_resolution_hours should be rounded to 1 decimal place.
    • Order by avg_resolution_hours ascending, then agent_id ascending.

    Solution query

    postgres_task_2_solution.sql
    SELECT
      a.agent_id,
      a.agent_name,
      COUNT(*) AS resolved_tickets,
      ROUND(
        AVG(EXTRACT(EPOCH FROM (t.resolved_at - t.created_at)) / 3600.0),
        1
      ) AS avg_resolution_hours
    FROM tickets t
    JOIN agents a
      ON t.agent_id = a.agent_id
    WHERE t.resolved_at >= TIMESTAMP '2025-01-01'
      AND t.resolved_at < TIMESTAMP '2025-02-01'
    GROUP BY
      a.agent_id,
      a.agent_name
    HAVING COUNT(*) >= 2
    ORDER BY
      avg_resolution_hours,
      a.agent_id;

    Expected output

    agent_idagent_nameresolved_ticketsavg_resolution_hours
    12Sara224.0
    10Ava224.5
    11Mihkel237.5
    Task 3

    Highest customer burden by month

    Support leadership wants to know which customer generated the highest support burden in each month.

    Burden scoring:

    • critical ticket = 5 points
    • high ticket = 3 points
    • normal ticket = 1 point

    Write a query that returns the customer with the highest burden score for each month.

    Expected columns:

    • month, customer_id, company_name, burden_score

    Rules:

    • month should be the first day of the month as a date.
    • If two customers have the same burden score in a month, return the customer with the lower customer_id.
    • Order by month ascending.

    Solution query

    postgres_task_3_solution.sql
    WITH customer_monthly_burden AS (
      SELECT
        DATE_TRUNC('month', t.created_at)::date AS month,
        c.customer_id,
        c.company_name,
        SUM(
          CASE t.priority
            WHEN 'critical' THEN 5
            WHEN 'high' THEN 3
            WHEN 'normal' THEN 1
          END
        ) AS burden_score
      FROM tickets t
      JOIN customers c
        ON t.customer_id = c.customer_id
      GROUP BY
        DATE_TRUNC('month', t.created_at)::date,
        c.customer_id,
        c.company_name
    ),
    ranked AS (
      SELECT
        *,
        ROW_NUMBER() OVER (
          PARTITION BY month
          ORDER BY burden_score DESC, customer_id ASC
        ) AS rn
      FROM customer_monthly_burden
    )
    SELECT
      month,
      customer_id,
      company_name,
      burden_score
    FROM ranked
    WHERE rn = 1
    ORDER BY month;

    Expected output

    monthcustomer_idcompany_nameburden_score
    2025-01-012BluePeak Logistics6
    2025-02-013Cedar Bank8

    Reviewer scoring rubric

    Score the assignment out of 30 points, 10 points per task.

    DimensionPointsWhat to check
    Correctness5Returns correct rows and columns, handles joins and grouping correctly, follows ordering rules
    SQL quality2Readable structure, useful aliases, avoids unnecessary complexity, uses PostgreSQL features appropriately
    Edge cases2Handles SLA thresholds, month grouping, and tie-breaking correctly where required
    Communication1Query is easy for reviewers to understand; logic is clear enough to discuss in an interview

    Suggested interpretation

    25-30Strong practical SQL ability
    19-24Good working SQL ability, suitable for follow-up interview
    13-18Partial understanding, needs deeper review
    0-12Insufficient for SQL-heavy role

    How to use this manually

    You can use this template manually by copying the schema and sample data into a PostgreSQL database, sending the task descriptions to candidates, and reviewing their submitted SQL against the expected outputs and rubric.

    This resource is intentionally ungated. You can adapt the dataset, tasks, or scoring rubric to match your own hiring process.

    Validation commands

    psql "$POSTGRES_DATABASE_URL" -f postgres_seed.sql
    psql "$POSTGRES_DATABASE_URL" -f postgres_task_1_solution.sql
    psql "$POSTGRES_DATABASE_URL" -f postgres_task_2_solution.sql
    psql "$POSTGRES_DATABASE_URL" -f postgres_task_3_solution.sql

    Need another SQL dialect?

    This is the PostgreSQL version of the SQL take-home assignment template. You can also use:

    Want to create your own SQL assessment?

    How Datakaru helps

    Datakaru helps hiring teams create custom SQL and Python take-home assignments, send them to candidates, and review submissions with AI-assisted scoring.

    Use this template as a starting point, then build your own assessment in Datakaru based on your role, dataset, and hiring criteria.

    No credit card required - 14-day free trial

    Frequently asked questions

    What is a PostgreSQL take-home assignment?
    A PostgreSQL take-home assignment is a practical hiring task where candidates write PostgreSQL queries against a sample database or realistic dataset. It helps hiring teams evaluate job-relevant SQL skills before a technical interview.
    What should a PostgreSQL hiring test evaluate?
    A good PostgreSQL hiring test should evaluate joins, filtering, aggregation, date handling, conditional logic, tie-breaking, and the candidate's ability to translate business questions into correct SQL.
    How long should a SQL take-home assignment take?
    For early screening, 30 to 60 minutes is usually enough. Longer assignments may create unnecessary candidate burden unless the role is senior or highly specialized.
    Can this template be used for data engineer hiring?
    Yes. This template is suitable for data engineer, analytics engineer, BI developer, and SQL-heavy analyst roles.
    Can I use this assignment in Datakaru?
    You can use this template manually, or use Datakaru to create your own custom SQL and Python take-home assignments for candidates.

    Ready to run better technical screens?

    Use this template as-is, or build your own custom SQL assignment in Datakaru.