SQL for Data Analysts — Complete Guide, Interview Questions & Practice (2026) | Data Analyst Interview

SQL Interview Prep Hub 2026

Master SQL for
Data Analyst Interviews

Complete SQL guide covering every topic interviewers test — with real company questions, full theory, worked examples, and a live SQL compiler to practice right here in your browser.

SELECT & Filtering All JOIN Types Window Functions CTEs 50+ Real Questions Live SQL Compiler

Practice SQL Now View Interview Questions

50+

Real company questions

Core SQL topics

Live

SQL compiler

Free

No signup needed

sql_interview_practice.sql

— Window Function: Rank customers by revenue

SELECT

customer_id,

customer_name,

SUM(order_amount) AS total_revenue,

RANK() OVER (

ORDER BY SUM(order_amount) DESC

) AS revenue_rank

FROM orders

WHERE order_year = 2024

GROUP BY customer_id, customer_name

HAVING SUM(order_amount) > 10000;

Result — 5 rows returned

customer_name	total_revenue	rank
Rahul Sharma	₹98,400	1
Priya Mehta	₹76,200	2
Amit Patel	₹64,800	3

Query executed in 0.003s · 5 rows returned

What is SQL? All Topics SELECT & Basics JOINs Aggregations Subqueries & CTEs Window Functions SQL Compiler Interview Questions Cheat Sheet

Foundations

What is SQL? — Complete Beginner Guide

SQL (Structured Query Language) is the standard language for managing and querying data stored in relational databases. It is the single most important technical skill for data analysts — required in over 90% of analyst job listings in India and globally.

Why SQL matters: SQL lets you pull, filter, group, join and transform data stored in databases — exactly what data analysts do every single day. Without SQL, you cannot do the job. With SQL, you can answer almost any business question from data.

🗄️

Relational Databases

SQL works with relational databases like MySQL, PostgreSQL, BigQuery, Snowflake, SQL Server — all widely used at companies hiring analysts.

📋

Tables, Rows & Columns

Data is stored in tables. Each row is a record (e.g. one order). Each column is an attribute (e.g. order_amount, customer_id, date).

🔍

Querying Data

You write SQL queries to ask questions: “Which customers spent the most last month?” SQL retrieves exactly the data you need from millions of rows.

⚡

Why It’s Tested

Interviews test SQL because it directly mirrors real analyst work. Every analyst query you write on the job is SQL — interviewers want to see you can do it.

SQL vs other tools — where it fits

Tool	What it does	When analysts use it
SQL	Query & transform data in databases	Daily — pulling any data from any database
Python (Pandas)	Analyse & visualise data in-memory	Complex analysis, ML, automation
Excel	Ad-hoc analysis, small datasets	Quick calculations, sharing with non-tech teams
Power BI / Tableau	Interactive dashboards & charts	Reporting, presenting to stakeholders

Learning Path

All SQL Topics for Data Analyst Interviews

Work through these topics in order. Each one builds on the previous. All topics are tested in data analyst interviews — from beginner roles to senior positions at top companies.

Beginner

SQL Basics & SELECT

SELECT, WHERE, ORDER BY, LIMIT — the foundation of every SQL query.

Start → Beginner

GROUP BY & Aggregations

COUNT, SUM, AVG, MIN, MAX, HAVING — summarise data across groups.

Start → Intermediate

SQL JOINs

INNER, LEFT, RIGHT, FULL OUTER — combine data from multiple tables.

Start → Intermediate

Subqueries & CTEs

Nested queries and WITH clauses for clean, readable complex SQL.

Start → Advanced

Window Functions

RANK, ROW_NUMBER, LAG, LEAD, SUM OVER — the hardest and most tested SQL topic.

Start → Intermediate

Date & String Functions

DATEDIFF, DATE_TRUNC, SUBSTR, CONCAT — work with real-world messy data.

Start → Practice

Real Interview Questions

50+ questions from Google, Amazon, Flipkart, Swiggy — with full solutions.

Start → Practice

Live SQL Compiler

Write and run SQL queries right here in your browser — no setup needed.

Try now →

How often each topic appears in interviews

Window Functions

92%

SQL JOINs

89%

GROUP BY & Aggregations

85%

Subqueries & CTEs

78%

SELECT & Filtering

74%

Date Functions

62%

Topic 1 — Beginner

SELECT, WHERE, ORDER BY — SQL Basics

Every SQL query starts with SELECT. These fundamentals are the building blocks of all SQL — even complex window function queries use SELECT at their core.

1The basic SELECT statement

SELECT retrieves data from a table. You specify which columns you want and which table to get them from. The asterisk (*) means “all columns.”

SQL

— Select all columns from orders table
SELECT *
FROM orders;
 
— Select only specific columns
SELECT order_id, customer_name, order_amount
FROM orders;
 
— Give columns a readable alias
SELECT
  customer_name AS “Customer”,
  order_amount AS “Order Value (₹)”
FROM orders;

Key rules: Column names go after SELECT separated by commas. The table name goes after FROM. AS creates a readable alias for any column. Semicolons end the query.

2WHERE — filtering rows

WHERE filters which rows are returned. It works like a condition — only rows where the condition is TRUE are included in the result.

SQL

— Single condition
SELECT * FROM orders
WHERE order_amount > 5000;
 
— Multiple conditions with AND / OR
SELECT * FROM orders
WHERE order_amount > 5000
  AND region = ‘North’;
 
— IN — match any value in a list
SELECT * FROM customers
WHERE city IN (‘Mumbai’, ‘Delhi’, ‘Bengaluru’);
 
— LIKE — pattern matching for text
SELECT * FROM products
WHERE product_name LIKE ‘%phone%’;
 
— BETWEEN — range filter
SELECT * FROM orders
WHERE order_date BETWEEN ‘2024-01-01’ AND ‘2024-03-31’;

Interview tip: LIKE with % means “anything.” '%phone%' matches “smartphone”, “phone case”, “headphones”. IS NULL checks for missing values — very common in data cleaning questions.

3ORDER BY and LIMIT

ORDER BY sorts results. LIMIT controls how many rows are returned. Together they answer “show me the top N…” questions.

SQL

— Top 10 orders by value
SELECT customer_name, order_amount
FROM orders
ORDER BY order_amount DESC
LIMIT 10;
 
— Sort by multiple columns
SELECT region, product_name, revenue
FROM sales
ORDER BY region ASC, revenue DESC;

ASC = smallest to largest (default). DESC = largest to smallest. Use DESC when you want “top N” — highest revenue, most orders, latest dates.

Topic 3 — Intermediate

SQL JOINs — Complete Guide

JOINs combine data from two or more tables based on a related column. This is the most commonly tested SQL topic at all levels. Understanding all 4 join types is essential.

INNER JOIN

Returns only rows that have a match in both tables. Most common join. Use when you only want records that exist in both.

ON a.id = b.id

LEFT JOIN

Returns all rows from the left table, plus matching rows from the right. Non-matching right rows become NULL. Most used in analytics.

LEFT JOIN b ON a.id = b.id

RIGHT JOIN

Returns all rows from the right table, plus matching rows from the left. Rare in practice — analysts usually rewrite as a LEFT JOIN.

RIGHT JOIN b ON a.id = b.id

FULL OUTER JOIN

Returns all rows from both tables. Non-matching rows from either side become NULL. Use to find records missing from either table.

FULL OUTER JOIN b ON a.id = b.id

SQL — JOIN Examples

— INNER JOIN: customers who placed orders
SELECT c.customer_name, o.order_date, o.order_amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;
 
— LEFT JOIN: ALL customers, even those with no orders
SELECT c.customer_name, COUNT(o.order_id) AS total_orders
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_name;
 
— Interview classic: customers with NO orders (NULL trick)
SELECT c.customer_name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;

Interview golden rule: When asked to find records that DON’T exist in another table (e.g. “customers with no orders”), always use LEFT JOIN + WHERE right_table.id IS NULL. This pattern comes up constantly.

Topic 2 — Beginner/Intermediate

GROUP BY & Aggregate Functions

Aggregate functions summarise many rows into a single value. GROUP BY groups rows with the same value so you can aggregate per group — like “total revenue per region.”

Function	What it does	Example
`COUNT(*)`	Count all rows (including NULLs)	`COUNT(*) AS total_orders`
`COUNT(col)`	Count non-NULL values in a column	`COUNT(order_id)`
`SUM(col)`	Add up all values in a column	`SUM(revenue) AS total_revenue`
`AVG(col)`	Average of all values	`AVG(order_amount) AS avg_order`
`MIN(col)`	Smallest value	`MIN(order_date) AS first_order`
`MAX(col)`	Largest value	`MAX(order_amount) AS biggest_order`

SQL — GROUP BY Examples

— Revenue by region
SELECT region, SUM(order_amount) AS total_revenue
FROM orders
GROUP BY region
ORDER BY total_revenue DESC;
 
— HAVING filters AFTER grouping (WHERE filters BEFORE)
SELECT customer_id, COUNT(*) AS total_orders
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;
 
— Multi-level grouping: revenue by region AND product category
SELECT region, category, SUM(revenue) AS total
FROM sales
GROUP BY region, category
ORDER BY region, total DESC;

HAVING vs WHERE: WHERE filters individual rows before grouping. HAVING filters groups after grouping. A common interview trick is asking you to find groups where an aggregate condition is true — always use HAVING for that.

Topic 4 — Intermediate

Subqueries & CTEs (WITH Clauses)

When a single SELECT isn’t enough, you nest queries inside queries (subqueries) or break them into named steps (CTEs). CTEs are the cleaner, modern approach that interviewers love to see.

1Subqueries — queries inside queries

A subquery is a SELECT statement nested inside another. The inner query runs first, then the outer query uses its result.

SQL — Subquery

— Find customers with above-average order value
SELECT customer_name, order_amount
FROM orders
WHERE order_amount > (
  SELECT AVG(order_amount) FROM orders
);
 
— Classic interview Q: 2nd highest salary
SELECT MAX(salary) AS second_highest
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

2CTEs with WITH — cleaner and preferred

CTEs (Common Table Expressions) use the WITH keyword to define named temporary tables. They make complex queries readable and are strongly preferred in interviews over deeply nested subqueries.

SQL — CTEs

— CTE: find top customers per region
WITH regional_totals AS (
  SELECT region, customer_id,
    SUM(order_amount) AS total_spend
  FROM orders
  GROUP BY region, customer_id
),
ranked AS (
  SELECT *,
    RANK() OVER (
      PARTITION BY region
      ORDER BY total_spend DESC
    ) AS rnk
  FROM regional_totals
)
SELECT * FROM ranked WHERE rnk = 1;

Best practice: Always use CTEs over deeply nested subqueries in interviews. It shows you write maintainable, readable SQL — exactly what companies want in a working analyst.

Topic 5 — Advanced (Most Tested)

Window Functions — RANK, LAG, LEAD & More

Window functions are the single most tested SQL topic in senior data analyst interviews. They perform calculations across a set of rows related to the current row — without collapsing the result like GROUP BY does.

Key concept: Window functions do NOT reduce rows like GROUP BY. You get one result row per input row — the window function just adds extra calculated columns alongside the original data.

PARTITION BY dept — RANK() by salary within each department

emp_name	dept	salary	RANK()	DENSE_RANK()	ROW_NUMBER()
Rahul	Analytics	₹90,000	1	1	1
Priya	Analytics	₹90,000	1	1	2
Amit	Analytics	₹75,000	3	2	3
Neha	Engineering	₹1,20,000	1	1	1
Vikram	Engineering	₹95,000	2	2	2

Function	What it does	Key difference
`RANK()`	Rank rows — ties get same rank, next rank skips	1, 1, 3 (skips 2)
`DENSE_RANK()`	Rank rows — ties get same rank, no skipping	1, 1, 2 (no skip)
`ROW_NUMBER()`	Unique sequential number — no ties ever	1, 2, 3 always
`LAG(col, n)`	Value from n rows before current row	Previous month’s revenue
`LEAD(col, n)`	Value from n rows after current row	Next month’s target
`SUM() OVER()`	Running total or sum per partition	Cumulative revenue
`AVG() OVER()`	Moving average across a window	7-day rolling average
`NTILE(n)`	Divide rows into n equal buckets	Quartiles, deciles

SQL — Window Functions

— Month-over-month revenue change with LAG
SELECT
  month,
  revenue,
  LAG(revenue, 1) OVER (ORDER BY month) AS prev_month,
  revenue – LAG(revenue, 1) OVER (ORDER BY month) AS mom_change
FROM monthly_revenue;
 
— Running total (cumulative revenue)
SELECT
  order_date,
  daily_revenue,
  SUM(daily_revenue) OVER (ORDER BY order_date) AS cumulative_revenue
FROM daily_sales;
 
— Top 1 customer per region (classic interview question)
WITH ranked AS (
  SELECT region, customer_name, total_spend,
    ROW_NUMBER() OVER (
      PARTITION BY region ORDER BY total_spend DESC
    ) AS rn
  FROM customer_totals
)
SELECT * FROM ranked WHERE rn = 1;

Interview tip: Use ROW_NUMBER() (not RANK) when you need exactly one result per partition — e.g. “the top customer per region.” RANK() would return multiple rows if there’s a tie at the top.

Practice Tool

Live SQL Compiler — Practice Right Here

Write and run real SQL queries in your browser. Pre-loaded with sample datasets matching real interview scenarios. No account or setup needed.

SQL Practice Compiler ● Live

SQL Editor

Tables: customers · orders · products · employees

Try:

Query Results

Click “Run Query” to see results

Database Schema — click to inspect

customers(id, customer_name, city, email, total_orders, join_date)

Interview Prep

Real SQL Interview Questions with Answers

These exact questions have been asked at Google, Amazon, Flipkart, Swiggy, Paytm and other top companies. Click any question to see the full answer and SQL solution.

Easy

Find the second highest salary from the Employee table without using LIMIT or TOP.

TCSInfosysWipro

Use a subquery to exclude the maximum salary, then find the MAX of what’s left. This is a classic interview pattern for finding Nth highest values.

SELECT MAX(salary) AS second_highest_salary
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

Alternative using DENSE_RANK (better for Nth):

WITH ranked AS (
SELECT salary, DENSE_RANK() OVER(ORDER BY salary DESC) AS rnk
FROM employees
)
SELECT DISTINCT salary FROM ranked WHERE rnk = 2;

Easy

Find all customers who placed orders in 2024 but NOT in 2023.

FlipkartMeesho

Use NOT IN with a subquery, or LEFT JOIN with IS NULL. The LEFT JOIN approach is generally preferred for performance on large datasets.

— Method 1: NOT IN
SELECT DISTINCT customer_id FROM orders
WHERE YEAR(order_date) = 2024
  AND customer_id NOT IN (
    SELECT DISTINCT customer_id FROM orders
    WHERE YEAR(order_date) = 2023
  );

Medium

For each department, find the employee with the highest salary. Return department, employee name, and salary.

AmazonPaytmRazorpay

The classic window function interview question. Use ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) and filter where rn = 1.

WITH ranked AS (
  SELECT
    department,
    employee_name,
    salary,
    ROW_NUMBER() OVER(
      PARTITION BY department
      ORDER BY salary DESC
    ) AS rn
  FROM employees
)
SELECT department, employee_name, salary
FROM ranked
WHERE rn = 1;

Medium

Calculate the month-over-month revenue growth rate for each month in 2024.

SwiggyZomatoPhonePe

Use LAG() to get the previous month’s revenue, then calculate the percentage change. This tests both window functions and arithmetic in SQL.

SELECT
  month,
  revenue,
  LAG(revenue) OVER(ORDER BY month) AS prev_revenue,
  ROUND(
    (revenue – LAG(revenue) OVER(ORDER BY month))
    * 100.0
    / LAG(revenue) OVER(ORDER BY month),
    2
  ) AS mom_growth_pct
FROM monthly_revenue
WHERE year = 2024;

Hard

Find users who made purchases on 3 or more consecutive days within the last 30 days.

GoogleMetaUber

This is a classic consecutive days problem. The trick is to subtract the ROW_NUMBER from the date — rows that are consecutive will have the same “group date.” Count groups with 3+ rows.

WITH daily_activity AS (
  SELECT DISTINCT user_id, DATE(purchase_date) AS pdate
  FROM purchases
  WHERE purchase_date >= CURRENT_DATE – 30
),
grouped AS (
  SELECT user_id, pdate,
    pdate – ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY pdate) AS grp
  FROM daily_activity
)
SELECT DISTINCT user_id
FROM grouped
GROUP BY user_id, grp
HAVING COUNT(*) >= 3;

Hard

Calculate the 7-day rolling average of daily active users for each product.

GoogleAirbnbWalmart

Use AVG() as a window function with ROWS BETWEEN 6 PRECEDING AND CURRENT ROW to define a 7-day window. PARTITION BY product_id ensures separate rolling averages per product.

SELECT
  product_id,
  activity_date,
  dau,
  AVG(dau) OVER (
    PARTITION BY product_id
    ORDER BY activity_date
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
  ) AS rolling_7d_avg
FROM daily_active_users
ORDER BY product_id, activity_date;

View All 50+ Questions →

Quick Reference

SQL Cheat Sheet — Interview Quick Reference

The most important SQL syntax, functions and patterns in one place. Save this page and review before every interview.

Query execution order

SQL executes in this order — NOT the order you write it: FROM → JOIN → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT. This is why you can’t use a SELECT alias in a WHERE clause — SELECT hasn’t run yet when WHERE executes.

SELECTFROMWHEREGROUP BY HAVINGORDER BYLIMITJOIN LEFT JOININNER JOINUNIONDISTINCT IS NULLINBETWEENLIKE COUNT()SUM()AVG()RANK() LAG()LEAD()OVER()PARTITION BY WITH (CTE)CASE WHENCOALESCE()CAST()

Pattern	SQL	Use case
Top N per group	`ROW_NUMBER() OVER(PARTITION BY grp ORDER BY val DESC)`	Top customer per region
Running total	`SUM(val) OVER(ORDER BY date)`	Cumulative revenue
% of total	`val * 100.0 / SUM(val) OVER()`	Product share of revenue
Previous value	`LAG(val, 1) OVER(ORDER BY date)`	MoM growth calculation
Anti-join (NOT IN)	`LEFT JOIN ... WHERE b.id IS NULL`	Customers with no orders
Conditional count	`COUNT(CASE WHEN status='paid' THEN 1 END)`	Count by status in one query
Deduplicate	`ROW_NUMBER() OVER(PARTITION BY id ORDER BY date DESC) = 1`	Keep latest record per id
Pivot (manual)	`SUM(CASE WHEN cat='A' THEN val END) AS cat_a`	Rows to columns

Ready to practise with a real mentor?

Book a free 30-min SQL mock interview. We’ll run through real questions, give you live feedback, and build your personalised prep plan.

Book Free SQL Session

Master SQL forData Analyst Interviews

What is SQL? — Complete Beginner Guide

Relational Databases

Tables, Rows & Columns

Querying Data

Why It’s Tested

SQL vs other tools — where it fits

All SQL Topics for Data Analyst Interviews

SQL Basics & SELECT

GROUP BY & Aggregations

SQL JOINs

Subqueries & CTEs

Window Functions

Date & String Functions

Real Interview Questions

Live SQL Compiler

How often each topic appears in interviews

SELECT, WHERE, ORDER BY — SQL Basics

SQL JOINs — Complete Guide

INNER JOIN

LEFT JOIN

RIGHT JOIN

FULL OUTER JOIN

GROUP BY & Aggregate Functions

Subqueries & CTEs (WITH Clauses)

Window Functions — RANK, LAG, LEAD & More

Live SQL Compiler — Practice Right Here

Real SQL Interview Questions with Answers

SQL Cheat Sheet — Interview Quick Reference

Query execution order

Ready to practise with a real mentor?

Master SQL for
Data Analyst Interviews