Master SQL for
Data Analyst Interviews
Complete SQL guide covering every topic interviewers test β with real company questions, full theory, worked examples, and a live SQL compiler to practice right here in your browser.
| customer_name | total_revenue | rank |
|---|---|---|
| Rahul Sharma | βΉ98,400 | 1 |
| Priya Mehta | βΉ76,200 | 2 |
| Amit Patel | βΉ64,800 | 3 |
What is SQL? β Complete Beginner Guide
SQL (Structured Query Language) is the standard language for managing and querying data stored in relational databases. It is the single most important technical skill for data analysts β required in over 90% of analyst job listings in India and globally.
Why SQL matters: SQL lets you pull, filter, group, join and transform data stored in databases β exactly what data analysts do every single day. Without SQL, you cannot do the job. With SQL, you can answer almost any business question from data.
Relational Databases
SQL works with relational databases like MySQL, PostgreSQL, BigQuery, Snowflake, SQL Server β all widely used at companies hiring analysts.
Tables, Rows & Columns
Data is stored in tables. Each row is a record (e.g. one order). Each column is an attribute (e.g. order_amount, customer_id, date).
Querying Data
You write SQL queries to ask questions: “Which customers spent the most last month?” SQL retrieves exactly the data you need from millions of rows.
Why It’s Tested
Interviews test SQL because it directly mirrors real analyst work. Every analyst query you write on the job is SQL β interviewers want to see you can do it.
SQL vs other tools β where it fits
| Tool | What it does | When analysts use it |
|---|---|---|
| SQL | Query & transform data in databases | Daily β pulling any data from any database |
| Python (Pandas) | Analyse & visualise data in-memory | Complex analysis, ML, automation |
| Excel | Ad-hoc analysis, small datasets | Quick calculations, sharing with non-tech teams |
| Power BI / Tableau | Interactive dashboards & charts | Reporting, presenting to stakeholders |
All SQL Topics for Data Analyst Interviews
Work through these topics in order. Each one builds on the previous. All topics are tested in data analyst interviews β from beginner roles to senior positions at top companies.
SQL Basics & SELECT
SELECT, WHERE, ORDER BY, LIMIT β the foundation of every SQL query.
Start β BeginnerGROUP BY & Aggregations
COUNT, SUM, AVG, MIN, MAX, HAVING β summarise data across groups.
Start β IntermediateSQL JOINs
INNER, LEFT, RIGHT, FULL OUTER β combine data from multiple tables.
Start β IntermediateSubqueries & CTEs
Nested queries and WITH clauses for clean, readable complex SQL.
Start β AdvancedWindow Functions
RANK, ROW_NUMBER, LAG, LEAD, SUM OVER β the hardest and most tested SQL topic.
Start β IntermediateDate & String Functions
DATEDIFF, DATE_TRUNC, SUBSTR, CONCAT β work with real-world messy data.
Start β PracticeReal Interview Questions
50+ questions from Google, Amazon, Flipkart, Swiggy β with full solutions.
Start β PracticeLive SQL Compiler
Write and run SQL queries right here in your browser β no setup needed.
Try now βHow often each topic appears in interviews
SELECT, WHERE, ORDER BY β SQL Basics
Every SQL query starts with SELECT. These fundamentals are the building blocks of all SQL β even complex window function queries use SELECT at their core.
SELECT retrieves data from a table. You specify which columns you want and which table to get them from. The asterisk (*) means “all columns.”
WHERE filters which rows are returned. It works like a condition β only rows where the condition is TRUE are included in the result.
'%phone%' matches “smartphone”, “phone case”, “headphones”. IS NULL checks for missing values β very common in data cleaning questions.ORDER BY sorts results. LIMIT controls how many rows are returned. Together they answer “show me the top N⦔ questions.
SQL JOINs β Complete Guide
JOINs combine data from two or more tables based on a related column. This is the most commonly tested SQL topic at all levels. Understanding all 4 join types is essential.
INNER JOIN
Returns only rows that have a match in both tables. Most common join. Use when you only want records that exist in both.
ON a.id = b.id
LEFT JOIN
Returns all rows from the left table, plus matching rows from the right. Non-matching right rows become NULL. Most used in analytics.
LEFT JOIN b ON a.id = b.id
RIGHT JOIN
Returns all rows from the right table, plus matching rows from the left. Rare in practice β analysts usually rewrite as a LEFT JOIN.
RIGHT JOIN b ON a.id = b.id
FULL OUTER JOIN
Returns all rows from both tables. Non-matching rows from either side become NULL. Use to find records missing from either table.
FULL OUTER JOIN b ON a.id = b.id
GROUP BY & Aggregate Functions
Aggregate functions summarise many rows into a single value. GROUP BY groups rows with the same value so you can aggregate per group β like “total revenue per region.”
| Function | What it does | Example |
|---|---|---|
COUNT(*) | Count all rows (including NULLs) | COUNT(*) AS total_orders |
COUNT(col) | Count non-NULL values in a column | COUNT(order_id) |
SUM(col) | Add up all values in a column | SUM(revenue) AS total_revenue |
AVG(col) | Average of all values | AVG(order_amount) AS avg_order |
MIN(col) | Smallest value | MIN(order_date) AS first_order |
MAX(col) | Largest value | MAX(order_amount) AS biggest_order |
Subqueries & CTEs (WITH Clauses)
When a single SELECT isn’t enough, you nest queries inside queries (subqueries) or break them into named steps (CTEs). CTEs are the cleaner, modern approach that interviewers love to see.
A subquery is a SELECT statement nested inside another. The inner query runs first, then the outer query uses its result.
CTEs (Common Table Expressions) use the WITH keyword to define named temporary tables. They make complex queries readable and are strongly preferred in interviews over deeply nested subqueries.
Window Functions β RANK, LAG, LEAD & More
Window functions are the single most tested SQL topic in senior data analyst interviews. They perform calculations across a set of rows related to the current row β without collapsing the result like GROUP BY does.
Key concept: Window functions do NOT reduce rows like GROUP BY. You get one result row per input row β the window function just adds extra calculated columns alongside the original data.
| emp_name | dept | salary | RANK() | DENSE_RANK() | ROW_NUMBER() |
|---|---|---|---|---|---|
| Rahul | Analytics | βΉ90,000 | 1 | 1 | 1 |
| Priya | Analytics | βΉ90,000 | 1 | 1 | 2 |
| Amit | Analytics | βΉ75,000 | 3 | 2 | 3 |
| Neha | Engineering | βΉ1,20,000 | 1 | 1 | 1 |
| Vikram | Engineering | βΉ95,000 | 2 | 2 | 2 |
| Function | What it does | Key difference |
|---|---|---|
RANK() | Rank rows β ties get same rank, next rank skips | 1, 1, 3 (skips 2) |
DENSE_RANK() | Rank rows β ties get same rank, no skipping | 1, 1, 2 (no skip) |
ROW_NUMBER() | Unique sequential number β no ties ever | 1, 2, 3 always |
LAG(col, n) | Value from n rows before current row | Previous month’s revenue |
LEAD(col, n) | Value from n rows after current row | Next month’s target |
SUM() OVER() | Running total or sum per partition | Cumulative revenue |
AVG() OVER() | Moving average across a window | 7-day rolling average |
NTILE(n) | Divide rows into n equal buckets | Quartiles, deciles |
Live SQL Compiler β Practice Right Here
Write and run real SQL queries in your browser. Pre-loaded with sample datasets matching real interview scenarios. No account or setup needed.
Real SQL Interview Questions with Answers
These exact questions have been asked at Google, Amazon, Flipkart, Swiggy, Paytm and other top companies. Click any question to see the full answer and SQL solution.
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
SELECT salary, DENSE_RANK() OVER(ORDER BY salary DESC) AS rnk
FROM employees
)
SELECT DISTINCT salary FROM ranked WHERE rnk = 2;
SELECT DISTINCT customer_id FROM orders
WHERE YEAR(order_date) = 2024
AND customer_id NOT IN (
SELECT DISTINCT customer_id FROM orders
WHERE YEAR(order_date) = 2023
);
SELECT
department,
employee_name,
salary,
ROW_NUMBER() OVER(
PARTITION BY department
ORDER BY salary DESC
) AS rn
FROM employees
)
SELECT department, employee_name, salary
FROM ranked
WHERE rn = 1;
month,
revenue,
LAG(revenue) OVER(ORDER BY month) AS prev_revenue,
ROUND(
(revenue – LAG(revenue) OVER(ORDER BY month))
* 100.0
/ LAG(revenue) OVER(ORDER BY month),
2
) AS mom_growth_pct
FROM monthly_revenue
WHERE year = 2024;
SELECT DISTINCT user_id, DATE(purchase_date) AS pdate
FROM purchases
WHERE purchase_date >= CURRENT_DATE – 30
),
grouped AS (
SELECT user_id, pdate,
pdate – ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY pdate) AS grp
FROM daily_activity
)
SELECT DISTINCT user_id
FROM grouped
GROUP BY user_id, grp
HAVING COUNT(*) >= 3;
product_id,
activity_date,
dau,
AVG(dau) OVER (
PARTITION BY product_id
ORDER BY activity_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_7d_avg
FROM daily_active_users
ORDER BY product_id, activity_date;
SQL Cheat Sheet β Interview Quick Reference
The most important SQL syntax, functions and patterns in one place. Save this page and review before every interview.
Query execution order
SQL executes in this order β NOT the order you write it: FROM β JOIN β WHERE β GROUP BY β HAVING β SELECT β ORDER BY β LIMIT. This is why you can’t use a SELECT alias in a WHERE clause β SELECT hasn’t run yet when WHERE executes.
| Pattern | SQL | Use case |
|---|---|---|
| Top N per group | ROW_NUMBER() OVER(PARTITION BY grp ORDER BY val DESC) | Top customer per region |
| Running total | SUM(val) OVER(ORDER BY date) | Cumulative revenue |
| % of total | val * 100.0 / SUM(val) OVER() | Product share of revenue |
| Previous value | LAG(val, 1) OVER(ORDER BY date) | MoM growth calculation |
| Anti-join (NOT IN) | LEFT JOIN ... WHERE b.id IS NULL | Customers with no orders |
| Conditional count | COUNT(CASE WHEN status='paid' THEN 1 END) | Count by status in one query |
| Deduplicate | ROW_NUMBER() OVER(PARTITION BY id ORDER BY date DESC) = 1 | Keep latest record per id |
| Pivot (manual) | SUM(CASE WHEN cat='A' THEN val END) AS cat_a | Rows to columns |
Ready to practise with a real mentor?
Book a free 30-min SQL mock interview. We’ll run through real questions, give you live feedback, and build your personalised prep plan.
Book Free SQL Session