Technical Interview Rounds — BI, Data Modelling & ETL | Data Analyst Interview

Home SQL Python Case Studies Technical Services Blog Book Free Call →

🖥️ Technical Interview Rounds

Master Every Technical Round of the Data Analyst Interview

BI tools, data modelling, ETL concepts, and system thinking. Go into every technical round knowing exactly what to expect — and how to answer with confidence.

Book Mock Technical Round ← Back to Home

Technical round types

50+

Real interview questions

BI & analytics tools covered

95%

Candidate success rate

Round Structure

The 4 technical rounds you will face

Most data analyst interviews beyond SQL and Python include one or more of these technical rounds. Know which ones your target company uses — and prepare accordingly.

Round 01

BI Tools & Dashboarding

Build a dashboard live in Power BI or Tableau, or walk through one you’ve built. Interviewers test whether you can turn raw data into a decision-ready chart with the right filters, KPIs, and drill-downs.

Power BITableauLookerExcelDAX

Explore BI topics →

Round 02

Data Modelling

Design a schema for a given business problem — star schema, fact and dimension tables, normalisation. Common at companies with data warehouses (Redshift, BigQuery, Snowflake).

Star schemaSnowflake schemaERDNormalisation

Explore modelling topics →

Round 03

ETL & Data Pipeline Concepts

Explain how data flows from source to warehouse. You don’t need to write Airflow code, but you must explain ELT vs ETL, staging layers, incremental loads, and common pipeline problems.

ETL vs ELTAirflowdbtIncremental load

Explore ETL topics →

Round 04

System & Analytical Thinking

Design a metrics system for a new product, think through a data quality issue, or propose an analytics architecture. This tests how you reason — not just what you know.

Metric designData qualityAnalytical frameworks

Explore frameworks →

Round 01

BI Tools & Dashboarding

The tools interviewers actually test — and what they look for in each one.

📊Power BI

The most tested BI tool in UK and Indian data analyst interviews. Know how to build a report from scratch in under 20 minutes.

Connecting to CSV, SQL, and Excel
Building visuals — bar, line, card, slicer
DAX — CALCULATE, SUMX, RELATED
Row-level security basics
Publishing to Power BI Service

📈Tableau

Common at product and consulting companies. Interviewers test LOD expressions, calculated fields, and storytelling with data.

Connecting data sources and blending
Dimensions vs measures
LOD expressions (FIXED, INCLUDE, EXCLUDE)
Dual-axis and combined charts
Dashboard actions and filters

🔍Looker / Looker Studio

Increasingly common at tech startups. Looker uses LookML — a modelling layer you’ll be asked to explain conceptually.

Looker Studio — connect and visualise
LookML concepts: explores, views, joins
Dimensions and measures in LookML
Drilling into reports

📋Excel & Google Sheets

Still tested at mid-size companies. Pivot tables, VLOOKUP, and dynamic charts are the big three.

Pivot tables with grouping and % of total
VLOOKUP vs INDEX-MATCH vs XLOOKUP
Conditional formatting for dashboards
Dynamic charts linked to pivot tables

⚡DAX Fundamentals

Power BI’s formula language. Interviewers ask you to write or explain 3–5 DAX measures on the spot.

CALCULATE and filter context
Time intelligence — DATESYTD, SAMEPERIODLASTYEAR
RANKX for ranked measures
DIVIDE for safe division

🎯Dashboard Design

It’s not just about the tool — interviewers evaluate your design thinking: which chart, which KPI, what goes on page 1.

Choosing the right chart type
Designing for the right audience
KPI cards vs trend lines
Colour, contrast, and accessibility

Round 02

Data Modelling

Schema design questions are asked at companies with proper data warehouses. Know the vocabulary and be able to sketch a model from a business problem in 5 minutes.

⭐Star Schema

The default warehouse model — one central fact table surrounded by denormalised dimension tables. Fast for analytics queries.

Fact table — events and measures
Dimension tables — who, what, where, when
Foreign key relationships
Why it’s fast for GROUP BY

❄️Snowflake Schema

A star schema with normalised dimension tables. More space-efficient but adds JOIN complexity — know the trade-offs.

Splitting city → country dimension
Reduced storage vs query complexity
When to use vs star schema

📐Normalisation

1NF, 2NF, 3NF — OLTP databases are normalised. Know why warehouses deliberately denormalise for performance.

1NF — atomic values, no repeating groups
2NF — no partial dependencies
3NF — no transitive dependencies
OLTP vs OLAP trade-off

🔑Keys & Relationships

Primary, foreign, surrogate, and natural keys. Interviewers use these to test whether you understand how tables relate.

Natural vs surrogate keys
Composite keys
One-to-many vs many-to-many
Bridge tables for M2M

🗄️Cloud Warehouses

BigQuery, Snowflake, and Redshift are common in interviews. Understand the architecture even if you haven’t used them daily.

Columnar storage and why it’s fast
Partitioning and clustering
Compute vs storage separation
Materialised views

📊Slowly Changing Dimensions

SCD Type 1, 2, and 3 — how to handle dimension changes over time. A classic senior analyst question.

Type 1 — overwrite (no history)
Type 2 — new row per change (full history)
Type 3 — add column (limited history)
When each type is appropriate

Round 03

ETL & Data Pipeline Concepts

You don’t need to build pipelines to talk about them fluently. These concepts come up in every senior data analyst interview.

🔄ETL vs ELT

ETL transforms before loading (traditional). ELT loads raw first, then transforms in the warehouse (modern cloud approach). Know why the shift happened.

ETL — Informatica, SSIS, Talend
ELT — dbt, BigQuery, Snowflake
Why ELT wins for cloud warehouses

🌊Batch vs Streaming

Batch processes data in scheduled chunks; streaming processes events in near real-time. Know the trade-off for each use case.

Batch — Airflow, cron, Spark
Streaming — Kafka, Kinesis, Flink
Lambda and Kappa architectures

🔁Incremental Loads

Full refresh vs incremental — loading only new or changed rows since the last run. Critical for large tables.

Watermark columns (updated_at)
CDC — Change Data Capture
Upsert patterns (MERGE statement)

🛠️dbt Basics

dbt (data build tool) is the modern standard for ELT transformation. Increasingly mentioned in data analyst job descriptions.

Models — SQL SELECT as a model
ref() for dependency management
Tests — not_null, unique, accepted_values
Materialisation — view vs table

⏰Airflow & Orchestration

Apache Airflow is the most common orchestration tool. Know what a DAG is and how you’d describe the concept in an interview.

DAG — Directed Acyclic Graph
Operators — Python, Bash, SQL
Sensors, dependencies, retries
Monitoring and alerting

✅Data Quality

How do you know your pipeline is producing correct data? Interviewers love this question — most candidates can’t answer it well.

Null checks and row count reconciliation
Great Expectations / dbt tests
Anomaly detection alerts
Data lineage tracking

Round 04

System & Analytical Thinking

These questions have no single right answer — they test how you structure your thinking. Use frameworks and always anchor to business impact.

Metric Design Framework

When asked “how would you measure X?”, follow this structure: define the goal → identify the primary metric → identify guardrail metrics → define the denominator (per user? per session?) → account for segmentation (device, region, user cohort).

North Star MetricCounter metricsSegmentationLeading vs lagging

Root Cause Analysis Framework

For “a metric dropped — diagnose it”: first confirm the data isn’t broken → check if it’s global or segment-specific → rule out external factors → break the metric into its components → form and test hypotheses. Never guess before ruling out data issues.

Data validation firstSegment breakdownFunnel decompositionHypothesis testing

Data Architecture Thinking

For “design an analytics system for X”: identify sources → choose ingestion layer (batch/stream) → define warehouse model (star schema) → define transformation layer (dbt) → define consumption layer (BI tool or API) → define quality checks at each layer.

Ingestion → Storage → Transform → ServeLatency requirementsCost trade-offs

A/B Test Design

Define the hypothesis → choose the primary metric and guardrails → calculate sample size → choose randomisation unit (user, session, device) → define success criteria → plan for novelty effects and leakage → plan the ship or no-ship decision.

HypothesisSample sizeStatistical powerType I/II errors

Stakeholder Communication

A common but underestimated technical question: “how do you present a complex analysis to a non-technical stakeholder?” Lead with the insight, not the method. One chart, one headline. Numbers in context. Recommend a decision.

Insight firstOne chart ruleContext and comparisonDecision recommendation

Technical Round Code Example 1

Designing a metrics layer in SQL

A common take-home task: build a reusable SQL view that feeds a BI dashboard. Interviewers look for clarity, correct grain, and meaningful KPIs.

— Mart: daily revenue KPIs — feeds the Power BI / Tableau dashboard

CREATE OR REPLACE VIEW mart_daily_revenue AS

SELECT

DATE_TRUNC(‘day’, o.created_at) AS order_date,

p.category,

r.region_name,

COUNT(DISTINCT o.order_id) AS orders,

COUNT(DISTINCT o.customer_id) AS unique_customers,

SUM(o.amount) AS gross_revenue,

SUM(o.amount – o.discount) AS net_revenue,

ROUND(SUM(o.amount) / NULLIF(COUNT(DISTINCT o.order_id), 0), 2) AS aov

FROM orders o

JOIN products p ON o.product_id = p.product_id

JOIN regions r ON o.region_id = r.region_id

WHERE o.status = ‘completed’

GROUP BY 1, 2, 3

ORDER BY order_date DESC;

Technical Round Code Example 2

Incremental load pattern with watermark

ETL design question — explain and write the SQL for an incremental load that only processes new or updated rows.

— Incremental load: only insert rows newer than the last run

— Step 1: Get the last loaded timestamp

SET last_run = (

SELECT COALESCE(MAX(updated_at), ‘1970-01-01’)

FROM warehouse.orders

);

— Step 2: Insert only new/changed rows from source

INSERT INTO warehouse.orders

SELECT *

FROM source.orders

WHERE updated_at > last_run;

— In dbt: this becomes a simple incremental model config

— {{ config(materialized=’incremental’, unique_key=’order_id’) }}

— {% if is_incremental() %}

— WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})

— {% endif %}

Real Interview Questions

Technical round questions from top companies in 2026

These questions have been asked at Google, Amazon, Flipkart, Swiggy, and more — grouped by difficulty and topic.

🖥️ BI & Dashboard Questions

Easy

What is the difference between a dimension and a measure in Power BI / Tableau?

Topic: BI fundamentals · Asked at Infosys, Deloitte, EY

Medium

You have daily sales data by product and region. Design a Power BI dashboard for the VP of Sales — describe the pages, visuals, and KPIs you would include.

Topic: Dashboard design · Asked at Amazon, Walmart

Medium

Write a DAX measure to calculate month-to-date revenue compared to the same period last year, expressed as a percentage change.

Topic: DAX / Time intelligence · Asked at Swiggy, Zomato

Hard

Your Power BI dashboard is running very slowly for the finance team. Walk through how you would diagnose and fix the performance issue.

Topic: Performance optimisation · Asked at Razorpay, PhonePe

Hard

Explain what a LOD expression is in Tableau and give an example of when you’d use FIXED vs INCLUDE vs EXCLUDE.

Topic: Tableau LOD · Asked at Google, Flipkart

⭐ Data Modelling Questions

Easy

What is the difference between a fact table and a dimension table? Give a real example from an e-commerce business.

Topic: Star schema · Asked at Meesho, CRED

Medium

Design a data model for a food delivery platform like Swiggy. Identify at least 4 fact and dimension tables, and explain the grain of each fact table.

Topic: Schema design · Asked at Swiggy, Zomato

Medium

What is a Slowly Changing Dimension? Explain Type 1, 2, and 3 with an example using a customer’s address.

Topic: SCD · Asked at TCS, Infosys, Accenture

Hard

When would you use a snowflake schema instead of a star schema? What are the query performance implications?

Topic: Schema trade-offs · Asked at Amazon, Walmart

🔄 ETL & Pipeline Questions

Easy

What is the difference between ETL and ELT? Which is more common in modern cloud data stacks and why?

Topic: ETL fundamentals · Asked at Paytm, Razorpay

Medium

Explain what an incremental load is and how you would implement one. What column would you use to track which rows to load?

Topic: Incremental loading · Asked at Flipkart, Amazon

Medium

Your daily data pipeline failed at 2am and the dashboard shows stale data. Walk me through how you would investigate and fix it.

Topic: Pipeline debugging · Asked at Swiggy, PhonePe

Hard

What is dbt and how does it fit into the modern data stack? Explain models, refs, and tests to someone who hasn’t used it.

Topic: dbt / Modern data stack · Asked at CRED, Zepto, Meesho

Hard

You discover your daily order count in the warehouse is 15% lower than the source system. How do you investigate and fix this data discrepancy?

Topic: Data quality · Asked at Google, Amazon

Tools Overview

Every tool mentioned in technical rounds

You don’t need to master all of these — but you should be able to speak confidently about each one and know when to use it.

📊

Power BI

Microsoft BI. Most common in UK, banking, consulting interviews.

Must Know

📈

Tableau

Popular at product companies and consulting firms globally.

Must Know

🔍

Looker

Google Cloud’s BI tool. Growing in startups and tech companies.

Good to Know

❄️

Snowflake

Cloud warehouse. Star schema, clustering, and compute concepts.

Must Know

🔵

BigQuery

Google’s serverless warehouse. Common at fintech and GCP shops.

Good to Know

🛠️

dbt

SQL transformation tool. Now mentioned in most modern DA job specs.

Must Know

⏰

Airflow

Pipeline orchestration. Know DAGs conceptually — not just for engineers.

Good to Know

🌊

Kafka

Streaming platform. Understand publish-subscribe pattern conceptually.

Awareness

Expert Answers

Model answers to the hardest technical questions

These are the questions candidates struggle with most — with the structured answers that impress interviewers.

How would you design a data model for an e-commerce company from scratch?

Start by identifying the business processes: orders, product browsing, payments, returns. For each, define a fact table at the right grain — e.g. orders fact table at order-line level. Then build dimension tables: customer_dim (who), product_dim (what), date_dim (when), location_dim (where). Connect with foreign keys in a star schema. Add slowly changing dimension logic on customer_dim for address changes (Type 2). The result handles 90% of business questions without complex joins.

A key dashboard metric is showing incorrect numbers. How do you investigate?

First, rule out the dashboard itself — check the underlying query directly against the database. Then trace the pipeline: did the last load succeed? Check row counts at each stage — source → staging → warehouse → BI layer. Compare today’s numbers to yesterday’s to spot the change point. Check for schema changes in the source that may have broken parsing. If counts match but values differ, look for currency/unit changes, filter logic errors, or duplicate rows from a faulty join.

Explain the difference between OLTP and OLAP databases.

OLTP (Online Transaction Processing) databases handle thousands of small, fast reads and writes — your app’s operational database. Highly normalised to avoid data duplication. OLAP (Online Analytical Processing) databases are optimised for complex queries across millions of rows — your data warehouse. Columnar storage, denormalised star schema, and query engines optimised for aggregations rather than row lookups.

What is the grain of a fact table and why does it matter?

Grain is the most atomic level of detail stored in the fact table — one row represents exactly one event. For orders, the grain could be “one order line item” (most granular) or “one order” (aggregate). The grain determines what questions you can answer: order-line grain lets you analyse by product within an order; order grain doesn’t. You should always define the grain before designing the schema — mismatched grain is the root cause of most warehouse modelling bugs.

What is dbt and why has it become the industry standard?

dbt (data build tool) is a transformation framework that lets analysts write SQL SELECT statements as models, and handles everything else — dependency resolution (ref()), testing, documentation, and version control integration. It popularised the ELT pattern and brought software engineering best practices (git, CI/CD, testing) into analytics. Hiring managers now see dbt on job specs because it means your SQL transformations are tested, documented, and reproducible — not just scripts saved on someone’s laptop.

How do you ensure data quality in a production pipeline?

Implement checks at three layers. In ingestion: row count reconciliation, null checks on critical columns, duplicate detection. In transformation: dbt tests (not_null, unique, accepted_values, relationships) run on every model. In consumption: anomaly detection alerts — if daily orders drop more than 20% without a known cause, alert the team before the dashboard misleads a business decision. Document expected ranges and set up Slack or email alerts for SLA breaches.

Quick Reference

Technical rounds terminology cheat sheet

The terms you must be able to define clearly in a technical interview — no hesitation.

Term	What it means	Where it comes up
Star Schema	Central fact table surrounded by denormalised dimension tables	Data modelling round
Grain	The most atomic event represented by one row in a fact table	Data modelling round
SCD Type 2	Track historical changes in a dimension by adding a new row per change	Data modelling round
ETL	Extract, Transform, Load — transform before loading to warehouse	ETL round
ELT	Extract, Load, Transform — load raw, transform inside warehouse	ETL round
Incremental load	Load only new/changed rows since the last pipeline run	ETL round
DAG	Directed Acyclic Graph — Airflow’s model for pipeline task dependencies	ETL round
dbt model	A SQL SELECT statement that dbt materialises as a table or view	ETL round
Columnar storage	Stores data by column rather than row — fast for aggregation queries	Warehouse / modelling
DAX CALCULATE	Power BI formula that changes the filter context of a measure	BI tools round
LOD expression	Tableau formula that controls which dimensions define the aggregation level	BI tools round
Partition pruning	BigQuery/Snowflake skips irrelevant partitions — reduces query cost	Cloud warehouse round

Prep Plan

How to prepare for technical rounds in 2 weeks

A focused 2-week plan to go from “I’ve heard of these tools” to “I can answer any technical round question confidently.”

Days 1–3: BI Tools

Install Power BI Desktop (free). Load a CSV dataset and build a complete report: KPI cards, bar chart by category, line trend, and a slicer. Publish to Power BI Service. Write 3 DAX measures: total revenue, MoM growth, and a RANKX measure.

Power BI DesktopDAX measuresPublish report

Days 4–5: Data Modelling

Draw a star schema for two businesses: an e-commerce platform and a ride-hailing app. Identify fact tables, dimension tables, grain, and foreign keys. Practice explaining SCD Type 2 out loud using a customer address example.

Star schema diagramsSCD Type 2Grain definition

Days 6–8: ETL Concepts

Read the dbt documentation introduction and set up a free dbt Cloud project. Understand ETL vs ELT, incremental load patterns, and what a DAG is in Airflow. Practice explaining each concept in 60 seconds — no jargon.

dbt Cloud free tierETL vs ELTIncremental load SQL

Days 9–11: Practice Questions

Answer all 14 questions from the question bank above — out loud, timed at 3 minutes per answer. Record yourself and review. Focus on questions where you hesitate or use filler words instead of structured answers.

14 questions out loud3 minutes eachSelf-review recording

Days 12–14: Mock Interview

Book a mock technical round with our mentors. We simulate a real 45-minute technical interview — BI, data modelling, and ETL questions — with live feedback on your answers and a written report of exactly what to improve before your actual interview.

45-min mock interviewLive feedbackWritten improvement report

Ready to crack your technical round?

Book a free mock technical interview with our expert mentors — BI, data modelling, ETL, and system thinking all in one session.

Book Free Technical Mock