case study for sql

8 Week SQL Challenge

Start your SQL learning journey today!

  • Case Study #1 - Danny's Diner

Danny Ma · May 1, 2021

case study for sql

Introduction

Danny seriously loves Japanese food so in the beginning of 2021, he decides to embark upon a risky venture and opens up a cute little restaurant that sells his 3 favourite foods: sushi, curry and ramen.

Danny’s Diner is in need of your assistance to help the restaurant stay afloat - the restaurant has captured some very basic data from their few months of operation but have no idea how to use their data to help them run the business.

Problem Statement

Danny wants to use the data to answer a few simple questions about his customers, especially about their visiting patterns, how much money they’ve spent and also which menu items are their favourite. Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers.

He plans on using these insights to help him decide whether he should expand the existing customer loyalty program - additionally he needs help to generate some basic datasets so his team can easily inspect the data without needing to use SQL.

Danny has provided you with a sample of his overall customer data due to privacy issues - but he hopes that these examples are enough for you to write fully functioning SQL queries to help him answer his questions!

Danny has shared with you 3 key datasets for this case study:

You can inspect the entity relationship diagram and example data below.

Entity Relationship Diagram

Example datasets.

All datasets exist within the dannys_diner database schema - be sure to include this reference within your SQL scripts as you start exploring the data and answering the case study questions.

Table 1: sales

The sales table captures all customer_id level purchases with an corresponding order_date and product_id information for when and what menu items were ordered.

Table 2: menu

The menu table maps the product_id to the actual product_name and price of each menu item.

Table 3: members

The final members table captures the join_date when a customer_id joined the beta version of the Danny’s Diner loyalty program.

Interactive SQL Session

You can use the embedded DB Fiddle below to easily access these example datasets - this interactive session has everything you need to start solving these questions using SQL.

You can click on the Edit on DB Fiddle link on the top right hand corner of the embedded session below and it will take you to a fully functional SQL editor where you can write your own queries to analyse the data.

You can feel free to choose any SQL dialect you’d like to use, the existing Fiddle is using PostgreSQL 13 as default.

Serious SQL students have access to a dedicated SQL script in the 8 Week SQL Challenge section of the course which they can use to generate relevant temporary tables like we’ve done throughout the entire course!

Case Study Questions

Each of the following case study questions can be answered using a single SQL statement:

  • What is the total amount each customer spent at the restaurant?
  • How many days has each customer visited the restaurant?
  • What was the first item from the menu purchased by each customer?
  • What is the most purchased item on the menu and how many times was it purchased by all customers?
  • Which item was the most popular for each customer?
  • Which item was purchased first by the customer after they became a member?
  • Which item was purchased just before the customer became a member?
  • What is the total items and amount spent for each member before they became a member?
  • If each $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?
  • In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customer A and B have at the end of January?

Bonus Questions

Join all the things.

The following questions are related creating basic data tables that Danny and his team can use to quickly derive insights without needing to join the underlying tables using SQL.

Recreate the following table output using the available data:

Rank All The Things

Danny also requires further information about the ranking of customer products, but he purposely does not need the ranking for non-member purchases so he expects null ranking values for the records when customers are not yet part of the loyalty program.

It’s highly recommended to save all of your code in a separate IDE or text editor as you are trying to solve the problems in the provided SQL Fiddle instance above!

If you’d like to use this case study for one of your portfolio projects or in a personal blog post - please remember to link back to this URL and also don’t forget to share some LinkedIn updates using the #8WeekSQLChallenge hashtag and remember to tag me!

Ready for the next 8 Week SQL challenge case study? Click on the banner below to get started with case study #2!

case study for sql

I really hope you enjoyed this fun little case study - it definitely was fun for me to create!

Official Solutions

If you’d like to see the official code solutions and explanations for this case study and a whole lot more, please consider joining me for the Serious SQL course - you’ll get access to all course materials and I’m on hand to answer all of your additional SQL questions directly!

Serious SQL is priced at $49USD and $29 for students and includes access to all written course content, community events as well as live and recorded SQL training videos!

Please send an email to [email protected] from your educational email or include your enrolment details or student identification for a speedy response!

The following topics relevant to the Danny’s Diner case study are covered lots of depth in the Serious SQL course:

  • Common Table Expressions
  • Group By Aggregates
  • Window Functions for ranking
  • Table Joins

Don’t forget to review the comprehensive list of SQL resources I’ve put together for the 8 Week SQL Challenge on the Resources page!

Community Solutions

This section will be updated in the future with any community member solutions with a link to their respective GitHub repos!

Final Thoughts

The 8 Week SQL Challenge is proudly brought to you by me - Danny Ma and the Data With Danny virtual data apprenticeship program.

Students or anyone undertaking further studies are eligible for a $20USD student discount off the price of Serious SQL please send an email to [email protected] from your education email or include information about your enrolment for a fast response!

We have a large student community active on the official DWD Discord server with regular live events, trainings and workshops available to all Data With Danny students, plus early discounted access to all future paid courses.

There are also opportunities for 1:1 mentoring, resume reviews, interview training and more from myself or others in the DWD Mentor Team.

From your friendly data mentor, Danny :)

All 8 Week SQL Challenge Case Studies

All of the 8 Week SQL Challenge case studies can be found below:

  • Case Study #2 - Pizza Runner
  • Case Study #3 - Foodie-Fi
  • Case Study #4 - Data Bank
  • Case Study #5 - Data Mart
  • Case Study #6 - Clique Bait
  • Case Study #7 - Balanced Tree Clothing Co.
  • Case Study #8 - Fresh Segments

Share: Twitter , Facebook

Top SQL Case Study Interview Questions in 2024

Top SQL Case Study Interview Questions in 2024

What is a sql case study.

The majority of SQL interview questions are straightforward. You may be asked for definitions, or to write a clearly defined SQL query.

But SQL case study questions are an entirely different beast.

These questions usually start with a hypothetical business or product issue, e.g. unsubscribe rates are falling. Then, you have to define what metrics could be used to investigate the problem , and then write the query to produce those metrics.

One of the best ways to prepare for SQL case study interviews is to walk through solutions step-by-step. This will show you how to think about metrics in hypotheticals, as well as how to walk interviewers through your logic.

We’ve done that here, with two breakdowns of SQL case questions with clear solutions.

Example SQL Case Question: Unsubscribe Rates

case study for sql

Many SQL case study questions will ask you to investigate correlation. In this example SQL case question , we’re looking into this issue: Unsubscribe rates have increased after a new notification system has been introduced.

Twitter wants to roll out more push notifications to users because they think users are missing out on good content. Twitter decides to do this in an A/B test.

Say that after more notifications are released, there is a sudden increase in the total number of unsubscribes.

We’re given two tables: events where actions are ‘login’, ‘nologin’, and ‘unsubscribe’ and another table called variants where user’s are bucketed into a control and a variant A/B test.

Given these tables, write a query to display a graph to understand how unsubscribes are affecting login rates over time.

Note: Let’s say that all users are automatically put into the A/B test.

events table

variants table

Step 1: Start Each SQL Case Study by Making Assumptions

This question asks us to compare multiple variables at play here. Specifically, we’re looking at:

  • There is a new notification system.
  • We’re interested in the effect the new notifications are having on unsubscribes.

We’re not sure how unsubscribes are affecting login rates, but we can plot a graph that would help us visualize how the login rates change before and after an unsubscribe from a user .

We can also see how the login rates compare for unsubscribes for each bucket of the A/B test. Given that we want to measure two different changes, we have to eventually do a GROUP BY of two different variables:

  • Bucket variant

Step 2: Develop a Hypothesis for the SQL Case Question

In order to visualize this, we’ll need to plot two lines on a 2D graph.

  • The x-axis represents days until unsubscribing with a range of -30 to 0 to 30, in which -30 is thirty days before unsubscribing and 30 is 30 days after unsubscribing.
  • The y-axis represents the average login rate for each day. We’ll be plotting two lines for each of the A/B test variants, control and test.

Now that we have what we’re going to graph, it’s a matter of writing a SQL query to get the dataset for the graph.

We can make sure our dataset looks something like this:

Each column represents a different axis or line for our graph.

Step 3: SQL Coding + Analysis

We know that we have to get every user that has unsubscribed, so we’ll first INNER JOIN the abtest table to the events table, where there exists an unsubscribe event. Now we’ve isolated all users that have ever unsubscribed.

Additionally, we have to then get every event in which the user has logged in, and divide it by the total number of users that are eligible within the timeframe.

Example SQL Case Question: LinkedIn Job Titles

case study for sql

Many SQL case questions require creativity to solve. You’re given a hypothesis, but then have to determine how to prove or disprove it with specific metrics. The key here is walking the interviewer through your thought process. This example SQL case question from LinkedIn explores user career paths.

We’re given a table of user experiences representing each person’s past work experiences and timelines.

Specifically, let’s say we’re interested in analyzing the career paths of data scientists. The titles we care about are bucketed into data scientist, senior data scientist, and data science manager.

We’re interested in determining if a data scientist who switches jobs more often ends up getting promoted to a manager role faster than a data scientist that stays at one job for longer.

Write a query to prove or disprove this hypothesis.

user_experiences table

Step 1: Make Assumptions about the SQL Case Question

The hypothesis is that data scientists that end up switching jobs more often get promoted faster.

Therefore, in analyzing this dataset, we can prove this hypothesis by separating the data scientists into specific segments based on how often they shift in their careers.

For example, if we look at the number of job switches for data scientists that have been in their field for five years, we could prove the hypothesis if the number of data science managers increased along with the number of career jumps.

Here’s what that might look like:

  • Never switched jobs: 10% are managers
  • Switched jobs once: 20% are managers
  • Switched jobs twice: 30% are managers
  • Switched jobs three times: 40% are managers

We could look at this over different buckets of time as well to see if the correlation stays consistent after 10 or 15 years in a data science career.

This analysis proves to be correct except for the fact that it doesn’t count the intention of the data scientist. What happens if the data scientist didn’t ever want to become a manager?

Step 2: Come up with a Hypothesis for the SQL Case Question

There’s one flaw in the assumption there. It doesn’t account for the intention of the data scientist. It doesn’t answer the question: What happens if the data scientist didn’t ever want to become a manager?

One way to solve this is to do the analysis backwards .

We can subset all of the existing data science managers and see how often they ended up switching jobs before they got to their first manager position.

Then divide the number of job switches by the amount of time it took for them to achieve the manager position themselves. This way, we can end up with a result that looks like this:

  • Job switches: 1 - Average months to promotion: 50
  • Job switches: 2 - Average months to promotion: 46
  • Job switches: 3 - Average months to promotion: 44

But there is a fault with this analysis as well. What about all those data scientists that have switched jobs / not switched jobs but haven’t become managers yet? They could be one month away from being a manager and be subsetted out of our analysis!

We have to then make some assumptions about the distribution of existing data science managers.

Are the years of experience before they became managers normally distributed? If not, then our results might be a bit biased from our hindsight analysis.

Step 3: Write the SQL Case Query

We first make a CTE called manager_promo with all the user_ids that have been promoted to data science managers.

Next, we count the number of job switches before getting promoted as num_jobs_switched.

Then, we calculate the number of months before promotion to the data science manager position as month_to_promo.

Finally, we order by the number of jobs switched.

Step 4: Perform Analysis and Make Conclusions

Hint: Talk about any conclusions you could draw from your data, but also be prepared to talk about trade-offs and potential flaws.

With the query result, we can draw conclusions about the months it took each distinct user to be promoted to data science manager.

Be warned this solution is not perfect. The edge cases where users never become promoted to data science managers are not considered.

Finally, many adjustments, like creating buckets for different ranges of months (0-20 months to promotion, 20-40 months to promotion, etc.), can present a more digestible, high-level analysis on whether frequent job changes affect promotion opportunities to the data science manager position.

Each bucket would correspond to the average time it took the users in that bucket to be promoted to a data science manager position.

Learn more about SQL questions

This course is designed to help you learn everything you need to know about working with data, from basic concepts to more advanced techniques.

More SQL Resources to Ace Your Interview

If you have an interview coming up, review Interview Query’s data science course, which includes modules in SQL .

SQL interviews are demanding, and the more you practice all types of SQL interview questions and not just case questions, the more confident and efficient you’ll become in answering them.

DEV Community

DEV Community

yaswanthteja

Posted on Oct 12, 2022

8 Week SQL Challenge: Case Study #2 Pizza Runner

Image description

Introduction

Danny was scrolling through his Instagram feed when something really caught his eye — “80s Retro Styling 🎸 and Pizza 🍕 Is The Future!”

Danny was sold on the idea, but he knew that pizza alone was not going to help him get seed funding to expand his new Pizza Empire — so he had one more genius idea to combine with it — he was going to Uberize it — and so Pizza Runner was launched!

Danny started by recruiting “runners” to deliver fresh pizza from Pizza Runner Headquarters (otherwise known as Danny’s house) and also maxed out his credit card to pay freelance developers to build a mobile app to accept orders from customers.

Table Relationship

  • customer_orders — Customers’ pizza orders with 1 row each for individual pizza with topping exclusions and extras, and order time.
  • runner_orders — Orders assigned to runners documenting the pickup time, distance and duration from Pizza Runner HQ to customer, and cancellation remark. runners — Runner IDs and registration date
  • pizza_names — Pizza IDs and name
  • pizza_recipes — Pizza IDs and topping names
  • pizza_toppings — Topping IDs and name

Image description

Case Study Questions

This case study has LOTS of questions — they are broken up by area of focus including:

A. Pizza Metrics

B. runner and customer experience, c. ingredient optimisation, d. pricing and ratings.

  • E. Bonus DML Challenges (DML = Data Manipulation Language)

Data Cleaning and Transformation

Before I start with the solutions, I investigate the data and found that there are some cleaning and transformation to do, specifically on the

  • null values and data types in the customer_orders table
  • null values and data types in the runner_orders table
  • Alter data type in pizza_names table

Firstly, to clean up exclusions and extras in the customer_orders — we create TEMP TABLE #customer_orders and use CASE WHEN.

Then, we clean the runner_orders table with CASE WHEN and TRIM and create TEMP TABLE #runner_orders .

In summary,

  • pickup_time — Remove nulls and replace with ‘ ‘
  • distance — Remove ‘km’ and nulls
  • duration — Remove ‘minutes’ and nulls
  • cancellation — Remove NULL and null and replace with ‘ ‘

Then, we alter the date according to its correct data type.

  • pickup_time to DATETIME type
  • distance to FLOAT type
  • duration to INT type

Now that the data has been cleaned and transformed, let’s move on solving the questions! 😉

How many pizzas were ordered?

Image description

  • Total pizzas ordered are 14.
  • How many unique customer orders were made?

Image description

  • There are 10 unique customer orders made.
  • How many successful orders were delivered by each runner?

Image description

  • Runner 1 has 4 successful delivered orders.
  • Runner 2 has 3 successful delivered orders.
  • Runner 3 has 1 successful delivered order.
  • How many of each type of pizza was delivered?

Image description

  • There are 9 delivered Meatlovers pizzas.
  • There are 3 delivered Vegetarian pizzas.
  • How many Vegetarian and Meatlovers were ordered by each customer?

Image description

  • Customer 101 ordered 2 Meatlovers pizzas and 1 Vegetarian pizza.
  • Customer 102 ordered 2 Meatlovers pizzas and 2 Vegetarian pizzas.
  • Customer 103 ordered 3 Meatlovers pizzas and 1 Vegetarian pizza.
  • Customer 104 ordered 1 Meatlovers pizza.
  • Customer 105 ordered 1 Vegetarian pizza.
  • What was the maximum number of pizzas delivered in a single order?

Image description

  • Maximum number of pizza delivered in a single order is 3 pizzas.
  • For each customer, how many delivered pizzas had at least 1 change and how many had no changes?

Image description

  • Customer 101 and 102 likes his/her pizzas per the original recipe.
  • Customer 103, 104 and 105 have their own preference for pizza topping and requested at least 1 change (extra or exclusion topping) on their pizza.
  • How many pizzas were delivered that had both exclusions and extras?

Image description

  • Only 1 pizza delivered that had both extra and exclusion topping. That’s one fussy customer!
  • What was the total volume of pizzas ordered for each hour of the day?

Image description

  • Highest volume of pizza ordered is at 13 (1:00 pm), 18 (6:00 pm) and 21 (9:00 pm).
  • Lowest volume of pizza ordered is at 11 (11:00 am), 19 (7:00 pm) and 23 (11:00 pm).
  • What was the volume of orders for each day of the week?

Image description

  • There are 5 pizzas ordered on Friday and Monday.
  • There are 3 pizzas ordered on Saturday.
  • There is 1 pizza ordered on Sunday.

How many runners signed up for each 1 week period? (i.e. week starts 2021-01-01)

Image description

  • On Week 1 of Jan 2021, 2 new runners signed up.
  • On Week 2 and 3 of Jan 2021, 1 new runner signed up.
  • What was the average time in minutes it took for each runner to arrive at the Pizza Runner HQ to pickup the order?

Image description

  • The average time taken in minutes by runners to arrive at Pizza Runner HQ to pick up the order is 15 minutes.
  • Is there any relationship between the number of pizzas and how long the order takes to prepare?

Image description

  • On average, a single pizza order takes 12 minutes to prepare.
  • An order with 3 pizzas takes 30 minutes at an average of 10 minutes per pizza.
  • It takes 16 minutes to prepare an order with 2 pizzas which is 8 minutes per pizza — making 2 pizzas in a single order the ultimate efficiency rate.
  • What was the average distance travelled for each customer?

Image description

(Assuming that distance is calculated from Pizza Runner HQ to customer’s place)

  • Customer 104 stays the nearest to Pizza Runner HQ at average distance of 10km, whereas Customer 105 stays the furthest at 25km.
  • What was the difference between the longest and shortest delivery times for all orders?

Firstly, let’s see all the durations for the orders.

Image description

Then, we find the difference by deducting the shortest (MIN) from the longest (MAX) delivery times.

Image description

  • The difference between longest (40 minutes) and shortest (10 minutes) delivery time for all orders is 30 minutes.
  • What was the average speed for each runner for each delivery and do you notice any trend for these values?

Image description

(Average speed = Distance in km / Duration in hour)

  • Runner 1’s average speed runs from 37.5km/h to 60km/h.
  • Runner 2’s average speed runs from 35.1km/h to 93.6km/h. Danny should investigate Runner 2 as the average speed has a 300% fluctuation rate!
  • Runner 3’s average speed is 40km/h
  • What is the successful delivery percentage for each runner?

Image description

  • Runner 1 has 100% successful delivery.
  • Runner 2 has 75% successful delivery.

Runner 3 has 50% successful delivery (It’s not right to attribute successful delivery to runners as order cancellations are out of the runner’s control.)

I will continue with Part A, B and C soon!

What are the standard ingredients for each pizza?

What was the most commonly added extra?

What was the most common exclusion?

Generate an order item for each record in the customers_orders table in the format of one of the following:

Meat Lovers

Meat Lovers - Exclude Beef

Meat Lovers - Extra Bacon

Meat Lovers - Exclude Cheese, Bacon - Extra Mushroom, Peppers

  • Generate an alphabetically ordered comma separated ingredient list for each pizza order from the customer_orders table and add a 2x in front of any relevant ingredients

For example: "Meat Lovers: 2xBacon, Beef, ... , Salami"

  • What is the total quantity of each ingredient used in all delivered pizzas sorted by most frequent first?

If a Meat Lovers pizza costs $12 and Vegetarian costs $10 and there were no charges for changes — how much money has Pizza Runner made so far if there are no delivery fees? What if there was an additional $1 charge for any pizza extras?

  • Add cheese is $1 extra

The Pizza Runner team now wants to add an additional ratings system that allows customers to rate their runner, how would you design an additional table for this new dataset — generate a schema for this new table and insert your own data for ratings for each successful customer order between 1 to 5.

Using your newly generated table — can you join all of the information together to form a table which has the following information for successful deliveries?

  • customer_id
  • runner_id - rating - order_time
  • pickup_time
  • Time between order and pickup
  • Delivery duration
  • Average speed
  • Total number of pizzas
  • If a Meat Lovers pizza was $12 and Vegetarian $10 fixed prices with no cost for extras and each runner is paid $0.30 per kilometre travelled — how much money does Pizza Runner have left over after these deliveries?

E. Bonus Questions If Danny wants to expand his range of pizzas — how would this impact the existing data design? Write an INSERT statement to demonstrate what would happen if a new Supreme pizza with all the toppings was added to the Pizza Runner menu?

Top comments (2)

pic

Templates let you quickly answer FAQs or store snippets for re-use.

aarone4 profile image

  • Location Uj
  • Work SQL developer at Independent
  • Joined May 5, 2019

I've not read the whole article (too long!) But your first two code blocks could have been achieved using ISNULL(), REPLACE() and CAST() and avoided the CASE statements and ALTER column types. Cleaner code and less steps.

yaswanthteja profile image

  • Joined Jan 15, 2022

Hi Aaron thanks for your suggestion, i'm just started MySql .

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink .

Hide child comments as well

For further actions, you may consider blocking this person and/or reporting abuse

allenheltondev profile image

Serverless Postgres with Neon - My first impression

Allen Helton - Apr 24

akesh0805 profile image

C# algoritmik masalalar

Akbarkhan - Apr 15

fibonacid profile image

OAuth2 and AWS Cognito for Browser Extensions

Lorenzo Rivosecchi - Apr 15

renancferro profile image

The Best UX Research Sites for Developers/Designers Inspirations

Renan Ferro - May 8

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

logo

Practice Interview Questions

Get 1:1 Coaching

SQL Tutorial

  • Intermediate SQL
  • Advanced SQL
  • ADVANCED INTRO 🥵
  • CTE vs. SUBQUERY 🥊
  • WINDOW FUNCTION🪟
  • SQL RANKING 🥇
  • SQL LEAD LAG 🐢
  • SQL SELF-JOINS 🤝
  • SQL UNION 🙏
  • WRITE CLEAN SQL 🧼
  • EXECUTION ORDER ↕️
  • SQL PIVOTING 🔨
  • STRING FUNCTIONS 🔤
  • INSTACART SQL CASE 🥕

Instacart SQL Data Analytics Case Study

Let's apply all we've learned across the 30+ past SQL lessons to a real-world case study where we'll analyze data from Instacart. While there's no one correct solution to this open-ended Data Analytics problem, we've included a few sample SQL queries to help you get started.

Case Study Background: About Instacart

Instacart is a grocery delivery and pickup service. Users can select items from local grocery stores through the Instacart app or website, and then either have them delivered to their doorstep by a personal shopper or prepared for pickup at the store.

Instacart App Experience

For our non-North American friends working on this case study, Instacart is similar to India's Blinkit, Swiggy Instamart, or Dunzo app. In Europe, the comparable app is Getir and Gorrilas. In Latin America, Rappi serves a similar use case.

Case Study Background: The Data Analysis Task

You’re a data analyst at Instacart. Aside from the discounted groceries, you also get the benefit of solving interesting data problems.

In your 1-1 call this morning, your manager tells you that leadership wants to analyze the Instacart market data over time, to understand how the business is changing or staying the same.

Unfortunately, data engineering found some logging errors in the pipeline, and there are currently no date fields in the market data tables 🙃.

Your manager has a call with engineering tomorrow to work on fixing this so the team can track changes more closely in the future. But you’re already mid-way through Q3 and the data pipeline can’t be refreshed again until Q4. So for now, you’re stuck with the data you have.

The Task: find a way to understand how Instacart's business changed over time…without using explicit dates!

Before you panic about time-series data analysis without time-series data, take a deep breath, it's all going to be okay. Take a look at the data you have access to... things will start to click!

Case Study Background: Instacart Grocery Orders Data

Here are the schemas for all 5 tables in the Instacart market data. You’ll decide which ones are relevant and how to best use them throughout this case study.

: this table specifies which products were purchased in each Instacart order.

  • The 'reordered' field indicates that the customer has a previous order that contains the product.
  • Some orders will have no reordered items
  • None of these fields are unique to this table, but the combination of and is unique!

: this table contains previous order contents for all customers. This data was collected in Q2, verus which is data from the current quarter (Q3).

Note: the table has the same exact schema as . You'll likely want to compare these two tables!

: info about each item in the Instacart product catalog

: info about each department

: info about each aisle in a grocery store

Before you make any assumptions, explore the tables below and make sure you understand their structures. You can explore the Instacart data here !

Your Turn: Start Analyzing with SQL!

Given the above data and the task mentioned earlier, go to town and come up with a solution. There's no more instructions, rules, or constraints, go nuts!

Just remember – it's not really about finding the “right” answer; it's about finding some business insights that you can defend with confidence!

Our Solution

Here's how we'd approach this ambiguous data case study. Feel free to follow this approach, or adapt it to derive your own insights!

Our Solution: High-Level Overview

  • Identify the two specific tables you should focus on to understand broad trends over time.
  • Consider a phenomenon in the data that may have changed over time. Choose something practical that you can highlight to your manager.
  • Define in plain ol' English how you plan to investigate the data to solve the data analysis task.
  • Express that approach in SQL.
  • If your observed phenomenon has changed over time, develop 2 or more hypotheses to explain potential causes of that change. If your observed phenomenon has not changed over time, develop 2 or more hypotheses to explain why this phenomenon has remained stagnant.
  • Consider other relevant factors aside from just the Instacart data, such as food trends, seasonality, supply chain, etc.

BONUS: Based on your hypotheses, write a recommendation to leadership explaining how they should either:

  • Support this phenomenon if it’s helpful to Instacart business, or
  • Combat this phenomenon if it’s harmful to Instacart business.

If you want to check your work against our solution, scroll down to check the answer key below.

Remember, a successful case study simply means you developed a coherent process with a data-driven conclusion and defended your method! It doesn’t have to be the same as ours; it just needs to be similarly rigorous!

Our Solution: Analyzing Prior vs. Current Products.

The two tables we want to focus on are and .

As a data analyst, one surprising event you might want to investigate using SQL could be a sudden and significant change in the reordering behavior of certain products in the current orders compared to their behavior in the prior orders.

Specifically, you could look for products that were not frequently reordered in the past (based on data from ) but are now being reordered more frequently in the current orders (from ).

We can formulate the following query to investigate:

This query joins the prior and current orders tables to the product, department, and aisles tables. The crux of the query lies in the SUMs, which allow us to compare product reorders across the prior and current orders tables. We select a lot of other fields so that we can use them to aggregate later as needed.

Finally, in the HAVING clause, we filter on products that were previously reordered fewer than 10 times, and are currently reordered 10 or more times. 10 is a nice round number to start with but a different number (or a measure of percent change) would work for this methodology too.

You can execute the query here:

The first few rows of results should look like this:

Our Solution: Analyzing Changes in Reorders By Department, Aisle

Looking at this granular view won’t give us a broad picture of changes over time, so let’s do some summarizing. We can wrap the above query in a CTE (Common Table Expression) called and do some quick aggregation. We can aggregate by department, which shows that the majority of products with increased reorders fall under Produce.

We can aggregate by aisle to see if that data tells the same story:

According to the results, most of these products come from aisles that are related to produce (fresh vegetables, fresh fruit, and packaged produce).

Our Solution: Hypothesis Behind Re-order Change

Now it’s time for the actionable part of our work: figuring out what this data means for the company, and what we'll do because of the data.

We’ll start by developing some hypotheses to explain potential reasons for the increase in reorders of produce items.

First of all, we mentioned that it’s currently Q3 , which means we’re somewhere between June and August in this scenario. Fruits and vegetables might become more popular in the summer months due to their freshness and nutritional value. So we’ll call hypothesis #1 seasonality.

It’s possible that limited-time discounts, bundle deals, or loyalty programs could drive higher reorders. We should check in with the marketing department and see if there were any deals or discounts related to produce recently. We’ll call hypothesis #2 deals.

We could also take a more skeptical approach to things and see if these products were even available before, or if a recent increase in supply chain activity has allowed for more reorders. When we see reorders go from 0 to 30 for the Organic Navel Orange, we have to ask if the conditions around those orders have changed at all. We’ll call hypothesis #3 availability.

We could keep going, but with the 3 hypotheses below, we’ve covered step 5 of the strategy framework:

  • Seasonality: Produce is more popular in the summer due to its freshness and nutritional value.
  • Deals: Discounts, bundle deals, or loyalty programs around produce could drive higher reorders.
  • Availability: Recent shifts in supply chain and availability may have influenced consumers’ ability to buy produce.

Our Solution: Our Business Recommendation

Here is the bonus recommendation informed by our hypotheses:

After analyzing our current order data compared to previous order data, we discovered that the reorder rates for produce have significantly increased across both departments and aisles. Increased produce sales are beneficial for the company, so leadership should capitalize on customers’ higher propensity to reorder produce during the summer months.

If not already implemented, marketing should consider promoting bundle deals on produce to incentivize new buyers, who may then become repeat customers for those products. Additionally, team members working with suppliers and grocers should ensure the consistent availability of popular produce items, including Organic Navel Oranges, Russet Potatoes, and Cantaloupes, in order to maintain high reorder rates.

Interview Questions

Career resources.

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

Contains solutions for #8WeekSQLChallenge case studies https://8weeksqlchallenge.com/

sharkawy98/sql-case-studies

Folders and files, repository files navigation, 8 weeks sql challenge.

This repository contains solutions for #8WeekSQLChallenge, they are interesting real-world case studies that will allow you to apply and enhance your SQL skills in many use cases. I used Microsoft SQL Server in writing SQL queries to solve these case studies.

Table of Contents

Sql skills gained.

  • Case study 1
  • Case study 2
  • Case study 3
  • Some interesting queries from my solutions
  • Data cleaning & transformation
  • Aggregations
  • Ranking (ROW_NUMBER, DENSE_RANK)
  • Analytics (LEAD, LAG)
  • CASE WHEN statements
  • UNION & INTERSECT
  • DATETIME functions
  • Data type conversion
  • TEXT functions, text and string manipulation

Case Study #1 : Danny's Diner

My solutions

image

Case Study #2 : Pizza Runner

image

Case Study #3 : Foodie-Fi

image

Case Study #4 : Data Bank

image

Some interesting queries

Data With Danny

This is serious sql.

Start your guided data apprenticeship today

Serious SQL

Your complete SQL learning experience

  • Health analytics
  • Marketing analytics
  • People analytics
  • Financial markets
  • Fast moving consumer goods
  • Digital marketing Topics Covered Cover many core SQL skills and techniques required for data analysis from beginner to advanced levels:
  • Where filters and ordering data
  • Group by aggregates
  • Identifying and dealing with duplicate data
  • Common table expressions and subqueries
  • Summary statistics
  • Exploratory data analysis
  • Complex table joins
  • Entity relationship diagrams
  • SQL reverse engineering
  • Data problem solving techniques
  • Window functions
  • Case When Statements
  • Recursive CTEs
  • Cumulative aggregates
  • Simple, weighted and exponential moving metrics
  • Historical vs Snapshot data analysis techniques
  • Temp tables and views
  • String transformations
  • Regular Expressions
  • Datetime manipulation This course consists of detailed technical coding tutorials, a step-by-step setup guide, recorded live training videos and access to the datasets for all case studies. Focus on learning fundamental SQL skills and understanding data at a deep level using PostgreSQL 16. Gain hands-on practical experience so you can feel confident solving challenging data problems in any database environment. Get access to our members only Discord community for further support. Additional Bonus Content
  • Gain familiarity with popular programming tools such as Docker, Markdown, GitHub and the command line interface (CLI)
  • Access to all 8 Week SQL Challenge case studies with further explanations and debugging exercises For further $20 student discount please reach out directly to [email protected] using your student email or share your student details for verification!

case study for sql

Course Curriculum

Introduction

Welcome to Serious SQL

Course Outline

SQL Environment Setup

Data Exploration

Select & Sort Data

Record Counts & Distinct Values

Identifying Duplicate Records

Summary Statistics

Distribution Functions

Health Analytics Mini Case Study

Case Study Quiz

Marketing Analytics Case Study

Case Study Introduction

Case Study Overview

Understanding the Data

SQL Reverse Engineering

Introduction to Table Joins

Joining Multiple Tables

SQL Problem Solving

Window Functions

Final SQL Scripting Solution

Marketing Analytics Quiz

Optional Window Functions Quiz

People Analytics Case Study

Creating Reusable Data Assets

Snapshot and Historic Data

Final Case Study Solution

Quiz 1: Current Employee Analysis

Quiz 2: Employee Churn

Quiz 3: Management Analysis

Additional SQL Techniques

String Transformations

Date & Time Conversions

Serious SQL Live Training

Week 1 - November 20th 2021

Week 2 - November 27th 2021

Week 3 - December 4th 2021

Week 4 - December 11th 2021

Week 5 - January 29th 2022

Week 6 - February 5th 2022

Week 7 - February 12th 2022

Week 8 - February 26th 2022

Week 9 - March 5th 2022

Week 10 - March 20th 2022

8 Week SQL Challenge

Case Study #1 - Danny's Diner

Case Study #2 - Pizza Runner

Case Study #3 - Foodie-Fi

Case Study #4 - Data Bank

Case Study #5 - Data Mart

Case Study #6 - Clique Bait

Case Study #7 - Balanced Tree

Case Study #8 - Fresh Segments

Bonus Content

Linux Command Line Crash Course

GitHub Crash Course

The author has tested the case study in the classroom with thousands of students. While other SQL texts tend to use examples from many different data sets, the author has found that once students get used to one case study, they learn the material at a much faster rate.

The text begins with an introduction to the case study and trains the reader to think like the query processing engine for a relational database management system. Once the reader has a grasp of the case study then SQL programming constructs are introduced with examples from the case study. In order to reinforce concepts, each chapter has several exercises with solutions provided on the book’s website.

SQL by Example  is designed both for those who have never worked with SQL as well as those with some experience. It is modular in that each chapter can be approached individually or as part of a sequence, giving the reader flexibility in the way that they learn or refresh concepts. This also makes the book a great reference to refer back to once the reader is honing his or her SQL skills on the job.

Case Studies

  • Etraveli Reduces Database Backup and Restore Times from Days to Hours
  • Educore Boosts Educational Operations Performance 45% with MySQL HeatWave
  • GaP Solutions Optimizes Retail POS Applications with MySQL HeatWave
  • Plax1 Accelerates Online Sports Gaming Sites with High Performance MySQL HeatWave
  • Universidad Andina del Cusco Reduced Costs with MySQL HeatWave
  • Kemana improves eCommerce performance 10x with MySQL HeatWave
  • eD-Online Maximizes Uptime for e-Learning LMS with MySQL HeatWave
  • EatEasy Transforms Food Delivery Services using MySQL HeatWave AI and ML
  • Sinopay, SaaS Payments Gateway Provider, Expands Globally with MySQL HeatWave on Oracle Cloud Infrastructure
  • datasíntese, leading Brazilian FinTech, boosts performance by 50% and slashes costs by migrating to MySQL HeatWave
  • Bibold Revolutionizes BI Solutions, Boosts Competitiveness with MySQL HeatWave
  • Conecta Wireless delivers unified communications as a service (UCaaS) with MySQL HeatWave
  • Rewards4Earth Migrates its Eco-friendly Rewards App to MySQL HeatWave
  • Procom Achieves 25% Performance Boost and 25% Cost Reduction with MySQL HeatWave on OCI
  • Enpointe IO Achieves 80% Better Performance and 45% Cost Savings with MySQL HeatWave
  • LightSwitch API Fortifies its No-code REST API Developer Platform with MySQL HeatWave
  • Intelectivo, Brazilian ISV, Improves its Web Analytical Platform, Plugger BI by Migrating to MySQL HeatWave
  • tilyanPristka Transforms Business Process Outsourcing Excellence with MySQL HeatWave
  • SKYPlay Cuts Costs 50%, Boosts Gaming Performance with MySQL HeatWave
  • Aiwifi Enriches Analytics Solutions for Wi-Fi Marketing and Customer Experience with MySQL HeatWave In-Memory Machine Learning
  • WelcomeNext Ensures Maximum Availability of E-Learning Solutions with MySQL HeatWave
  • ITSP Boosts App Performance by 100X and Reduces Costs by 33% with MySQL HeatWave
  • Broctagon Fintech Group Migrates its Flagship Forex CRM to MySQL HeatWave
  • Grupo DTG Fuels SaaS Business Growth with MySQL HeatWave after Migrating from Amazon RDS
  • MCM Telecom Boosts Customer Satisfaction to 95% by Moving to MySQL HeatWave
  • UBIT Uses MySQL Heatwave to Build Student Management Systems
  • Aspire Systems Boosts Analytics Performance by 10X with MySQL HeatWave
  • Teyuto, Italian SaaS ISV, Boosts Customer Experiences with Recommendation Engines Built on MySQL HeatWave
  • Exchange Speed Migrated to MySQL HeatWave from Amazon RDS and Redshift to Deliver its High-Performance Trading Platform
  • Gieman, Australian SaaS ISV, Boosts Growth with MySQL HeatWave
  • SaaS ISV Fiscontech Reduces Costs by 95% by Migrating from Amazon Aurora to MySQL HeatWave
  • Fragrantica Enhances User Experience for 25 Million Website Visitors with MySQL HeatWave
  • Dr Mais On-Line, Brazilian TeleMedicine SaaS ISV, increases performance by 50% with MySQL HeatWave
  • Aicoll Improves Loan Default Prediction using Machine Learning Models in MySQL HeatWave
  • Gravity Delivers Scalable Personalization Solutions with MySQL HeatWave
  • Tasmania's Northwest Support Services Modernize the National Disability Insurance Agency with MySQL HeatWave
  • Licitapyme.cl Migrates from AWS to MySQL HeatWave for Improved Performance and Faster Analytics
  • FANCOMI accelerates ad analytics by 10X with MySQL HeatWave
  • Red3i speeds insights by 1,000X with MySQL HeatWave
  • Tetris.co speeds real-time insights with MySQL HeatWave
  • Tamara cuts costs, speeds performance with MySQL HeatWave and Oracle Cloud
  • Wavenet Technology runs one million-plus queries in seconds with MySQL HeatWave
  • Estuda.com increases query responses by 300X with MySQL HeatWave
  • Bionime modernizes data and analytics with Oracle MySQL HeatWave on AWS
  • Centroid simplifies and scales data and analytics with MySQL HeatWave on AWS
  • Johnny Bytes boosts data and analytics with Oracle MySQL HeatWave on AWS
  • KBG Services Payroll Application Increases Data Security and Compliance with MySQL HeatWave
  • Wavenet Technology Saves 30% by Migrating from Amazon Redshift of MySQL HeatWave
  • Shanrohi Cuts Server Costs by 50%, Boosts Performance by Moving from AWS to MySQL HeatWave on OCI
  • Toffs Technologies Boosts Efficiencies by 50% with MySQL Database Service on Oracle Cloud
  • Baguio Relaunches Local Tourism Industry and Protects Visitors with MySQL HeatWave on Oracle Cloud
  • Science House Medicals Speeds Medical Testing by 4x with MySQL HeatWave
  • Uangel Launches New Global Mobile Service with MySQL Database Service on Oracle Cloud
  • College of Marshall Islands Improves Application Performance with MySQL Database Service on Oracle Cloud
  • Akna Lifts Performance by 1000x and Cuts Costs by 60%
  • POCT Science House Quadruples Diagnosis Speed with MySQL HeatWave
  • Asahimatsu Foods builds a low-cost Supply and Demand application in the cloud using MySQL HeatWave Database Service
  • Genius Sonority speeds game analytics by 90X with MySQL HeatWave
  • Noorisys Technologies migrates from AWS EC2 to MySQL HeatWave
  • VRGlass Increases Database Performance by 5x over Amazon EC2 with MySQL HeatWave
  • VizSeek: AI-based Visual Search Platform Deployment on MySQL and Oracle Cloud
  • Ecopaynet Shortens Time to Market with MySQL Database Service
  • Custella Boosts its SaaS Application with MySQL Database Service
  • AK Systems Drives Innovation with Life Sciences SaaS Application
  • Pasona Tech Reduced Costs by 75% After Migrating from Amazon RDS
  • QBS System Speeds Performance and Tightens Security with MySQL Database Service on Oracle Cloud
  • IsoEnergy Streamlines Million Dollar Drill Programs with MySQL Database Service
  • The Gold Continent Helps Zambia Transition to a Vibrant Formal Economy with MySQL Database Service
  • Sectona Boosts Cybersecurity Offering with Embedded MySQL Database
  • RTTS’ QuerySurge Automates Data Testing with MySQL Embedded as Powerful Backend
  • appleple uses geometry type of MySQL to create a website using location information
  • YamaReco implemented a map search function for hiking records, using the GIS feature of MySQL

MySQL Enterprise Edition

  • RBL Bank Boosts Security, Compliance and Operational Efficiency with MySQL Enterprise Edition
  • E Connect Solutions Increases Efficiency and Reduces Costs of its ERP Platform with MySQL Enterprise Edition
  • Pinkbyte Inc. and its Subsidiary Mazzzing Inc. Deliver Low-cost, Secure Desktop by Migrating to MySQL Enterprise Edition from Microsoft SQL Server
  • NAVER, Korea’s largest search engine, powers online services with MySQL Enterprise Edition
  • Dream D&S Provides High Security by Using MySQL Enterprise Edition for its Embedded Products, Enabling Customers to Innovate Faster
  • OP-CBS Secures Data with MySQL Enterprise Edition - Unlocking New Business Opportunities
  • Brown University Enhances Campus Database Services with MySQL Enterprise Edition
  • Universidad Complutense de Madrid Maximizes Availability with MySQL Enterprise Edition
  • DGB Capital Powers its Financial Services with MySQL Enterprise Edition
  • Great HealthWorks Improves Reliability by Migrating from MariaDB
  • guard.me Relies on MySQL Enterprise Edition for Enhanced Security and Compliance
  • Toss Bank Delivers Innovative Financial Services with MySQL Enterprise Edition
  • Digital14 Relies on MySQL Enterprise Edition for Enhanced Security
  • ST Engineering's Smart Mobility Rail Business uses MySQL Enterprise Edition
  • UL Solutions Sdn Bhd Delivers its Shop Floor and Inventory Control Application with MySQL Enterprise Edition
  • GCI achieves carrier-grade uptime and slashes IT costs with MySQL Enterprise Edition
  • KAI Improves Railway Efficiencies with IoT Platform using MySQL Enterprise Edition
  • The BBC Ensures World Class Broadcasting Services using MySQL Enterprise Edition
  • Korea Investment & Securities boosts employee productivity with MySQL
  • Meritz Fire Powers Groupware Portal for Improved Collaboration and Cuts TCO with MySQL
  • Itaú Unibanco Boosts Digital Platform with MySQL Enterprise Edition for High Availability and Support
  • SSG Builds Online Shopping Mall using MySQL Enterprise Edition
  • KDDI prevents service downtime with MySQL InnoDB Cluster and reduces failure recovery time by 80%
  • BSE Takes Online Trading from Milliseconds to Microseconds with MySQL Enterprise Edition
  • TMON Builds ‪Korea’s Number One Online Malling Platform ‬‬‬on MySQL Enterprise Edition‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬
  • Flash Networks Embeds MySQL Enterprise Edition in Parental Control Solution for Mobile Operators
  • Globo Adopts MySQL Enterprise Edition as Platform for Content Development
  • University of Toronto Empowers Astronomers to Research Dark Matter with Massive Space Image Database
  • SQUARE ENIX reduced database backup time with MySQL Enterprise Edition to 1/6th to offer superb game environment for users worldwide
  • K Bank Delivers High-Quality Customer Services While Reducing TCO by 80% MySQL Enterprise Edition
  • France Billet Stakes Online TicketSales on High Availability of MySQL Enteperprise Edition
  • Mobitel Achieves Optimum Price Performance Ratio for 2 Tier Applications
  • Plusnet Supports Rapid Growth of Customer Base by Improving Visibility, Performance, and Scalability
  • NJ India Invest Boosts Financial Transaction Processing with MySQL Enterprise Edition
  • Isibet Achieve 24x7 Uptime with High Availability MySQL Enterprise
  • LINE Enhances Database Availability, Scalability, and Security with MySQL
  • Credorax Delivers NextGen Payment Processing Technology using MySQL
  • WealthObjects Relies on MySQL to Deliver Innovative FinTech Solutions

MySQL NDB Cluster

  • Beezz Helps Solve IoT Security Threats with MySQL Cluster Carrier Grade Edition
  • Certigna Gains Maximum Availability Through Enhance Performance of MySQL Cluster Carrier Grade Edition
  • BitCash Supports Future Business Growth with MySQL Cluster Carrier Grade Edition
  • DMM.com levaraged Oracle Premier Support for MySQL to upgrade over 100 MySQL Servers in a year
  • SQL Server training
  • Write for us!

Esat Erkec

A case study of SQL Query tuning in SQL Server

Gaining experience in SQL query tuning can be very difficult and complicated for database developers or administrators. For this reason, in this article, we will work on a case study and we are going to learn how we can tune its performance step by step. In this fashion, we will understand well how to approach query performance issues practically.

Pre-requirements

In this article, we will use the Adventureworks2017 sample database. At the same time, we will also use the Create Enlarged AdventureWorks Tables script to obtain an enlarged version of the SalesOrder and SalesOrderDetail tables because the size of this database is not sufficient to perform performance tests. After installation of the Adventureworks2017 database, we can execute the table enlarging script.

Case Study: SQL Query tuning without creating a new index

Imagine that, you are employed as a full-time database administrator in a company and this company is still using SQL Server 2017 version. You have taken an e-mail from the software development team and they are complaining about the following query performance in their e-mail.

Your objective is to improve the performance of the above query without creating a new index on the tables but you can re-write the query.

The first step of the SQL query tuning: Identify the problems

Firstly, we will enable the actual execution plan in the SSMS and execute the problematic query. Using the actual execution plan is the best approach to analyze a query because the actual plan includes all accurate statistics and information about a query. However, if a query is taking a long time, we can refer to the estimated execution plan. After this explanation, let’s examine the select operator of the execution plan.

  • Interpreting execution plans of T-SQL queries
  • Main Concepts of SELECT operators in SQL Server execution plans

The ElapsedTime attribute indicates the execution time of a query and we figure out from this value that this query is completed in 142 seconds. For this query, we also see the UdfElapsedTime attribute and it indicates how long the database engine deal to invoke the user-defined functions in the query. Particularly for this query, these two elapsed times are very close so we can deduce that the user-defined function might cause a problem.

A select operator details in the execution plan

Another point to take into consideration for this query is parallelism. For this query, the Estimated Subtree Cost value exceeds the Cost Threshold for Parallelism  setting of the server but the query optimizer does not generate a parallel execution plan because of the scalar function. The scalar functions prevent the query optimizer to generate a parallel plan.

Why a query does not generate a parallel execution plan?

The last problem with this query is the TempDB spill issue and this problem is indicated with the warning signs in the execution plan.

Analyze an execution plan for SQL query tuning

Outdated statistics, poorly written queries, ineffective index usage might be caused to tempdb spill issues.

Improve performance of the scalar-function in a query

The scalar-functions can be a performance killer for the queries, and this discourse would be exactly true for our sample query. Scalar-functions are invoked for every row of the result set by the SQL Server. Another problem related to the scalar-functions is the black box problem because the query optimizer has no idea about the code inside the scalar-function, due to this issue the query optimizer does not consider the cost impact of the scalar functions on the query.

A new feature has been announced with SQL Server 2019 and can help overcome most of the performance issues associated with scalar functions. The name of this feature is Scalar UDF Inlining in SQL Server 2019 . On the other hand, if we are using earlier versions of SQL Server, we should adapt the scalar function code explicitly to the query if it is possible. The common method is to transform the scalar-function into a subquery and implement it to query with the help of the CROSS APPLY operator. When we look at the inside of the ufnGetStock function, we can see that it is summing the quantity of products according to the ProductId only a specific LocationId column.

Scalar-functions affects SQL query tuning negatively

We can transform and implement the ufnGetStock scalar-function as shown below. In this way, we ensure that our sample query can run in parallel and will be faster than the first version of the query.

This query has taken 71 seconds to complete but when we look at the execution plan, we see a parallel execution plan. However, the tempdb spill issue is persisted. This case obviously shows that we need to expend more effort to overcome the tempdb spill problem and try to find out new methods.

Tempdb spill issue affects SQL query tuning negatively

Think more creative for SQL query tuning

To get rid of the tempdb spill issue, we will create a temp table and insert all rows to this temporary table. The temporary tables offer very flexible usage so we can add a computed column instead of the LEN function which is placed on the WHERE clause. The insert query will be as below.

When we analyze this query we can see the usage of the TABLOCK hint after the INSERT statement. The usage purpose of this keyword is to enable a parallel insert option. So that, we can gain more performance. This situation can be seen in the execution plan.

SQL query tuning and parallel insert

In this way, we have inserted the 1.286.520 rows into the temporary table in just one second. However, the temporary table still holds more data than we need because we haven’t filtered the CreditCard ApprovalCode column values ​​with character lengths are greater than 10 in the insert operation. At this point, we will make a little trick and delete the rows whose are character length smaller than 10 or equal to 10. After the insert statement, we will add the following delete statement so that we will obtain the all qualified records in the temp table.

SQL Query tuning: Using indexes to improve sort performance

When we design an effective index for the queries which include the ORDER BY clause, the execution plan does not require to sort the result set because the relevant index returns the rows in the required order. Moving from this idea, we can create a non-clustered index that satisfies sort operation requirements. The important point about this SQL query tuning practice is that we have to get rid of the sort operator and the generated index advantage should outweigh the disadvantage. The following index will be helping to eliminate sort operation in the execution plan.

Now, we execute the following query and then examine the execution plan of the select query.

Improve sort operator performance with an index

As we can see in the execution plan, the database engine used the IX_Sort index to access the records and it also did not require to use of a sort operator because the rows are the sorted manner. In the properties of the index scan operator, we see an attribute that name is Scan Direction .

Non-clustered index scan direction

The scan direction attribute explains that SQL Server uses the b-tree structure to read the rows from beginning to the end at the leaf levels. At the same time, this index helps us to overcome the tempdb spill issue.

Non-clustered index structure  and scan direction

Finally, we see that the query execution time was reduced from 220 seconds to 33 seconds.

In this article, we learned practical details about SQL query tuning and these techniques can help when you try to solve a query performance problem. In the case study, the query which has a performance problem contained 3 main problems. These are:

  • Scalar-function problem
  • Using a serial execution plan
  • Tempdb spill issue

At first, we transformed the scalar-function into a subquery and implement it to query with the CROSS APPLY operator. In the second step, we eliminated the tempdb spill problem to use a temporary table. Finally, the performance of the query has improved significantly.

  • Recent Posts

Esat Erkec

  • SQL Performance Tuning tips for newbies - April 15, 2024
  • SQL Unit Testing reference guide for beginners - August 11, 2023
  • SQL Cheat Sheet for Newbies - February 21, 2023

Related posts:

  • Mapping schema and recursively managing data – Part 1
  • SQL SUBSTRING function and its performance tips
  • Top SQL Server Books
  • Parallel Nested Loop Joins – the inner side of Nested Loop Joins and Residual Predicates
  • A complete guide to T-SQL Metadata Functions in SQL Server

SQL Tutorial

Sql database, sql references, sql examples, sql case expression, the sql case expression.

The CASE expression goes through conditions and returns a value when the first condition is met (like an if-then-else statement). So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause.

If there is no ELSE part and no conditions are true, it returns NULL.

CASE Syntax

Demo database.

Below is a selection from the "OrderDetails" table in the Northwind sample database:

Advertisement

SQL CASE Examples

The following SQL goes through conditions and returns a value when the first condition is met:

The following SQL will order the customers by City. However, if City is NULL, then order by Country:

Get Certified

COLOR PICKER

colorpicker

Contact Sales

If you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail: [email protected]

Report Error

If you want to report an error, or if you want to make a suggestion, send us an e-mail: [email protected]

Top Tutorials

Top references, top examples, get certified.

IMAGES

  1. SQL CASE statement: Everything you need to know

    case study for sql

  2. 36 Case Study on SQL Queries(Northwind DB)

    case study for sql

  3. Case Study #4

    case study for sql

  4. Advanced SQL Case Studies

    case study for sql

  5. SQL CASE statement: Everything you need to know

    case study for sql

  6. Quiz & Worksheet

    case study for sql

VIDEO

  1. My Data analysis for Cyclistic case study in SQL

  2. Case Study: SQL Data Cleaning & Create Views To Ease Your Analysis

  3. TIKTOK Interview Question Solved

  4. GOOGLE Interview Question Solved

  5. TWITTER Interview Question Solved

  6. (Part 1) CVS HEALTH Interview Question Solved

COMMENTS

  1. Case Study #1

    Example Datasets. All datasets exist within the dannys_diner database schema - be sure to include this reference within your SQL scripts as you start exploring the data and answering the case study questions.. Table 1: sales. The sales table captures all customer_id level purchases with an corresponding order_date and product_id information for when and what menu items were ordered.

  2. Problem solving with SQL: Case Study #1

    Introduction. Danny seriously loves Japanese food so at the beginning of 2021, he decides to embark upon a risky venture and opens up a cute little restaurant that sells his 3 favourite foods: sushi, curry and ramen. Danny's Diner needs your assistance to help the restaurant stay afloat — the restaurant has captured some very basic data ...

  3. Top SQL Case Study Questions

    Example SQL Case Question: Unsubscribe Rates. Many SQL case study questions will ask you to investigate correlation. In this example SQL case question, we're looking into this issue: Unsubscribe rates have increased after a new notification system has been introduced. Question: Twitter wants to roll out more push notifications to users ...

  4. Case Studies in SQL: Real-World Data Analysis with SQL Queries

    SQL (Structured Query Language) is a powerful tool for working with data, and it's widely used in various industries for data analysis and decision-making. In this guide, we'll explore real-world case studies that demonstrate the practical use of SQL queries to analyze data and derive valuable insights.

  5. Case Studies in SQL: Real-World Data Analysis with SQL Queries

    SQL is a versatile language for data analysis, and these real-world case studies demonstrate its practical application in various domains. By using SQL queries, you can extract valuable insights ...

  6. GitHub

    8-Week SQL Challenges. This repository serves as the solution for the 8 case studies from the #8WeekSQLChallenge. It showcases my ability to tackle various SQL challenges and demonstrates my proficiency in SQL query writing and problem-solving skills. A special thanks to Data with Danny for creating these insightful and engaging SQL case ...

  7. SQL Challenge: Case Study. Step-by-step walkthrough of a SQL…

    The sum of that CASE statement will be the numerator. The denominator will be a COUNT of all elements in the event_type column. CAST(SUM(CASE WHEN event_type IN ('video call received', 'video call sent', 'voice call received', 'voice call sent') THEN 1 ELSE 0 END) AS FLOAT) / COUNT(event_type) Next, we'll wrap the conditional in a GROUP BY ...

  8. 8 Week SQL Challenge: Case Study #1 Danny's Diner

    However, for sushi, each $1 spent earns 20 points. From Day 1 to Day 7 (the first week of membership), each $1 spent for any item earns 20 points. From Day 8 to the last day of January 2021, each ...

  9. tituHere/SQL-Case-Study

    A comprehensive collection of SQL case studies, queries, and solutions for real-world scenarios. This repository provides a hands-on approach to mastering SQL skills through a series of case studies, including table structures, sample data, and SQL queries. - GitHub - tituHere/SQL-Case-Study: A comprehensive collection of SQL case studies, queries, and solutions for real-world scenarios.

  10. 8 Week SQL Challenge: Case Study #2 Pizza Runner

    GROUP BY pizza_order; Answer: On average, a single pizza order takes 12 minutes to prepare. An order with 3 pizzas takes 30 minutes at an average of 10 minutes per pizza. It takes 16 minutes to prepare an order with 2 pizzas which is 8 minutes per pizza — making 2 pizzas in a single order the ultimate efficiency rate.

  11. Solving Danny Ma's SQL Case Study #1

    Introduction. Danny seriously loves Japanese food so in the beginning of 2021, he decides to embark upon a risky venture and opens up a cute little restaurant that sells his 3 favourite foods: sushi, curry and ramen. Danny's Diner is in need of your assistance to help the restaurant stay afloat — the restaurant has captured some very basic ...

  12. Solving Danny Ma's SQL Case Study #2 Pizza Runner

    12 min read. ·. Oct 5, 2021. --. 5. I started Danny Ma's SQL challenge to gain hands-on experience as a beginner in this vast SQL field and it has not disappointed me! So let's get started ...

  13. Instacart SQL Data Analytics Case Study

    For our non-North American friends working on this case study, Instacart is similar to India's Blinkit, Swiggy Instamart, or Dunzo app. In Europe, the comparable app is Getir and Gorrilas. In Latin America, Rappi serves a similar use case. Case Study Background: The Data Analysis Task. You're a data analyst at Instacart.

  14. GitHub

    This repository contains solutions for #8WeekSQLChallenge, they are interesting real-world case studies that will allow you to apply and enhance your SQL skills in many use cases. I used Microsoft SQL Server in writing SQL queries to solve these case studies.

  15. Serious SQL

    Learn SQL best practices by solving multiple case studies using data from health, marketing and HR domains. <br><br> All inclusive lifetime access includes growing library of SQL portfolio projects, interview questions, exclusive bonus content.

  16. SQL Case Study: Helping a Startup CEO Manage His Data

    SQL Case Study: Helping a Startup CEO Manage His Data. In this tutorial, you will learn how to create a table, insert values into it, use and understand some data types, use SELECT statements, UPDATE records, use some aggregate functions, and more. By Ezz El Din Abdullah, Former Data Scientist Intern & Programming Tutor.

  17. SQL By Example

    SQL by Example uses one case study to teach the reader basic structured query language (SQL) skills.. The author has tested the case study in the classroom with thousands of students. While other SQL texts tend to use examples from many different data sets, the author has found that once students get used to one case study, they learn the material at a much faster rate.

  18. SQL Practice Case Study with Sample Database

    Great SQL practical case study on Sales Analysis. This case study is an indication of the type of business questions an analyst will get to solve real problems in the industry. The emphasis on understanding how the data is structured before querying the data is well explained. SQL is beautiful.

  19. SQL Case Study: Investigating a Drop in User Engagement

    Indeed, it seems to be the case that the drop in clickthrough rates was attributed specifically to mobile devices and tablets. Weekly digest vs re-engagement emails So far, I've determined that the lack of engagement is due to a decrease in email clickthrough rates from July to August.

  20. MySQL :: Case Studies

    E Connect Solutions Increases Efficiency and Reduces Costs of its ERP Platform with MySQL Enterprise Edition. Pinkbyte Inc. and its Subsidiary Mazzzing Inc. Deliver Low-cost, Secure Desktop by Migrating to MySQL Enterprise Edition from Microsoft SQL Server. NAVER, Korea's largest search engine, powers online services with MySQL Enterprise ...

  21. A case study of SQL Query tuning in SQL Server

    In this article, we learned practical details about SQL query tuning and these techniques can help when you try to solve a query performance problem. In the case study, the query which has a performance problem contained 3 main problems. These are: Scalar-function problem. Using a serial execution plan.

  22. SQL CASE Expression

    The SQL CASE Expression. The CASE expression goes through conditions and returns a value when the first condition is met (like an if-then-else statement). So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause.. If there is no ELSE part and no conditions are true, it returns NULL.