InstaCart is a large grocery company that specializes in online orders and delivery services. How can customer sales data be analyzed and used to increase business sales even further?
This project was conducted by analyzing customer purchasing behavior in order to generate potential business strategies. Customers were classified into different groups to create customer profiles that could be targeted via methods such as content-oriented advertising.
Data Wrangling and Subsetting
Combining and Exporting Dataframes
Deriving New Variables
Population Flows
Data Visualization with Python
Python
Tableau
Excel
Customers dataset
Orders dataset
Products dataset
Departments dataset
Customer data was only from 2017. Weekly and hourly time series analysis was completed, but long term time series analysis could not be completed due to lack of data over multiple years.
Fake names were used for customer name data to ensure PII was not leaked.
The busiest days of the week for customers were Saturday, Sunday, and Friday, marked as 0, 1, and 2, respectively.
Customers shopped the most during the hours of 10AM and 4PM, with a decrease in orders after 4PM.
Despite the graph of average price per hour appearing to fluctuate throughout the day, the difference in price isn't significant enough, deviating only by around 10 cents. Customers do not spend more during certain times of the day on average.
The most popular department was produce, having almost 4 million more orders than the next highest department.
Customers were more likely to shop for food products compared to other miscellaneous products.
On average, the percentages of each region's orders in every department is very consistent, indicating that ordering habits did not differ by region.
The Southern US region makes up the largest percentage of orders whereas the Northeast US makes up the lowest amount.
Customers were classified as Old Adults (55+), Middle-Age Adults (35-55), and Young Adults (<35).
The largest amount of orders still came from food departments such as produce, dairy/eggs, and snacks.
Despite Old Adults having the largest amount of total orders, the relative percentages of each age group per department were similar regardless of department.
Customers were classified into different income profiles as Low income (<75,000), "Middle income" (between 75,000 and 150,000), and "High income" (>150,000).
The middle income group made up the largest amount of customers, composing roughly 50%.
Relative percentages were mostly the same except for slight variations in certain departments.
High income customers bought more alcohol and pet products.
Middle income customers bought slightly more baby products and food products like meat and seafood.
Low income customers purchased more bulk goods and snacks.
Customers were also classified into family profiles based on marital status and number of dependents.
Married adults with dependents made up the largest group, making up around 70% of customers.
Similar to income profiles, there were small variations in department orders based on family profiles.
Married adults with dependents spent slightly more on bulk goods and household products.
Single adults preferred to spend more on alcohol.
Young adults with family or dependents spent slightly more on alcohol and pet products.
Marketing campaigns should focus on Saturdays and Sundays . Promotional sales on the weekend would encourage even more shopping.
Ads should be run during peak hours from 10AM to 4PM and slightly earlier (from 8AM to 10AM) to spur even more shopping.
Expand inventory for food departments such as produce, dairy/eggs, and snacks since these are the most popular departments.
Targeted ads and loyalty discounts should be used for each group. For example, low income customers should be given ads for bulk goods whereas middle income customers should receive ads for baby products. Loyalty programs would be useful too to increase customer retention since new customers (those who placed less than 10 orders) make up ~16% of the customer base.