This project aims to explore the D-mart Sales data to understand top performing branches and products, sales trend of of different products, customer behaviour. The aims is to study how sales strategies can be improved and optimized.
“In this recruiting competition, job-seekers are provided with historical sales data for 45 D-mart stores located in different regions. Each store contains many departments, and participants must project the sales for each department in each store. To add to the challenge, selected holiday markdown events are included in the dataset. These markdowns are known to affect sales, but it is challenging to predict which departments are affected and the extent of the impact.
The major aim of thie project is to gain insight into the sales data of D-mart to understand the different factors that affect sales of the different branches.
The dataset was obtained from the website. This dataset contains sales transactions from a three different branches of D-mart, respectively located in Mandalay, Yangon and Naypyitaw. The data contains 17 columns and 1000 rows:
Column | Description | Data Type |
---|---|---|
invoice_id | Invoice of the sales made | VARCHAR(30) |
branch | Branch at which sales were made | VARCHAR(5) |
city | The location of the branch | VARCHAR(30) |
customer_type | The type of the customer | VARCHAR(30) |
gender | Gender of the customer making purchase | VARCHAR(10) |
product_line | Product line of the product solf | VARCHAR(100) |
unit_price | The price of each product | DECIMAL(10, 2) |
quantity | The amount of the product sold | INT |
VAT | The amount of tax on the purchase | FLOAT(6, 4) |
total | The total cost of the purchase | DECIMAL(10, 2) |
date | The date on which the purchase was made | DATE |
time | The time at which the purchase was made | TIMESTAMP |
payment_method | The total amount paid | DECIMAL(10, 2) |
cogs | Cost Of Goods sold | DECIMAL(10, 2) |
gross_margin_percentage | Gross margin percentage | FLOAT(11, 9) |
gross_income | Gross Income | DECIMAL(10, 2) |
rating | Rating | FLOAT(2, 1) |
Conduct analysis on the data to understand the different product lines, the products lines performing best and the product lines that need to be improved.
This analysis aims to answer the question of the sales trends of product. The result of this can help use measure the effectiveness of each sales strategy the business applies and what modificatoins are needed to gain more sales.
This analysis aims to uncover the different customers segments, purchase trends and the profitability of each customer segment.
- Build a database
- Create table and insert the data.
- Select columns with null values in them. There are no null values in our database as in creating the tables, we set NOT NULL for each field, hence null values are filtered out.
- Add a new column named
time_of_day
to give insight of sales in the Morning, Afternoon and Evening. This will help answer the question on which part of the day most sales are made.
- Add a new column named
day_name
that contains the extracted days of the week on which the given transaction took place (Mon, Tue, Wed, Thur, Fri). This will help answer the question on which week of the day each branch is busiest.
- Add a new column named
month_name
that contains the extracted months of the year on which the given transaction took place (Jan, Feb, Mar). Help determine which month of the year has the most sales and profit.
Exploratory Data Analysis (EDA): Exploratory data analysis is done to answer the listed questions and aims of this project.
Conclusion:
$ COGS = unitsPrice * quantity $
$ VAT = 5\% * COGS $
$VAT$ is added to the $COGS$ and this is what is billed to the customer.
$ total(gross_sales) = VAT + COGS $
$ grossProfit(grossIncome) = total(gross_sales) - COGS $
Gross Margin is gross profit expressed in percentage of the total(gross profit/revenue)
$ \text{Gross Margin} = \frac{\text{gross income}}{\text{total revenue}} $
**Example with the first row in our DB:**
Data given:
$ COGS = 45.79 * 7 = 320.53 $
$ \text{VAT} = 5\% * COGS\= 5\% 320.53 = 16.0265 $
$ total = VAT + COGS\= 16.0265 + 320.53 = $336.5565$
$ \text{Gross Margin Percentage} = \frac{\text{gross income}}{\text{total revenue}}\=\frac{16.0265}{336.5565} = 0.047619\\approx 4.7619\% $
-- Create database
CREATE DATABASE IF NOT EXISTS walmartSales;
-- Create table
CREATE TABLE IF NOT EXISTS sales(
invoice_id VARCHAR(30) NOT NULL PRIMARY KEY,
branch VARCHAR(5) NOT NULL,
city VARCHAR(30) NOT NULL,
customer_type VARCHAR(30) NOT NULL,
gender VARCHAR(30) NOT NULL,
product_line VARCHAR(100) NOT NULL,
unit_price DECIMAL(10,2) NOT NULL,
quantity INT NOT NULL,
tax_pct FLOAT(6,4) NOT NULL,
total DECIMAL(12, 4) NOT NULL,
date DATETIME NOT NULL,
time TIME NOT NULL,
payment VARCHAR(15) NOT NULL,
cogs DECIMAL(10,2) NOT NULL,
gross_margin_pct FLOAT(11,9),
gross_income DECIMAL(12, 4),
rating FLOAT(2, 1)
);