https://www.kaggle.com/zusmani/pakistans-largest-ecommerce-dataset
Special Thanks to Dr Zeeshan Usmani for making this data public.
This is the largest retail e-commerce orders dataset from Pakistan. It contains half a million transaction records from March 2016 to August 2018. The data was collected from various e-commerce merchants as part of a research study. I am releasing this dataset as a capstone project for my data science course at Alnafi (alnafi.com/zusmani). There is a dire need for such dataset to learn about Pakistan’s emerging e-commerce potential and I hope this will help many startups in many ways.
Geography: Pakistan
Time period: 03/2016 – 08/2018
Unit of analysis: E-Commerce Orders
The dataset contains detailed information of half a million e-commerce orders in Pakistan from March 2016 to August 2018. It contains item details, shipping method, payment method like credit card, Easy-Paisa, Jazz-Cash, cash-on-delivery, product categories like fashion, mobile, electronics, appliance etc., date of order, SKU, price, quantity, total and customer ID. This is the most detailed dataset about e-commerce in Pakistan that you can find in the Public domain.
Variables: The dataset contains Item ID, Order Status (Completed, Cancelled, Refund), Date of Order, SKU, Price, Quantity, Grand Total, Category, Payment Method and Customer ID.
Size: 101 MB
File Type: CSV
I like to thank all the startups who are trying to make their mark in Pakistan despite the unavailability of research data.
I have tried to visualize series of relationship to identify the patterns in the dataset. I have also graphed and documented the special trends and findings.
I’d like to call the attention of my fellow Kagglers to use Machine Learning and Data Sciences to help me explore these ideas:
- What is the best-selling category?
- Visualize payment method and order status frequency
- Find a correlation between payment method and order status
- Find a correlation between order date and item category
- Find any hidden patterns that are counter-intuitive for a layman
- Can we predict number of orders, or item category or number of customers/amount in advance?