The project entails the implemtation of following concepts to predict the output sales value:
-
Linear Regression
-
Lasso Regularisatiion
-
Ridge Regularisation
-
heteroscedasticity
-
Cross Validation
-
Feature Selection
About the DataSet :
We have train (8523) and test (5681) data set, train data set has both input and output variable(s). You need to predict the sales for test data set.
Features Description
**Item_Identifier **Unique product ID
**Item_Weight **Weight of product
**ItemFatContent **Whether the product is low fat or not
**Item_Visibility **The % of total display area of all products in a store allocated to the particular product
**Item_Type **The category to which the product belongs
**Item_MRP **Maximum Retail Price (list price) of the product
**Outlet_Identifier **Unique store ID
**OutletEstablishmentYear **The year in which store was established
**Outlet_Size **The size of the store in terms of ground area covered
**OutletLocationType **The type of city in which the store is located
**Outlet_Type **Whether the outlet is just a grocery store or some sort of supermarket
**ItemOutletSales **Sales of the product in the particulat store. This is the outcome variable to be predicted.
**source **is it a train data or test data point