Classification of various products into different categories is a very important task. Doing this classification, one can get various types of insights about the specific product. This also helps in doing product matching when you try and search a product on a eCommerce site.
Training Data is around 6 lakh and test data is around 4.5 lakh.
Columns information: Store ID: Store id from where the informaiton has been taken Url: Url of the product for which we have the information Attributes: Its a key value pair map. It describes the properties of the products like its ISBN number and other relevant informaiton Breadcrumbs: It shows the path of the products on an ecommerce site. It is usually written on the top left corner.
Problem Statement: Given the product information we need to predict the labels for that product. Its multi class classification.
With SVM accuracy : 99.9 With Random Forest accuracy : 98.9