Developed A Machine Learning Based Solution to Categorize Products And Improve Search On The Website for a Leading Distributor


Client had an inventory of 3 million SKUs out of which only ~10% had a product hierarchy attached. Categorizing the remaining SKUs was a slow manual process which directly led to loss of revenue and a poor search experience on the website. The task was to automate the process to reduce revenue loss and improve user-experience.


To achieve this, we had to:

  • Extract data elements like product description, vendor name etc. that influence category
  • Cleanse data by removing stop words, special characters and perform lemmatization
  • Use text mining to create term document metric and evaluate model accuracy
  • Add confidence threshold to classify predicted classification as ‘Low’ or ‘High’ confidence
  • Add business rules and improve model accuracy iteratively by analyzing ‘Low’ confidence predictions

Key Benefits

  • The high confidence classified product hierarchy with Machine Learning based algorithm gives the client the ability to take feedback and improve classification over time
  • Man-hours are now spent on only validating the ‘Low Confidence’ outputs. The corrected output from these is fed back into the model’s learning algorithm to reduce similar errors in the future.
  • The customers are now able to search for these classified products on the website since they can be indexed post-classification and enrichment. This directly improves the search experience, reduces related inbound customer care calls and generates incremental revenue.


Icon Boost

30X faster categorization than the existing manual process

Icon Boost

99% accuracy for ‘High Confidence’ predictions and overall accuracy of 95%

Icon Boost

$250K per year cost reduction in 3rd party expenses for manual categorization

Icon Boost

Better search experience on the website and incremental revenue

Talk To Us