How to Build, Structure and Manage a Data Science Team Successfully
Currently, companies have been praising the importance of data on internal business decision making process and investing more in their data teams. There has been a contest taking place lately especially among big tech companies to expand their data teams to serve the needs.
In my career, I have had opportunities to work for some of those big companies and took different roles in their data teams. Thus far, I’ve also had numerous data science interviews with many companies in various sizes. From my previous experiences, I wouldn’t be…
Word2Vec to Product2Vec — Product Embeddings Generator
Product I: Similar Items Recommender
Product II: Product Taxonomy Expander
Product III: Advanced Content-Based Similar Items Recommender
Product IV: Personalised Items Recommender
Product V: Listing Category Corrector
Product VI: User Interest Classifier
Conclusion — The Big Picture
This is a story of how we developed a bundle of data science products containing 6 different solutions for an e-commerce platform. (A marketplace in our case) The essence of this story is that all these products in the bundle are stemming from a single source which is a list of product numerical representatives or…
Nowadays, product recommendations are arguably the most important component of an e-commerce website or a mobile application. Companies can easily increase the crucial business metric of Click Through Rate (CTR) on their platform by improving their recommendations. Therefore, I believe that recommendation is the best field where Data Science teams do have a chance to increase their visibility in a business by making a profound impact on the product.
This is a story of the impact that our new recommender approach has had on the marketplace platforms in eCG (eBay Classified Group).
I. Problem Statement:
It is very likely that you have come across the concept called ‘Know Your Audience’ once in a while. It is a crucial approach used by most businesses to identify different customer groups and their respective needs. The importance of the concept for a business is all about being capable of understanding and reaching customers better. This enables businesses to deliver their contents and messages to customers in a more efficient and personalised way.
This concept gives businesses a chance to move from one-fits-all to more customer-centric strategies.
Customer Segmentation is a commonly used broad term for applying the ‘Know…
Learn more about an end-to-end journey of a real-life data science project
Get some modelling tips and key takeaways
See a predictive machine learning model in action
Conversion is one of the most crucial metrics to all kinds of e-business. Although the definition varies across different parts of the industry, it has been always a vital one being tracked in order to measure the overall success of any e-platform. For a pure e-commerce website such as eBay, conversion means completing a purchase or a payment. …
Gradient Boosted Trees (a.k.a GBT) is a commonly used tree-based ML algorithm which works for both regression and classification type of data mining problems. This algorithm has attracted a lot of attention from data science communities because of its success in industrial problems and data mining competition platforms such as Kaggle. Actually, GBT (XGBoost) algorithm was used for almost half of the winning solutions on Kaggle contests.
In this article, we will discuss what makes this algorithm really good compared to others. We will discover the technical details and shed light on the mathematical formulation of the GBT algorithm in…
It is important to learn how a machine learning algorithm works behind the curtains. For a data scientist, it is crucial to wonder about the logic & the math behind these algorithms. However, even with best parameter configuration, using only one ML algorithm on a data mining problem may limit your performance and your capabilities while solving the problem. At this point, we have to meet some really important strategies in Machine Learning to boost our modelling performance. These strategies are called ‘Bagging’ and ‘Boosting’. We should keep these concepts in our minds for almost every project that we’ll be…
I don’t know if you’re a kind of person who is addicted to a machine learning algorithm and use your favorite one as long as it is applicable to a problem. But, I am that kind of machine learning geek. My favorite algorithm is Random Forest and I have my own reasons for this preference. First of all, Random Forest is one of the most easy-going algorithm among all other machine learning algorithms.
Let’s make a list of some advantages of Random Forest :
Currently Amsterdam-based and working at Ebay. Senior Data Scientist with M. Sc degree in Machine Learning and 7 years of professional experience.