Machine Learning for Large Scale Logistics Platform
Sub-project : An online recommendation system based on collaborative filtering for implicit data using sentiment and frequency dependent weighting schemes.
Technical details :
- Implemented a state of the art algorithm for online collaborative filtering based on Fast Matrix Factorization for Online Recommendation with Implicit Feedback (He et al.) using Numpy.
- Integrated element-wise Alternating Least Squares (eALS) based incremental update strategy for online learning.
- Developed an online collaborative filtering based deep recommender algorithm based on AutoEncoder in tensorflow.
- Used the VADER model in NLTK for sentiment analysis of comments.
- Improved results of algorithm by using interaction count and sentiment dependent weighting scheme for the observed data and a frequency aware weighting scheme for the missing data.
- Built multiple Kafka consumers and producer for parallely consuming real time interaction data of comments, likes and views to produce online recommendations for users.
- Used locust to simulate parallel user interaction to test recommendation algorithm.
- Used an eventually consistent engagement database (Couchbase) for storing user and item based data.
Sub-Project: Identification and Classification of toxic comments. Technical Details:
- Implemented a Bidirectional LSTM based model using Keras for flagging toxic comments based on six metrics.
- Built Kafka consumer and producer data-pipelines for recording and processing new comments.