Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Kriti Dutta data.kriti.dutta@gmail.

com (+91) 8901316295

LinkedIn | GitHub

EvalueServe Pvt Limited, Gurugram, India March 2021--Present
Senior Software Engineer, NLP
● Upgraded a process that resulted in 20 hour time saving per week(~INR 3 lakhs savings per analyst).
● Worked closely with the Product team for the design, development, debugging, effort estimation with 90% productivity
and maintenance of Machine Learning models.
● Responsible for end to end model development to deployment process with minimum bug reports(only 5 bug reports).
● Used with state of art algorithms in natural language processing to analyze and solve complex business problems.
● Collaborated with the internal stakeholders, identifying and gathering analytical requirements for customers, product and
project needs.

Gemini Solutions Pvt Limited, Gurugram, India January 2018 – March 2021
Machine Learning Engineer
● Increased model's accuracy by 10% from client's in-house team's model.
● Leveraged analytics to drive business impact including customer interests.
● Proposed different Proof of concepts for clients to improve business impact.
● Scaled analytic capabilities using big data technologies, evolving analytics to influence investment banking strategic
● Performed customer segmentation, using machine learning algorithms to upgrade marketing campaigns.

Information Extraction for Financial Spreading
● This involved text SEC filings of companies in pdf and extraction of important data points from the text data.
● Used USE and Fasttext Sentence embeddings for extracting sentences which and custom NER model for retrieval of
numbers for autofill of financial spreads.
● Optimized the algorithm and improved prediction time by 80%.

Long short trading strategy by Topic modelling of financial documents

● Segregated 1000 financial documents through unsupervised algorithms like K-means clustering on the document
vectors made by LDA of companies,
● Predicted fundamentals of a company in 20 sectors and 70 subsectors.

Housing customers returning prediction

● This involved a dataset of 400 survey questions of customers and responses and predicting whether they will return.
● Conducted univariate and bivariate analysis for target variable prediction, making new features, and created a Logistic
Regression model on the selected features with 55 KS.

Financial sentiment analysis using earning calls transcripts

● Given the earning call transcripts of CEO’s and CFO’s of different companies, this involved deduction and classification
of their financial sentiments from their transcripts with 86% accuracy.
● Developed a complete pipeline of calculating counts of financial keywords classifying them into 7 different sentiments.
● Classified the sentences in transcripts by state-of-art BERT model by transfer learning.

Predictive modeling in Carry Trade

● This involved portfolio rebalancing for 100 tickers in Carry trade and predicting daily returns.
● Built a feed-forward network and with cross-validation, the best parameters were selected for the LSTM model for
different instruments.
● The alpha of trading strategy comes out to be 0.2.

Mortgage Prepayment Model conversion to Pyspark

● There is history mortgage data from the year 1996 till 2018 for predicting whether a customer is prepaying the mortgage.
● The solution includes making features from the base and predictions of mortgage prepayment based on features.
● Designed a complete data pipeline for preprocessing and feature creation at scale using PySpark and HDFS.

Information retrieval system for 10K-10Q filings based on sentence similarity

● Developed a query-based system to extract and compare information from financial document corpus consisting of
10K-10Q filings.
● Implemented sentence vectors on whole documents and based on their similarity with the given query, retrieved multiple
similar documents with similarity greater than 0.86.
BE (Honors) Computer Science and Engineering August 2014 – May 2018
Department of Computer Science, Panjab University, CGPA (8.39/10)

● Programming Languages: Python (NumPy, Scikit-learn, Matplotlib, PyTorch), PySpark
● Machine Learning: Predictive Modelling, Topic Modelling, Clustering, Linear Regression, Logistic Regression,
Decision Trees, Random forests, SVM, PCA, Boosting and Bagging Techniques, Tf-idf
● Deep Learning: Neural Networks, LSTM, Natural Language processing, Transfer Learning (BERT), Word2Vec
Model, USE, InferSent, Hugging Face Library, Named Entity Recognition(NER)
● Other relevant skills: Data visualization, Data Engineering and Analytics, Predictive Analytics,NLTK

● You make a difference Award: For saving time in fundamental analysis of a company by Information Extraction
● On-the-Spot-Award: For working on Topic modeling projects.
● Accurate Information Extraction: Achieved 85% accuracy in information extraction.
● Problem Resolution: Worked with design and test development engineers to analyze data and resolved process
● Quality Enhancement: Streamlined design with development process and increased production code quality.
● Production pipelines: Developing production ready machine learning pipelines for all the models.

You might also like