Kumar Kishalaya
San Francisco, CA 564-***-**** ad667v@r.postjobfree.com LinkedIn Github Medium AWS SUMMARY
Data Scientist with 6+ years of experience in E-recruitment, Digital platforms & Healthcare industry. Master’s degree from UC Davis in Business Analytics. Specialize in A/B testing, Product Analytics, ML algorithms, Personalization, NLP, GenAI & RAG- based solutions. Proven leadership & collaboration skills. WORK EXPERIENCE
Data Scientist, UC Davis Health, Sacramento, CA Sep 2023 – Jun 2024
• Developed RAG-based customer support chatbot to enhance query response accuracy and reduce resolution time by 90%.
• Developed a classification model to predict late arrivals/no-show patients, achieving ~80% F1 score. Devised intervention strategies based on ML model prediction leading to a 10% increase in provider-patient facetime.
• Designed & built 5+ Tableau dashboards to monitor clinic KPIs, improving planning & analysis time by 8-10 hours weekly. Senior Data Scientist (Team Lead), Internshala, Gurugram, India Aug 2021 – Apr 2023
• Collaborated with functional heads, finance team, and CEO to develop yearly business projections using SQL and time series analysis from 2019 to 2022, achieving projections within 5% of actual data every year.
• Built and maintained 15+ KPI dashboards empowering analytics and marketing teams to track vital metrics like ARPU, Churn, and CLTV-CAC, leading to optimized subscription-based model and a ~20% increase in revenue.
• Designed and conducted 10+ A/B tests & root cause analysis on key metrics fixing major product leaks and improving user TOFU, resulting in an increase in engagement by ~30% over a 2-year period.
• Built Recommendation Engine to enhance personalization on ATS. Leveraged a pre-trained word2vec model to extract textual features combined with numerical features resulting in an NDCG score of 0.85 and a 11.5% decrease in hiring time.
• Led and mentored a team of 5+ Data Scientists & Analysts on 15+ analytics projects over 3 years; facilitated cross-industry meetups for knowledge sharing & talent attraction. Data Scientist, Internshala, Gurugram, India Sep 2018 – Jul 2021
• Engineered a production grade content-based recommendation model on Python to identify most relevant candidates for job roles, reducing hiring time by ~15%. Utilized a combination of vector-based similarity and heuristic methods.
• Developed NLP (smart reply) model to increase chat response rate by 35%, boosting a platform level engagement. Leveraged LDA to generate labels and a CNN+LSTM+DNN layer for prediction achieving a top 3-accuracy of ~80%.
• Implemented an ensemble tree-based text classifier for identifying messages requiring response, achieving a 20%+ improvement in chat initiation rates. Utilized pre-trained self-attention-based BERT embeddings as textual features.
• Built a fraud detection classifier using BOW and TF-IDF, reducing malicious employer chat messages by 50% and achieving superior inference time compared to deep learning models with similar accuracy. Associate Data Scientist, Internshala, Gurugram, India Jan 2017 – Aug 2018
• Leveraged advanced SQL queries to structure and fulfill key data requirements with less than 5% error rate enabling smooth day-to-day operations across multiple departments and driving informed decision-making.
• Analysis & projection for marketing channel focused on B2B, aiding growth and partnership with 2000+ colleges in India resulting in ~500K monthly user registrations at its peak. PROJECTS
• Finetuning LLMs –Next word prediction, chatbot, Q/A and translation on RNN, LSTM, finetuned GPT-2, Llama-2-7b
• BERT – Analyzed the impact of transfer-learning for sequential recommendation on books dataset using only the book title.
• RAG – Engineered Agentic-RAG for structuring and retrieving research paper abstracts, outperforming naive RAG significantly. TECHNICAL SKILLS
Languages: Python, SQL, R, SAS, PyTorch, Tensorflow, Keras, Unix/Linux Database/Tools: Excel, RDB, Tableau, Power BI, Pandas, Numpy, Scikit, NLTK, SpaCy, SQL Server, Git, LangChain, Hugging Face Statistics: Hypothesis Testing, Causal Inference, Multivariate Testing, Non-Parametric tests, Survival Analysis, Probability Machine Learning: Statistical Modeling, Regression, Classification, Clustering, Support Vector Machines, Recommender Systems, PCA, KNN, Decision trees, Random Forest, GBM, XGBoost, ARIMA, ML Ops, Deep Neural Networks DL/NLP: CBOW, TF-IDF, Word2Vec, CNN, RNN, LSTM, Transformers, Prompt Engineering, Finetuning LLMs, LoRA/QLoRA, RAG Parallel Processing: ETL, Kinesis, Apache Spark, PySpark, S3, Kafka, GCP, AWS, Redshift, Hadoop, AWS Sagemaker EDUCATION
University of California, Davis, CA, USA Jul 2023 – Jun 2024 Master of Science in Business Analytics
Kurukshetra University, Haryana, India Aug 2012 – Jun 2016 Bachelor of Technology in Mechanical Engineering