Post Job Free
Sign in

Data Scientist

Location:
San Francisco, CA
Salary:
140000
Posted:
July 12, 2024

Contact this candidate

Resume:

Kumar Kishalaya

San Francisco, CA 564-***-**** ad667v@r.postjobfree.com LinkedIn Github Medium AWS SUMMARY

Data Scientist with 6+ years of experience in E-recruitment, Digital platforms & Healthcare industry. Master’s degree from UC Davis in Business Analytics. Specialize in A/B testing, Product Analytics, ML algorithms, Personalization, NLP, GenAI & RAG- based solutions. Proven leadership & collaboration skills. WORK EXPERIENCE

Data Scientist, UC Davis Health, Sacramento, CA Sep 2023 – Jun 2024

• Developed RAG-based customer support chatbot to enhance query response accuracy and reduce resolution time by 90%.

• Developed a classification model to predict late arrivals/no-show patients, achieving ~80% F1 score. Devised intervention strategies based on ML model prediction leading to a 10% increase in provider-patient facetime.

• Designed & built 5+ Tableau dashboards to monitor clinic KPIs, improving planning & analysis time by 8-10 hours weekly. Senior Data Scientist (Team Lead), Internshala, Gurugram, India Aug 2021 – Apr 2023

• Collaborated with functional heads, finance team, and CEO to develop yearly business projections using SQL and time series analysis from 2019 to 2022, achieving projections within 5% of actual data every year.

• Built and maintained 15+ KPI dashboards empowering analytics and marketing teams to track vital metrics like ARPU, Churn, and CLTV-CAC, leading to optimized subscription-based model and a ~20% increase in revenue.

• Designed and conducted 10+ A/B tests & root cause analysis on key metrics fixing major product leaks and improving user TOFU, resulting in an increase in engagement by ~30% over a 2-year period.

• Built Recommendation Engine to enhance personalization on ATS. Leveraged a pre-trained word2vec model to extract textual features combined with numerical features resulting in an NDCG score of 0.85 and a 11.5% decrease in hiring time.

• Led and mentored a team of 5+ Data Scientists & Analysts on 15+ analytics projects over 3 years; facilitated cross-industry meetups for knowledge sharing & talent attraction. Data Scientist, Internshala, Gurugram, India Sep 2018 – Jul 2021

• Engineered a production grade content-based recommendation model on Python to identify most relevant candidates for job roles, reducing hiring time by ~15%. Utilized a combination of vector-based similarity and heuristic methods.

• Developed NLP (smart reply) model to increase chat response rate by 35%, boosting a platform level engagement. Leveraged LDA to generate labels and a CNN+LSTM+DNN layer for prediction achieving a top 3-accuracy of ~80%.

• Implemented an ensemble tree-based text classifier for identifying messages requiring response, achieving a 20%+ improvement in chat initiation rates. Utilized pre-trained self-attention-based BERT embeddings as textual features.

• Built a fraud detection classifier using BOW and TF-IDF, reducing malicious employer chat messages by 50% and achieving superior inference time compared to deep learning models with similar accuracy. Associate Data Scientist, Internshala, Gurugram, India Jan 2017 – Aug 2018

• Leveraged advanced SQL queries to structure and fulfill key data requirements with less than 5% error rate enabling smooth day-to-day operations across multiple departments and driving informed decision-making.

• Analysis & projection for marketing channel focused on B2B, aiding growth and partnership with 2000+ colleges in India resulting in ~500K monthly user registrations at its peak. PROJECTS

• Finetuning LLMs –Next word prediction, chatbot, Q/A and translation on RNN, LSTM, finetuned GPT-2, Llama-2-7b

• BERT – Analyzed the impact of transfer-learning for sequential recommendation on books dataset using only the book title.

• RAG – Engineered Agentic-RAG for structuring and retrieving research paper abstracts, outperforming naive RAG significantly. TECHNICAL SKILLS

Languages: Python, SQL, R, SAS, PyTorch, Tensorflow, Keras, Unix/Linux Database/Tools: Excel, RDB, Tableau, Power BI, Pandas, Numpy, Scikit, NLTK, SpaCy, SQL Server, Git, LangChain, Hugging Face Statistics: Hypothesis Testing, Causal Inference, Multivariate Testing, Non-Parametric tests, Survival Analysis, Probability Machine Learning: Statistical Modeling, Regression, Classification, Clustering, Support Vector Machines, Recommender Systems, PCA, KNN, Decision trees, Random Forest, GBM, XGBoost, ARIMA, ML Ops, Deep Neural Networks DL/NLP: CBOW, TF-IDF, Word2Vec, CNN, RNN, LSTM, Transformers, Prompt Engineering, Finetuning LLMs, LoRA/QLoRA, RAG Parallel Processing: ETL, Kinesis, Apache Spark, PySpark, S3, Kafka, GCP, AWS, Redshift, Hadoop, AWS Sagemaker EDUCATION

University of California, Davis, CA, USA Jul 2023 – Jun 2024 Master of Science in Business Analytics

Kurukshetra University, Haryana, India Aug 2012 – Jun 2016 Bachelor of Technology in Mechanical Engineering



Contact this candidate