Post Job Free
Sign in

Data Scientist Analyst

Location:
East Lansing, MI
Salary:
60000
Posted:
January 30, 2025

Contact this candidate

Resume:

SAUMYA SHAH

*************@*****.*** • 517-***-**** • https://www.linkedin.com/in/5hah5aumya/ • New York City, NY Enabling AI transformation to optimize businesses, supply chains, and manage risks, by delivering impactful data-driven solutions. Harnessing much demanded data mining, analytics, visualization, modeling, and MLOps skills throughout the product life cycle. WORK EXPERIENCE

Data Scientist Meijer Inc., East Lansing, MI Jan 2024 – May 2024

Enhanced service-levels of 4 distribution facilities (catering to 30 stores) by optimizing safety stocks, on-shelf-availability, warehousing, and ordering costs, preventing stock-out losses worth $ 200K for May 2024 (Referenced paper: King et al.)

Setup PySpark ETL pipeline to fetch and apply transformations on SAP HANA real-time data stream while carrying out 6 statistical tests (F-test, Chi-Square, ADF, ACF, PACF, Granger causality) using Spark MLlib and RDDs

Forecasted safety stocks for top 18 revenue driving stock-out vulnerable product-vendor groups, by building an ensemble of: Bayesian model (demand uncertainty), Poisson model (short-term sales peaks), and Facebook Prophet (baseline)

Applied SAS Viya to provide insights, study interaction term effects (weather, fuel price, convenience) on model forecasts Data Analyst VERN.ai, East Lansing, MI May 2023 – Aug 2023

Integrated GPT 2 transformer into PySpark cluster for user pro ling, and analyzing customer queries and purchase patterns

Monitored chatbot performance (tableau), deployed feedback mechanism to achieve 76% utterance-intent-context match

Reduced database (30+ entities) query response time from 2 min to 15 sec, by executing PL/SQL triggers and cursors Research Assistant Michigan State University, East Lansing, MI Sep 2022 – Dec 2022

Collaborated at Imaging and Deep learning lab, team of 20+ researchers, dealing with disease detection, structural analysis

Migrated Raman spectral simulation scripts from MATLAB to Python, implementing clustering (GMM) on 1133 compounds

Trained a CNN-based autoencoder (Keras) to de-noise spectral images, leading to an F1-score of 87% Associate Analyst XcelTec Interactive Private Limited, Ahmedabad, India Apr 2019 – Aug 2022

Generated product recommendations for e-commerce app using collaborative ltering, associate rule mining, and customer review sentiment analysis, achieving a scale of 5000+ products, 80+ categories, and increased revenue by $ 60K

Automated ML pipelines, populating supply chain KPIs through AWS SageMaker, Step functions, and Lambda triggers

Developed and maintained merchant app (admin) with HTML, CSS, Js, Django, MongoDB, NGinX, Docker, and AWS EC 2

Facilitated communications among stakeholders, sales, and engineering team using MS Office (Excel, PPT, Word) and Jira PROJECTS

Property investment management Jun 2024

Participated in 7-day hackathon, organized by Jain Alert group (US), got selected for in-person global round in New York City

Scraped property, nancial, and market data (climate risk, crime stats, insurance, transit, capital spending, amenities, etc.) for 1500+ households in real-time, asynchronously using Selenium scripts and kafka data stream

Staged raw data into MongoDB instance, later cleaned, transformed, and stored in MySQL database for react UI engine to access. This data abstraction (2-layer system architecture) inherently provided security from injection or backdoor attacks

Ranked neighborhoods based on derived metrics and user data (credit score, job designation, down payment exibility, age, preferences, liabilities), generating custom investment portfolio/report ( nancial underwriting) for listings in NJ (070**-*****) PowerTree Apr 2022

Engaged with PDEU and Indian Meteorological Department in research on efficient forecasting of Solar Irradiance

Pitched 5 module PoC, facilitating informed decisions, optimizing investment strategies, ensuring regulatory compliance

Leveraged ArcGIS, and satellite imagery, to simplify land potent identi cation, and scaling up power generation (0.5 MW)

IEEE TRIBES conference publication DOI: 10.1109/TRIBES52498.2021.9751626 TECHNICAL SKILLS

Programming languages: Python SQL R JavaScript C++ Matlab Frameworks and Packages: PySpark TensorFlow Hugging Face Django FastAPI MongoDB Selenium Kafka Tools: Prompt engineering (LLMs) AWS Docker Hadoop Tableau SAP Git Jenkins SAS Latex Certi cations: IBM Data Science NPTEL Deep Learning 100 days of ML Faculty recommendations EDUCATION

Master of Science in Data Science Michigan State University Statistical modeling, Supply Chain management, Data Mining, Deep learning in Finance, NLP, Big Data analytics, Machine learning Bachelor of Science in Computer Engineering Pandit Deendayal Energy University Data Structures and Algorithms, DBMS, Web Development, Insurance theory, Derivative markets, Hedging



Contact this candidate