Arpit Kakasaheb Patil
****************@*****.*** https://www.linkedin.com/in/arpit-patil/ +1-385-***-**** https://github.com/arpit2607 arpit2607.github.io Google Scholar EDUCATION
University of Utah Salt Lake City, UT
Master of Science in Computer Science (GPA: 3.86/4) Aug 2022 - May 2024 Courses: Deep Learning, Machine Learning, Manage Data for ML, Data Mining, Advanced Algorithms, Advanced Operating Systems Dr. Vishwanath Karad MIT World Peace University Pune, India Bachelor of Technology in Computer Science and Engineering (CGPA: 9.74/10) Jul 2017 - Jul 2021 Courses: Artificial Intelligence, Object Oriented Programming, Data Structures, Database Management TECHNICAL SKILLS
Programming Languages and Databases: Python, C, C++, Java, R, PHP, SQL, MySQL, PostgreSQL, MongoDB Big Data & Cloud: Microsoft Azure, Amazon Web Services, Hadoop, Apache Spark, Snowflake, Data Warehouse, Data Lake Data Science: Natural Language Processing, Large Language Models, NumPy, Scikit-learn, Keras, PyTorch, SciPy, TensorFlow, Pandas, Matplotlib, ETL, Generative AI, Data Analytics, Advanced Analytics, Statistics, Cloud Computing Visualization: Tableau, PowerBI, Paraview
Web Application Tools: HTML, CSS, JavaScript, D3.js, Angular, Flask, React, Material UI, Chakra UI Tools: Git Version Control, Postman, ActiveMQ, UiPath Orchestrator, Docker, Agile, DevOps, Jira Certifications: Microsoft Azure AZ-900
WORK Senior Data EXPERIENCE Scientist, Los Angeles County Department of Health Services Jan 2025 – Present
• Constructed end-to-end data pipelines in Databricks, integrating Blob Storage for data retrieval and Tableau for real-time visualization of 10+ KPIs, automating ETL processes and enhancing KPI reporting efficiency by 100%
• Developed CHAMP data workflows to fix 1,000+ address issues and processed 50,000+ records using SQL and decision trees, boosting model accuracy by 23%
• Managed the migration of monthly claims processing to Databricks workflows and Unity Catalogs, enabling scalable pipeline orchestration and improving data validation and delivery efficiency across a team of 3 analysts
• Automated monthly Provider Advantage ETL using SAS and Excel, standardizing 20+ fields, mapping 100+ insurance codes, and reducing reporting time by 50% through reusable workflows and SSI-compliant formatting Software Engineer, TitanPayAI Jul 2024 – Feb 2025
• Spearheaded the front-end development within an Agile sprint framework, creating interfaces using React.js and Material UI
• Enhanced efficiency by connecting React.js with async FastAPI and PostgreSQL back-end, reducing API response times by 40%
• Collaborated with leadership and stakeholders to design intuitive screens with Figma and database architecture in PostgreSQL
• Built the product using Angular and Java, integrating REST APIs, with deployment achieved on AWS Elastic Beanstalk and S3 Graduate Teaching and Research Assistant, University of Utah Jan 2023 – May 2024
• Tutored and led labs in a class of 100+ students in courses Introduction to Data Science, Deep Learning and Programming for All
• Formulated a hierarchical taxonomy for numerical reasoning with over 10 reasoning types
• Devised 100+ diverse numerical probes and orchestrated scripts to process a 10,000+ entry JSON file, improving data processing efficiency by 25%
Big Data and Cloud Engineer, Abzooba India Infotech Pvt Ltd Mar 2021 – Jul 2022
• Led a 10-member team for internal hackathon, resulting in the development of an award-winning HR query chatbot improving the internal query response accuracy by 52%
• Orchestrated Olive Data Ingestion Framework (ODIF), a cutting-edge tool facilitating efficient data transfer across diverse sources and sinks, employing a cloud-agnostic architecture achieving 33% increase in compatibility and deployment speed
• Engineered ODIF by integrating ETL processes and implementing CI/CD pipelines to improve data handling efficiency eliminating pre-installation cluster requirements, resulting in 28% reduction in resource footprint Link
• Programmed Python ETL scripts for Snowflake and Amazon Redshift, transformed data utilizing Data Build Tool (dbt), reducing processing time by 40%
• Designed Power BI and Tableau dashboards for real-time data visualization from Amazon Kinesis enhancing monitoring processes Software Development Intern, Sisai Technologies Pvt Ltd Feb 2020 – May 2020
• Visualized and managed real-time data from a Vibration Oscillation Machine using MongoDB, SpringBoot, Angular, and D3.js
• Tested REST APIs with Postman and implemented A/B testing, enabling insightful analysis of over 10,000 data points per hour
• Configured ActiveMQ JMS for frontend-backend communication, resulting in a 25% reduction in data latency RESEARCH ‘Exploring Numerical PAPERS Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data’, EMNLP 2023 Link
• Evaluated language models’ numerical reasoning, finding FlanT5 and GPT-3.5 outperforming other models, with GPT-3.5 showing an average 16.7% accuracy improvement in complex tasks across 10+ reasoning types
‘Garbage Classifying Application Using Deep Learning Techniques’, IEEE RTEICT 2021 Link
• Created an Android application using deep learning models (VGG-16, ResNet50, Simple CNN) for classifying images into garbage or non-garbage categories, with a 96% accuracy rate, and automated location reporting to Firebase for garbage detections Link