Data Engineer Power Bi

Location:

Round Rock, TX

Posted:

July 10, 2024

Contact this candidate

Resume:

Prem Sita Ram Karri Data Engineer

737-***-**** Frisco, Texas Linkedin

Profile

Experienced Data Engineer with around 4 years of experience in designing and building scalable data pipelines using technologies such as Hadoop, Spark, and SQL. Proficient in ETL processes, data modeling, database management systems

(DBMS), and data warehousing solutions. Skilled in extracting insights from large datasets using advanced analytical tools and techniques.

Skills

Programming Languages: Python, R, Java, Scala, SQL, Unix/Shell Scripting Big Data Tools: Hadoop, HDFS, MapReduce, Hive, Sqoop, Oozie, EMR, Spark, Pig, lambda functions, Zookeeper Frameworks & Tools: NumPy, Scikit-learn, Pandas, Seaborn, PySpark, Power BI, Tableau, Git, GitHub, MS Excel, MS Access, Airflow, ETL, Jira, Agile/Scrum, Jenkins, Docker, Confluence Datastores and Cloud: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), MySQL, Oracle, MongoDB, Teradata, PostgreSQL, Netezza, Cassandra, Apache Kafka, Snowflake, Databricks, Azure Data Factory, Azure Synapse, Azure Data Lake

Professional Experience

Data Engineer, IQVIA Oct 2022 – present Remote, USA

•Architected and implemented a scalable data integration platform for a healthcare provider, leveraging Python and SQL for ETL processes to ensure accurate and timely extraction, transformation, and loading of patient data.

•Designed and orchestrated real-time data streaming using Apache Kafka, enabling immediate insights into patient health metrics and clinical outcomes.

•Developed interactive dashboards and reports using Power BI, enabling stakeholders to visualize and analyze healthcare data trends and performance metrics in real-time.

Data Engineer, Southwest Airlines Apr 2022 – Sep 2022 Frisco, USA Designed and implemented robust data integration pipelines utilizing ETL processes, enabling seamless collection and

•Employed advanced Python (Numpy, Pandas) and SQL techniques for meticulous data cleaning and validation, ensuring the accuracy and reliability of the aviation dataset.

•Conducted large-scale data processing using Apache Spark, Hadoop, and Python, and created visually intuitive dashboards using Tableau, Power BI, and Excel for real-time insights and strategic decision-making. Data Analyst, L-Com Inc. Sep 2019 – Jul 2021 Bangalore, India

•Developed a Python-centric project aimed at augmenting risk assessment precision, employing meticulous data preprocessing, feature engineering, and model development strategies to deliver 30% increase in accuracy.

•Orchestrated the integration of cutting-edge Python libraries such as NumPy and Pandas, elevating data preprocessing and manipulation capabilities to optimize project workflows and bolster overall efficiency.

•Spearheaded the automation of data extraction and transformation procedures using Python scripting, revolutionizing analytics workflows and yielding substantial reductions in manual labor.

•Elevated risk assessment accuracy by 10% through the strategic implementation of random forest ensemble learning techniques, effectively capturing intricate data relationships, and furthered accuracy by 15% via the deployment of neural network models, underscoring the efficacy of deep learning methodologies in data analysis. Education

Masters of Science, Western Illinois University Aug 2021 – Dec 2022 Macomb, USA Computer and Information Science

Bachelors of Technology, C.M.R Institute of Technology Aug 2016 – Sep 2020 Hyderabad, India Computer Science Engineering

Certificates

Microsoft Certified: Azure Data Engineer Associate ad644e@r.postjobfree.com

Designed and implemented scalable ETL pipelines on Databricks to process and analyze large datasets, improving data processing efficiency by 40%.

Used Databricks to preprocess data, train models, and deploy them into production, improving model deployment speed by 50%

Designed and implemented ETL pipelines using PySpark and Hadoop to ingest and transform data from various sources. transformation of aviation data from diverse sources. Performed performance tuning and optimization of Spark jobs on Databricks, resulting in a 25% reduction in processing time.

Contact this candidate