Data Engineer Engineering

Location:

Hilliard, OH, 43026

Posted:

December 05, 2024

Contact this candidate

Resume:

AISHWARYACHIRANJEEVI

***************@*****.*** 540-***-**** Irving, TX LINKEDIN

SUMMARY Over 6+ years of experience in Data Engineering, Data Pipeline Design, Development, and Implementation as a Sr. Data Engineer/Data Developer and Data Modeler.

Strong experience in writing scripts using Python API, PySpark API, and Spark API for analyzing data.

I have extensively used Python Libraries PySpark, Pytest, Pymongo, cxOracle, PyExcel, Boto3, Psycopg, embedPy, NumPy and Beautiful Soup.

Experience in Google Cloud components, Google container builders GCP client libraries, and cloud SDKs

Hands-on use of Spark and Scala APIs to compare the performance of Spark with Hive and SQL, and Spark SQL to manipulate Data Frames in Scala.

I have expertise in Python and Scala, user-defined functions (UDF) for Hive and Pig using Python.

Experience in developing Map Reduce Programs using Apache Hadoop for analyzing big data as per the requirement.

Experience in working with Flume and NiFi for loading log files into Hadoop. Experience in working with NoSQL databases like HBase and Cassandra.

Strong knowledge and hands-on experience with various GCP services and components, ensuring seamless cloud operations and optimizations.

Good Experience in implementing and orchestrating data pipelines using Oozie and Airflow.

Worked with Cloudera and Hortonworks distributions.

Expert in developing SSIS/DTS Packages to extract, transform, and load (ETL) data into data warehouses/ data marts from heterogeneous sources.

Good working knowledge of Amazon Web Services (AWS) Cloud Platform which includes services like EC2, S3, VPC, ELB, IAM, DynamoDB, Cloud Front, Cloud Watch, Route 53, Elastic Beanstalk (EBS), Auto Scaling, Security Groups, EC2 Container Service (ECS), Code Commit, Code Pipeline, Code Build, Code Deploy, Dynamo DB, Auto Scaling, Security Groups, Redshift, CloudWatch, CloudFormation, CloudTrail, Ops Works, Kinesis, IAM, SQS, SNS, SES.

Experience in Data Analysis, Data Profiling, Data Integration, Migration, Data Governance and Metadata Management, Master Data Management, and Configuration Management. Experience in developing customized UDFs in Python to extend Hive and Pig Latin functionality.

Expertise in designing complex Mappings and expertise in performance tuning and slowly changing Dimension Tables and Fact tables

I extensively worked with Teradata utilities, such as Fast Export and Multi Load, to export and load data to/from different source systems, including flat files.

I am experienced in building automation regression Scripts for validation of ETL processes between multiple databases like Oracle, SQL Server, Hive, and Mongo DB using Python.

Proficiency in SQL across several dialects (we commonly write MySQL, PostgreSQL, Redshift, SQL Server, and Oracle).

Experience in designing star schema, Snowflake schema for Data Warehouse, and ODS architecture.

Skilled in System Analysis, E-R/Dimensional Data Modeling, Database Design and implementing specific features.

TECHNICAL SKILLS: