Post Job Free
Sign in

Data Warehouse Big

Location:
Brandon, FL
Salary:
$130000
Posted:
April 15, 2025

Contact this candidate

Resume:

PUSHKAR TIWARI

Experience Summary

** Years of Experience as ETL Abinitio and Big Data Developer.

•Being a seasoned Data Integration professional, delivered Enterprise Data Warehouse projects.

•Implementing a project by using Agile Methodology and successfully delivering the code in Iterative manner to the customer.

•Actively participated in Requirement Analysis, Designing, Development, Testing and Production Deployment with Production support.

•Worked on Proof of Concepts in diversified Tools & Technologies.

•Experience in mentoring the team members in the process of knowledge transfer of the application flow, Ability to work under deadlines, worked as an active team member and willingness to accept responsibilities.

•Excellent knowledge on Hadoop Ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.

•Experience working with Big Data Tools like Hadoop, Hive, Spark to implement ELT(Extract, Load & Transform) solutions.

•Implementing a project by using Agile Methodology and successfully delivering the code in Iterative manner to the customer.

•Strong understanding of the full CI/CD lifecycle

•Worked with Ab Initio GDE, Unix commands, Shell scripting, SQL and Data Warehouse concepts.

Skills Summary

Domain

Healthcare, Banking & Finance and Logistics

Tools

Abinitio, Express>IT, Hadoop, Sqoop, Hive, Kafka, Spark, Pyspark and Spark SQL

HP Quality Center

Autosys, Control Center, CA-7

Programming Languages

SQL, PL/SQL, UNIX shell Scripting

Operating System / ERP Version

Windows, Linux RHEL7

Databases

Oracle, Teradata

Cloud Computing

Amazon S3, EMR and EC2

Hardware Platforms

Intel Series

Professional Certifications/ Trainings

Attended various trainings on Bigdata Hadoop and Snowflake Cloud Computing and AWS

Work Experience

Project 1 – Client CITI BANK LTIMINDTREE Tampa Florida, USA

Project Name

RDH

Team Size

8

Start Date

Nov 2024

End Date

Till Date

Project Description

RDH project focused loading data to hive standardization tables using ETL Abinitio and Spark process for US Regulatory Reporting.

Role & Contribution

•Design, Develop, Test, Prod Deployment of ETL Components

•Leveraged ETL Abinitio process to write custom SQL Queries used to query data against hive standardized table into Json file to validate business rules against data stored into hdfs.

•Leveraged Spark DQ process to query data on Hive Standardization tables to validate the number of business rules match/unmatch & NA scenarios by plugging business rule if then statement into sql case statement on top of sql query embedded in json file and validating business rules.

•Perform successful integration of ETL Abinitio with Hadoop Distributed file System.

•Integration of Spark with Hive to allow running scalable queries on top of big dataset with humongous data volume in parallel leveraging capability of Hadoop distributed file system with optimization techniques such as partitioning.

•Performance benchmarking with ETL Abinitio and Hive against Spark and Hive.

•Worked on POC to unload data from Web API endpoint into Unix file system using curl with tokenization technique.

•Successfully able to unload data as part of POC to optimize data fetch time by running multiple thread of curl command within spark, integrated with distributed computing of hdfs to run process in parallel resulting in reducing data retrieval from more than an hour time to 6 minutes.

Technology & Tools

Abinitio, Hive, HDFS, Spark, Cloudera-Manager, CI/CD

Key Achievements

Significantly reduced the data fetch retrieval timespan to few minutes from hours using spark and Hadoop parallelism feature. Further, performed detailed root cause analysis to find the discrepancy underlying the production data that is used for US Regulatory Reporting, by running SQL queries juxtapose to the Spark code logic to validate match/unmatch and na case scenario derived from business rules and further compare the results and conclude the flaw in the existing spark code.

nnan, USA

Project 2 – Client CITI BANK LTMINDTREE Tampa Florida, USA

Team Size

16

Project Name

DQP

Team Size

5

Start Date

Dec 2023

End Date

Sept 2024

Project Description

As a part of project I am working as Abinitio Developer on DQP platform. Project is focused upon enriching data quality based on metrics. Users has created dispositioning system to generate tickets for records that has failed data quality check and generate key metrics on records that failed data quality check and load it to target database.

Role & Contribution

•Worked on ETL enhancements using existing framework to perform data quality check for varied source system and generating scorecard with red, amber and green metrics for downstream MicroStrategy Team for reporting purpose. Data with Red Breaks are further sent for dispositioning to DCRM system and remediation is performed over data and again fed back to DQP process to perform further validation unless the data quality metrics is green.

•Worked on fixing technical support issues raised by Users on DQP project.

•Performance tune the existing process to gather data from Collibra staging tables using change data capture and load to dimension table by eliminating the extraneous ETL components and replacing it with SQL Stored procedure with MERGE statement to perform update and insert. This approach has drastically improved the performance from hour time to 2 minutes.

•Working alongside with Offshore Team in day to day task assigned as part of user stories in agile methodologies.

•Involved in fixing bug to the existing DQP process.

•Participating in agile scrum sprint backlog refinement meetings to prioritize the user story to perform and deliver as per business requirement.

Technology & Tools

ETL Abinitio, Oracle, SQL, Unix

Key Achievements

Successfully provide L2 technical support to Users for all open incidents.

Performance tune the existing process with Stored procedure to efficiently run in lesser timeframe.

Project 3 –Client FedEx Atos Syntel Pvt Ltd, Pune Maharashtra

Project Name

Cloud Migration (Colo)

Team Size

6

Start Date

Apr 2023

End Date

Sep 2023

Project Description

As part of Colo Migration Project, we have migrated ETL Application from On Prem to Dell Colo Public Cloud Server.

Role & Contribution

•Worked on migrating On Prem ETL Application on Abinitio to GCP Cloud

•Performed End to End testing of Application in Dev and Test Environment.

•Leveraged CI/CD pipeline to migrate code from Dev to Higher Environment.

•Participating in agile scrum sprint backlog refinement meetings to prioritize the user story to perform and deliver as per business requirement.

•Strong proficiency in Colo Cloud, SQL, and ETL tools, with the ability to develop and maintain data pipelines and ETL processes

•Work with Data teams to design and implement modern, scalable data solutions using a range of new and emerging technologies from the Dell Colo Public Cloud server

•Setup CICD Pipelines for the three projects dockerzing the applications, setting up Kubernetes cluster.

•Build Generic Framework to create tag of either object level and project level with all associated changed/ modified objects and do the initial check for object lock and then apprising the Users with mail notification about the check In object changes and post creating a tag and save file DEV environment and copying it to Higher Environment using scp and further load the save file and archive the existing sandbox path before checking out the newly migrated code from EME. This automated generic process helped achieve seamless migration to higher environment.

•Worked on JMS (Java Messaging System) to publish data to Abinitio Queue and resolved jars dependencies and further unload the data from queue to unix file system and validate business transformation.

Technology & Tools

ETL Abinitio, Colo Public Cloud Server, CI/CD, Conduct>IT,JMS

Key Achievements

Developed Generic Framework to migrate the code to higher environment seamlessly using conduct>IT framework for Abinitio.

Project 4 –Client CIGNA IBM INDIA PVT LTD, Pune Maharashtra

Project Name

CRM

Team Size

3

Start Date

Jul 2020

End Date

Mar 2023

Project Description

As a ETL Developer responsible for Analysis, design, development, testing and implementation of Big Data ETL Solution for our IBM US Healthcare Client. The project focused on executing user queries passed from CRM (Clinical Rule Maintenance) front end built on React Js and Angular Js using backend Big Data service plan and publishing reports to front end in excel format. The service plan is built on Big Data framework and use Spark Engine and it runs 24*7, further executes user queries submitted from front end on real time basis, one query at given time and generates output report in excel format for given query.

Role & Contribution

Responsible for architecting Abinitio ETL (Extract, Transform, Load) Framework design, development, debug, deploy and document all the Process and programs as well as support activities adhering to the corporate systems architecture and maintaining the framework.

Prepare architectural overview, high level and detailed design documents for the business and technical requirements.

Performance tune graph and plans with robust error handling/data validation and reconciliation.

Development, testing, enhancement, debug, document, and implement Abinitio ETL.

Create detailed design specification that document extraction, transformation, and load processes Big Data Hadoop coded for audit and maintainability purposes.

Responsible for developing Abinitio ETL mapping transformations and aggregations on data to perform load/unload task in HDFS.

Responsible for creating shell scripts to process the source files of different patterns, validating and extracting them, performing archive and purge tasks, creating wrapper and initialization scripts

Enhance our automation capabilities through scripting and assist with optimization of scripts & tools, make decision around key points of enhancement for utilities & methods.

Perform unit testing through ETL tool and acts as a subject matter expert to QA and Business Testers through the testing lifecycle.

Experience in Creating Worksheets and Dashboard Using Tableau.

Designed, developed, tested and maintained Tableau Functional Report based on User Requirement.

Developed generic framework which are configurable and reusable using Big Data tool.

Developed Generic process using ETL/Big Data tool to archive files based on file pattern for a given business date or date range to Network Drive and further deletion of files from Network Drive after 40 days purge period.

Developed code to publish excel report with separate tab for twenty query graphs, that gives an end to end details for all input parameters passed from CRM front end, alongside business date on which user submitted query. This is done to validate report output against user input parameters passed.

Participating agile scrum sprint backlog refinement meetings to prioritize the user story to perform and deliver as per business requirement.

Create template for Business Users in Express>IT.

Develop Business Rules as Transformations to integrate with graphs.

Technology & Tools

Abinitio ETL, Big Data, Hadoop, Hive, ETL,Teradata 12, Express >IT

Project 4 – Client BARCLAYS BANK IBM INDIA PVT LTD, Pune Maharashtra

Project Name

Sam8 Reporting (AML)

Team Size

10

Start Date

Dec 2018

End Date

May 2020

Project Description

As a Abinitio Developer responsible for Analysis, design, development, testing and implementation of ETL Solution for our IBM UK Banking & Financial Client. The project focused on publishing Suspicious Activity Monitoring Report (SAM8) that incorporates suspicious transactions

Role & Contribution

Developed Housekeeping Framework to archive files less than 3 days.

Prepare architectural overview, high level and detailed design documents for the business and technical requirements.

Developed Party & Party Account Relation Data Transformation Graph.

Written test data generation utility.

Perform unit testing through Ab Initio and acts as a subject matter expert to QA.

Developed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.

Developed generic framework which are configurable and reusable using Extract, Transform and Load tool Abinitio.

Loading data from different datasets using Extract, Transform and Load tool Abintio to datalake built on Hadoop Distributed File System, in different formats and structure like Parquet, Avro etc. to speed up analytics.

Worked Closely with Business Users. Interacted with ETL Developers, Project Managers and member of QA Teams.

Designed, developed tested and maintained Tableau Functional Reports based on User Requirements.

Technology & Tools

ETL Abinitio and Teradata, HDFS, Hive.

Key Achievements

Successfully implemented ETL process to publish data SAM8 reporting and developed Generic framework to archive purge files based on custom configuration file and store audit information on hive tables.

Project 5 – Client BARCLAYS BANK IBM INDIA PVT LTD, Pune Maharashtra

Project Name

9PSDM

Team Size

8

Start Date

APR 2017

End Date

NOV 2018

Project Description

As a Abinitio Developer responsible for Analysis, design, development, testing and implementation of ETL Solution for our IBM UK Banking & Financial Client. The Project focused on decommissioning BIW 1.0 Datawarehouse built on top of Teradata, to BIW 2.0 datalake built on HDFS (Hadoop Distributed File System) using 9PSDM framework that utilizes Abinitio product suite such as Graphical Development Environment, Enterprise Meta>Environment & Conduct>It.

Role & Contribution

Developed generic process to load data feed (intraday, full snapshot & delta) from Teradata to Hadoop Distributed File System using Extract, Transform Load tool Abinitio.

Created external partition table on Hive to load Active and Inactive records and view data on Big Data tool Hue.

Written test data generation utility to capture test results with test data.

Ability to process structured and semi-structured data using Apache Hive and Extract, Transform Load tool Abinitio and perform end to end testing.

Written test data generation utility to capture test results with test data.

Onboarded Falcon source system using Mainframe for data processing to BIW 2.0 (hdfs storage) using generic framework 9psdm.

Loading data from different datasets using Extract, Transform and Load tool Abintio to datalake built on Hadoop Distributed File System, in different formats and structure like Parquet, Avro etc. to speed up analytics.

Responsible for developing Ab Initio psets, DMLs(Data Manipulation Language) and XFRs(transformation) to do mapping transformations and aggregations on data using GDE (Graphical Development Environment) for generic process such as Extract, ANP, Load and Reconciliation developed as part of 9PSDM framework.

Enhance our automation capabilities through scripting and assist with optimization of scripts & tools, make decision around key points of enhancement for utilities & methods.

Design recommended best approach suited for data movement from different sources to Hadoop Distributed File System using Apache/Confluent Kafka.

Automated migration process of code to higher environment like QA, UAT and PROD. This is done with help of plans created under Conduct>IT framework.

Assisted in configuration, development and testing of AutoSys JIL and other scripts.

Interact face-to-face with customer and involve in discussions to prepare the design documentation from the formulated Business and system requirements.

Implementing and delivering the applications using Agile Methodology and Work with Business Analyst, Data Engineers, and IT architects to get the requirement to meet the deliverables.

Attending design review meetings to finalize the technical design. Present technical design to all project stakeholders using WebEx (document sharing tool) and get their sign-off.

Technology & Tools

Abinitio 3.3.4, HDFS, Apache Hive 2.2.0, Apache Kafka, Autosys.

Key Achievements

Successfully Integrated Kafka consumer with Abinitio Queue (producer) to publish data to Mem-store for real time data reporting purpose and successfully leveraged 9 PSDM framework for varied source system to populate data to BIW 2.0 Hadoop distributed file system.

Project 6 –Client UNITED HEALTHCARE GROUP Exusia Inc, Pune Maharashtra

Project Name

Claims Integration

Team Size

8

Start Date

Apr 2016

End Date

Mar 2017

Project Description

As a Abinitio Developer responsible for Analysis, design, development, testing and implementation of ETL Solution for our US Healthcare Client (UHG). This project is focused to re-engineer DMF framework developed by Abinitio corp. to DXF framework. Worked as an Abinitio developer in the framework team and also worked alongside QA team for defect fixing.DXF framework is divided in four phases viz, get, ilm,map, load and distribute. Get process extracts the data present in Mapr sandbox and Map phase is used to apply mapping transformation .ILM process is used to generate record structure for Map through target tables structure. Load Process has CDC process and that creates Master ICFF files comprising all history data and insert and update event files which contain current records. Insert & Update files are used in consumption process and loaded in target Netezza database. Also, there is concept of Late arriving dimensions which is handled using placeholder file concepts.

Role & Contribution

Worked as an Ab Initio developer collaborating with Unix and Netezza at the back-end. Prepare strong test strategy and execution plan to achieve business expectation.

Worked on Express>It to create GET, MAP, ILM and LOAD Appconfs for generic DXF framework.

Processed big data sets using Integration framework to capture real time updates and insert records.

Utilized distribution framework process to load event files to target database Netezza.

Coordinators of QA offshore call and actively involved in defect fixing and resolution.

Written shell script to capture record processed per day, avg time taken, avg cpu time from logs, based on source system wise. This is done to understand record processing of ETL code on daily basis.

Performance tuned ETL jobs to decrease the time it takes to complete the load process..

Collaborated with the infrastructure, network, database, application, and Business Intelligence teams to ensure data quality and availability.

Performed unit test case writing and execution with data validation.

Technology & Tools

Abinitio 3.2.6, Express>It, & Netezza

Key Achievements

Successfully implemented this project.

Project 7 (Client OPTUM Rx) Exusia Inc, Pune Maharashtra

Project Name

Prior Authorization

Team Size

8

Start Date

Oct 2015

End Date

Mar 2016

Project Description

As a Abinitio Developer responsible for Analysis, design, development, testing and implementation of ETL Solution for our US Healthcare Client (UHG). This project is focused to improve prior authorization process. Prior Authorization is process undertaken whenever Doctor advice patient high order drugs and it is not covered under medical plan issued by Insurance firm. OptumRx(PBM) provides approval for the usage of such drugs. We have developed an end-to-end application where we have gathered data from various source system viz SharePoint, excel etc. and perform business transformation on the same, further published data to Datalake and used Query>It to generate reports.

Role & Contribution

Understand the Requirements and Design and implement them.

Worked on Express>It to create GET, MAP, ILM and LOAD Appconfs for generic DMF framework.

Published data from Optum Sharepoint using customized web service graph to datalake built on multi file system.

Developed Web Service graph to fetch data using SOAP protocol in xml format and further using utility xml-to-dml to convert XML formatted to structured data.

Developed Loader process with remote layout such that data is published where reporting tool Query>IT is installed.

Leveraged Query>IT tool for data analysis and query purposes.

Worked with dimension and fact tables in the data warehouse. Have knowledge of Data Warehousing concepts and Dimensional models like Star Schema and Snowflake Schema.

Performed unit test case writing and execution with data validation.

Provided test and deployment support. Analyzed and fixed plan/graph failure issues found during testing.

Develop ETL (Extract, Transform, Load) and ELT (Extract, load, Transform) solutions using Ab Initio graphs and plans that involved transformations for OL (Operational layer), ODS (Operational data store), Dimensions, Fact and Aggregate layers of Netezza databases.

Technology & Tools

ETL Abinitio, Web Service SOAP, Query>IT and Express>IT

Key Achievements

Successfully loaded data to data mart RXclaim build on Netezza using Data Integration Framework and further distributed update and insert file data for downstream to process.

Project 7 (Client OPTUM Rx) Exusia Inc, Pune Maharashtra

Project Name

Rxtrack

Team Size

3

Start Date

Apr 2015

End Date

Sept 2015

Project Description

As a Abinitio Developer responsible for Analysis, design, development, testing and implementation of ETL Solution for our US Healthcare Client (UHG). This project focused on extracting data from 5 source system viz Rxauth(mainframe), Pass(Oracle), Avaya(Informix), Stellent(Mysql), Training(static csv) & Compliance(Sharepoint). I was involved in extracting data from Training & Compliance source system and designing ETL process for history and incremental feed data loading to ODS(Operational Data Source).

Role & Contribution

Develop ETL (Extract, Transform, Load) and ELT (Extract, load, Transform) solutions using Ab Initio graphs and plans that involved transformations for OL (Operational layer), ODS (Operational data store), Dimensions, Fact and Aggregate layers of target databases.

Developed generic SCDII(slowly changing dimension) process to capture historical data using point in time files concepts.

Developed generic Audit process to validate source and target record count as part of reconciliation.

Developed ETL code with Unix scripting to publish Audit report in html format to showcase incremental and history data load.

Automated migration process of code to higher environment like QA, UAT and PROD. This is done with help of plans created under Conduct>IT framework.

Bug fixes and Permanent fix of the problem.

Technology & Tools

ETL Abinitio, Unix, Control Center

Key Achievements

Successfully implemented this project

Project 8 – CITI BANK Wipro Technologies, Pune Maharashtra

Project Name

Anti Money Laundering Project

Team Size

10

Start Date

Jan 2013

End Date

Feb 2015

Project Description

As a Abinitio Developer responsible for Analysis, design, development, testing and implementation of ETL Solution for our Wipro US Banking & Financial Client. The project focused on publishing SAR (Suspicious Activity Report) and CTR(Currency Transaction Report) to avoid fraudulent transaction as per AML guidelines. The Bank Secrecy Act of 1970 (or BSA, or otherwise known as the Currency and Foreign Transactions Reporting Act) requires financial institutions in the United States to assist U.S. government agencies to detect and prevent money laundering. Every Bank follows the strict guideline stated under AML act. There are various measures Bank take to prevent money Laundering. For instance, Bank generates CTR(currency Transaction Report) every month to keep track of fraudster The CTR must report cash transactions in excess of $10,000 during the same business day. The amount over $10,000 can be either in one transaction or a combination of cash transactions. Customer's transaction above this limit is recorded under CTR report, which is filed then electronically with the Financial Crimes Enforcement Network, who will take necessary penal actions against the fraudulent.

Role & Contribution

System analysis and design to migrate the DB2 database from Mainframe system to Teradata using Extract, Transform Load tool Abinitio.

Published Currency Transaction Report filtering data of customers performing transaction of $10000, in day, using aggregation of entire amount of transaction per day.

Published Alerts and Non-Alerts file representing data of customer with transaction amount above and below the predefined transaction amount for varied organization like NGO etc., as per AML laws. Further generated an XML reporting for alert file data used for java web application.

I have translated complex logic of mainframe JCL and COBOL code and written the juxtaposed code In Ab Initio using various components viz Rollup, Scan, Reformat etc.

I was involved in Performance tuning of graphs to decrease the time it takes to complete the load.

I am responsible for code migration to production with the help of Deployment team.

I have written JIL script to schedule job in Autosys on daily and adhoc basis.

I have developed Extrapolated logic to calculate currency exchange rate.

Provided technical assistance for development and maintenance of AutoSys and related scheduling solutions.

Technology & Tools

Abinitio, Autosys, UNIX, Teradata 12

Key Achievements

Successfully migrated Mainframe System JCL and Cobol code to Teradata from DB2 database for Banamex Bank.

Educational Qualification

Education & Credentials

B.E in Computer Science



Contact this candidate