Gathered integration requirements, planned, coordinated, and participated in workshops, solution refinement, development activities, and supported testing of integrations
Spearheaded a migration initiative to transition from on-premises Oracle to AWS, utilizing AWS Glue and AWS Lambda for orchestration and data transformation
Extracted metadata from Oracle ETL tools to expedite data mapping processes, enabling faster turnaround times for data integration projects
Developed ETL pipelines in AWS Glue using Connections, Crawlers, and Jobs to extract, transform, and load data from various sources, including Oracle, SFTP, AWS RDS, S3, and Redshift
Designed and developed a new solution to process real-time data using Amazon Kinesis and AWS Glue Streaming ETL with Amazon Redshift for data storage and analysis
Developed Snowflake tasks using PySpark and SnowSQL for data extraction, transformation, and aggregation from multiple file formats to analyze and transform data, uncovering insights into customer usage patterns
Good understanding of Spark architecture including spark core, spark SQL, Dataframe, Spark streaming, Driver Node, Worker Node, Stages, Executors and tasks, Deployment modes, Execution hierarchy, fault tolerance, and collection
Possess strong technical understanding of big data platforms and proficient use of data development software (ETL packages) and programming languages (SQL and Python) to develop tools for data extraction, transformation, and loading, as well as document their creation
Experienced in AWS data lakehouse architecture using Medallion model (Bronze, Silver, Gold tiers)
Utilized Data Vault modelling for staging, and both Star and Snowflake schemas for reporting
Leveraged AWS S3 Lifecycle policies to automate data transitions between storage classes, reducing storage costs by keeping only the most valuable data in high-performance storage.
Senior Data Engineer
Resolution Life
05.2021 - 10.2022
Designed and developed data pipelines in AWS Glue, implementing intricate ETL processes, data transformations, and orchestration
Created and integrated connections to diverse data sources including legacy assets, on-premises databases, Oracle Database, SAP, Salesforce, and REST APIs, ensuring seamless data integration and optimizing performance
Implemented robust data extraction in AWS Glue, transforming raw files from Amazon S3, EBS, or EFS into optimized Parquet format
Leveraged AWS Glue DataBrew for data preparation tasks and applied PII masking during data loading to staging tables, enhancing downstream analytics and processing efficiency
Leveraged AWS CodePipeline to implement a robust CI/CD pipeline, ensuring reliable and repeatable deployment of data integration workflows
Utilized AWS Glue's comprehensive set of capabilities to perform complex data manipulation tasks, including data type conversions, aggregations, lookups, and conditional logic
Implemented near real-time data streaming pipelines on AWS, leveraging Change Data Capture (CDC) logic with AWS Glue event triggers
Employed watermarking techniques in AWS Databricks to process only changed records, thereby improving data processing efficiency and ensuring data integrity
Implemented cluster optimization strategies in Databricks, configuring settings for performance and cost efficiency
Utilized Databricks notebooks for seamless execution of PySpark scripts, leveraging parallel processing capabilities to efficiently handle large-scale data transformations across diverse datasets.
Principal Data Consultant
NAB
04.2017 - 04.2021
Architected and deployed a scalable real-time data platform using AWS Stack (S3, Glue, Lambda, RDS) which enhanced business decision-making capability by 30%
Migrated legacy data systems (Oracle & SAP) to AWS Redshift
Utilized Glue crawlers to automate the discovery and ingestion of data, ensuring seamless transition while preserving data integrity and logic
Designed and implemented robust ETL pipelines using AWS Glue, reducing processing time by 40%
Designed and introduced automation scripts framework using Python and AWS Lambda to streamline data ingestion processes, reducing manual efforts by 50%
Designed and executed a disaster recovery plan in AWS for critical data stores, drastically reducing potential downtime
Led a team of 4 data engineers in migrating 50+ TB of data to AWS Redshift, adhering to strict data security and compliance standards, resulting in a secure and scalable data environment
Collaborated closely with cross-functional teams to design and implement migration strategies aligned with organizational objectives and compliance requirements.
Senior Consultant
Oracle Corporation
05.2012 - 04.2017
Led design and implementation of custom ODI interfaces, incorporating complex transformation logics, filters, joins, and lookups to seamlessly migrate data from legacy systems to modern cloud-based environments
Employed ODI's extensive transformation capabilities to ensure data quality and consistency throughout migration process
Developed and optimized ODI packages and scenarios to orchestrate data integration workflows across heterogeneous data sources, leveraging ODI's graphical interface and declarative design approach to simplify development and maintenance tasks
Implemented best practices and design patterns to enhance scalability, performance, and reusability of ODI artifacts
Collaborated with business stakeholders to gather requirements and translate business logic into ODI workflows and mappings, ensuring alignment with organizational objectives and data governance standards
Provided technical leadership and guidance to junior developers, facilitating knowledge transfer and skill development in ODI development practices
Implemented change data capture (CDC) feature of ODI to minimize data load times
Orchestrated end-to-end migration processes, encompassing data ingestion, integration, and normalization, while adhering to best practices in data management and governance
Collaborated closely with cross-functional teams to design and implement migration strategies aligned with organizational objectives and compliance requirements
Led successful migration of customer data from legacy systems to Microsoft Dynamics 365, ensuring seamless integration and data accuracy
Developed custom data migration scripts and validation processes to identify and resolve data inconsistencies, resulting in 98% data accuracy rate post-migration.