Overview
Work History
Education
Skills
Certification
Timeline
Generic

Karthik Annangi

Melbourne

Overview

12
12
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

AMP
10.2022 - Current
  • Gathered integration requirements, planned, coordinated, and participated in workshops, solution refinement, development activities, and supported testing of integrations
  • Spearheaded a migration initiative to transition from on-premises Oracle to AWS, utilizing AWS Glue and AWS Lambda for orchestration and data transformation
  • Extracted metadata from Oracle ETL tools to expedite data mapping processes, enabling faster turnaround times for data integration projects
  • Developed ETL pipelines in AWS Glue using Connections, Crawlers, and Jobs to extract, transform, and load data from various sources, including Oracle, SFTP, AWS RDS, S3, and Redshift
  • Designed and developed a new solution to process real-time data using Amazon Kinesis and AWS Glue Streaming ETL with Amazon Redshift for data storage and analysis
  • Developed Snowflake tasks using PySpark and SnowSQL for data extraction, transformation, and aggregation from multiple file formats to analyze and transform data, uncovering insights into customer usage patterns
  • Good understanding of Spark architecture including spark core, spark SQL, Dataframe, Spark streaming, Driver Node, Worker Node, Stages, Executors and tasks, Deployment modes, Execution hierarchy, fault tolerance, and collection
  • Possess strong technical understanding of big data platforms and proficient use of data development software (ETL packages) and programming languages (SQL and Python) to develop tools for data extraction, transformation, and loading, as well as document their creation
  • Experienced in AWS data lakehouse architecture using Medallion model (Bronze, Silver, Gold tiers)
  • Utilized Data Vault modelling for staging, and both Star and Snowflake schemas for reporting
  • Leveraged AWS S3 Lifecycle policies to automate data transitions between storage classes, reducing storage costs by keeping only the most valuable data in high-performance storage.

Senior Data Engineer

Resolution Life
05.2021 - 10.2022
  • Designed and developed data pipelines in AWS Glue, implementing intricate ETL processes, data transformations, and orchestration
  • Created and integrated connections to diverse data sources including legacy assets, on-premises databases, Oracle Database, SAP, Salesforce, and REST APIs, ensuring seamless data integration and optimizing performance
  • Implemented robust data extraction in AWS Glue, transforming raw files from Amazon S3, EBS, or EFS into optimized Parquet format
  • Leveraged AWS Glue DataBrew for data preparation tasks and applied PII masking during data loading to staging tables, enhancing downstream analytics and processing efficiency
  • Leveraged AWS CodePipeline to implement a robust CI/CD pipeline, ensuring reliable and repeatable deployment of data integration workflows
  • Utilized AWS Glue's comprehensive set of capabilities to perform complex data manipulation tasks, including data type conversions, aggregations, lookups, and conditional logic
  • Implemented near real-time data streaming pipelines on AWS, leveraging Change Data Capture (CDC) logic with AWS Glue event triggers
  • Employed watermarking techniques in AWS Databricks to process only changed records, thereby improving data processing efficiency and ensuring data integrity
  • Implemented cluster optimization strategies in Databricks, configuring settings for performance and cost efficiency
  • Utilized Databricks notebooks for seamless execution of PySpark scripts, leveraging parallel processing capabilities to efficiently handle large-scale data transformations across diverse datasets.

Principal Data Consultant

NAB
04.2017 - 04.2021
  • Architected and deployed a scalable real-time data platform using AWS Stack (S3, Glue, Lambda, RDS) which enhanced business decision-making capability by 30%
  • Migrated legacy data systems (Oracle & SAP) to AWS Redshift
  • Utilized Glue crawlers to automate the discovery and ingestion of data, ensuring seamless transition while preserving data integrity and logic
  • Designed and implemented robust ETL pipelines using AWS Glue, reducing processing time by 40%
  • Designed and introduced automation scripts framework using Python and AWS Lambda to streamline data ingestion processes, reducing manual efforts by 50%
  • Designed and executed a disaster recovery plan in AWS for critical data stores, drastically reducing potential downtime
  • Led a team of 4 data engineers in migrating 50+ TB of data to AWS Redshift, adhering to strict data security and compliance standards, resulting in a secure and scalable data environment
  • Collaborated closely with cross-functional teams to design and implement migration strategies aligned with organizational objectives and compliance requirements.

Senior Consultant

Oracle Corporation
05.2012 - 04.2017
  • Led design and implementation of custom ODI interfaces, incorporating complex transformation logics, filters, joins, and lookups to seamlessly migrate data from legacy systems to modern cloud-based environments
  • Employed ODI's extensive transformation capabilities to ensure data quality and consistency throughout migration process
  • Developed and optimized ODI packages and scenarios to orchestrate data integration workflows across heterogeneous data sources, leveraging ODI's graphical interface and declarative design approach to simplify development and maintenance tasks
  • Implemented best practices and design patterns to enhance scalability, performance, and reusability of ODI artifacts
  • Collaborated with business stakeholders to gather requirements and translate business logic into ODI workflows and mappings, ensuring alignment with organizational objectives and data governance standards
  • Provided technical leadership and guidance to junior developers, facilitating knowledge transfer and skill development in ODI development practices
  • Implemented change data capture (CDC) feature of ODI to minimize data load times
  • Orchestrated end-to-end migration processes, encompassing data ingestion, integration, and normalization, while adhering to best practices in data management and governance
  • Collaborated closely with cross-functional teams to design and implement migration strategies aligned with organizational objectives and compliance requirements
  • Led successful migration of customer data from legacy systems to Microsoft Dynamics 365, ensuring seamless integration and data accuracy
  • Developed custom data migration scripts and validation processes to identify and resolve data inconsistencies, resulting in 98% data accuracy rate post-migration.

Education

Bachelors in Information Technology -

JNTU - India
05.2009

Skills

  • Databricks
  • Azure Data factory
  • Pyspark & SparkSQL
  • Python Programming
  • ETL development
  • Git Version Control
  • Kafka Streaming
  • Data Pipeline Design
  • Data Modeling
  • Advanced SQL
  • Data Migration

Certification

  • Azure Data Engineer Associate
  • Databricks Certified Data Engineer Associate
  • ODI 12 C Certified
  • Smart Communications Certified (CCM Tool)

Timeline

Senior Data Engineer

AMP
10.2022 - Current

Senior Data Engineer

Resolution Life
05.2021 - 10.2022

Principal Data Consultant

NAB
04.2017 - 04.2021

Senior Consultant

Oracle Corporation
05.2012 - 04.2017

Bachelors in Information Technology -

JNTU - India
  • Azure Data Engineer Associate
  • Databricks Certified Data Engineer Associate
  • ODI 12 C Certified
  • Smart Communications Certified (CCM Tool)
Karthik Annangi