Summary
Overview
Work History
Education
Skills
Additional Information
Employment
Technical Skills
Personal Profile
Timeline
Generic

Aakanksha Sharma

Senior Data Engineer
Melbourne

Summary

Detail- oriented and result driven professional with over 6 years of experience in the IT industry, performed various roles working as an IT officer, programmer analyst, data engineer and data analyst. Ability to handle multiple projects simultaneously with a high degree of accuracy. Pursuing full-time role involved in data analysis and machine learning.


Overview

9
9
years of professional experience
6
6
years of post-secondary education

Work History

Senior Data Engineer

Client - NAB
07.2023 - Current

Role : Senior Data Engineer

Team Size : 8

Environment: Databricks, Apache Spark, Delta Lake,Databricks Notebooks, Workflows

Role and Responsibilities:

  • Maintained scalable ETL pipelines, ensuring efficient data ingestion and transformation
  • Collaborated with cross-functional teams to analyze business requirements and translate them into effective Databricks notebooks
  • Utilized Databricks Jobs for scheduling and orchestrating Spark jobs, improving automation and reducing manual intervention
  • Implemented performance tuning strategies to enhance Spark job execution on Databricks clusters
  • Worked on real-time data processing using Structured Streaming in Databricks Collaborated with data scientists to deploy machine learning models on Databricks for predictive analytics
  • Implemented and maintained security measures, ensuring data integrity and compliance with industry standards
  • Conducted training sessions for team members on Databricks best practices and optimization techniques

Project Name : CIJ - POD11

Client - Telstra
02.2022 - 06.2023

Role : Technical AWS Lead

Team Size : 10

Environment: AWS – AWS Batch, CloudWatch, S3, EC2, Container, Docker, SQS, SNS, Athena, Redshift, Kinesis Stream, Kinesis Firehose, Lambda, Python

Role and Responsibilities:

  • Telstra AWS project lead, responsible for rolling out big customer targeted offers like Telstra iPhone and Samsung launches
  • Analyzed complex data and identified anomalies, trends, and risks to provide useful insights to improve internal controls.
  • Modelled predictions using feature selection algorithms, included customer churn and purchase propensities which resulted in 18% customer churn reduction for 2022-2023
  • Prepared documentation and analytics reports, delivering summarized results, analysis and conclusions
  • Reviewed current analytics implementations and provided recommendations for realignment to customer KPIs or technical best practices reducing database runtime to 27%
  • Responsible for leading a team of 10 members and provide technical and functional guidance
  • Responsible for adhering to Data Privacy and Protection policy in CIJ
  • Patch update of data files (csv to parquet conversion) and upload to S3 using PySpark on Jupyter Notebook and AWS EMR
  • Download data from S3 to Jupyter notebook using PySpark to visualize data in Python and R, to help generate reports and understand the attributes for purchase and churn propensity models for customers

Project Name : CIJ - Feature Engineering

Client : Telstra
07.2020 - 01.2022

Role : Sr. Data Engineer
Team Size : 8
Environment: AWS – AWS Batch, S3, EC2, Container, Docker, SQS,SNS, Athena, Redshift, Kinesis, SQL
Role and Responsibilities:

  • Responsible for building models for Downstream Customer Campaign
  • Responsible for building Recommendations and Data Science models for Campaign and Marketing
  • Responsible to create customer onboarding features on top of customer propensity models
  • Responsible for maintaining optimizations in the modelling
  • Responsible for Data Analysis and identification of data lineages
  • Responsible for building Solution Sheet and Interface Architectural designs.


Project : Legacy Migration

WorkSafe Victoria
07.2019 - 06.2020

Role : Sr. Data Analyst - Big Data Analytics
Team Size : 18
Environment: Teradata,Netezza Map-R, Scala, Spark, Hive, SparkSql

Role and Responsibilities:

  • Responsible for the documentation, design, development and automation BOT for testing data migration from various RDBMS databases(Teradata, Netezza) into BDP
  • Requirement gathering for all the sources, designing and providing technical support and guidance
  • Analysis of Business data and the processes transformed as a pre requisite to ingest data into legacy architecture and replicating a pipeline to ingest same data into Big Data Platform
  • Transformations of Source raw data into Business consumable data based on requirements and logics for downstream consumption
  • Integrating data with Tableau to create Dashboards
  • Automated tools for document verification which increased legal document validation to 23%

Project : Progrexion CRM

Role : Programmer Analyst
10.2015 - 05.2017

Client : Progrexion Services
Team Size : 8
Environment: Salesforce Lightening, Einstein Analytics, SQL data loader, Apex, Python

  • Analysed requirements for Progrexion Services - US based credit score company and developed existing Rev Software on Salesforce Platform
  • Worked closely with clients to establish specifications and system designs that included new dynamic Salesforce lightening pages development adhering to needs of legacy system data with over 21 million customer records
  • Responded to and remedied critical issues to limit downtime
  • Built and debugged custom components, executed technical implementation, integrated with external systems and created business logic using OSGi
  • Produced monthly reports using Salesforce Einstein Analytics

Education

Master of Science - Data Science

Monash University
Melbourne, VIC
08.2017 - 07.2019

Bachelor of Science - Computer Science And Programming

NIIT University
India
08.2011 - 05.2015

Skills

    Predictive Modeling

Data Munging

Data Verification and Maintenance

Strong Analytical Skills

Problem-Solving

Leadership Skills

Organization and Time Management

Cloud Services

Additional Information

  • Having around 5+ years of experience in data warehousing, maintenance and software development life cycle
  • Having 4+ years if experience in big data and hadoop
  • Having great knowledge and experience on AWS cloud platform
  • Having hands on expertise and building Ingestion pipelines in AWS Ecosystem – Batch and Streaming
  • Enormous Knowledge in Integration of AWS Services – AWS ECS, WDL, Container, Docker, AWS Batch, EC2, SQS, SNS, S3, Cloud Watch, Redshift, Athena, Glue, RDS
  • Experience in building AWS Streaming Pipeline – Kinesis Stream, Kinesis Firehose, Flink App, Lambda
  • Implemented and optimized data processing workflows using Databricks, Apache Spark, and Delta Lake
  • Developed and maintained Databricks notebooks for exploratory data analysis, enabling data scientists to derive actionable insights
  • Expertise in computer programming languages such as Java, Python, Scala and R
  • Outstanding use of problem-solving methods to help design database and comms architecture, leveraging most viable product (MVP) methodology
  • Create high quality APIs for data accessibility on the Hadoop platform that ultimately integrate with the broader environment
  • Hands on experience with Big Data core components and Ecosystem HDFS, MapReduce [MR1, YARN], Hive, Hue Impala, Spark, Sqoop, HBase, Tableau
  • Expertise in writing HIVE queries, Hue and MapReduce scripts and loading the huge data from local file system and HDFS to Hive
  • Designed and developed Sqoop queries in the process of loading RDBMS into Hadoop cluster
  • Build Big Data Analytics and Visualization platform for handling high-volume batch-oriented and real-time data streams
  • Designed and developed multiple Framework like – Ingestion (Ingest Data from External RDBMS Databases/Files to HIVE), Transformation and Extraction (Ingest Data into RDBMS Database from HIVE)
  • Knowledge of data warehousing concepts, including data warehouse technical architectures, infrastructure components, ETL/ELT and reporting/analytic tools and environments
  • Knowledge of cloud computing, including virtualization, hosted services, multi-tenant cloud infrastructures, storage systems, and content delivery networks
  • Effective customer-facing communication and listening skills
  • Experience working with recommendation engines, data pipelines, distributed machine learning
  • Experience with data analytics, data visualization techniques and software, and deep learning frameworks
  • Experience with data wrangling using libraries such as Pandas, Scikit-learn, or Numpy
  • Worked on creating Big Data cluster and S3 buckets on AWS
  • Worked on different file formats like LDIF, AVRO, Parquet etc
  • Real time experience on Production deployment, code fixes. Experience in agile, scrum environment
  • Experience with different query engines such as Hive, Impala, SparkSql
  • Worked with Spark ecosystems such as Spark Core, Pyspark, Spark SQL
  • Experience with other SQL DB – Teradata, Oracle
  • Knowledge on Data warehouse and ETL tools like Informatica
  • Working Experience with Business Intelligence and Visualization tools Excel, Tableau
  • Experience on Scripting Shell, VB, Linux, Unix scripts
  • Performed Data transfer between HDFS and other Relational Database Systems (MySQL, Teradata and Oracle ) using Sqoop
  • Experience to patch fix and download S3 data using Pyspark and AWS EMR


Employment

NCS Australia, Melbourne Australia 

Sr. Data Engineer/Data and Analytics 

2023 – Current


COGNIZANT TECH LTD, Melbourne Australia
Sr. Associate/Data Analytics and AI
2021 – 2023


Deloitte Melbourne Australia
Data Consultant/ Data Analytics and AI
2019 – 2021


COGNIZANT TECH LTD, Bangalore, India

Programmer Analyst/ CRM

2015 – 2017

Technical Skills

  • Cloud services : AWS Batch, CloudWatch, Redshift, Athena, SQS,SNS, S3, Glue, ECS, Lambda, Docker, EC2, Kinesis, Databricks Notebooks and Workflows
  • Hadoop Eco System : HDFS, MapReduce, Sqoop, Hive, Impala, Hue
  • RDBMS Databases : Teradata, Oracle, MySQL, MSSQL
  • Spark Ecosystem : Spark Core, Spark SQL, Spark Streaming
  • Programming Languages : Python, C, C++, Java, Scala, SQL, R
  • Cloud Services : AWS, Salesforce, Databricks
  • Reporting Tools : Tableau, Excel, Power BI
  • Scripting Languages : Shell Scripting, Batch Scripting, Perl
  • Hadoop Distribution : Cloudera Manager, MAP-R
  • Scheduling Tools : CAWA, Control-M, Crontab
  • Coordination Service : Zookeeper
  • Development Tools : IntelliJ, Eclipse, PyCharm, GitHub
  • ETL and Analytics : Informatica

Personal Profile

  • Full name : Aakanksha Sharma
  • Languages : English, Hindi
  • Marital Status : Single
  • Active Visa’s : Yes, 190: Australian Permanent Resident

Timeline

Senior Data Engineer

Client - NAB
07.2023 - Current

Project Name : CIJ - POD11

Client - Telstra
02.2022 - 06.2023

Project Name : CIJ - Feature Engineering

Client : Telstra
07.2020 - 01.2022

Project : Legacy Migration

WorkSafe Victoria
07.2019 - 06.2020

Master of Science - Data Science

Monash University
08.2017 - 07.2019

Project : Progrexion CRM

Role : Programmer Analyst
10.2015 - 05.2017

Bachelor of Science - Computer Science And Programming

NIIT University
08.2011 - 05.2015
Aakanksha SharmaSenior Data Engineer