Summary
Overview
Work History
Education
Skills
Certification
Reference
Timeline
Generic

Sanjay Jagannadhan

Senior Software Engineer| Data Engineer | Data Scientist
Castle Hill

Summary

As a Senior Software Engineer at Tech Mahindra, I specialize in building and optimizing enterprise data lake solutions to ingest and analyze large volumes of data from various sources. I utilize my expertise in Azure Data Factory, Azure Databricks, Delta Lake, SQL, and other cloud technologies to ensure governance, cost management, security, storage, DevOps, and consumption design patterns. Collaborating with a cross-functional team of data scientists, analysts, and engineers, I deliver high-quality data products and insights to our clients.

I have a robust background in data science and engineering, holding a master's degree in data science from Monash University and extensive experience in data warehouse development, ETL processes, Power BI dashboards, and machine learning models. My keen interest in applying advanced analytics and AI solutions to solve real-world problems drives me to continuously seek out data science and data engineering challenges to learn new skills and techniques. I am a diligent, collaborative, and creative professional committed to excellence and innovation.

Overview

9
9
years of professional experience
6
6
years of post-secondary education
11
11
Certifications

Work History

Senior Software Engineer

Tech Mahindra
09.2024 - Current

As a Senior Software Engineer, I specialize in the design, development, and maintenance of data architecture, solutions, pipelines, and systems that support the organization's data-driven lake house/warehouse. My responsibilities include troubleshooting data issues, resolving batch job failures, and ensuring the smooth operation of framework scripts and pipelines. I actively perform data migration tasks, facilitating seamless data transfers and transformations between various data sources and destinations. I excel in data engineering tasks in Databricks, harnessing its capabilities for advanced data processing, transformation, and analysis, thus contributing to the organization's data management and analytics initiatives.

My daily tasks encompass system and server monitoring, service management, and stakeholder engagement to ensure the availability, accuracy, and accessibility of data for various business needs. I collaborate closely with data architects, solution designers, developers, testers, and business stakeholders to facilitate a seamless flow of data within the organization. I regularly review solution designs, engage in development and testing activities, provide environment support, and meticulously document technical aspects. My commitment lies in optimizing data processes and enhancing system performance while effectively managing costs.

My extensive skill set includes strong proficiency in Python, PySpark, Azure Data Factory framework, and Azure SQL, along with expertise in shell scripting, change/incident/problem management, and a solid understanding of continuous integration and deployment pipelines, including branching strategies, and Git repositories.

Additionally, I possess valuable knowledge in telecom domain concepts, Scala/Spark, Azure DevOps, and a variety of tools such as Power BI, Tableau, Alteryx, Apache Airflow, and Confluence. These enhance my capacity to drive data engineering excellence within the organization.

Software Engineer

SDP SOLUTIONS PTY LTD
9 2023 - 08.2024
  • As a Software Engineer, I specialize in the design, development, and maintenance of data architecture, solutions, pipelines, and systems that support the organization's data-driven lake house/warehouse. In this role, I am responsible for a wide range of critical functions, including troubleshooting data issues, resolving batch job failures, and ensuring the smooth operation of framework scripts and pipelines. I actively perform data migration tasks, facilitating seamless data transfers and transformations between various data sources and destinations. Moreover, I excel at performing data engineering tasks in Databricks, harnessing its capabilities for advanced data processing, transformation, and analysis, thus contributing to the organization's data management and analytics initiatives.
  • My daily tasks encompass system and server monitoring, service management, and stakeholder engagement to ensure the availability, accuracy, and accessibility of data for various business needs. I collaborate closely with data architects, solution designers, developers, testers, and business stakeholders to facilitate a seamless flow of data within the organization. I regularly review solution designs, engage in development and testing activities, provide environment support, and meticulously document technical aspects. My commitment lies in optimizing data processes and enhancing system performance while effectively managing costs.
  • My extensive skill set includes strong proficiency in Python, PySpark, and Azure SQL, along with expertise in shell scripting, change/incident/problem management, and a solid understanding of continuous integration and deployment pipelines, including branching strategies and Git repositories.
  • Additionally, I possess valuable knowledge in telecom domain concepts, Scala/Spark, Azure DevOps, and a variety of tools such as Power BI, Tableau, Alteryx, Apache Airflow, and Confluence, all of which enhance my capacity to drive data engineering excellence within the organization.

Azure Data Engineer

Data-Driven
04.2023 - 07.2023
  • Designing end-to-end data pipelines that efficiently move and transform data from various sources to Azure Data Warehouse. Creating scalable and reliable pipeline architectures that meet the organization's data integration and analytics needs.
  • Developing Extract, Transform, Load (ETL) processes using Databricks and Azure Data Warehouse tools. Implementing data transformations, data enrichment, and data cleansing to ensure data accuracy and consistency.
  • Orchestrating and automating data workflows within Databricks and Azure Data Warehouse. Utilizing tools like Azure Data Factory to schedule and monitor pipeline execution, ensuring timely and accurate data processing.
  • Optimizing pipeline performance to minimize data processing times and resource consumption. Fine-tuning ETL jobs, optimizing SQL queries, and implementing caching mechanisms to enhance data processing efficiency.
  • Implementing data quality checks and validation during the ETL process. Identifying and resolving data quality issues to ensure high data integrity and reliability in the Azure Data Warehouse.
  • Managing the Azure Data Warehouse environment and performing routine maintenance tasks such as index optimization, data purging, and backups to ensure data repository health.


Tools used: Python, PySpark, SQL, Databricks, Azure stack

Data Engineer

Retail Insight
08.2022 - 04.2023

Roles and Responsibilities:

  • Requirement gathering for product development and enhancement.
  • Contribute to the next iteration of data pipelines, working to increase capability and features to ensure that client data is processed quickly and reliably through the systems.
  • Building and supporting traditional and cloud-native technologies to build and run scalable, manageable, and resilient applications.
  • Implement ETL pipelines (Extract, Transform, & Load), monitoring/maintaining data pipeline performance, and implementing analytics solutions.
  • Usage of functional programming skills to facilitate data fetching, transformation, and storage process.
  • Product Development and enhancements using Python, SQL and Blob.
  • In the existing databases and data pipelines, add enhancements for better data quality.
  • Build new versions of the product based on client requirements.


Tools used : Azure Blob storages, Azure DB services, Python, PySpark, SQL, Github, Jfrog, Octopus

Support Engineer

Insight Enterprise
05.2021 - 07.2022
  • Deliver high-quality data solutions as part of product and project engineering at Insight
    Enterprise.
  • Creating and maintaining data pipelines, orchestration, data wrangling, and SQL to support existing and new client deliveries, and a growing focus on data engineering.
  • Automate to accelerate, eliminate redundant and manual steps to get things done, so that every engineer in the team can focus on the things that engineers do best.
  • Build, configure, and support data warehouse integrations, assist in the automation of data acquisition, enrichment, and flow to and from core systems.
  • I have actively designed and developed solutions leveraging a multitude of essential skills within the data engineering realm, focusing prominently on the Databricks ecosystem. My proficiency shines through as I intricately work with Delta Lake to architect and develop robust data management strategies, and I skillfully employ AutoLoader to orchestrate seamless real-time data ingestion processes. Furthermore, I harness the capabilities of the Unified Data Analytics Platform to craft comprehensive end-to-end solutions that meet organizational needs.
  • Developed and delivered business information solutions.
    Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity, and verifying pipeline stability.


Tools used: Python, PySpark, SQL, Databricks, Azure stack

Data Engineer

Department Of Health WA
10.2019 - 05.2021

Full Time - Contract (21-Oct-2019 to 14-May-2021)

1. Data Outputs Team


Data pipeline investigation for our migration project

  • Talking to different stakeholders and drawing information flow diagram.
  • Analyzed complex data using python and identified trends, risks and anomalies to provide recommendations to improve our internal systems.


Data Warehouse Migration Process

  • Collaborated with lead architect for setting up our data warehouse pipeline.
  • Integrated data from different source like SAS, Postgres, and Oracle.
  • Setting up CI / CD pipeline from scratch.


Automation of parts of Data Warehouse pipeline

  • Modifying existing pipeline for better reporting.
  • Setting up new pipelines for new data sources.


National Reporting and pipeline management

  • Facilitating new pipelines for national submission.
  • Creating fields with calculated values based on requirements from Analytical team.
  • Creating / Modifying stored procedure for automation of loading processes.

.

2. Data Science Team


Radiology AI Project

  • Tailored an automated pipeline for fetching DICOM images from our storage and that was used for Deep Learning model for predicting wrist fracture.

.


3. Research Outputs Team

Research Data Extraction

  • Requirement gathering and designing workflow for data extraction.
  • Performing cohort and control group selection.
  • Retrieving data from multiple sources and performing data quality checks.

Tools Used - Python, SSIS,SSAS, PowerBI, Tableau, Crontab, Unix, SAS, SQL Server, Azure, Jupyter Notebook , Tensorflow, Keras, PyTorch, Seaborn, NLTK, pandas, numpy

Project Management - Agile

Other skills - Presentation, hosting training sessions & Public Speaking

.

Software Engineer

Hexaware Technologies
05.2015 - 03.2017

Full Time - Permanent (12-May-2015 to 23-Mar-2017)

Project : Migration to outlook

Client : One of the Big 4 financing firm


The purpose of this project was to help our client get rid of an old technology. For this purpose, our team created two software. One was to automate the deactivation, decommission and archive process, and the second software was to create metadata about the databases that were being archived.

My roles were to help develop a few features of this software and also to perform testing. My weekly responsibility in this project were to analyze global databases and to come up with recommendations about which database to deactivate, decommission and restore on a weekly basis. Furthermore, I had to communicate and generate reports that provide insights into the project to different teams from different parts of the world.

.

Furthermore, I was responsible to handle escalations and to maintain SLA, and provide our clients with reports showing how our progress was aligned with SLA. During this project, my major tasks were data wrangling (Cleaning and quality improvement), data analysis,database management, software development, testing, reporting and archiving of data. To sum up, once we got rid of the unwanted data, we further analyzed databases which are important and that needed to be migrated to new technology for saving costs and improving efficiency.

.

Tools Used - SQL, Excel, Python, SharePoint and lotus notes.

Education

Master of Science - Data Science

Monash University
Melbourne, Vic, Australia
06.2017 - 10.2019

Bachelor of Science - Information Technology

Anna University
Chennai, TN, India
03.2011 - 03.2015

Skills

    PySpark / Databricks

Data Lake Architecture / Data Warehouse Architecture

SQL Server (SSIS, SSRS, SSAS)

Azure (Blob storage, DB Service, Azure Data Factory, Synapse)

Machine Learning Algorithm

Linux / Unix

Azure Devops ( CI / CD pipeline) / GitHub

Tableau / Power BI

Data Mining / Data Analysis / Statistical analysis

Agile framework / Scrum

Optimization Techniques / AutoLoader / Kafka Integration / Delta live tables / Cluster Management /

Python

Certification

Introduction to Data Engineering using Pyspark

Reference

Available on request

Timeline

Senior Software Engineer

Tech Mahindra
09.2024 - Current

Azure Data Engineer

Data-Driven
04.2023 - 07.2023

Data Engineer

Retail Insight
08.2022 - 04.2023

Support Engineer

Insight Enterprise
05.2021 - 07.2022

Introduction to Data Engineering using Pyspark

03-2021

Biomedical Image Analysis

02-2020

Image Processing in Python

02-2020

Python for Data Science

01-2020

Data Engineer

Department Of Health WA
10.2019 - 05.2021

SQL Server DBA

08-2019

Machine Learning Algorithm A-Z

06-2019

Hands on Hadoop

05-2019

Industry Experience

05-2019

R Programming

03-2018

Master of Science - Data Science

Monash University
06.2017 - 10.2019

Python Programming

05-2017

Software Engineer

Hexaware Technologies
05.2015 - 03.2017

.NET Certification

08-2014

Bachelor of Science - Information Technology

Anna University
03.2011 - 03.2015

Software Engineer

SDP SOLUTIONS PTY LTD
9 2023 - 08.2024
Sanjay JagannadhanSenior Software Engineer| Data Engineer | Data Scientist