Summary
Overview
Work History
Education
Skills
Certification
Projects
Timeline
Generic

NITHIN KRISHNA ENUGURTHI

Adelaide,SA

Summary

A solution-focused Data consultant with over 4 years of industry experience, specializing in cloud-based data solutions, data pipelines, and data quality assurance. Demonstrable expertise in Python, R, SQL, Pyspark and AWS with a track record of designing and implementing end-to-end data transformation solutions to drive business performance. Skilled at collaborating in agile and scrum environments, with an ability to liaise effectively with multidisciplinary teams and senior executives.

Overview

7
7
years of professional experience
1
1
Certification

Work History

Data Engineer

Bailey Abbott
Adelaide, SA
02.2024 - Current

LendFast Data Ingestion and Reporting - Credit Union SA (Client)

  • Seamlessly transitioned reporting services from Symtrix application to LendFast, focusing on data lineage, reporting development, and comprehensive testing for Credit Union SA (CUSA).
  • Developed an SSIS package to ingest data from flat (CSV) files on the network into tables within the on-premises data warehouse.
  • Designed and implemented an ETL pipeline based on the medallion architecture, incorporating landing and staging phases.
  • Created and maintained a clear audit trail for all ingestion operations.
  • Designed and implemented data integrity checks, including datatype verification, referential integrity, nullable fields, and format validations as part of the pipeline.
  • Designed the data model for LendFast, ensuring support for downstream reporting while incorporating Slowly Changing Dimension (SCD) Type 2 changes.
  • Developed and tested scripts for the entire pipeline and data ingestion process.
  • Drafted comprehensive technical documentation, including deployment and maintenance guides.
  • Designed and developed Azure Data Factory pipelines for both full and delta data loads from on-premises to Azure SQL Database.
  • Implemented the entire data pipeline orchestration using SQL Server Agent job, from ingestion to transformation and loading into Azure SQL DB.

Impacts and Achievements

  • Successfully transitioned reporting services, ensuring data integrity and comprehensive testing.
  • Developed a dynamic ETL script that simplified table creation and data population, earning praise from senior leadership.
  • Designed and implemented Azure Data Factory pipelines, enhancing data management efficiency.
  • Recognized by the client for delivering high-quality outputs in a short timeframe, strengthening client relationships.

Hyperspectral Imagery Project for Solar Panel Optimization - RAA (Client)

  • Implemented advanced image processing techniques and machine learning algorithms to analyze multidimensional hyperspectral data, enhancing the decision-making process for solar panel installations with insights on environmental factors, roof material properties, and temporal changes.
  • Contributed to the development and maintenance of a scalable and secure backend infrastructure utilizing a suite of AWS services (API Gateway, Lambda, S3, DynamoDB, Cognito, SageMaker) for efficient data management, model deployment, and user authentication.
  • Engaged in collaborative team efforts including problem-solving sessions, code reviews, and model iteration, driving continuous improvement and ensuring high standards of code quality and project documentation.
  • Supported the documentation process of methodologies, models, and testing outcomes, ensuring the project's approaches and results were replicable, well-understood, and accessible for future reference and scaling.

Senior Cloud Data Engineer

PwC Australia
Adelaide, SA
01.2022 - 01.2024

Sustainability Insights Platform:

  • Led a team to architect and build the platform's core engine, adept at transmuting activity data into actionable ESG dimensions.
  • Successfully implemented Delta Lake as our data storage solution, bolstering the capabilities of our data lake and enabling efficient Change Data Capture (CDC) handling.
  • Collaborated closely with cross-functional teams, including Data Scientists, Data Engineers, and Business Analysts, to identify data quality requirements and incorporate them into the data pipeline architecture.
  • Devised a flexible data ingestion system compatible with various sources such as databases and SFTP servers, capable of processing both structured and unstructured files.
  • Implemented robust data governance framework using AWS Lake Formation, enabling precise control and management of granular-level data access, ensuring data security and compliance.
  • Designed & developed a scalable data model, enabling real-time dashboards in AWS QuickSight & Power BI for data-driven decision-making.
  • Integrated AWS Textract for parsing PDFs, enabling the capture of high-grade data.
  • Engineered a configurable data generator tool to generate domain-specific data as per user requirements.
  • Masterfully orchestrated the entire data pipeline utilizing Amazon Managed Workflows for Apache Airflow (MWAA), crafting effective DAG scripts for optimized workflow management.

Impacts & Achievements:

  • Pioneered the use of multi-threading in PDF extraction via Textract, achieving an impressive reduction in processing time from 15 minutes to just 3 minutes for 120 documents.
  • Recognized for outstanding problem-solving skills and proactive approach in resolving complex data quality issues, positively impacting the efficiency of the data pipeline and downstream processes.
  • Commended by the Engagement Leader for creating a versatile Data Generator tool, serving as a crucial element in building data-driven application POCs.
  • Innovatively developed a standalone emission calculator in AWS Lambda, contributing to the understanding and reduction of carbon emissions.

Customer Insights Engine:

  • Designed and developed a metadata-driven transformation pipeline using Pyspark AWS Glue job, with easy customization via a YAML file.
  • Engineered a Graph data builder to automate creation of graph edges and nodes in AWS Neptune per graph configuration file.
  • Orchestrated the Customer Insights Engine with AWS Step function, enabling seamless integration of all components.
  • Utilized Pyspark to load data into AWS DynamoDB, feeding an ML model in AWS Sagemaker endpoint.

Impacts:

  • Created standalone, industry-agnostic components, with market potential for the firm.
  • Facilitated diverse application development via a data and config-driven approach.
  • Improved engine execution efficiency through dynamic operations based on FULL or DELTA load.

DocMap:

  • Developed an inference pipeline using S3, AWS Step functions, Docker services, and Python, to classify finance documents in the Banking sector.
  • Built preprocessing scripts to gather and clean data, addressing the finance sector's challenges.

Achievements:

  • Recognized by project Data Scientist for creating a unique ID system enabling easy traceability to raw documents after classification.
  • Developed a visualization demonstrating time efficiency from tokenization to classification, acknowledged by the Managing Director.

Software Developer

Siemens Technology and Services
Bengaluru, Karnataka
01.2018 - 07.2019

SINAMICS G120X:

  • Developed the Field Device Integration (FDI) package for the SIMATIC PCS7 application, facilitating configuration and commissioning of G120X hardware.
  • Engineered key features of Application Start-up and Control Status Word-I/II in the G120X FDI package v2.2.
  • Created user-friendly interface plugins using HTML and CSS, enhancing product usability.
  • Designed and implemented the DictGenerator tool, consolidating dictionary content from multiple language files into a single, accessible file.

Accomplishments:

  • Conceived and introduced a real-time drive status LED in the Application Start-up feature, recognized by the Product Manager as a significant industrial application use case.

Transformer:

  • Developed a Python-based tool for log analysis, transforming XML log files into concise, informative reports on test case status.
  • Utilized XMLschema and Regex libraries for data wrangling and output generation in Excel format.
  • Managed daily task allocation within the team, driving optimal productivity.
  • Collaborated within an Agile Development Environment, spanning from requirement analysis to delivery phase.
  • Contributed to test case design using pytest, reinforcing development quality and reliability.

Education

Master of Science - Data Science

Monash University
Melbourne, VIC
07.2021

Bachelor of Science - Computer Science & Technology

B.M.S Institute of Technology
Bangalore, India
07.2018

Skills

  • Programming Languages: Python, R, C#
  • Big Data Processing: Pyspark, Kafka
  • Query Language: SQL
  • Business Intelligence Tools: Tableau, Excel, PowerBI
  • Web Design: HTML, CSS, JavaScript
  • Cloud Platform: AWS (S3, CDK, Step functions, Glue, Lambda, DynamoDB, Sagemaker,Airflow), Azure (ADF, AzureSQLDB)
  • Version Control Software: Git, Bitbucket
  • Software IDE: PyCharm, VS Code, Jupyter Notebook
  • Data Engineering: Databricks
  • Development Practices: Agile, TDD, CI/CD
  • SQL Server Integration Services (SSIS)

Certification

  • Certified Scrum Master (CSM)
  • Databricks Data Engineer Associate
  • Fundamentals of the Databricks Lakehouse Platform Accreditation
  • Alteryx Designer Core certification

Projects

RecycleHero:

· Played a crucial role as a Data Engineer in developing a sustainability-focused website that guides international students on electronic waste disposal in Melbourne.

· Created Python scripts for automated data fetching, cleaning, and output generation in Excel, ensuring up-to-date information for users.

· Designed test cases and hosted the website on AWS cloud server.

· The application won the 'Best Application' award at Monash University Expo 2021 for 'End Plastic Waste'.

Suicide Cases in India:

· Independently designed a project analyzing suicide cases in India, performing data accumulation, cleaning, and structuring.

· Leveraged data visualization for deeper insights and developed a linear regression model to predict future suicide cases.

Timeline

Data Engineer

Bailey Abbott
02.2024 - Current

Senior Cloud Data Engineer

PwC Australia
01.2022 - 01.2024

Software Developer

Siemens Technology and Services
01.2018 - 07.2019

Master of Science - Data Science

Monash University

Bachelor of Science - Computer Science & Technology

B.M.S Institute of Technology
NITHIN KRISHNA ENUGURTHI