Summary

Overview

Work History

Education

Skills

Certification

Projects

Timeline

NITHIN KRISHNA ENUGURTHI

Adelaide,SA

Summary

A solution-focused Data consultant with over 4 years of industry experience, specializing in cloud-based data solutions, data pipelines, and data quality assurance. Demonstrable expertise in Python, R, SQL, Pyspark and AWS with a track record of designing and implementing end-to-end data transformation solutions to drive business performance. Skilled at collaborating in agile and scrum environments, with an ability to liaise effectively with multidisciplinary teams and senior executives.

Overview

years of professional experience

Certification

Work History

Data Engineer

Bailey Abbott

Adelaide, SA

02.2024 - Current

LendFast Data Ingestion and Reporting - Credit Union SA (Client)

Seamlessly transitioned reporting services from Symtrix application to LendFast, focusing on data lineage, reporting development, and comprehensive testing for Credit Union SA (CUSA).
Developed an SSIS package to ingest data from flat (CSV) files on the network into tables within the on-premises data warehouse.
Designed and implemented an ETL pipeline based on the medallion architecture, incorporating landing and staging phases.
Created and maintained a clear audit trail for all ingestion operations.
Designed and implemented data integrity checks, including datatype verification, referential integrity, nullable fields, and format validations as part of the pipeline.
Designed the data model for LendFast, ensuring support for downstream reporting while incorporating Slowly Changing Dimension (SCD) Type 2 changes.
Developed and tested scripts for the entire pipeline and data ingestion process.
Drafted comprehensive technical documentation, including deployment and maintenance guides.
Designed and developed Azure Data Factory pipelines for both full and delta data loads from on-premises to Azure SQL Database.
Implemented the entire data pipeline orchestration using SQL Server Agent job, from ingestion to transformation and loading into Azure SQL DB.

Impacts and Achievements

Successfully transitioned reporting services, ensuring data integrity and comprehensive testing.
Developed a dynamic ETL script that simplified table creation and data population, earning praise from senior leadership.
Designed and implemented Azure Data Factory pipelines, enhancing data management efficiency.
Recognized by the client for delivering high-quality outputs in a short timeframe, strengthening client relationships.

Hyperspectral Imagery Project for Solar Panel Optimization - RAA (Client)

Implemented advanced image processing techniques and machine learning algorithms to analyze multidimensional hyperspectral data, enhancing the decision-making process for solar panel installations with insights on environmental factors, roof material properties, and temporal changes.
Contributed to the development and maintenance of a scalable and secure backend infrastructure utilizing a suite of AWS services (API Gateway, Lambda, S3, DynamoDB, Cognito, SageMaker) for efficient data management, model deployment, and user authentication.
Engaged in collaborative team efforts including problem-solving sessions, code reviews, and model iteration, driving continuous improvement and ensuring high standards of code quality and project documentation.
Supported the documentation process of methodologies, models, and testing outcomes, ensuring the project's approaches and results were replicable, well-understood, and accessible for future reference and scaling.

Senior Cloud Data Engineer

PwC Australia

Adelaide, SA

01.2022 - 01.2024

Sustainability Insights Platform:

Led a team to architect and build the platform's core engine, adept at transmuting activity data into actionable ESG dimensions.
Successfully implemented Delta Lake as our data storage solution, bolstering the capabilities of our data lake and enabling efficient Change Data Capture (CDC) handling.
Collaborated closely with cross-functional teams, including Data Scientists, Data Engineers, and Business Analysts, to identify data quality requirements and incorporate them into the data pipeline architecture.
Devised a flexible data ingestion system compatible with various sources such as databases and SFTP servers, capable of processing both structured and unstructured files.
Implemented robust data governance framework using AWS Lake Formation, enabling precise control and management of granular-level data access, ensuring data security and compliance.
Designed & developed a scalable data model, enabling real-time dashboards in AWS QuickSight & Power BI for data-driven decision-making.
Integrated AWS Textract for parsing PDFs, enabling the capture of high-grade data.
Engineered a configurable data generator tool to generate domain-specific data as per user requirements.
Masterfully orchestrated the entire data pipeline utilizing Amazon Managed Workflows for Apache Airflow (MWAA), crafting effective DAG scripts for optimized workflow management.

Impacts & Achievements:

Pioneered the use of multi-threading in PDF extraction via Textract, achieving an impressive reduction in processing time from 15 minutes to just 3 minutes for 120 documents.
Recognized for outstanding problem-solving skills and proactive approach in resolving complex data quality issues, positively impacting the efficiency of the data pipeline and downstream processes.
Commended by the Engagement Leader for creating a versatile Data Generator tool, serving as a crucial element in building data-driven application POCs.
Innovatively developed a standalone emission calculator in AWS Lambda, contributing to the understanding and reduction of carbon emissions.

Customer Insights Engine:

Designed and developed a metadata-driven transformation pipeline using Pyspark AWS Glue job, with easy customization via a YAML file.
Engineered a Graph data builder to automate creation of graph edges and nodes in AWS Neptune per graph configuration file.
Orchestrated the Customer Insights Engine with AWS Step function, enabling seamless integration of all components.
Utilized Pyspark to load data into AWS DynamoDB, feeding an ML model in AWS Sagemaker endpoint.

Impacts:

Created standalone, industry-agnostic components, with market potential for the firm.
Facilitated diverse application development via a data and config-driven approach.
Improved engine execution efficiency through dynamic operations based on FULL or DELTA load.

DocMap:

Developed an inference pipeline using S3, AWS Step functions, Docker services, and Python, to classify finance documents in the Banking sector.
Built preprocessing scripts to gather and clean data, addressing the finance sector's challenges.

Achievements:

Recognized by project Data Scientist for creating a unique ID system enabling easy traceability to raw documents after classification.
Developed a visualization demonstrating time efficiency from tokenization to classification, acknowledged by the Managing Director.

Software Developer

Siemens Technology and Services

Bengaluru, Karnataka

01.2018 - 07.2019

SINAMICS G120X:

Developed the Field Device Integration (FDI) package for the SIMATIC PCS7 application, facilitating configuration and commissioning of G120X hardware.
Engineered key features of Application Start-up and Control Status Word-I/II in the G120X FDI package v2.2.
Created user-friendly interface plugins using HTML and CSS, enhancing product usability.
Designed and implemented the DictGenerator tool, consolidating dictionary content from multiple language files into a single, accessible file.

Accomplishments:

Conceived and introduced a real-time drive status LED in the Application Start-up feature, recognized by the Product Manager as a significant industrial application use case.

Transformer:

Developed a Python-based tool for log analysis, transforming XML log files into concise, informative reports on test case status.
Utilized XMLschema and Regex libraries for data wrangling and output generation in Excel format.
Managed daily task allocation within the team, driving optimal productivity.
Collaborated within an Agile Development Environment, spanning from requirement analysis to delivery phase.
Contributed to test case design using pytest, reinforcing development quality and reliability.

Education

Master of Science - Data Science

Monash University

Melbourne, VIC

07.2021

Bachelor of Science - Computer Science & Technology

B.M.S Institute of Technology

Bangalore, India

07.2018

Skills

Programming Languages: Python, R, C#
Big Data Processing: Pyspark, Kafka
Query Language: SQL
Business Intelligence Tools: Tableau, Excel, PowerBI
Web Design: HTML, CSS, JavaScript
Cloud Platform: AWS (S3, CDK, Step functions, Glue, Lambda, DynamoDB, Sagemaker,Airflow), Azure (ADF, AzureSQLDB)

Version Control Software: Git, Bitbucket
Software IDE: PyCharm, VS Code, Jupyter Notebook
Data Engineering: Databricks
Development Practices: Agile, TDD, CI/CD
SQL Server Integration Services (SSIS)

Certification

Certified Scrum Master (CSM)
Databricks Data Engineer Associate
Fundamentals of the Databricks Lakehouse Platform Accreditation
Alteryx Designer Core certification

Projects

RecycleHero:

· Played a crucial role as a Data Engineer in developing a sustainability-focused website that guides international students on electronic waste disposal in Melbourne.

· Created Python scripts for automated data fetching, cleaning, and output generation in Excel, ensuring up-to-date information for users.

· Designed test cases and hosted the website on AWS cloud server.

· The application won the 'Best Application' award at Monash University Expo 2021 for 'End Plastic Waste'.

Suicide Cases in India:

· Independently designed a project analyzing suicide cases in India, performing data accumulation, cleaning, and structuring.

· Leveraged data visualization for deeper insights and developed a linear regression model to predict future suicide cases.

Timeline

Data Engineer

Bailey Abbott

02.2024 - Current

Senior Cloud Data Engineer

PwC Australia

01.2022 - 01.2024

Software Developer

Siemens Technology and Services

01.2018 - 07.2019

Master of Science - Data Science

Monash University

Bachelor of Science - Computer Science & Technology

B.M.S Institute of Technology

NITHIN KRISHNA ENUGURTHI

Summary

Overview

Work History

Data Engineer

Senior Cloud Data Engineer

Software Developer

Education

Master of Science - Data Science

Bachelor of Science - Computer Science & Technology

Skills

Certification

Projects

Timeline

Data Engineer

Senior Cloud Data Engineer

Software Developer

Master of Science - Data Science

Bachelor of Science - Computer Science & Technology

Similar Profiles

Mitesh DANAKMitesh DANAK

Shawna SurfaceShawna Surface

Lucas ArneyLucas Arney

Dhruvi ModiDhruvi Modi

YASH MAINIYASH MAINI