Summary
Overview
Work History
Education
Skills
Timeline
Generic

Sri Harsha Vunnam

Sydney

Summary

Data Enthusiast with nearly 4+ years of Information Technology experience having sound knowledge in Data Engineering, Data analytics, cloud, Data Science, Big Data, SQL, Software development methodologies, Data cleaning, Data Governance and Data management.

Overview

5
5
years of professional experience

Work History

Senior Data Engineer

IPG Mediabrands: Kinesso
09.2024 - Current
  • Designed, built, and maintained ETL pipelines to extract, transform, and load API data from multiple social and AdTech platforms, including Google, Meta, Snapchat, Pinterest, and Yahoo, into Snowflake.
  • Developed data models and warehouse structures in Snowflake, ensuring efficient storage and optimized query performance for large-scale marketing data.
  • Automated data ingestion workflows using Apache Airflow and Kubernetes, ensuring seamless and timely data updates.
  • Implemented data transformation processes in Python and SQL, standardizing and enriching raw API data for analytics and reporting.
  • Led data quality initiatives, troubleshooting inconsistencies, fixing errors, and ensuring accurate, reliable datasets for marketing performance analysis.
  • Integrated and centralized AdTech and social media marketing data, enabling BI Engineers and Data Scientists to derive insights and optimize campaigns.
  • Collaborated with internal and external stakeholders, translating business requirements into scalable data solutions.
  • Built and optimized CI/CD pipelines using GitHub Actions, Docker, and shell scripting to streamline ETL development and deployment.
  • Leveraged AWS services (S3, Lambda, EC2) to develop cloud-based data solutions, supporting high-volume API data ingestion.
  • Tuned SQL queries and managed schema design to enhance data accessibility and reduce query latency.
  • Provided mentorship to junior engineers, promoting best practices in ETL development and API integrations.
  • Presented data insights, dashboards, and performance reports to clients and senior leadership, helping bridge the gap between data and business strategy.
  • Managed multiple projects simultaneously, ensuring efficient and scalable data pipelines for marketing and analytics teams.

Data Engineer

Australian Payments Plus
10.2023 - Current
  • Build and develop data pipelines using to ingest and transform data into Snowflake.
  • Ensuring integrity (accuracy, consistency, and completeness), of the data available on the data platform
  • Transform raw data into a structured format suitable for analysis and reporting. Develop ETL (Extract, Transform, Load) processes to cleanse, enrich, and transform data.
  • Data Modeling and Transformation: Worked on utilizing DBT (data build tool) for building scripts and models for Snowflake tables. Proficient in creating efficient data transformation pipelines using dbt's powerful features
  • Build and maintain integrations between data lake (AWS S3) and, Snowflake for real-time streaming
  • Developed and maintained Python and SQL scripts to integrate data pipelines and Snowflake account with the Dynatrace application using REST API calls, enabling efficient alerting and monitoring functionalities
  • Designed and implemented an automated data transfer solution using AWS Lambda and AWS S3 to seamlessly move files from a data lake to an S3 bucket upon arrival.
  • Secure data at rest and in Transit through encryption.
  • Implemented data validation alerts on pipelines to ensure data integrity and quality, enhancing reliability of data processing workflows.
  • Improve the current BI architecture, emphasizing data security, data quality, timeliness, scalability, and extensibility

Senior Consultant

Capgemini
01.2021 - 09.2023

Assisted in the developement and implementation of various ETL pipelines
• Conducted data cleaning and preparation tasks while coordinating with multiple team to develop data pipelines to improve data quality and accessibility
• Developed end-to-end data analytics framework utilizing Amazon Redshift,Glue and Lambda enabling business to obtain KPIs faster with reduced costs
• Transform raw data into a structured format suitable for analysis and reporting. Develop ETL (Extract, Transform, Load) processes to cleanse, enrich, and transform data.
• Worked on the Data modelling package, where the bunch of scripts like SQL and python were tested on the Jupyter notebooks before pushing them into release artifact repo.
• I would perform the data ingestion where the data will be ingested into the S3 bucket and later it would be curated and moved to the downstream.
• Testing, debugging, diagnosing, and correcting errors and faults in an applications programming language within established testing protocols, guidelines, and quality standards to ensure data models and applications perform to specification.
• Working on the curation of the data whenever the new data got ingested into the S3 raw bucket, where the raw data would be standardised in to parquet.
• Monitoring of Data pipelines to check any issue in running the pipeline in its activity run when the data has been ingested and curated.
• Troubleshooting and patching most of the infrastructure activities in production and non-production to make them compliant for the security purposes.
• Experienced in working with Jenkins and Airflow as part of Data ingestion and Data Curation
• Worked on AWS glue tables to create a schema and move the data across from S3 bucket to Glue

Analyst Engineer

ITIC - Systems, Education And Research
04.2020 - 01.2021

Used Python to scrape, clean, and analyse large datasets.
• Working with large/complex datasets.
• Monitoring of Data pipeline to check any issue in running the pipeline in its activity run.
• Implemented various ETL process and moved the data to the AWS Redshift.
• Experience in big data querying tools like Hive and HBase.
• Used MongoDB and HBase for NoSQL Databases.
• Interpret data, analyse results using statistical techniques and design reports, dashboards, and visualizations.
• Wrote SQL queries to extract and analyse complex datasets.
• Performed ETL using Python and built machine learning models.
• NLTK library for data cleaning, removal of stop words and punctuations, Tokenization, stemming and lemmatization.
• Built TF-IDF features using NLTK and performed sentiment analysis using Neural Networks.
• Applied various pre-trained models like VGG16/RestNet/XG-Boost algorithm to get the better accuracy of the model.
• Model analysis was done using Precision, Recall, F1 score, AUC and ROC.
• Performed EDA, feature extraction, data visualization and generated a model using random forest regressor. Feature selection was done using Extra tree regressor technique.
• Worked with Image data for image recognition using Machine learning and Deep learning techniques. • Frequently worked with libraries like pandas, matplotlib, seaborn, Scikit-Learn, NumPy, Keras, fastai and Tensorflow.

Education

Master's of Data Science - Data Science

Macquarie University
12.2020

Skills

  • Languages: Python, pyspark, SQL
  • Database technologies: Snowflake, POSTGRESS,MySQL,Amazon RDS
  • Cloud Technologies: AWS,Azure
  • Big data Technologies: HadoopFramework- HBase, Hive , Spark
  • Experience on working end to end with Airflow, DBT and Jenkins
  • Visualization Tools: Power BI,Tableau ,Looker
    Others: GitHub Version Control, UnixShell scripting, HTML, CSS, Matlab,CI/CD tools
  • Experience Build data systems and pipelines- ETL/ELT procedures
  • Worked on big data, performing Data ingestion, Data transformations and Data validations, Data Governance

Timeline

Senior Data Engineer

IPG Mediabrands: Kinesso
09.2024 - Current

Data Engineer

Australian Payments Plus
10.2023 - Current

Senior Consultant

Capgemini
01.2021 - 09.2023

Analyst Engineer

ITIC - Systems, Education And Research
04.2020 - 01.2021

Master's of Data Science - Data Science

Macquarie University
Sri Harsha Vunnam