Summary
Overview
Work History
Education
Skills
Other Experience
Languages
Timeline
Generic

Kai Zhu

Melbourne,VIC

Summary

  • A data professional with 5-year experience in data science and data engineering on Hadoop and GCP.
  • A team player with good communication and stakeholder management skills who can deliver fast.
  • A tech leader who breaks down big problems, designs innovative solutions and drives team forward.

Cloud support professional skilled in managing, deploying, and troubleshooting cloud-based solutions. Strong focus on team collaboration and achieving results, adaptable to changing needs. Proficient in AWS, Azure, and DevOps practices with keen ability to solve complex technical issues. Known for reliability, effective communication, and results-driven approach.

Overview

8
8
years of professional experience

Work History

Cloud Support Engineer

AWS
07.2022 - Current
  • Conducted training sessions for junior team members and new hires, fostering a culture of continuous learning and skills development.
  • Assisted in the evaluation of emerging technologies, determining their potential benefits to company operations.
  • Managed multiple projects concurrently, ensuring timely completion within budget constraints.
  • Streamlined deployment processes with the integration of automation tools and best practices.

Machine Learning Engineer

ANZ Bank
01.2022 - 07.2022
  • Expense Prediction: The goal was to train a machine learning model which predicts future expenses based on history transactions, and to serve the model in dataflow to make batch predictions daily. Key achievements:
  • Re-designed integration test framework which enables testers to define input data and expected outputs. Implemented using pytest and docker compose to enable testing on local and in CI.
  • Designed and implemented new dataflow steps which extract unique account numbers with predictions and publish them to pubsub to notify ANZ Plus app to refresh predictions.
  • Managed sprint planning sessions and release process in the squad.

Data Engineer

ANZ Bank
04.2018 - 01.2022
  • Data Analytics Workbench: The goal was to build a data analytics workbench on GCP to enable data professionals analyse data and train machine learning models on cloud. Key achievements:
  • Designed and implemented a solution with Vertex AI Notebooks, BigQuery and GCS, and managed user access to BigQuery and GCS with service accounts and IAM roles. Successfully released the workbench to production, and the platform has been used by several teams.
  • Used Terraform to manage resources creation and destruction. Applied Kubernetes cronjobs with workload identity to control notebook instances start/stop/recreation operations.
  • Designed a monitoring and alerting solution for entire tribe to enable centralised management of alerts and notifications. Created scoping projects to monitor all infra projects using metrics scopes. Logging metrics were defined in infra repos, while notification channels and alert policies were defined in monitoring repo.
  • Managed weekly release including checking all required SIT testing, getting approvals from product owners, adding implementation steps and roll back strategies, conducting technical and business verifications to check whether release was successful or not.
  • Trained engineers from different backgrounds to help them gain general cloud knowledge, understand the solution design and CICD process, and finally enabled them develop on GCP.
  • SING: The goal was to build a data pipeline to read data from on-prem servers and feed to a machine learning app hosting in GKE, which serves as the backend of a feature on ANZ App. Key achievements:
  • Implemented a data pipeline with GCS, BigQuery and CloudSQL for storage, Dataflow and Kafka for data processing, and Airflow for orchestration. Deployed the solution to production, and it has been serving millions of ANZ app customers.
  • Led the load testing part of the project. Conducted load testing with Python Locust framework on ML app to meet the 100 requests/sec goal. Locust was containerised and deployed to OpenShift in master-worker mode.
  • Fine-tuned ML app, Kafka, and Cloud SQL so that the pipeline can transfer millions of transactions from on-prem to Cloud SQL Postgres database within one hour. The data transfer pipeline and ML app met performance requirements and have been working in production as expected.
  • Performed functional testing with Python Robot framework. Applied Kubernetes cronjobs with workload identity on GKE to do automatic functional tests.
  • Set up Cloud Build pipelines to enable CICD. Created custom built images and pushed to GCP Container Registry for later use. Created triggers for Cloud Build to enable automated testing and deployment.

Data Scientist

ANZ Bank
10.2017 - 10.2018
  • A-Z Review: The goal was to build a machine learning algorithm to predict which products a customer will choose based on his/her A-Z Review answers. Key achievements:
  • Analysed differences between system and banker recommended products based on customer A-Z Review using PySpark RDD and dataframe on Cloudera Data Science Workbench.
  • Trained a tree-based classification algorithm to predict which products a customer will choose based on his/her A-Z Review answers.
  • Built a demo website using Django web development framework to display A-Z Review questions and recommended products.
  • Sentiment Analysis of Project Progress Comments: The goal was to apply natural language processing techniques on weekly project progress comments to determine the status of a project. Key achievements:
  • Cleaned project progress comments by expanding contractions, deleting non-alphanumeric characters and stop words, and stemming words.
  • Applied part-of-speech tagging and noun phrase extraction to obtain the most-frequent words and phrases, and built customized sentiment lexicon.
  • Implemented sentiment analysis with customized lexicon on cleaned comments to output a positive/negative indicator of project health.
  • Presented demo and business use cases on Cloudera Distributed Hadoop to other data-related teams.

Software Engineer

ANZ Bank
10.2018 - 04.2018
  • Reskilling Program: The goal was to train ANZ non-engineer staff with engineering skills. Key achievements:
  • Accepted 3-month full-time Java software development training at CoderAcademy.
  • Built web app with HTML/CSS/JavaScript for front-end and Java servlet for backend.
  • Interacted with MySQL database using Java Hibernate.

Education

Ph.D. - Electrical Engineering

Swinburne University of Technology
06.2017

Master - Electrical Engineering

Shanghai Jiao Tong University
03.2012

Bachelor - Electrical Engineering

Shandong University
06.2009

Skills

  • Experienced in designing, developing and monitoring cloud-based machine learning solutions (Google Cloud Certified Professional Data Engineer, Google Cloud Certified Professional Cloud Architect)
  • Experienced in developing, testing (pytest, Locust, Robot) and deploying software with git and CICD tools (Cloud Build, Bamboo, Git Actions, Harness)
  • Experienced in containerizing applications using Docker, Kubernetes (Certified Kubernetes Application Developer), Helm, Google Kubernetes Engine and OpenShift
  • Experienced in building cloud infra with Terraform (HashiCorp Certified Terraform Associate)
  • Proficient with Python and Bash Scripting
  • Proficient with Airflow, Kafka, Beam/Dataflow, Postman, Machine Learning and Hadoop

Other Experience

  • Data Science Melbourne Datathon 2017 Apr. 2017 – May 2017
  • Worked on a data analysis project with about 60 million transactions of patient drug-purchase history. Cleaned data and studied seasonal patterns of drug demands, transactions and illnesses. Analysed abnormal behaviour of patient who had the highest number of transactions.
  • Worked on a machine learning project to predict the probability of getting diabetes in 2016 based on drug purchase history before 2016. Analysed relationship between diabetes and various features, e.g., gender, year of birth, postcode, ATC level code of drugs. Trained several machine learning methods, such as random forests, XGBoost, to predict the probability of getting diabetes.

Languages

English
Full Professional
Chinese
Native or Bilingual

Timeline

Cloud Support Engineer

AWS
07.2022 - Current

Machine Learning Engineer

ANZ Bank
01.2022 - 07.2022

Software Engineer

ANZ Bank
10.2018 - 04.2018

Data Engineer

ANZ Bank
04.2018 - 01.2022

Data Scientist

ANZ Bank
10.2017 - 10.2018

Master - Electrical Engineering

Shanghai Jiao Tong University

Bachelor - Electrical Engineering

Shandong University

Ph.D. - Electrical Engineering

Swinburne University of Technology
Kai Zhu