Overview
Work History
Education
Skills
Timeline
Generic

Yuxin(Mia) Ma

Overview

3
3
years of professional experience

Work History

Data Engineer

Eratos
04.2024 - Current
  • Designed and optimized complex Entity Relationship Diagrams (ERD) and implemented scalable middleware APIs with Django REST Framework, integrating front-end applications with AWS-hosted PostgreSQL databases through ORM design
  • Optimized and parallelized processing of large-scale cloud based datasets using Python packages Dask, Rasterio, Xarray, resulting in a 6x reduction in processing time and significant improvements in data accuracy for complex ETL workflows
  • Implemented robust software engineering practices including testing framework, Docker containerization, CI/CD pipelines, and version control Github to ensure scalable and maintainable data science code
  • Collaborate with cross-functional teams and client, coordinating efforts to implement end-to-end data pipelines, ensuring alignment with project goals and enhancing deliverable quality and code robustness

Research Data Analyst Intern

Walter and Eliza Institute of Medical Research
02.2024 - 06.2024
  • Served as part of the clinical dashboard project. Handled streaming patients' data using python and REDCap database and implement interactive dashboard with R shiny
  • Implemented a MongoDB instance, serving as an intermediary between the REDCap database and dashboard, to improve loading speed by 300%
  • Embedded geo-map, heat-map statistics, and Kaplan-Meier visualisations for survival rate on dashboard, facilitating data exploration for administrators and clinicians

Data Science Consultant

Fyto Fire-Fighting AI
02.2023 - 12.2023
  • Gathered and curated data from authoritative government sources, including Daymet, USGS, Cal-Fire, and NASA, ensuring access to reliable and high-quality datasets for analysis
  • Utilized geospatial techniques to merge, preprocess and integrate datasets, enabling comprehensive exploratory data analysis and visualisation with Python and ArcGIS tool
  • Achieved 90% accuracy by leveraging Ensemble Boosting method and CNN to classify fire occurrence within specified timeframes and locations for residents in California
  • Collaborated with a global team of data scientists and domain experts, fostering effective weekly communication and leverage diverse perspectives to enhance model performance

Data Engineer

Johnson Control
02.2022 - 01.2023
  • Implemented predictive maintenance for company-produced chillers using Weibull Distribution and Survival Analysis on Azure Databricks. Achieved significant savings in maintenance resources and improved budget efficiency
  • Supervised the secure document transformation process for internal employees using natural language processing techniques, with Spacy, Keras and NLTK packages on python
  • Managed weekly streaming data and seamlessly connected the data source with the Python script
  • Performed analysis and forecasting of time series alarm data from physical smart sensors with univariate models and boosting methods in Python, to monitor alarm anomaly and potential activity trends
  • Developed an interactive dashboard using Plotly Dash and Power BI for the aforementioned alert system, to presents data-driven insights to business leaders and maintenance workers
  • Extracted, transformed, and loaded industrial data from various sources, including MS SQL, Oracle DB, and Hive Database on Apache Hadoop. Automate the ETL process with Azure Data Factory Pipelines
  • Developed predictive models using Python and R to enhance data-driven decision-making.

Education

Master of Data Science -

The University of Melbourne
07.2024

Bachelor of Science - Data Science Major

The University of Melbourne
12.2021

Skills

  • Technical Skills: Data Warehousing Management, ETL Development, Data Pipeline, Cloud Data Architecture, Data Analysis and Modelling, Statistical Modelling, Machine Learning, Deep Learning(Tensorflow&Pytorch)
  • Tools: Python, AWS, SQL(Oracle/Hive/Microsoft SQL Server), Git(GitHub), Bash Scripting, Microsoft SQL Server, Microsoft Power BI, Microsoft Office, Microsoft Azure
  • Language: English, Mandarin

Timeline

Data Engineer

Eratos
04.2024 - Current

Research Data Analyst Intern

Walter and Eliza Institute of Medical Research
02.2024 - 06.2024

Data Science Consultant

Fyto Fire-Fighting AI
02.2023 - 12.2023

Data Engineer

Johnson Control
02.2022 - 01.2023

Master of Data Science -

The University of Melbourne

Bachelor of Science - Data Science Major

The University of Melbourne
Yuxin(Mia) Ma