~/rodrigoalmeida/projects/jua/ git:(main)

Project Overview

At Jua.ai, I worked on building a foundational AI model for the natural world. This involved managing massive geospatial datasets and creating high-resolution weather forecasting pipelines.

Key Contributions

  • Data Engineering: Managed the ingestion of over 30 different sources of historical weather observation data into a common data warehouse (> 500 TB), leveraging Zarr and Parquet for efficient storage and access.
  • Data Quality: Lead efforts to assess data quality of weather observation data and cross-validate between sources.
  • ETL Pipelines: Built and maintained live ETL pipelines for weather data using Prefect and deployed on GCP and AWS.
  • Model Downscaling: Developed a deep learning pipeline using Zarr and Dask to downscale global weather forecasts to 1x1 km resolution, running 4 times daily.
  • Leadership: Led a team of 2 engineers, bridging the gap between technical implementation and product requirements.

Press & Publications