Worked alongside research scientists and stakeholders to optimize machine learning algorithms for high-performance computing environments, achieving a 20% reduction in model training time on multi-core architectures
Enhanced the deployment efficiency of ML models on GPU clusters by integrating PyTorch's Distributed Data-Parallel and ONNX runtime, resulting in a 17% improvement in processing speed for real-time climatic predictions
Curated multifaceted data from 50+ years of CSEM2 simulations, encompassing both historical data and future projections producing high-volume climatic simulation outputs
Crafted ETL pipelines using Dask to handle terabytes of proprietary data, reducing processing time by 60% through effective resource allocation and model parallelism
Applied regression analysis and decision trees in climatic data prediction models, improving accuracy and reducing false-positive rates by 25% in real-time predictions