📍 Santiago, Chile | 🤖 AI & Data Engineer | ☁️ Cloud Data Engineer
Engineering scalable Data Architectures & Production-Grade AI Systems. Focused on Cloud Efficiency (FinOps), Governance, and Automating the Boring Stuff.
📉 90% Cost/Time Reduction: Automated HR unstructured data processing (from 20 to 2 mins) using GenAI (GPT-4) & Whisper.
⚡ 95% Workflow Optimization: Re-engineered legacy manual processes into automated Python/FastAPI ETL pipelines.
🎯 97% Model Accuracy: Developed XGBoost predictive models for fuel consumption, validated with rigorous MLOps practices.
🏦 CL-RiskEngine | Financial & High-Performance Computing
Engineered a distributed Monte Carlo risk engine utilizing Hexagonal Architecture, Ray, and a Medallion Lakehouse for high-performance financial modeling.
🚢 NavOptima | MLOps for Maritime Industry
Deployed an end-to-end MLOps solution for maritime route optimization, integrating real-time predictive modeling with Apache Airflow and MLflow orchestration.
🚲 Modern Data Platform | Data Engineering & IaC
Architected a scalable GCP data platform using Terraform for IaC, optimizing SQL modeling to reduce query latency by 54%.
⚖️ Credit-Model-Benchmark | Supervised Learning Optimization
Implemented a comparative pipeline tuning hyperparameters via GridSearchCV for SVM (RBF kernel), KNN, and LogReg, standardizing performance metrics.
Stack: Scikit-learn Hyperparameter Tuning SVM (RBF) GridSearchCV Model Evaluation
📉 Regression-Model-Benchmark | ML Engineering Pipelines
Architected a modular training pipeline using Scikit-learn wrappers to automate hyperparameter tuning (RandomizedSearchCV) across SVR and Tree-based models.
Stack: ML Pipelines Hyperparameter Tuning RobustScaler SVR RandomizedSearchCV
💳 Credit-Risk-Scoring | Financial Modeling
Engineered a statistical approval model utilizing SMOTE for class imbalance and PCA for dimensionality reduction, validated via rigorous hypothesis testing.
Stack: Logistic Regression Hypothesis Testing SMOTE PCA Statsmodels Inference
🫁 PM10-Mortality-Analytics | Public Health Data Science
Processed and correlated geospatial telemetry data with biological mortality rates to visualize linear trends in environmental health risks.
Stack: Linear Regression ETL Pipeline Correlation Analysis Data Normalization Geospatial Aggregation
🌊 ENSO-Climate-Forecasting | Climate Predictive Modeling
Developed a predictive system for ENSO cycles by processing temporal oceanographic data and extracting seasonal trend components.
Stack: Time Series Analysis Forecasting Data Wrangling Seasonality Extraction Statistical Modeling
🍷 Red-Wine-Analysis| Statistical Inference
Validated physicochemical differences between wine strains using hypothesis testing and modeled quality drivers via multiple regression analysis.
Stack: Hypothesis Testing OLS Regression Statsmodels SciPy Inference
🥂 White-Wine-Quality | Advanced Statistical Modeling
Engineered a robust quality predictor using PCA for dimensionality reduction and GLM (Gamma) for non-normal distributions, validated via Mann-Whitney U tests.
Stack: PCA GLM Mann-Whitney U Statsmodels Scikit-learn
🏎️ MPG-Predictive-Modelling | Econometrics & Regression
Built a robust econometric model to predict vehicle MPG, utilizing OLS for parameter estimation and rigorous goodness-of-fit validation.
Stack: Multivariate Regression Statsmodels OLS Hypothesis Testing Data Visualization
🐧 Penguin-PCA-Analysis | Multivariate Analysis
Applied PCA for dimensionality reduction on the Palmer Penguins dataset, identifying key variance drivers through loading analysis and scree plots.
Stack: PCA Dimensionality Reduction Feature Extraction Scikit-learn Seaborn
🧠 Student-Stress-Clustering | Pattern Recognition
Segmented high-dimensional student data into behavioral clusters using K-Means and PCA to analyze stress factor variability.
Stack: Clustering K-Means PCA Dimensionality Reduction Scikit-learn Statsmodels
Este notebook complementa muy bien al de "Credit-Risk-Scoring" que ya tenías.
- Diferencia clave: El anterior se enfocaba en balanceo de datos (SMOTE) y estadística (Statsmodels). Este se enfoca en algoritmos de caja negra (SVM, KNN) y optimización computacional (GridSearch).
- Narrativa: Tienes el "Enfoque Estadístico" (el otro notebook) y el "Enfoque de Machine Learning Puro" (este notebook). Eso demuestra versatilidad.
¡Quedo a la espera del último notebook para cerrar tu portafolio! Súbelo cuando estés listo.
⚡ Random Facts (The Human Side)
- 🎓 Origin Story: Graduated as a Marine Biologist before becoming a Tech Lead.
- 🏫 Teacher at Heart: Former Science Educator turned Engineer (I love documentation).
- 🐶 Chief Morale Officers: Dog dad to Bombal 🤎 and Negro 🖤.
- 📊 Mantra: Obsessed with "Clean Code" and measurable ROI.

