Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
View canischilensis's full-sized avatar

Block or report canischilensis

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
canischilensis/README.md

Hi, I'm Guillermo 👋

📍 Santiago, Chile | 🤖 AI & Data Engineer | ☁️ Cloud Data Engineer

Python SQL Airflow Pandas GCP Azure FastAPI OpenAI PostgreSQL BigQuery Docker Terraform

Engineering scalable Data Architectures & Production-Grade AI Systems. Focused on Cloud Efficiency (FinOps), Governance, and Automating the Boring Stuff.


🚀 Business Impact & Metrics

📉 90% Cost/Time Reduction: Automated HR unstructured data processing (from 20 to 2 mins) using GenAI (GPT-4) & Whisper.
95% Workflow Optimization: Re-engineered legacy manual processes into automated Python/FastAPI ETL pipelines.
🎯 97% Model Accuracy: Developed XGBoost predictive models for fuel consumption, validated with rigorous MLOps practices.


📂 Current Projects

🏦 CL-RiskEngine | Financial & High-Performance Computing
Engineered a distributed Monte Carlo risk engine utilizing Hexagonal Architecture, Ray, and a Medallion Lakehouse for high-performance financial modeling.
🚢 NavOptima | MLOps for Maritime Industry
Deployed an end-to-end MLOps solution for maritime route optimization, integrating real-time predictive modeling with Apache Airflow and MLflow orchestration.
🚲 Modern Data Platform | Data Engineering & IaC
Architected a scalable GCP data platform using Terraform for IaC, optimizing SQL modeling to reduce query latency by 54%.


🧪 Data Analysis & Experiments

⚖️ Credit-Model-Benchmark | Supervised Learning Optimization
Implemented a comparative pipeline tuning hyperparameters via GridSearchCV for SVM (RBF kernel), KNN, and LogReg, standardizing performance metrics.
Stack: Scikit-learn Hyperparameter Tuning SVM (RBF) GridSearchCV Model Evaluation

📉 Regression-Model-Benchmark | ML Engineering Pipelines
Architected a modular training pipeline using Scikit-learn wrappers to automate hyperparameter tuning (RandomizedSearchCV) across SVR and Tree-based models.
Stack: ML Pipelines Hyperparameter Tuning RobustScaler SVR RandomizedSearchCV

💳 Credit-Risk-Scoring | Financial Modeling
Engineered a statistical approval model utilizing SMOTE for class imbalance and PCA for dimensionality reduction, validated via rigorous hypothesis testing.
Stack: Logistic Regression Hypothesis Testing SMOTE PCA Statsmodels Inference

🫁 PM10-Mortality-Analytics | Public Health Data Science
Processed and correlated geospatial telemetry data with biological mortality rates to visualize linear trends in environmental health risks.
Stack: Linear Regression ETL Pipeline Correlation Analysis Data Normalization Geospatial Aggregation

🌊 ENSO-Climate-Forecasting | Climate Predictive Modeling
Developed a predictive system for ENSO cycles by processing temporal oceanographic data and extracting seasonal trend components.
Stack: Time Series Analysis Forecasting Data Wrangling Seasonality Extraction Statistical Modeling

🍷 Red-Wine-Analysis| Statistical Inference
Validated physicochemical differences between wine strains using hypothesis testing and modeled quality drivers via multiple regression analysis.
Stack: Hypothesis Testing OLS Regression Statsmodels SciPy Inference

🥂 White-Wine-Quality | Advanced Statistical Modeling
Engineered a robust quality predictor using PCA for dimensionality reduction and GLM (Gamma) for non-normal distributions, validated via Mann-Whitney U tests.
Stack: PCA GLM Mann-Whitney U Statsmodels Scikit-learn

🏎️ MPG-Predictive-Modelling | Econometrics & Regression
Built a robust econometric model to predict vehicle MPG, utilizing OLS for parameter estimation and rigorous goodness-of-fit validation.
Stack: Multivariate Regression Statsmodels OLS Hypothesis Testing Data Visualization

🐧 Penguin-PCA-Analysis | Multivariate Analysis
Applied PCA for dimensionality reduction on the Palmer Penguins dataset, identifying key variance drivers through loading analysis and scree plots.
Stack: PCA Dimensionality Reduction Feature Extraction Scikit-learn Seaborn

🧠 Student-Stress-Clustering | Pattern Recognition
Segmented high-dimensional student data into behavioral clusters using K-Means and PCA to analyze stress factor variability.
Stack: Clustering K-Means PCA Dimensionality Reduction Scikit-learn Statsmodels


💡 Análisis del Valor Técnico

Este notebook complementa muy bien al de "Credit-Risk-Scoring" que ya tenías.

  • Diferencia clave: El anterior se enfocaba en balanceo de datos (SMOTE) y estadística (Statsmodels). Este se enfoca en algoritmos de caja negra (SVM, KNN) y optimización computacional (GridSearch).
  • Narrativa: Tienes el "Enfoque Estadístico" (el otro notebook) y el "Enfoque de Machine Learning Puro" (este notebook). Eso demuestra versatilidad.

¡Quedo a la espera del último notebook para cerrar tu portafolio! Súbelo cuando estés listo.


Connect

LinkedIn GitHub Email


⚡ Random Facts (The Human Side)
  • 🎓 Origin Story: Graduated as a Marine Biologist before becoming a Tech Lead.
  • 🏫 Teacher at Heart: Former Science Educator turned Engineer (I love documentation).
  • 🐶 Chief Morale Officers: Dog dad to Bombal 🤎 and Negro 🖤.
  • 📊 Mantra: Obsessed with "Clean Code" and measurable ROI.

Popular repositories Loading

  1. Algorithms Algorithms Public

    Python 1

  2. Projects_arduino Projects_arduino Public

    C++ 1

  3. Bike-SQL-Python Bike-SQL-Python Public

    PLpgSQL 1

  4. chile-housing-ops chile-housing-ops Public

    Python 1

  5. machine-learning-platzi machine-learning-platzi Public

    Forked from JuanPabloMF/machine-learning-platzi

    Notebooks del curso de machine learning aplicado con python de platzi

    HTML

  6. tensorflow tensorflow Public

    Forked from tensorflow/tensorflow

    An Open Source Machine Learning Framework for Everyone

    C++

Morty Proxy This is a proxified and sanitized view of the page, visit original site.