I’m a data-driven developer passionate about building tools that bridge technology and real-world impact. Below are highlights of my work—feel free to explore, collaborate, or reach out!
A Unified Platform for Drug Combination Analysis
- What it does: Aggregates data from PubChem, DrugBank, and PubMed APIs, cross-references academic articles via PubMed Central, and generates evidence-based insights for potential drug synergies.
- Tech Stack: Python, PostgreSQL, Elasticsearch, Streamlit (frontend), and custom ETL pipelines.
- Key Features:
- Automated extraction of drug interaction data from unstructured PDFs using PyMuPDF and spaCy.
- Risk-scoring algorithm for combinations, powered by Scikit-learn clustering.
- Real-time visualization of molecular interactions with RDKit and Plotly.
❗ Medical Disclaimer:
SynergyMed Explorer is a research tool only. It does NOT provide medical advice. Always consult a licensed healthcare professional before making treatment decisions.
NLP-Driven Sustainability Insights from Twitter (2012–2022)
- What it does: Analyzes 10M+ tweets (Twitter/X) about sustainability, climate change, and ESG trends using NLP to map public sentiment and emerging topics.
- Tech Stack: Python, Tweepy (historical data), Apache Spark (distributed processing), Hugging Face Transformers (BERT for sentiment), and Tableau for dashboards.
- Key Features:
- Topic modeling with BERTopic to identify shifts in sustainability discourse (e.g., rise of "circular economy" post-2018).
- Geospatial analysis of tweet origins with GeoPandas and Folium.
- Interactive timeline showing correlations between global events (e.g., COP summits) and tweet volume.
I believe in collaborative innovation! Here’s where I’ve pitched in:
- PyHealth (v2.3.0): Enhanced FHIR-standard medical data parsers for oncology datasets.
- Pandas: Optimized
read_xml()performance for nested XML drug databases.