I’m a data scientist who builds end-to-end ML and data engineering systems that ship in real healthcare and public health environments. I focus on making models usable and trustworthy by pairing strong modeling with reproducible pipelines, validation, and workflow integration.
At the CDC, I worked on surveillance informatics in Palantir Foundry (1CDP). I designed modular ingestion and transformation pipelines, implemented schema validation and data quality checks, used lineage to debug upstream issues, and built tools that reduced manual burden for epidemiologists and state partners. I also developed structured, auditable workflows for semi-automated tasks like schema mapping, with human review, versioned configurations, and clear traceability.
What I work on most:
ML engineering: training and inference pipelines, distributed processing, monitoring, reproducibility
Data engineering: schema management, validation gates, lineage-driven debugging, scalable transforms
Applied healthcare AI: interpretable models, uncertainty-aware decisions, clinical workflow fit
Tools: Python, SQL, PyTorch, Spark/PySpark, Git, containers, Palantir Foundry
I like practical problems where correctness, traceability, and maintainability matter as much as model performance.
Pinned Loading
-
-
-
NEDSS-DataReporting
NEDSS-DataReporting PublicForked from CDCgov/NEDSS-DataReporting
Data Near Real Time Reporting micro services for Modernized NBS System
TSQL
-
-
periop-prediction-framework
periop-prediction-framework PublicMachine learning models for predicting postoperative delirium from perioperative EHR data, including baseline models, domain-structured ensembles, interpretability, calibration, and decision curve …
Jupyter Notebook 1
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.


