Jack Stehn

Machine Learning Engineer | Data Engineer | Data Scientist

San Francisco, California

Download PDF

Summary

High-agency Data Professional bridging rigorous social science research with production-grade engineering. I specialize in architecting end-to-end data systems—from raw ingestion and warehousing to deploying predictive models that drive revenue. I bring a software engineering mindset to data teams, championing CI/CD, unit testing, and modular design.

Experience

Data Scientist (Lead: ML, Data Engineering, MLOps) - Ed Pioneers Fellow

Caliber Public Schools · Richmond, California

Sep 2024 — Oct 2025
  • Strategic Leadership (Solo Data Lead): Owned the full data lifecycle (DS, DE, ML) as the sole data scientist. Partnered directly with C-suite and department heads to navigate a 'zero-to-one' environment.
  • Predictive ML & Risk Modeling: Developed and deployed explainable ML models to predict staff turnover. Engineered a 'Risk Tolerance' configuration allowing non-technical leadership to adjust precision/recall thresholds.
  • Modern Data Stack Architecture: Architected a scalable platform on GCP. Orchestrated ELT pipelines using Dagster, dbt, and dlt to ingest data from disparate SIS and HR platforms.
  • Engineering Maturity (ROI): Engineered a comprehensive People Team data pipeline, reducing manual consistency checks from months of collective annual work to seconds.

Data Scientist (ML, Data Engineering, MLOps)

SetSail · San Mateo, California

Aug 2021 — Feb 2023
  • Business Impact: Contributed to product enhancements that achieved 33% faster ramp times, 16% higher revenue, and 15x ROI for customers.
  • Production ML (Revenue): Developed and deployed production ML models for Propensity Scoring and Churn Modeling. Leveraged NLP on unstructured email metadata to identify sales signals.
  • Pipeline Architecture (AWS): Led a critical overhaul of the AWS data infrastructure. Implemented 'SQL Push-down' strategies and asynchronous DAGs, reducing data processing latency by 75%.
  • Engineering Best Practices: Championed the adoption of CI/CD pipelines (GitHub Actions), unit testing (pytest), and Agile methodologies.

Data Science Research Team Lead

UC Berkeley School of Public Health · Berkeley, California

Sep 2020 — May 2021
  • Leadership: Led data science components for mixed-methods studies on equity and public health. Managed a team of undergraduates.
  • Unstructured Data: Analyzed diverse unstructured and non-traditional datasets requiring the development of novel data processing approaches.
  • Geospatial Analysis: Performed geospatial analysis (ArcGIS) to identify and visualize spatial patterns for non-technical stakeholders.

Education

University of California, Berkeley

Bachelor of Arts in Data Science (Domain Emphasis: Quantitative Social Science)

GPA: 4.00/4.00

Highest Distinction (Summa cum laude). Outstanding Data Science Undergraduate Award (Top of Class).

Aug 2019 — May 2021

Skills

Programming & Core Data Skills

Python (Pandas, NumPy, SciPy)SQL (PostgreSQL, BigQuery)RBash/Shell ScriptingStatistical AnalysisFeature Engineering

Machine Learning - Predictive & Classical

Propensity ScoringChurn ModelingPredictive Risk ModelingClassification & RegressionML Frameworks (Scikit-learn)Model Eval & Selection

Data Engineering & Cloud Platforms

AWS Cloud (EC2, S3, Lambda, Athena)GCP Cloud (BigQuery, GCS)Data Pipelines (Dagster, dbt, dlt)Data Warehousing (BigQuery, Athena)ETL/ELT Design & ImplementationDocker & Containerization

Software Engineering & DevOps Practices

Production Code Quality/PracticesTesting (Unit, Intg, E2E, Pytest)CI/CD Pipelines (GitHub Actions)Version Control (Git, GitHub)Agile Methods (Scrum, Kanban)

Data Visualization & BI Tools

Plotly (Dash for web apps)TableauLooker StudioInteractive Dashboard Design

Research, Experimentation & Ethics

Explainable MLExperimental Design (A/B)Causal Inference MethodsSurvey Design & AnalysisData Privacy & Security

Awards

2020-2021 Outstanding Data Science Undergraduate Award

UC Berkeley

Recognized for excellence in Data Science undergraduate studies, research, and community contributions at UC Berkeley.

May 2021

Volunteer & Community

Impact Fellow (Placement @ Caliber Public Schools)

Education Pioneers

Sep 2024 — Sep 2025

Selected for national fellowship applying leadership/management skills to advance educational equity.

  • Leadership Development: Applying data science & leadership skills to advance educational equity.
  • Capacity Building: Building organizational capacity through strategic data projects at placement site.

Data Team Lead

San Francisco Gay Men's Chorus

Dec 2023 — Present

Provide data-driven insights for policy-making and organizational growth through survey creation and analysis.

  • Team Leadership: Led volunteer team providing data analysis for organizational strategy.
  • Survey Analysis: Designed & analyzed surveys (qual/quant) informing policy & growth.

References

"Chosen from over 50 applicants and 5 finalists, Jack joined our organization at a pivotal moment and has been an invaluable team member ever since. Jack streamlined a survey and analysis process that previously took our team a month, developing a replicable system that now delivers actionable insights in just a few days. Jack is an easy choice for any team seeking a results-driven, collaborative data scientist who elevates both projects and people."

Brian Jimenez (Managed Jack directly at Caliber Public Schools) - Managing Director of People

"Not only is Jack an extremely capable engineer and data scientist, they are also a collaborative team player who elevates everyone around them. Their contributions at SetSail were always valuable to the company—whether it was their huge role in our data pipeline migration, or countless bug fixes and feature implementations. I wholeheartedly recommend Jack for any data science position."

Darrin Gilkerson (Worked with Jack on different teams at SetSail) - Software Engineer at QVT Financial

"Jack worked on a variety of projects that involved teasing out actionable insights from complex data sets, enhancing modeling capabilities through feature development and algorithm development, and building out a data ETL process that transformed the data infrastructure to help SetSail scale for enterprise customer needs. I highly recommend Jack as a Data Scientist and Data Engineer for any organization."

Danny Pan (Managed Jack directly at SetSail) - Data Science

"Jack is a motivated self-starter who loves to accomplish project tasks while developing and implementing smooth processes in their work environments. Jack is an accomplished leader, utilizing problem-solving skills to support their own work and the work of their colleagues and peers. Jack is a leader who uses imagination, experience, and empathy to create sustainable processes."

G. Allen Ratliff (Managed Jack directly at UC Berkeley SPH) - Assistant Professor of Social Work