Jack Stehn
Machine Learning Engineer | Data Engineer | Data Scientist
San Francisco, California • jack@ste.hn
Summary
High-agency Data Professional with a unique background blending rigorous social science research with production-grade engineering. I specialize in bridging the gap between 'notebook data science' and scalable infrastructure. I architect end-to-end data systems—from raw ingestion to deploying predictive models—and bring a software engineering mindset to data teams.
Experience
Data Analytics & Customer Delivery Engineer (Contract)
Innovare · Chicago, Illinois (Remote)
- EdTech Analytics Reliability: Specialize in the stability and scalability of the EdTech analytics layer, ensuring high-stakes dashboards and datasets are accurate, reliable, and actionable for school leadership teams.
- Enterprise Customer Delivery: Manage complex data pipelines, automate manual workflows, and deliver presentation-ready insights for enterprise accounts — translating raw data into decision-ready artifacts for non-technical stakeholders.
- Data Architecture Consulting: Co-designing the next-generation data architecture, proposing a transition plan to a Medallion (bronze/silver/gold) infrastructure on BigQuery to improve traceability, testability, costs, and downstream analytics velocity.
- Modern Stack Proposal: Proposed adopting a modern open-source stack — Terraform for IaC, dlt for ingestion, dbt for transformation, and Dagster for orchestration, Cube.js for semantic layer — to replace ad-hoc workflows with a maintainable, version-controlled, observable platform.
- Workflow Automation: Automating previously manual customer delivery workflows to reduce turnaround time and free engineering capacity for higher-leverage analytics work.
Data Scientist (Lead: ML, Data Engineering, MLOps) - Ed Pioneers Fellow
Caliber Public Schools · Richmond, California
- Strategic Leadership (Solo Data Lead): Owned the full data lifecycle (DS, DE, ML) as the sole data scientist. Partnered directly with C-suite and department heads to navigate a 'zero-to-one' environment.
- Predictive ML & Risk Modeling: Developed and deployed explainable ML models to predict staff turnover. Engineered a 'Risk Tolerance' configuration allowing non-technical leadership to adjust precision/recall thresholds.
- Modern Data Stack Architecture: Architected a scalable platform on GCP. Orchestrated ELT pipelines using Dagster, dbt, and dlt to ingest data from disparate SIS and HR platforms.
- Engineering Maturity (ROI): Engineered a comprehensive People Team data pipeline, reducing manual consistency checks from months of collective annual work to seconds.
- Survey Design & Enrichment: Leveraged data to enrich staff satisfaction surveys with demographic, work location, role, grade level, and tenure data, enabling highly segmented and actionable insights.
- Data Democratization: Updated the organization's data security policy and designed data literacy training modules, empowering school leaders to access real-time metrics without technical bottlenecks.
Data Scientist (ML, Data Engineering, MLOps)
SetSail · San Mateo, California
- Business Impact: Contributed to product enhancements that achieved 33% faster ramp times, 16% higher revenue, and 15x ROI for customers.
- Production ML (Revenue): Developed and deployed production ML models for Propensity Scoring and Churn Modeling. Leveraged NLP on unstructured email metadata to identify sales signals.
- Pipeline Architecture (AWS): Led a critical overhaul of the AWS data infrastructure. Implemented 'SQL Push-down' strategies and asynchronous DAGs, reducing data processing latency by 75%.
- Causal Analysis: Performed deep causal inference studies to isolate specific sales behaviors that drive outcomes, influencing the product roadmap to focus on 'high-leverage' user actions.
- Engineering Best Practices: Championed the adoption of CI/CD pipelines (GitHub Actions), unit testing (pytest), and Agile methodologies.
Data Science Research Team Lead
UC Berkeley School of Public Health · Berkeley, California
- Leadership: Led data science components for mixed-methods studies on equity and public health. Managed a team of undergraduates.
- Unstructured Data: Analyzed diverse unstructured and non-traditional datasets requiring the development of novel data processing approaches.
- Geospatial Analysis: Performed geospatial analysis (ArcGIS) to identify and visualize spatial patterns for non-technical stakeholders.
- Visualization: Created interactive dashboards (Tableau, Plotly) to communicate findings to stakeholders.
Education
University of California, Berkeley
Bachelor of Arts in Data Science (Domain Emphasis: Quantitative Social Science)
GPA: 4.00/4.00
Highest Distinction (Summa cum laude). Outstanding Data Science Undergraduate Award (Top of Class).
Skills
Programming & Core Data Skills
Machine Learning - Predictive & Classical
Data Engineering & Cloud Platforms
Software Engineering & DevOps Practices
Data Visualization & BI Tools
Research, Experimentation & Ethics
Awards
2020-2021 Outstanding Data Science Undergraduate Award
UC Berkeley
Recognized for excellence in Data Science undergraduate studies, research, and community contributions at UC Berkeley.
Volunteer & Community
Impact Fellow (Placement @ Caliber Public Schools)
Education Pioneers
Selected for national fellowship applying leadership/management skills to advance educational equity.
- Leadership Development: Applying data science & leadership skills to advance educational equity.
- Capacity Building: Building organizational capacity through strategic data projects at placement site.
Data Team Lead
San Francisco Gay Men's Chorus
Provide data-driven insights for policy-making and organizational growth through survey creation and analysis.
- Team Leadership: Led volunteer team providing data analysis for organizational strategy.
- Survey Analysis: Designed & analyzed surveys (qual/quant) informing policy & growth.
References
"Chosen from over 50 applicants and 5 finalists, Jack joined our organization at a pivotal moment and has been an invaluable team member ever since. Jack streamlined a survey and analysis process that previously took our team a month, developing a replicable system that now delivers actionable insights in just a few days. Jack is an easy choice for any team seeking a results-driven, collaborative data scientist who elevates both projects and people."
Brian Jimenez (Managed Jack directly at Caliber Public Schools) - Managing Director of People
"Not only is Jack an extremely capable engineer and data scientist, they are also a collaborative team player who elevates everyone around them. Their contributions at SetSail were always valuable to the company—whether it was their huge role in our data pipeline migration, or countless bug fixes and feature implementations. I wholeheartedly recommend Jack for any data science position."
Darrin Gilkerson (Worked with Jack on different teams at SetSail) - Software Engineer at QVT Financial
"Jack worked on a variety of projects that involved teasing out actionable insights from complex data sets, enhancing modeling capabilities through feature development and algorithm development, and building out a data ETL process that transformed the data infrastructure to help SetSail scale for enterprise customer needs. I highly recommend Jack as a Data Scientist and Data Engineer for any organization."
Danny Pan (Managed Jack directly at SetSail) - Data Science
"Jack is a motivated self-starter who loves to accomplish project tasks while developing and implementing smooth processes in their work environments. Jack is an accomplished leader, utilizing problem-solving skills to support their own work and the work of their colleagues and peers. Jack is a leader who uses imagination, experience, and empathy to create sustainable processes."
G. Allen Ratliff (Managed Jack directly at UC Berkeley SPH) - Assistant Professor of Social Work