The eScience Institute

Data Science for Social Good at UW eScience

Follow along at http://arokem.github.io/2015-10-20-DSSG/

Two major problems:

How do we enable data-driven approaches in institutions devoted to social good?

How can we provide training for data-scientists interested in social good?

Our solution

A ten-week internship program matching student DSSG fellows with project leads from organizations in the Seattle region devoted to social good, for intense joint work focused on providing a specific data-driven solution.

#DSSG2015 @uwescience

Context/Inspiration

Incubator projects

Urban@UW

DSSG programs at U Chicago, GA-Tech

Our recipe

4 projects (with project leads)

of 11 applications

17 DSSG Fellows

of 144 applications

6 High School students (ALVA program)

The eScience infrastructure

  • eScience Data Scientist Mentors
  • Speakers from around UW/Seattle
  • Ethnographer
  • Program managers
  • Data science studio

Training in data science:

Group tutorials
Software Carpentry

Individual mentorship

Peer instruction and collaboration

The projects

Assessing Community Well-being through Open Data

Wellbeing
Project Lead: Shelly Farnham, Third Place Technologies
DSSG Fellows: Jordan Bates, Ryan Burns, Jenny Ho, Yue Zhou
ALVA Students: Avery Glass, Jennifer Nino
eScience Data Scientist Mentors: Bernease Herman, Bill Howe

Socrata crime incidence data

Survey data

Data from social networks (facebook, twitter, etc.)

Well-being

Rerouting Solutions for King County Paratransit

Wellbeing
Project Lead: Anat Caspi, Taskar Center for Accessible Technology
DSSG Fellows:Rohan Aras, Frank Fineis, Kristen Garofali, Kivan Polimis
DREU Fellow: Emily Andrulis, Cornell College
eScience Data Scientist Mentors: Joseph Hellerstein, Valentina Staneva
Optimizing routing to reduce costs and develop tools to aid route planning
paratransit
paratransit
paratransit

Open Sidewalks: route maps for low-mobility citizens

Sidwalks
Project Leads: Nick Bolten Anat Caspi, Taskar Center for Accessible Technology
DSSG Fellows: Amir Amini, Yun Hao, Vaishnavi Ravichandran, Andre Stephens
ALVA Students : Nick Krasnoselsky, Doris Layman
eScience Data Scientist Mentors: Anthony Arendt, Jake Vanderplas
Connecting open sidewalk data through computational geometry
cleaning
routing
Powered by data from
SDOT/Socrata, Google API

Predictors of Permanent Housing for Homeless Families

Wellbeing
Project Leads: Anjana Sundaram, Neil Roche, Bill & Melinda Gates Foundation
DSSG Fellows: Joan Wang, Jason Portenoy, Fabliha Ibnat, Chris Suberlak
ALVA Students: Cameron Holt, Xilalit Sanchez
eScience Data Scientist Mentors: Ariel Rokem, Bryna Hazelton
Family Trajectories through Programs cleaning
http://tinyurl.com/dssg-homeless

A few lessons we learned

    It is possible to both:

  • Do interesting things with data, with social good implications
  • Provide highly effective training

  • Trainee diversity poses a challenge in formal settings

  • But might be a strength in the context of project work!

Stakeholder involvement is important
(no projects "thrown over the fence")

In-house expertise (data scientists, program managers) are an important asset

But (hypothesis) DSSG can be translated into other settings

Read our paper