Data Science For Service Change: City and County of San Francisco

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 49

Data science for service change

Presented by DataSF |

City and County of San Francisco
What is data science?

Data Science Service Change

Applying advanced Converting new data
statistical tools to insights into (often
existing data to small) changes to
generate new insights business processes

Smarter Work
More efficient and effective use of staff and resources
What complements
(and is really good stuff to do)
data science?
Approach Process Outcome Examples

Define, visualize, often Meet goals and KPI SF Scorecard,

using dashboards, and targets PublicWorks Stat &
Management manage to KPIs Stat starter kit

Assess a project, Better investment of Evaluation of

Evaluation program or policy resources; Better transitional-
design or results policy decisions kindergarten in SF

Define and assess Report or memo with Shape Up SF Policy

Policy Analysis alternatives using a policy or program Analysis
broad range of tools recommendations

Publish civic data for Easier data sharing and SFPUC Adopt a Drain
Open Data use by the City and the reporting, new tools or
public services built on data

Identify insights using Smarter work “on the See rest of deck!
DataScienceSF advanced statistics tied ground” in real time
to a service change
What complements
(and is really good stuff to do)
data science?


Evaluation All approaches can lead to service

improvement. It’s about choosing the
Policy Analysis right tool for the job (and sometimes
combining them)!
Open Data

What’s in the DataScienceSF Toolkit?
Statistical Methods Tools User Experience Research

Sentiment Time series analysis

analysis Data mining

Missing data
modeling imputations Classification and
Survival analysis
Pattern recognition
Principal component
and factor analysis
AB testing Machine learning
Propensity score Logistic, multinomial
matching and multiple linear
regression techniques Network analysis
What’s in the DataScienceSF Toolkit?
Statistical Methods Tools User Experience Research

Languages Libraries Data Engineering Visualization

Python SciPy Profiling D3.js
R Pandas ETL Gephi
SQL Scikit-learn Job notices R
Javascript GPText APIs Leaflet
NodeJS OpenNLP Optimized data PowerBI
Mahout pipelines ggplot2
+many others Optimized data shiny
What’s in the DataScienceSF Toolkit?
Statistical Methods Tools User Experience Research

Prototyping Photo journaling
and documenting
Journey mapping
Process mapping
Ethnographic field
research and user
observation Usability testing
What is NOT data science?
 This  Not that
Service change Academic research

Major overhauls /
Small changes
service disruptions

Collecting new
Use existing data
data (mostly ;)
Data Science
Project Types
Project Type: Find the needle in the haystack
What to target? Data Science Service Change

Target areas

Target categories

Target individuals

Service Issue: Data Science Process: Service Change:

Difficult to identify Use existing data and Engage with target
targets in a population predictive modeling to subset of population
identify targets

Result: Department resources are spent where most needed

Examples: Free fire alarms in New Orleans
Service Issue

Fire alarms to homes

that have them

Data Science

ID homes with high prob.

of no alarm

Service Change

Use list to shape



2x increase in hit rate

Examples: Find the needle in the haystack
Service Issue Data Science Service Change Result

New Orleans Fire Nola’s analytics Nola FD used the


With no increase in
New Orleans Fire

Department (Nola team used public list to determine resources or

FD) distributes free data to identify where to offer fire patrols, Nola FD
fire alarms to homes with a high alarms. increased the hit
homes. But many probability of not rate of homes
homes they visited having a fire alarm needing smoke
already had them, and provided Nola alarms by 2x.
wasting Nola FD’s FD with a list.

New York City (NYC) NYC analyzed The audit team


With the same staff

New York City Tax

conducts corporate historical audit targeted the levels, the audit

tax audits. They are records and flagged cases for team decreased the
time consuming identified patterns audits. percent of cases
and 37% have no of businesses. with no finding
findings. They want Outliers were from 37 to 22%,
to increase findings flagged as possible leading to
but maintain their audit targets. increased revenues.
number of audits.
Project Type: Prioritize your backlog
What to prioritize? Data Science Service Change

Service Issue: Data Science Process: Service Change:

Backlog is tackled via Create a model to Prioritize cases based on
first in, first out (FIFO) categorize and group categories in order of
past and current cases risk, need or

Result: Department addresses high priority cases first

Examples: Blight backlog in New Orleans
Service Issue

Backlog in blight

Data Science

Use data to grade cases

per prior decisions

Service Change

Result created
abatement tool


1500+ case backlog gone

in 100 days
Examples: Prioritize your backlog
Service Issue Data Science Service Change Result

In Boston, they The analytics team The Air Pollution


With no change in
have a large list of pooled data from Control resources, Boston
residences with housing, police, Commission saw a 55%
anti-social and tax agencies to expedited reduction in police
complaints filed gauge the nature of enforcement with calls associated
against them. complaints and the biggest with the targeted
identify the biggest contributors. residences.
contributors to

New Orleans (Nola) Nola used data on The enforcement

New Orleans

Nola eliminated the

faced a significant the outcomes of team used the 1,500+ case
backlog in blight previous blight results as an backlog in less than
enforcement due in cases to grade abatement decision 100 days.
part to bottlenecks cases in the backlog tool to speed the
in the decision and to recommend decision-making
making process and additional data to process of whether
missing collect by field to demolish or
information. teams. foreclose a home.
Project Type: Flag “stuff” early
How to detect? Data Science Service Change

Service Issue: Data Science Process: Service Change:

Hard to predict future Use historical and Use estimates to change
condition which leads to current data to create and tailor intervention
reactive services estimate ranges for points
potential outcomes

Result: Department provides pro-active early interventions

Examples: Use of force alerts in Charlotte
Service Issue

Excessive force have neg.

impact on community

Data Science

Identify patterns to
refine early warning

Service Change

Flagged recurring


Accuracy up 20%; False

positives down 55%
Examples: Flag “stuff” early
Service Issue Data Science Service Change Result

Excessive force The analytics team The department

Charlotte Police

The CMPD system

violations by police refined an early flagged recurring increased accuracy
officers have huge warning system, complaints against by 15-20% while
negative identifying patterns officers and reducing false
repercussions in that often led to notified supervisors positives by 55%.
the community and officers having when certain
for police careers. negative thresholds were
interactions with reached.
the public.

In Chicago, a large The analytics team They conducted

Lead Poisoning in

Chicago reached
number of children built a model of targeted the most
are thought to be exposure using inspections and vulnerable families
exposed to lead data on homes, provided before severe
paint in older history of children’s remediation health effects from
houses. exposure at that funding to homes lead contamination
address and identified in the manifest.
conditions of model.
Project Type: A/B test something
Which form? Data Science Service Change

62% 78%
respond respond

Service Issue: Data Science Process: Service Change:

Costly outreach Statistical testing on Use statistically
methods are not tested outreach methods to validated outreach
before implementation identify which, when, method
and to whom to send

Result: Department increases response rates

Examples: NYC Summons Redesign
Service Issue

40% cited no-show

leading to costly arrest

Data Science

Redesigned and tested

summons form

Service Change

Deployed new form and

rescheduled timelines


Currently evaluating
Examples: A/B test something
Service Issue Data Science Service Change Result

In New Orleans, The analytics team The department

NOLA Community
Health Program

60% increase in
they have a low tested different implemented the clients using free
take up rate of free SMS reminders to most successful primary care
primary care those eligible for SMS text. appointments
appointments. appointments.

40% of those cited Experiment and Reschedule court

NYC Summons

Evaluating impact
for low-level test redesign of timelines to on use of costly
violations did not summons process facilitate greater arrest warrants
take required next access (Project currently in
steps, leading to progress)
issuance of arrest
Project Type: Optimize your resources
How to distribute? Data Science Service Change

Service Issue: Data Science Process: Service Change:

Difficult to identify Use geospatial and/or Re-allocates resources
where to place or other data to identify to optimal distribution
distribute resources to optimal distribution of
be most effective resources

Result: Department decreases response times; increases volume

Examples: Chicago Pest Control
Service Issue

Challenging to predict

Data Science

Analyze data associated

with outbreaks

Service Change

Proactive targeting of
leading indicators


15% drop in requests for

Examples: Optimize your resources
Service Issue Data Science Service Change Result

Chicago’s rodent Predicted potential Directed rodent Resident requests

Chicago Pest

baiting program danger of baiting to areas for rodent control

finds it challenging outbreaks by using identified by services dropped
to predict rodent leading indicators leading indicators, by 15%
outbreaks and and other data including events,
locations leading to correlated with like water main
spikes in 311 previous outbreaks. breaks.

In New Orleans, Analytics team Ambulances

Stand-by Location

Targeting short
NOLA Ambulance

ambulance standby used city wide deployed at new response times to

locations are analysis of data on optimized locations EMS calls (Project
chosen based on accident patterns, currently in
dispatcher habits or traffic patterns, and progress)
instincts. crew readiness to
identify optimal
standby locations
What was the service change?

 From that  To This

Fire Alarms Random List Prioritized List

Blight Staff evaluates all cases Tool evaluates easy cases

Early Warning Focus on that set of officers Focus on this set of officers

Summons Send Original Form Send new form

Control Arrive at location X too late Arrive at location X early

Service Change = Small Business Process Change

Summary: The five project types
Find the needle in the haystack

Prioritize your backlog Some combination

Flag “stuff” early

A/B test something

Optimize your resources Something else…

Cohort 1
ASR: Increase property tax revenues
Service Issue

When a property sells in SF, we either accept the sales

price or modify it to collect property taxes. So which
sales should you accept and which should you dig into?

Data Science

Our regression model identifies which sale prices are

unusual for the location, time and property details
Service Change

The model splits properties into two lists: normal sale

prices to enroll directly in tax collection and outlier sales
for manual review by appraisers


Expected: Increased revenue and time to revenue, Prioritize your backlog

reduced backlog, and more consistency in assessments
Full write up at
Evictions: Pro-actively prevent evictions
Service Issue

How can we make eviction prevention more proactive by

identifying the most problematic eviction notices in real

Data Science

An algorithm combines data sources to identify eviction

notice filings that are outside the norm

Service Change

A list of flagged eviction notices is sent to eviction

prevention services to proactively review for service


Expected: Targeted eviction prevention that keeps Find the needle Flag “stuff”
residents in their homes in the haystack early
Full write up at
ENV: Find new clients to help green our City
Service Issue

SF Environment offers financial incentives and technical

assistance to help our constituents upgrade their lighting
& refrigeration systems. But their list of leads is
dwindling - how can they find new leads?

Data Science

Mashed together multiple data sources to identify

characteristics of stronger leads

Service Change

New and longer list of property leads with enriched data

for targeting marketing campaigns


Expected: New customers and increased uptake of green Find the needle Optimize your
subsidies in the haystack resources
Full write up at
DPH WIC: Help moms and babies stay in
nutrition program
Service Issue

Since 2011, DPH has seen an increase in mothers

dropping out of their nutrition program. Which moms
are most at risk of dropout?

Data Science

Built a predictive model that identified moms and infants

who are at greatest risk for dropping out

Service Change

Using the high-risk client profiles to conduct targeted

interviews to identify program barriers and make service


Expected: Reduce the dropout rate of moms, infants and Flag “stuff” early
children, leading to healthier outcomes for both
Full write up at
DPH BHS: Improve results and reduce costs in
mental health care
Service Issue

A small fraction of mental health patients use a large %

of resources. Can we identify high users early to improve
their outcomes and reduce costs?

Data Science

Build predictive model to identify clients at greatest risk

for becoming high users

Service Change

Expected: Targeted service model to direct high users to

more stable and preventative services


Expected: Reduction in high cost clients and use of high Find the needle Flag “stuff”
cost emergency services in the haystack early
TTX: Increase response to tax letter
Service Issue

TTX wanted to use behavioral economics and A/B test to

increase effectiveness of collection letter for unsecured
personal property (a difficult type to collect on).

Data Science

DataSF helped organize a Behavioral Insights Training

(BIT) workshop and provided guidance on A/B test

Service Change

Use whichever letter gets the best response


Improved response rate by 17%. TTX continuing to apply A/B test something
BIT principles to other taxpayer communications
Full write up at
ART: Preserve City art for the future
Service Issue

The Arts Commission needs to accurately and efficiently

project long-term costs to budget for art preservation

Data Science

Revised cost formula and new tool to provide long-term

projections and prioritization of conservation projects on

Service Change

Use tool to model cost scenarios instead of manual, one

time process


Expected: Reduction in staff time, more accurate cost Optimize your resources
estimates, and earlier identification of pieces in need of
Full write up at
Overview of Phases

Cohort 2: Jan – June

Solicitation Selection Project refining Present

Oct - Nov Nov 27 Dec

Dec January - May June
Nov 22 – Dec 13 13

Application due Notify applicants Analysis & service change

Phase: Solicitation
Opportunities to learn more
• Brown bags
• Office hours
• Invited presentations

Dates at

April - Mid
May May June July - November Dec
May May
Phase: Solicitation
How to prepare
• Brainstorm projects using the project types
• Identify possible service changes
• Review data that could help
• Identify key staff members

Learn more at

April - Mid
May May June July - November Dec
May May
Phase: Application
Available at

• Brief online form

– Problem statement (200
word max)
– Impact statement (100
words max)
– Service change statement
– Data overview
– Project champion

April - Mid
May May June July - November Dec
May May
Phase: Application
Criteria to keep in mind
• Above all else: A viable path to service change
• Question / problem answerable by data science
• Solvable within cohort time frame
• Impact
• Department commitment
• Data readiness
April - Mid
May May June July - November Dec
May May
Phase: Selection
• Initial review
– Criteria assessment
– Application scoring
• Department follow-ups, as needed
– Be available for questions (email or in person)
• Estimating 5-10 projects per Cohort

April - Mid
May May June July - November Dec
May May
Phase: Winners Announced
And gentle off-ramps for the rest…
Some projects may not be appropriate for data science or for our timeline. We will help identify other
opportunities that may be a better fit:
• Civic Bridge – pro bono opportunities via the Mayor’s Office of Civic Innovation
• STIR – startup technology engagements via the Mayor’s Office of Civic Innovation
• DataSF Dashboarding Services
• Controller's Performance Unit
• Data Academy classes
• External Data Science groups or volunteers
• Other technical assistance

April - Mid
May May June July - November Dec
May May
Phase: Project refining
During this phase, we will:
• Meet to refine the scope
• Optionally, do initial site visits/interviews
• Prepare data for analysis
• Outputs
– Project charter
– Data exchanges and agreements, as needed

April - Mid
May May June July - November Dec
May May
Phase: Analysis and service change
During this phase, we will:
• Conduct site visits, ride-alongs
and interviews, as appropriate Plan

• Conduct iterative analysis

• Implementation testing
• Handoff and training

April - Mid
May May June July - November Dec
May May
Phase: Analysis and service change

Statistical Methods Final Product is

DataSF Tools Algorithm + Tool:
Brings Algorithms that are
User Experience Research scripted and automated
(real time if needed) tied to
Issue expertise some service change tool
What You (e.g. list, service, alert)
A good question & data
Bring implemented together and
Project champion maintained by department
Phase: Present (& Disseminate)
During this phase, we will:
• Present and celebrate the results with cohort
• As appropriate, write an article for DataSF
Speaks ( and/or other venues
• Disseminate method and approach (not data) for
other departments and cities to learn
• Data Scientist will continue to be available
during office hours for continued support
April - Mid
May May June July - November Dec
May May
• This powerpoint
• 1 pager
• Sign up for office hours
• Sign up for brown bag
• Apply!
Other Resources: Civic Bridge
@datasf | |
• Take 5 minutes by yourself
– Brainstorm ideas
– Take your best idea and complete the form
• With your neighbors
– Review each top idea and refine/iterate
• Report out

You might also like