Proposed PHD in Data Science

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 166

A Proposal for a Program of Graduate Study in

Data Science leading to a degree in

Doctor of Philosophy in Data Science (PhD/DS)

By:
Doctoral Program Committee, HDSI (AY 2019)
Gal Mishne, Virginia De Sa, Yian Ma, Jingbo Shang, Vineet Bafna, Lily Xu, Michael Holst,
Rayan Saab, Armin Schwartzman, George Sugihara, Dimitris Politis.

Doctoral Program Committee, HDSI (AY 2020)


Faculty Council

Contacts:
Academic:
Rajesh K. Gupta
Director, Halıcıoğlu Data Science Institute (HDSI)
(858) 822-4391
rgupta@ucsd.edu

Administrative:
Yvonne Wollmann
Student Affairs Manager
Halıcıoğlu Data Science Institute (HDSI)
(858) 246-5427
ywollmann@ucsd.edu

Version History:
April 12, 2020: Version 1.1 submitted for preliminary review by HDSI faculty.
Oct 6, 2020: Version 2.0 submitted for administrative review.
Oct 12, 2020: Version 2.2 updated with inputs from HDSI faculty council.
Nov 30, 2020: Version 3.1 submitted to Graduate Council for review.
Jan 28, 2021: Version 4.0 revised and updated based on feedback from the Graduate Council.
Online Link: https://bit.ly/HDSI-PHD

1
A Proposal for a Program of Study in Data
Science leading to a degree in
Doctor of Philosophy in Data Science (PhD/DS)
Executive Summary
The Halıcıoğlu Data Science Institute proposes a doctoral degree program in “Data Science”
(PhD/DS) to serve the need for advanced graduate studies in the area of Data Science, a field in
which HDSI currently offers a well-received Bachelor of Science degree as a part of its academic
mission “to promote a unified campus-wide approach to research and teaching in Data Science.”
The proposed doctoral program will join similar degree programs coming up across the country
as the emerging field continues to define its core intellectual thrusts and its academic community.
The nascent field of Data Science spans mathematical models, computational methods and
analysis tools for navigating and understanding data in a broad range of application domains.
The scientific community in the area is accordingly drawn from many different existing
disciplines driven in the near term by the immediate demand and limited success of applying
data science methods and tools in application areas such as information technology,
communications, financial markets. These early successes have led to a demand for data
scientists in a whole range of industries from drug discovery to healthcare management, from
manufacturing to enterprise business processes as well as government organizations with the
expectation to do “data-driven” tasks such as the ability to create mathematical models of data,
identify trends and patterns using suitable algorithms and present the results in an effective
manner. However, there is also a growing realization that scientific knowledge is not enough for
data scientists who must also demonstrate awareness of ethical responsibilities in their work and
outcomes.
The goal of the doctoral program is to teach students knowledge, skills and awareness
required to perform data-driven tasks, and using this shared background, lay the foundation for
research that expands the boundaries of knowledge in Data Science. To achieve these goals, the
graduate program is structured as a set of three key requirements related to coursework,
examinations and dissertation compliance. The course preparation consists of breadth and depth
requirements of 48 units taken for letter grade and 4 units of satisfactory completion of
professional preparation courses. After a required preliminary advisory assessment at the end of
first year, the examination requirements consist of a research qualifying examination and
dissertation defense examination. The dissertation compliance requirement approved thesis
document that specifically meets reproducibility requirements. The implementation plan is
designed to open the program for internal transfers in Fall 2021 with a formal announcement and
new admissions starting Fall 2022.

Ph.D. in Data Science, November 30, 2020 Version 4.1 2 | Page


Table of Contents
Executive Summary 2
1 Introduction 5
Program Goals 5
Historical Development of the Field 7
Rationale and Justification 12
Timetable for Development of the Program 12
Relationship of the Proposed Program to Existing Program on Campus 13
Contributions to Diversity 17
Relationship of the Proposed Program with Other UC Campuses 18
Program Administration and Resource Planning 20
Plan for Evaluation of the Program 23
2 Program 23
Undergraduate Preparation for Admission into the Doctoral Program 23
Admission Requirements 24
Foreign Language 25
Overview of the Proposed Doctoral Program 25
Plan of Study 27
Unit Requirements 27
Structure of the Proposed Graduate Program 28
Course Numbering Schema 28
Required and Recommended Courses 29
Program Structure 29
Student Advising Information 38
Field Examinations 38
Qualifying Examinations 39
Preliminary Assessment 39
Research Qualifying Examination, or the UQE 39
Thesis Requirements 40
Generalization and Reproducibility Requirements: 40
Final Examination 40
Explanation of Special Requirements Over and Above Graduate Division Minimum
Requirements 41
Generalizability and Reproducibility Requirements 41
Rotation Training Program 41
Relationship of Master’s and Doctoral Programs 42
Special Preparation for Careers in Teaching 42
3 Projected Need 43

Ph.D. in Data Science, November 30, 2020 Version 4.1 3 | Page


Student Demand for the Program 43
Opportunities for Placement of Graduates 44
Importance to the Discipline 45
Ways in which the program will meet the needs of the society 46
Relationship of the program to research and/or professional interests of the faculty 46
Program Differentiation 46
4 Faculty 47
5. Courses 53
6. Resource Requirements 53
7. Graduate Student Support 53
8. Governance 54
9. Changes in Senate Regulations 54

Appendix A: Listing of Research Areas 55

Appendix B: Letters of Support Solicited 57

Appendix C: Catalogue Copy Description [Draft] 59

Data Science (DSC) 59


The Graduate Program 59
Data Science Program 59
Course Requirements 60
Preliminary Advisory Assessment 62
Research Qualifying Examination (UQE) 62
Dissertation Defense Examination 62
Student with Disabilities 63
Appendix D: HDSI Bylaws
Appendix E: Faculty Vitae

Ph.D. in Data Science, November 30, 2020 Version 4.1 4 | Page


1. Introduction
1. Program Goals
The goal of the doctoral program is to train graduate students who will advance the field of Data
Science. The doctoral program is an integral part of a greater mandate of the Halicioğlu Data
Science Institute (HDSI) to create a community and ecosystem of self-identifying data science
researchers and practitioners. The resulting discipline of Data Science consists of a community of
recognized researchers and a common interest in questions such as: such as: What problems are
considered important? What solution methods are considered legitimate, valid or useful? What
verification regimes are considered essential to assess a proposed solution?

The nascent discipline of Data Science currently draws researchers from diverse fields that share
a quantitative intellectual tradition, from mathematics, computer science, engineering, physical
sciences, and quantitative social sciences. Naturally, the proposed program builds upon
intellectual traditions and training in disciplines that currently constitute the primary drivers of
knowledge advances in Data Science: computer science, mathematics, statistics, and electrical
engineering. While these are the disciplines from which the majority (but not all) of the current
HDSI faculty are drawn, we are keenly aware of the importance of various use cases that are
principally driving adoption of data science advances in practice including engineering,
medicine, governance, journalism, and archeological discovery. More importantly, beyond use
cases scholarship in Data Science must also demonstrate awareness of ethical responsibilities for
the direct role data has on our social, cultural and personal lives.

This places a significant academic challenge on the HDSI faculty to create a program with a
well-defined intellectual core that invites and cultivates diverse intellectual traditions. Such a
program can not simply be a collection of diverse existing topics or multiple courses and degrees
in sciences and humanities stacked on an individual, or specialization of an existing program.
Instead, a streamlined and integrated approach to curriculum is needed that is accessible to
students drawn from different undergraduate degree backgrounds. The HDSI faculty have
addressed the challenge of program accessibility in its Masters of Science (MS) program recently
approved by the Graduate Council of the Academic Senate. Building upon the MS/DS program,
the doctoral program is structured to cultivate both a generalist’s penchant for persistence in
results validated by proofs, and robust experimentation as well as a specialist’s view of practical
impact validated by real-world demonstrations, user studies and trials.1

1
David Epstein in “Range: Why generalists triumph in a specialized world” (Riverhead Books, 2019) on
the importance of broad thinking and diverse experiences.

Ph.D. in Data Science, November 30, 2020 Version 4.1 5 | Page


To achieve these goals we outline here a broad and equitably-accessible program of graduate
study that clearly articulates core knowledge and skills to be expected from our graduates while
ensuring success of students drawn from diverse educational backgrounds. We seek to achieve
these via well-articulated pathways through existing and new courses as well as on-ramp courses
with financial support necessary to ensure a diverse talent pool consistent with aspirational goals
of UC San Diego as an academic institution.

The educational objectives of the proposed degree program include knowledge and skills
expected of all Data Science graduates namely: (a) collect raw data from various sources and
convert this raw data into a curated form suitable for computational modeling and analysis (e.g.,
its use in designing experiments); (b) understand learning algorithms and how to appropriately
use them in a given domain by developing effective optimization methods; (c) interpret the
results of these algorithms and iteratively drill down into the data, perform analysis, visualize
results and carry out scientific enquiry appropriate for the targeted domains.2 These educational
goals are to be achieved through required courses structured into breadth, depth and elective
groups. A successful completion of the doctoral degree program will require a demonstrated
advance in the state-of-the-art in data science evidenced through traditional means of academic
research success: peer reviewed publications, software (tools) or system artifacts and evidence of
generalizability and reproducibility documented in a well-written and approved dissertation
document. These requirements are discussed in detail in Section 2.10.

A successful execution of the proposed program also induces imperatives and resource
commitments by the Institute that are discussed later in this document. These include effective
partnership with academic units, institutes and centers for maximum exposure of potential
domain experts to graduate student training including any rotation programs, necessary teaching
capacity by HDSI faculty for timely graduation of students, and essential advising and
counseling services for the students to appropriately guide them towards graduation and into
post-graduation careers.

2. Historical Development of the Field


Scientific and engineering advances have given us a better understanding of the physical world,
material and structural properties and their use in accomplishing primarily physical work more
effectively and efficiently. These advances through new measurements and models have also
given us insights into the living world and the processes of life, mind and the intellect. In doing

2
These conclusions have been arrived at through discussions within the HDSI faculty council informed by
national debate on this subject organized in a series of meetings by the National Academy of Sciences,
Division of Engineering and Physical Sciences under the “Roundtable on Data Science Post-Secondary
Education”, 2016-17.

Ph.D. in Data Science, November 30, 2020 Version 4.1 6 | Page


so, these advances are incorporating human knowledge from social sciences, humanities, natural
and life sciences into a greater understanding of us and the world around us.

Data, collected or synthesized, is the primary means of such knowledge exploration and
integration. Historically, data analysis has been a domain of Statistics, a field that has reached
across multiple centuries. The growth of scientific enquiry especially through the eighteenth
century post-Napoleon period of quantitative scientific discovery relied upon calculus and
probability to understand measurement data. Statistical analysis spread widely to many areas of
human enquiry, in particular, areas of social sciences such as economics, clinical psychology etc.
These efforts contributed to significant growth in statistics.

Statistics departments are now common in most universities. At UC San Diego, while there is no
department of Statistics, statistics faculty are part of the Department of Mathematics and HDSI
on the General Campus, and also in the Division of Biostatistics in the School of Medicine. As
Statistics matured with strong foundational results and practical methods influencing many
application domains, more recent advances in computing hardware, software, engineering of
sensory devices, etc. enabled not only volumetric growth in data but also in computational means
to handle such data. Recent advances in algorithmic processing, machine learning, have
significantly advanced computational means for data processing. Early efforts in defining
computational means of handling large data sets and streams placed a new field of Data Science
at the intersection of statistics and computer science3 while others characterized it as a growth
area of Statistics with strong applications focus.4,5

While strong footprints of Computer Science, Mathematics and Statistics can be seen in its
origins, Data Science has emerged as a discipline in its own right to define either the core
problems of the sciences and society, or fundamental theories and underlying methods and tools
to solve these problems. Many of these problems concern reasoning, spanning intellect and
knowledge domains that are assisted by computing machines, thus referred to as machine
intelligence or artificial intelligence (AI). An independent and rich tradition in signal processing,
information theory, detection and estimation theory from Electrical Engineering has contributed
significantly to modern automation methods in AI. While AI has caught the imagination of
computer scientists and mathematicians since the early years of computing machines nearly half
a century ago, technological advances have only recently made it possible to realize answers to
some of the pressing questions such as:

● How do we automate routine tasks without violating human autonomy of thought or


conduct?

3
David Blei, Padhraic Symth, “Science and Data Science”, PNAS August 2017.
4
Bin Yu, “Let us own Data Science”, IMS Presidential Address, October 2014.
5
David Donoho, “50 Years of Data Science,” Journal of Computational and Graphical Studies, Dec 2017.

Ph.D. in Data Science, November 30, 2020 Version 4.1 7 | Page


● How can we incorporate machine intelligence into decision processes that are currently
purely human, and thus transition from purely human decision making to combined
human/machine decision making?

As we are beginning to provide answers to such questions, typically in the form of new software
and systems in various application domains such as improved automated diagnostics from
radiological images, we are beginning to face an entirely new set of sophisticated questions such
as:

● Anticipatory Awareness: How do we integrate algorithmic decision making into political,


social, and economic institutions in a way that anticipates how the algorithm itself might
change the incentives/behavior of individuals and cause negative and positive feedback
loops?

● Artificial Sentience: What are the ethical, moral and business considerations when an
algorithm learns by observations and produces new products and services? Who are the
ultimate beneficiaries of these intellectual or material products: for instance in a
healthcare setting, is it the patient or the doctor being observed, the business creating
new services or the machine itself?

The list of such questions is mind-boggling and touches pretty much every area of human
enquiry6. As an academic institution, fortunately our focus is limited to how knowledge advances
in the emerging domain will be achieved and how will we create a talent pool for the emerging
area? As mentioned earlier, the academic areas that have made the most early advances in
methods, tools and systems to perform such data analysis are statistics and computer science
especially in the context of understanding brain and cognition. Such talent is typically found in
the departments of electrical engineering and computer science, cognitive science, as well as
mathematics, statistics in natural sciences.

New sensing, data-collection and computing devices have also brought together these domains,
enabling practitioners to relax assumptions about the nature of the process that generates the data
and use real-life datasets instead. Thus, the analysis methods can be directly interfaced with
real-life systems to actually capture and analyze real data (and sometimes in real time as well).
Without the axioms underlying data generation processes, the mathematics and statistics required
to arrive at robust answers analytically become exceedingly complex. It is also precisely in these
circumstances where computing steps in and provides us computational models and solutions
that can deliver practical answers. Yet, neither -- mathematical analysis or computational models
-- can provide generic answers as problem-solving methods (and tools) that individual

6
Recently, a number of attempts to define Grand Challenge problems in the Data Science area have
been made. Prominent among these are essays by Jeannette Wing, Bin Yu, and Xuming He & Xihong
Lin. https://hdsr.mitpress.mit.edu/pub/d9j96ne4/release/2

Ph.D. in Data Science, November 30, 2020 Version 4.1 8 | Page


applications can use, because in the absence of a broadly applicable axiomatic framework, the
very success of these methods and tools depends upon the structure, dynamics and meaning of
the actual data. In this environment, to make real progress and demonstrate impact, it is essential
to work closely with the scientists, engineers and social scientists -- the “domain experts” --
before the real problems are understood, articulated and solutions devised to make an impact.

In this context, UC San Diego provides a rich tapestry of domain experts starting with perhaps
one of the most complex of application domains – the human brain and mind – and spread across
the triumvirate of general campus, health and marine sciences. Over the past four years the
Institute leadership has engaged deeply with a broad community of nearly 500 researchers across
the campus through many meetings in small group settings. These efforts yielded a core group of
founding faculty who came together and organized their ideas in data science. There are now
over two hundred faculty affiliated with HDSI drawn from all schools and divisions, and nearly
all departments who participate in various HDSI events, including its weekly Friday seminars
during the academic year. HDSI affiliates are organized into six research clusters shown below
and on HDSI website under Research:

● Data Science Theory: Researchers in this cluster work on theoretical foundations of


Data Science, design machine and statistical learning algorithms with provable
guarantees, develop methods and tools for the practitioners that are broadly useful in
combating the “deluge” of data caused by ever growing sources of data. Researchers
with core expertise in algorithms, mathematics, and statistics work with domain experts
in areas where there is a perceived benefit to collecting large amounts of data. The
constant interplay between the particulars of a domain and generality of methods is
essential to the advances we seek in algorithmic data sciences.
[https://datascience.ucsd.edu/research/theory-cluster/]
● Enabling Discovery: Researchers in this group are drawn together from the ongoing
Center for Computational Mathematics that administers the campus-wide Computational
Science, Mathematics and Engineering (CSME) graduate program. With the rise of big
data, the CSME area has evolved into Data-enabled Computational Science that seeks to
advance and make available integrated approaches to massively parallel computation –
from architectures to algorithms — as building blocks for scientists and engineers.
[https://datascience.ucsd.edu/research/discovery-cluster/]
● Education and Curricula Design: The goal of the Education cluster is to enable
training of students in methods and tools of Data Science regardless of their majors or
degree program. We seek to enhance skills of our graduates in experimental design,
hypothesis testing and data analysis by offering courses — online and in person — that
provide opportunities for significant hands-on learning experience.
[https://datascience.ucsd.edu/research/education-cluster/]

Ph.D. in Data Science, November 30, 2020 Version 4.1 9 | Page


● Quality of Life: The researchers in this cluster span areas of health sciences that rely on
large data sets such as precision health imaging, pharmaceutical data sciences, cancer
cytogenomics and immunogenomics, cancer biology, and medical and population
genomics. [https://datascience.ucsd.edu/research/life-cluster/]
● Cross-Cutting Areas and Systems: Researchers in this cluster address infrastructural
needs of cloud computing, telecommunication networks, data-driven system design, data
visualization, scientific workflows, data science in art.
[https://datascience.ucsd.edu/research/systems-cluster/]
● Data Science in Society: This focus group aims to develop advanced geospatial tools
and research methods for scalable analysis of satellite data. The group will develop
workflows that combine machine learning, remote sensing and crowdsourcing tools to
map our changing world and to address many of the world’s greatest challenges. The
group will identify multidisciplinary research domains that would utilize remotely
sensed data to address one or more of UCSD’s research themes.
[https://datascience.ucsd.edu/research/society-cluster/]

Each cluster consists of a number of interested groups where the researchers and practitioners
come together for joint research efforts in response to various research funding opportunities,
engagements with the industry, etc. HDSI provides personnel and material support to the entire
community for both proposal preparation as well as industry engagements. A complete list of 44
different research areas covered by the HDSI affiliates is available on the website.

Over the past two years, HDSI has recruited core faculty members, as well as drawn a number
of existing faculty through partial appointments into building an active governing body, Faculty
Council. As of this writing the faculty council consists of 11 full-time faculty members, 13
partially-appointed faculty members and 24 formally appointed faculty with no teaching
responsibilities at HDSI (i.e., the so-called, 0% appointments). A complete list of faculty
members and their specializations is provided in Section 1.8.

Over the past year, the HDSI faculty council has worked to identify broader research
challengesthat are central to Data Science as a field. A compilation of all these efforts reduced
core areas of research into following eight research themes that form the scope of the HDSI
academic programs and continue to drive our faculty recruiting strategy listed below.

Core Theme Brief Description

1. Artificial AI is about automating the decision processes, augmenting,


Intelligence complementing or amplifying decision making means. Challenging
problem areas are related to finding and learning good representations
of knowledge with connections of human cognition.

Ph.D. in Data Science, November 30, 2020 Version 4.1 10 | Page


2. Machine Computing machines –- from neuromorphic to cloud processing – have
Learning energized algorithms and CS theory that enable computing systems to
learn from data, create examples, counter-examples. ML covers
various learning methods (reinforcement, transfer,
resource-constrained, deep learning), architectural acceleration and
algorithms for game theoretic setups, natural language processing, etc.

3. Data This theme covers the entire gamut of machines and systems that
Infrastructure enable us to curate, organize, visualize and navigate large datasets,
identify structure in such data; design, deployment and security of
systems and their software stack including new programming
paradigms, languages and methods.

4. Mathematical Mathematical foundations span areas of probability theory, statistics


Foundations and applied mathematics that are used to address current and pressing
challenges of data science such as causal inference, non-parametric
data analysis, compressed sensing, multiple hypotheses testing and
submodular optimization, etc.

5. Digital Digital humanities is an umbrella term that in UCSD context spans both
Humanities social sciences, arts and humanities. The research topics include
privacy, public policy, ethics, computational social science,
computational linguistics, and philosophy of information.

6. Systems and A large and heterogeneous group of topics from algorithms and
Applications demonstrable systems, their use in specific domains from medical
signal processing, economics, geospatial data systems to political and
economic systems as well as applications in cyber-physical systems
and robotics.

7. Scientific Data-driven scientific advances through new instruments and analytics,


Discovery especially for high-throughput biology and chemistry that enable us to
understand the natural world as well as life processes.

8. Healthcare/ Research in this theme focuses on how data can be harnessed to


Medicine improve human health and well-being. The goal is to develop technical
contributions – i.e., theory, methods, hardware, software – to drive
progress in areas such as: clinical decision support, precision
medicine, clinical trial design, medical image processing,
pharmaceutical development, bioinformatics, and genomics.
(https://www.mlforhc.org/, https://www.nature.com/npjdigitalmed/)

3. Rationale and Justification


There are three key reasons for offering the new degree program, each of which are ultimately
tied to the mission of HDSI to be the hub for data science at UC San Diego: (a) to develop the

Ph.D. in Data Science, November 30, 2020 Version 4.1 11 | Page


nascent discipline of data science; (b) to meet the demand for specialists in data science areas by
academia and industry; and (c) to catalyze data science research by building an ecosystem of
partnerships in research and teaching across UC San Diego. As a core subject area of the
Halicioğlu Data Science Institute, the Data Science doctoral program is a natural evolution of the
undergraduate and (pending) master’s program in Data Science and a key vehicle for the
academic research conducted by its full and jointly-appointed faculty members. Besides being
the academic home for Data Science at HDSI, a practical reason for launching this program now
by the HDSI is to satisfy the growing demand for the graduate program in data science both
internally as well as externally. Launched in 2016, the undergraduate Data Science major is
already the 8th largest major at UC San Diego on par with Electrical Engineering despite being
on enrollment cap. In a survey of our students graduating from Data Science major, roughly a
third of them have indicated their interest in a graduate degree in Data Science program. In 2019,
nearly a half (1571) of over 3000 applicants to various graduate degree programs at UC San
Diego who indicated interest in Machine Learning and Artificial Intelligence related topics in
Electrical Engineering, Computer Science and Cognitive Sciences, directly indicated their
interest in Data Science programs at HDSI.

In short, the demand for doctoral training in Data Science is high and continues to grow with the
need for future leaders in Data Science both in research and education. Further, the program not
only serves the HDSI mission of educating talent in the area of Data Science, but also serves as a
vehicle for continued engagement and proliferation of Data Science training across various
graduate programs through foundational, core and elective course offerings that engage domain
experts into the field of Data Science (See Section 1.5 on HDSI Strategy for Partnership with
Other Academic Units). The program provides an excellent means to create new educational
opportunities for students, especially for underserved and economically-disadvantaged student
populations who can benefit from graduate scholarships offered by HDSI as a part of its core
foundation-supported activities.

4. Timetable for Development of the Program


The HDSI faculty council kicked-off discussion on the graduate program in Summer 2019 with a
formal presentation hosted by Yian Ma and Gal Mishne at the two-day long Faculty Retreat on
September 16-17, 2019. The faculty resolved to proceed with the plans for a Ph.D. program and
formed a faculty committee for defining and developing the PhD program in Fall 2019 headed
by Professor Gal Mishne. The discussions matured into a proposal that was revised in view of
the Graduate Council feedback on a concurrent proposal by HDSI for a Master’s program in
Data Science. The two proposals were closely coordinated due to the graduate courses (being
offered and planned for future) that are common to both degree programs.

Ph.D. in Data Science, November 30, 2020 Version 4.1 12 | Page


We plan a two-phase launch of the Ph.D. program with internal transfers (from other degree
programs) beginning Fall 2021 followed by a formal announcement and launch beginning Fall
2022 with a general admission deadline of January 15, 2022. Initial enrollment is estimated to be
5-10 students with approximately 10-15 new students per year that will ramp to 20-25 new
students per year at the steady-state for an average of 4-5 students per faculty FTE. We plan to
conduct a review of the program and its outcomes after the first three years of operation as a part
of our re-assessment of capacity and any enrollment changes.

Needs assessment and faculty discussions in HDSI Summer’19-Winter’20

Administrative Review and Routing Fall 2020 (October 2020)

Proposal submitted for UCSD Graduate Council, Reviewed November 30, 2020
by mid-December 2021

Revised proposal submitted January 6, 2021

UCSD proposal submission to CCGA March 2021

CCGA approval received Early Spring 2021 (early May)

UCOP approval received Spring 2021 (June)

Program open for admissions (internal transfers only) Summer 2021 (July)

Program announced for new admissions Early Fall 2021 (Application Deadline
Jan 15, 2022)

Admission of first class announced Spring 2022 (April)

Orientation and student advising Summer 2022

Program offered and courses begin Fall 2022

5. Relationship of the Proposed Program to Existing Program on


Campus
As part of a transdisciplinary field, Data Science courses necessarily intersect with programs in computer
science, electrical engineering, mathematics and cognitive sciences which are among the closest and
founding partners of the HDSI. As attested in letters from chairs of these departments and divisional
deans, these intersections discussed below are seen by the departments and HDSI faculty as a strength that
makes the proposed degree program unique in bringing together the very best UCSD has to offer

Ph.D. in Data Science, November 30, 2020 Version 4.1 13 | Page


educationally. In what follows, we first briefly describe HDSI’s approach to building partnerships with
other academic units on campus and our approach to pedagogy in a resource-optimal way.

How do we partner with other academic units? Over 200 founding faculty as a part of HDSI
Faculty Affiliate program are a starting point for a deeper engagement with HDSI and its
governance. HDSI's relationships with other academic units are governed by jointly appointed
faculty members on the HDSI Faculty Council with specific roles in keeping campus units
apprised of HDSI plans and progress through its weekly meetings. HDSI programs are overseen
by standing committees of the HDSI Faculty Council. Before hiring any of our core full-time or
part-time HDSI faculty, it has been the role of the HDSI Faculty Council to define and develop
both the Data Science curriculum and research directions. The HDSI Faculty Council now
consists of faculty drawn from all across UC San Diego: the home departments of Council
members are in Engineering (e.g., Computer Science, Electrical Engineering), Physical
Sciences (e.g., Mathematics, Physics), Arts & Humanities (Philosophy, Visual Arts), Social
Sciences (e.g., Cognitive Science, Communication, Political Science), Medicine (e.g.,
Biostatistics and Bioinformatics, Radiology, Pediatrics), the Scripps Institution of
Oceanography, and the Supercomputer Center. With such a diverse background to draw upon,
the HDSI Faculty Council has managed to create a unifying vision for Data Science, and to steer
the Institute towards a future that is based on interdisciplinary collaboration with all units on
Campus.

Following senate regulations, the Faculty Council has developed a detailed set of Bylaws
[attached with the proposal] to facilitate the governance and growth of HDSI as an academic
unit. The HDSI Faculty Council remains open to new faculty interested in joining HDSI via a
well-defined review, advise and consent process. Using this governance structure, HDSI has
successfully conducted six joint searches in 2019, and four in 2020. The Faculty Council
currently consists of 48 faculty members:

● 11 faculty members with 100% appointed in HDSI (2 Full Professors, 1 Associate, and 8
Assistant Professors)
● 13 faculty members with joint appointments with another department (Communication,
Computer Science and Engineering, Neurobiology, Bioengineering, Mathematics,
Political Science, Biostatistics, Philosophy)
● 24 faculty members with current or proposed 0% appointments in HDSI. These are
among the original faculty council members who have guided recruiting. All of them
will eventually transition to 0% appointments with HDSI as the ongoing process
completes.

We note that in its proposed three-year hiring plan, HDSI has requested the largest number of
joint searches among all divisions on the general campus. Partners of HDSI can be found in

Ph.D. in Data Science, November 30, 2020 Version 4.1 14 | Page


almost all units on campus.

Program Engagements: The proposed doctoral program lists a number of new core courses that
are also part of our MS program and designed to be broadly accessible. Some of the proposed
courses will be taught by faculty in other departments, and thereby cross-listed with courses
outside HDSI. When and if relevant courses are available in other UCSD departments, students
will be encouraged to enroll in them. Notably, HDSI has partnered with the Computer Science
and Engineering Department to create an Online M.S. program that was recently approved by
the Graduate Council, and that will serve as an excellent on-ramp preparation for a few selected
students into the Ph.D. program. Indeed, HDSI currently offers scholarships to students that
cover the cost of attending online courses taught by HDSI-affiliated faculty.

Among the related programs that partially cover some of the topical areas of the PhD/DS
program are the doctoral programs in Computer Science by CSE, Electrical Engineering by ECE,
Statistics by Math. More precisely, there are specializations of these programs that feature
elective courses on Machine Learning, Statistical learning and inference. The proposed program
is directly and entirely dedicated to Data Sciences and differs from existing programs in two
material ways: in the breadth of the student population it serves and in the scope of the
transdisciplinary training it provides as discussed below:

(a) The proposed degree is targeted to students drawn from a wide variety of backgrounds in
their undergraduate education in an effort to serve a diverse group of learners interested in Data
Science. This is in contrast to existing programs that either target a different population of
students or focus on subject areas specific to their domain. For instance, a doctoral degree in
Computer Science targets to admit students “with a strong academic background in computer
science and engineering and/or a related field.” Students in the PhD program must select four
courses from ten different breadth areas that include Artificial Intelligence and Robotics.
Similarly, the Machine Learning and Data Science (ML/DS) specialization in the ECE Ph.D.
program is one of 13 specializations, and one of the three “impacted” programs, that is, capacity
controlled areas along with Circuits and Robotics that are restricted to ECE students with a
required Bachelor’s (and optionally MS) degree in Engineering, Sciences or Mathematics.
Mentored by notable researchers in the areas of information and coding theory, statistical signal
processing, robotics and controls, the doctoral specialization provides deep insights into
intellectual underpinnings for data analytics and machine learning and its various application
domains. Students in the Ph.D. program are required to meet 48 units of course work structured
into three sets of courses that cover basic knowledge of programming, linear algebra, probability
and statistics, a set of required courses and another set of technical electives. The nature of these
courses as well as CSE courses, and their coordination with HDSI courses are discussed further
below. Among other programs, the department of mathematics offers Ph.D. degrees in
mathematics with specialization either in Computational Science (CSME) or Statistics.

Ph.D. in Data Science, November 30, 2020 Version 4.1 15 | Page


Admission to the Ph.D. programs in Mathematics requires a B.S. degree in Mathematics or a
strong background in mathematics with demonstrated completion of a full sequence of courses in
calculus, differential equations, linear algebra and a year’s sequence in both abstract algebra and
real analysis. Both specializations require completion of 48 units courses in core curriculum in
mathematics and 24 units of specialization in topics related to analysis, probability and statistics,
numerical optimizations, and applications of statistical methods to Bioinformatics. The students
are required to pass two written qualifying examinations; typical choices for the latter are
Mathematical Statistics and Real Analysis. Finally, the division of Biostatistics and
Bioinformatics of the department of Family Medicine and Public Health (FMPH) in the Herbert
Wertheim School of Public Health offers a PhD degree in Biostatistics drawing upon courses and
instructors from the departments of Mathematics, Computer Science and FMPH. The program
requires 68 units, a vast majority of which are required in Mathematical Statistics, Biostatistical
Methods, and life science applications with one elective course drawn from Biostatistics, CS or
Mathematics.

(b) The courses offered by the HDSI graduate programs will be available to the students in
related programs, and in fact, will be taught by faculty jointly appointed with other departments
representing domain knowledge. For instance, a faculty member jointly appointed with HDSI
and Bioengineering is planning to offer a graduate course in Biomedical Data Analysis. Such a
course will constitute a core requirement in the Bioengineering graduate program as well as a
HDSI cross-listed elective course for students with background and interest in biology and
engineering. Thus, by serving as a catalyst for creation of new courses and student support,
HDSI seeks to enhance the overall capacity of UC San Diego in serving educational and training
needs of a growing population of students whose interests extend beyond the offerings of the
existing programs.

Impact on Existing Programs: We do not anticipate any adverse material impact on existing
PhD programs in CSE, ECE, Mathematics or Cognitive Science. Their respective specializations
in Artificial Intelligence, Machine Learning/Data Science and Statistics are part of much larger
PhD programs among a dozen or so other specializations and are heavily oversubscribed by
students in their respective departments with class enrollments routinely over 200 students.

On the contrary, we expect a positive impact from increased participation of various academic
units in Data Science related subject areas (such as Computational Biology, Computational
Chemistry, Computational Social Science) that are currently inaccessible to graduate students
from other departments despite their need and demand, an assessment supported by letter from
the dean of division of social science. The proposed PhD/DS program expands the pool of
applications by enabling students from diverse backgrounds such as Economics, Cognitive

Ph.D. in Data Science, November 30, 2020 Version 4.1 16 | Page


Science, Biology etc to consider a research career in Data Analytics or its application to their
own domains.

There should also be a positive impact on certain specialization area courses offered by the
partner departments: some of the data science graduate students will increase enrollments in
these classes most of which are cross-listed with DSC courses and/or taught for HDSI or
jointly-appointed HDSI faculty in the Math, CSE, ECE departments. We expect to see a spread
of data science graduate students across half a dozen or so available area specializations.

6. Contributions to Diversity
Our vision for how the proposed program will advance UC’s goals for diversity is informed by
the founding document “HDSI Strategy for Inclusive Excellence.” It is a living document
available online at https://bit.ly/HDSI-Diversity that will be updated with information available from
Diversity Dashboard and our surveys.

The nascent nature of the HDSI organization provides us with additional flexibility to
incorporate Equity, Diversity and Inclusion (EDI) goals into the DNA of the new institution, that
is, embedded in all our processes and actions ab initio with these goals. In particular, the three
core tenets of “access & success, climate and accountability” are vigorously pursued. The
Institute has taken three concrete steps towards EDI goals that will directly impact its programs
-- including the proposed graduate program -- in the coming years. First, HDSI faculty recruiting
is carefully planned with anti-bias training required from the entire faculty council before any
faculty search is initiated. Second, every faculty member hired into HDSI has been provided with
$30K in support of EDI goals. These funds are held by the Institute and released for specific and
approved activities that advance diversity goals. The faculty members are encouraged to pool
these resources and seek additional matching resources from the Institute to launch substantive
programmatic activities.

Beyond this “access and success” part, the third element of HDSI strategy directly addresses
climate and accountability. HDSI has proposed and eventually succeeded in using its endowment
resources to identify and recruit a full-time coordinator for broadening participation
(https://bit.ly/HDSI-BPC). While a permanent position is pending and yet to be created, working
with the administration we have been able to recruit Saura Naderi who has been tasked full-time
in developing measures, establishing metrics and directing the activities related to EDI goals7.
The personnel and EDI share-pool mentioned earlier empowers HDSI faculty to put to action
concrete plans -- including seminars, enrichment activities and additional counseling, etc that are
implemented, tracked and accounted for. With increased direct attention to BPC (broadening
participation in computing) plans in research proposals to agencies such as the NSF, HDSI’s
7
https://datascience.ucsd.edu/about/dei/ website provides a starting point for engagement with HDSI
personnel who are dedicated to achieving EDI goals of the Institute.

Ph.D. in Data Science, November 30, 2020 Version 4.1 17 | Page


broadening participation plan provides a sustainable institutional mechanism and support for the
HDSI community. Under the leadership of Saura Naderi, HDSI has created a DEI council for the
faculty and staff. The DEI council meets weekly, and has started to work on projects such as
understanding racial influences and bias. It has also created a number of projects that are in the
early stages of planning and execution. Prominent among these are “Pathways to AI” outreach
program, and an educational trial program in Chula Vista Middle School. These and other
projects are discussed and launched by the DEI council that will be advising faculty in terms of
promoting inclusion and equity. Under the stewardship of DEI council, we will be creating
efforts where faculty can participate using their funding.

Beyond the three elements of the HDSI EDI strategy mentioned earlier, the proposed doctoral
program will also provide an excellent vehicle for deploying our fellowship support to encourage
URM participation as well as making the program accessible for academically strong but
economically disadvantaged students to ensure the program provides an affordable pathway for a
broad and diverse student population. Among the programs we have already devised and
launched8 are scholarships for graduate students (a commitment of $600K for the current year,
and likely to rise in coming years), as well as access to learning outside the classroom through
mechanisms such as EdX micromasters programs where the Institute offers financial support to
all students interested in taking on-ramp classes at their own pace. This on-ramping is a critical
element of our strategy to leverage the existing micromasters program. It enables students from
different backgrounds, who may otherwise be rejected from the doctoral program, to demonstrate
that they can do well in the program at low or no cost through our scholarship to undergraduate
students across the campus.

Beyond strategic decisions and choices to appoint personnel, devote resources, we are cultivating
a climate for faculty to conceive of new ideas that directly contribute to inclusive excellence.
Among the measures that we seek to improve are the participation rates of women and
underrepresented minorities in our classes and degree programs, retention rates, progress towards
graduation and placement results. Institutional role models and mentoring are key means that are
already implemented in the HDSI foundations and shall remain a cornerstone of HDSI faculty
recruiting and leadership advancement.

7. Relationship of the Proposed Program with Other UC Campuses


Given the historical development of the field discussed in Section 1.2, Data Science as a doctoral
subject area is often led by departments of Statistics or Operations Research on campuses where
such departments exist, in cooperation with Computer Science or EECS departments. UC San

8
Please see page 60-61 of our annual report at http://bit.ly/HDSIfirstyear.

Ph.D. in Data Science, November 30, 2020 Version 4.1 18 | Page


Diego’s Ph.D. program in Data Science will be the first such program on a UC campus joining
similar programs nationally.

To meet the growing scope and demand for data science, a number of UC campuses are offering
or planning to offer undergraduate and graduate programs in data science while at the same time
building new academic departments and schools of data science to support the nascent field.
Prominent among these is UC Berkeley’s Division of Computing, Data Science and Society
(CDSS) consisting of the departments of Statistics, Electrical Engineering and Computer Science
(which is jointly part of Engineering and CDSS), and the School of Information. CDSS at UC
Berkeley provides specialization in the form of a “Designated Emphasis in Computational and
Data Science and Engineering” to existing Ph.D. programs through curriculum specialization in
the individual Ph.D. programs.

At UC Davis and UC Irvine, the departments of Statistics have taken leadership in data science
as a part of their existing degree programs in Statistics, especially due to organizational
structures of the underlying departments (such as Statistics being part of a School of Information
and Computer Science at UCI). While the current focus in these emerging academic units is on
bachelor’s and master’s programs new doctoral programs and specializations are beginning to
appear.

To a first order, none of the specializations of existing Ph.D. degree programs on other campuses
prepare students for a research career in Data Science, an important objective of HDSI’s
proposed doctoral program. More importantly, no program provides the wide accessibility to
students from diverse educational backgrounds to Data Science and its applications. Indeed, the
structure of the proposed graduate program consisting of foundation and core courses makes it
possible for HDSI to offer a single program that admits students with training as broad as
engineering, sciences, social sciences, business and humanities and produces students with a
graduate degree in Data Science with well-defined research specializations depending upon the
application domain of the dissertation research that make it easier for them to pursue different
career paths in targeted domains.

Nationally, New York University (NYU) offers Ph.D. in Data Science since 2017 with five
required courses in programming, probability and statistics, big data information and
representation and nine elective courses drawn from the data science areas of machine learning,
artificial intelligence and statistics. Columbia University offers a data science specialization of
Computer Science, Electrical Engineering, Industrial Engineering & Operations Research and
Statistics doctoral programs. Among other notable national programs are Yale University’s
Statistics and Data Science program, PhD in Data Science and Operations offered by the
Marshall School of Business at the University of Southern California, and PhD in Statistics and

Ph.D. in Data Science, November 30, 2020 Version 4.1 19 | Page


Machine Learning offered by the Department of Statistics and Machine Learning at Carnegie
Mellon University.

8. Program Administration and Resource Planning


The proposed program will be offered by the Halıcıoğlu Data Science Institute (HDSI),
established as an academic unit by the UC Academic Senate in June 2018 under a divisional
budget model on the UC San Diego campus. The Institute also carries a $75M founding
endowment with an annual payout that is expressly dedicated to support the mission of HDSI in
training and preparation of Data Science talent by the Institute activities and programs. Due to its
founding commitments, the proposal does not require separate new infusion of campus
resources.

The Institute faculty, and members of its faculty council, are listed below with their annual
teaching workload in HDSI. The Graduate Admissions and Graduate Program are among the
standing committees of the HDSI Faculty Council. These committees are supported by a
full-time academic coordinator as well as an assistant director of training programs to ensure
program operation and academic advising of the graduate students.

Faculty Members with Teaching Responsibilities in HDSI Programs

Faculty Names and Title Appointments Nominal


Group Teaching
Workload in
Data Science

Mikhail Belkin, Professor HDSI 3 courses

Justin Eldridge, Assistant Teaching Professor HDSI 6 courses

Aaron Fraenkel, Assistant Teaching Professor HDSI 6 courses

Yian Ma, Assistant Professor HDSI 3 courses

Arya Mazumdar, Associate Professor HDSI 3 courses


Full
Time Gal Mishne, Assistant Professor HDSI 3 courses
Faculty Yusu Wang, Professor HDSI 3 courses
in
HDSI Babak Salimi, Assistant Professor HDSI 3 courses
(11)
Zhiting Hu, Assistant Professor HDSI 3 courses

Berk Ustun, Assistant Professor HDSI 3 courses

Ph.D. in Data Science, November 30, 2020 Version 4.1 20 | Page


Lily Weng, Assistant Professor HDSI 3 courses

R. Stuart Geiger, Assistant Professor Communication & 1.5 courses


HDSI

David Danks, Professor HDSI and 1.5 courses


Philosophy

Mikio Aoi, Assistant Professor HDSI & 1.5 courses


Neurobiology

Jingbo Shang, Assistant Professor CSE & HDSI 1.5 courses

Benjamin Smarr, Assistant Professor Bioengineering & 1.5 courses


HDSI

Barna Saha, Associate Professor CSE & HDSI 1 course

Joint Arun Kumar, Assistant Professor CSE & HDSI 1 course


Faculty
in Yoav Freund, Professor CSE & HDSI 1 course
HDSI
(13) Jelena Bradic, Associate Professor Mathematics & 1 course
HDSI

Rayan Saab, Associate Professor Mathematics & 1 course


HDSI

Alex Cloninger, Assistant Professor Mathematics & 1 course


HDSI

Margaret Roberts, Associate Professor Political Science & 1 course


HDSI

Armin Schwartzman, Professor Biostatistics & 1 course


HDSI

Bradley Voytek, Associate Professor HDSI Fellow, 2 cross-listed


Cognitive Science courses

Fellows Ilkay Altintas, Chief Data Scientist, SDSC HDSI Fellow,


& SDSC
Others Virginia De Sa, Professor HDSI Associate 1 cross-listed
(5) Director, Cognitive course
Science

Dimitris Politis, Distinguished Professor HDSI Associate 1 cross-listed

Ph.D. in Data Science, November 30, 2020 Version 4.1 21 | Page


Director, course
Mathematics

Rajesh K. Gupta, Distinguished Professor HDSI Director, CSE 1 cross-listed


course

In addition to the 29 faculty members listed above (15.5 FTE), the Institute is also planning to
fill one teaching faculty (LSOE) and one advancing faculty diversity (AFD) position in the
current recruiting season. It anticipates additional 1-2 new faculty members to join the Institute
for a total faculty strength of 16-17 FTE including 3 FTE LPSOE and 8 U18 lecturers.

Together, these provide a capacity of 51-52 courses annually by the current ladder-rank faculty in
addition to 5 cross-listed courses as well as teaching by U18 continuing lecturers for a combined
total annual capacity of 58-74 courses. The current Data Science undergraduate program
accounts for 35 courses/sections per year. Conservatively, the Institute has the capacity to offer
6-10 graduate courses per quarter that enables it to adequately serve the proposed doctoral
program.

Financial Support for Doctoral Students in the program will follow the current campus support
model consisting of direct support from the Graduate Division (formerly block grant), Graduate
Student Research (GSR) support from grants and contracts, and teaching assistant (TA)
employment funding. The Graduate Division support typically corresponds to one year of
non-employment based support including a levelized tuition (for resident and non-resident
students). The total amount of this support is based on historical enrollment and a function of
overall contracts and grant activity and is a part of the annual budgeting process. The current TA
support provides for 8-10 TA FTEs who will be drawn from the graduate student pool in Data
Science.

To ease the transition process, for the first five years of the program launch, the Institute will set
aside 20% of our annual graduate student funding liability in the foundation accounts as a
contingency measure to ensure support continuity in the face of any short-term GSR funding
shortfall as the Institute faculty ramp up extramural funding support through competitive grants.
Thus, we are confident that using a combination of resources from the Graduate Division, GSR,
TA and our foundation accounts, we will be able to guarantee five years of guaranteed funding to
every doctoral student in the program. In the steady state, we plan to deploy our fellowship
support to encourage URM participation as well as make the program accessible for
academically strong but economically disadvantaged students to ensure the program provides an
affordable pathway for a broad and diverse student population.

Ph.D. in Data Science, November 30, 2020 Version 4.1 22 | Page


9. Plan for Evaluation of the Program
The doctoral program will be formally evaluated like all other UC San Diego graduate programs,
in a way that is consistent with senate regulations every 8 years. This evaluation process will
include an external review and UC San Diego graduate council oversight. In addition, as
mentioned earlier, the HDSI faculty will perform a mid-flight review after three years focusing
on issues such as success in EDI-specific goals, its needs, and a comprehensive evaluation of
program placement outcomes. As with the formal evaluation, the internal review process – both
annually and mid-flight –will include student feedback and surveys, teaching evaluations, alumni
and industry feedback.

HDSI's existing faculty members have significant experience in building, launching and
directing graduate programs in the CSE, Bioengineering, Mathematics, Cognitive Science and
Biostatistics units. The academic coordinator position was specifically designed with a view
towards broadening participation, improving student learning experience and career placement
outcomes. The Institute is planning to be physically co-located in the same building with a
segment of the Teaching and Learning Commons (TLC) starting 2021. This colocation will
present us with additional opportunities for engagement with UC San Diego’s expertise in
improving learning experience and outcomes for all our students.

2. Program
1. Undergraduate Preparation for Admission into the Doctoral Program
The HDSI faculty have spent significant time discussing and formulating a plan that enables
maximum participation of interested students into the envisioned graduate program. Arguably,
this is the chief distinction of the campus-wide Data Science program. We are also keenly aware
of our primary obligation to ensure successful and timely completion of the graduate degree
program given the significant level of individual and institutional investment in terms of time
and resources. Balancing these requirements has required us to structure the incoming stream of
students into essentially three broad categories:

1. students who come with preparation in computing and/or information sciences at a level
to master algorithmic programming and cloud computing skills;

2. students with preparation in mathematics and statistics at a level to master probability and
statistical methodology necessary for meaningful data analysis;

3. students who enter the program from other areas of science that rely upon collecting and
analyzing observational or experimental data in order to advance scientific
understanding. These are students with a degree in natural sciences such as physics,

Ph.D. in Data Science, November 30, 2020 Version 4.1 23 | Page


chemistry, biology, environmental sciences, etc. or coming from a social science
background such as communication, economics, political science, psychology, etc.
Application examples may be causal inference in economics, assessing statistical
significance of a pharmaceutical experiment or psychological treatment, the study of
social networks in political science, etc.

We note that these are broad and overlapping categories. Even when students come prepared in
both advanced computing and mathematics/statistics, Data Science research problems challenge
them to apply these skills meaningfully in diverse applications to advance knowledge.

Graduate admissions process will use text analysis methods to automatically sort and bin
admitted students into three pools and thus drive the subsequent advising process including prior
communication to the students regarding their preparation options using online and other offers
by UC San Diego and other organizations. HDSI student advising will recruit an advisor
dedicated to the graduate program advising and will develop pathways for newly admitted
students to take specific upper-level undergraduate courses from different areas, in order to
solidify their backgrounds when/if there is some perceived weakness.

2. Admission Requirements
A Ph.D. degree in Data Science is an advanced degree that prepares students for leadership in
data science research in academia, industry or civic organizations. To be successful in this
program, the students must have a background in quantitative analysis typically seen in degree
programs with substantial mathematical preparation and programming skills. Course work or
equivalent experience in programming, calculus, probability and statistics are required.

Admissions requirements for the Ph.D. program are:

● Bachelor’s and/or Master’s degree in a quantitative field such as engineering, computer


science, mathematics, statistics, cognitive science, scientific disciplines or quantitative
social sciences such as economics or computational social science. Other degree options
are acceptable with demonstrated course work or experience in programming, calculus,
probability and statistics.
● Undergraduate GPA of at least 3.0 on a 4.0 scale
● College Transcripts
● Optional GRE requirements as per the latest guidance from the Graduate Division.
● Three letters of recommendation.
● Evidence of proficiency for international students: three English proficiency
examinations are accepted for graduate study at UC San Diego:
○ The Test of English as a Foreign Language (TOEFL): The minimum TOEFL
score for admission is 85 for the Internet Based Test and 64 for the Paper Based

Ph.D. in Data Science, November 30, 2020 Version 4.1 24 | Page


Test. Please note the Paper Based Test does not have a speaking component.
TOEFL information and forms are available at the TOEFL website.
○ The International English Language Testing System (IELTS) Academic Training
exam: The minimum IELTS score is Band 7.0. IELTS registration information is
available on the IELTS website.
○ The Pearson Test of English Academic (PTE Academic). The minimum PTE
academic score required for graduate admission is overall score 65. Registration
and test information is available on the Pearson website.
● A statement of purpose that clearly outlines the motivation, background preparation, any
relevant work experience in data science related areas and topical interests for a degree in
Data Science. Prospective students would be asked to identify any faculty members that
they would like to seek as a research advisor.

3. Foreign Language
A demonstrated proficiency in English is expected for international applicants. Foreign language
proficiency is not required for this degree.

4. Overview of the Proposed Doctoral Program


The Ph.D. program consists of the following components consistent with the regulation 715 of
the San Diego Division of the Academic Senate 9:

● Research rotation requirements to be completed by taking research rotation courses at


least in two laboratories in the first two quarters of Ph.D. program;

● Formal coursework requirements representing breadth and depth requirements


consisting of 48 units of courses structured in three groups: foundations, core and depth
areas; as well as 4 units of professional preparation including 1-unit HDSI Faculty
Research Seminar, 2-units of TA/Tutor training and 1-unit of Research Skills courses to
be completed with a Satisfactory grade;

● Completing a preliminary advisory assessment in a technical area of choice by the


student by a committee set by the Graduate Committee (GradCom). This examination is
to be completed before the start of the second year. Preliminary examinations will
normally be scheduled annually in the Spring quarter through Summer quarter of the first
year. The goal of the preliminary assessment is to assess student preparation in
background courses and identify any required courses consistent with the planned
research area. In rare cases, the assessment outcome may include a requirement to retake

9
http://senatestage.ucsd.edu/Operating-Procedures/Senate-Manual/Regulations/715

Ph.D. in Data Science, November 30, 2020 Version 4.1 25 | Page


the examination. The preliminary assessment must be successfully completed no later
than completion of two years (or six quarter enrollment) in the Ph.D. program

● Passing a research qualifying examination (UQE) that is conducted by the dissertation


committee consisting of five or more members approved by the graduate division as per
senate regulation 715(D). One senate faculty member must have a primary appointment
in the department outside of HDSI. Faculty with 25% or less partial appointment in HDSI
may be considered for meeting this requirement on an exceptional basis upon approval
from the graduate division.10 The goal of UQE is to assess the ability of the candidate to
perform independent critical research as evidenced by a presentation and writing a
technical report at the level of a peer-reviewed journal or conference publication. The
research qualifying examination must be completed no later than fourth year or 12
quarters from the start of the degree program; the UQE is tantamount to the advancement
to PhD candidacy exam;

● Annual review of the progress in the doctoral program by the graduate committee of
HDSI faculty council;

● Teaching requirements including completion of teacher training course (DSC 599) and
minimum of one quarter of teaching experience at half-time (50%) appointment as a
Teaching Assistant over the course of the degree program;

● Successful defense of the dissertation presentation in a final examination to the


doctoral dissertation committee;

● Approved dissertation that must explicitly address the reproducibility requirement.


This requirement can be met by providing supplementary online material consisting of
code, data repositories, any evidence of use by external parties and/or where necessary
through validated proof of results.

Time Limits: Assuming a student has no deficiencies and is full-time enrolled in the program,
our normative length of time pre-candidacy is 3 years and 2 years in candidacy. Extension of
total time from matriculation to degree beyond six years will require petition and approval from
the graduate division. HDSI has instituted several mechanisms and incentives to ensure
expeditious time-to-degree. These include a full-time graduate students advisor in HDSI
Graduate Affairs11, preliminary assessment examination and advisory in the first year, and annual

10
This exception is stipulated in view of a large number of formally appointed faculty on HDSI faculty
council (at 25% or 0%) drawn from different departments and divisions thus making it impossible for a
student to find an “outside” faculty member in some areas.
11
Academic and career advising are among the highest profile investments by HDSI and stipulated
explicitly as a part of the founding gift agreement for the Institute. We plan to build a portal and services

Ph.D. in Data Science, November 30, 2020 Version 4.1 26 | Page


review of each graduate student in the Ph. D. program led by the assigned faculty academic
advisor of the student, graduate scholarships funded by HDSI foundation accounts to cultivate a
culture of excellence in research and dedicated staff for computing and data curation services to
ensure a smooth and easy access to necessary experimental platforms.

5. Plan of Study
The program plan will follow Plan A consistent with the Regulation 715 of the San Diego
division of the Academic Senate.

Before admission to the candidacy for the Ph.D. degree, the student must have passed a
preliminary assessment examination conducted by a committee constituted by the Graduate
Committee (GradCom) of the HDSI Faculty Council. This committee shall not include any
assigned or selected research advisor.

The doctoral dissertation committee, chaired by the academic research advisor, shall be
appointed by the Dean of Graduate Studies under the authority of the Graduate Council of the
Academic Senate. The committee members shall be chosen from at least two departments, and at
least two members shall represent academic specialities that differ from the student’s chosen
specialization. In all cases, the doctoral committee will include one tenured or emeritus UCSD
faculty member from outside the HDSI. In exceptional conditions, a faculty member with home
department outside of HDSI and with 25% or less appointment in HDSI may be petitioned to the
graduation division for meeting this requirement. Additional rules per Regulation 71512 on the
composition and conduct of the doctoral committee shall apply.

6. Unit Requirements
For the conferral of the Ph.D. degree in Data Science, 48 units (12 courses) will be required to
be taken for a letter grade and 4 units of professional preparation units must be taken for a
passing (satisfactory) grade. The professional preparation consists of 1 unit of faculty research
seminar, 2 units of TA/tutor training and 1 unit of survival skills course. Out of the 12 courses, at
least 10 must be graduate-level courses; at most two can be upper-level undergraduate courses.
36 units or 9 courses must be completed within six quarters from the start of the degree program.

7. Structure of the Proposed Graduate Program


1. Course Numbering Schema
Course numbering scheme in HDSI reflects its fundamental mission as the hub for Data Science
across the campus. Accordingly, course series are structured into groups according to the

similar to our undergraduate advising to ensure success of our graduate students.


(https://datascience.ucsd.edu/academics/undergraduate/resources/)
12
http://senatestage.ucsd.edu/Operating-Procedures/Senate-Manual/Regulations/715

Ph.D. in Data Science, November 30, 2020 Version 4.1 27 | Page


intellectual (and corresponding organizational) areas where Data Science as a subject intersects
with existing topical areas. The first digit (from the left) of the three digit course number reflects
Undergraduate (‘1’) or Graduate (‘2’) course designation. Based on content, prerequisites and
credit policies, some courses can be taken by both undergraduate seniors and beginning graduate
students. These are colloquially referred to as “mezzanine” courses. Number ‘5’ as first digit
refers to teaching related courses, such as tutor/TA training or credit accounting for graduate
teaching activities. The middle digit describes either introductory data science subjects (‘0’),
foundational core subject (‘1’, ‘4’), advanced topics in data science (‘5’) or data science subjects
related to domain areas such as life sciences (‘2’), computing (‘3’), society and humanities (‘6’),
natural sciences (‘7’) etc. Finally, the last (right) digit reflects a partially ordered sequence of
courses on a topical area starting with (‘0’) that have as pre-requisite courses in the lower
division or a mezzanine course respectively for undergraduate and graduate courses.

Areas Description UD UG Grad Notes


series series

Data Management & Data Systems, Data Security DSC 10X DSC 20X Introductory &
Mezzanine Courses

Computational & Mathematical Foundations DSC 11X DSC 21X

Data and Life Science DSC 12X DSC 22X

Digital Infrastructure, Computing Systems, Cloud, DSC 13X DSC 23X


Cyber-infrastructure, Traditional & non-traditional
computing systems

Data Science Theoretical Foundations (builds upon DSC 14X DSC 24X
lower division DSC 4X series)

Applied Machine Learning: Data Mining (incl. DSC 15X DSC 25X Multiple
Graph mining, time-series mining), recommender domain-specific ML
systems, ML-based vision, Deep learning applications.
applications. Natural Language Processing

Arts, Humanities, Society, Policy and Social DSC 16X DSC 26X
Sciences

Data and Physical, Environmental Sciences DSC 17X DSC 27X

Capstone Project Courses DSC 18X NA

Special Topics DSC 19X DSC 29X Topics: 291, Projects:


292, Seminars: 293,

Ph.D. in Data Science, November 30, 2020 Version 4.1 28 | Page


Rotation: 294,
Survival Skills: 295.

Directed Research DSC 199 DSC 298, 298: Independent


DSC 299 Research; 299:
teaching credit

TA/Tutor Training DSC 599

2. Required and Recommended Courses


The formal course requirements for the doctoral program build upon the course requirements to
earn a Master’s degree in Data Science with additional requirements related to teaching
experience, professional preparation and research rotation requirements and coverage of both
bread and depth subject areas necessary for a successful doctorate degree. This structure
rationalizes significant preparation and common knowledge and skills expected of all our
students in data science while preparing our students for leadership careers in data science. It
does so by leveraging the significant effort HDSI faculty spend to teach and prepare students for
core and domain-specialized topics in the master’s program, while preparing them to take
advanced courses in chosen depth areas. It also provides a safe harbor for a small minority of
students who may not qualify or otherwise choose to exit from the doctoral program with a
successfully completed master’s degree without significant additional investment of time in the
graduate program.

3. Program Structure
Courses in Data Science Graduate Program are structured into three groups of courses: Group A,
Group B and Group C. Group A courses are introductory level courses taught at the level of
undergraduate senior or mezzanine courses. Group B are core graduate level courses with
prerequisites from Group A courses. Group C are advanced, specialized and free-standing
courses, often part of the required courses in the Data Science specialization of Graduate
Program in other departments. In all three groups, required courses are indicated as such; they
can not be substituted by other courses without exception approval from the graduate program
committee.

Group A: Preparatory Knowledge and Skill Areas [Credit for maximum of 3 courses]

We have identified five important knowledge and skills necessary for understanding (and
advancing) core data science knowledge. It is, therefore, important that all our entering students

Ph.D. in Data Science, November 30, 2020 Version 4.1 29 | Page


either have background preparation or have courses available in the program to ensure a
successful completion of the stipulated doctoral degree program:

1. Algorithms and Programming skills: ability to efficiently translate algorithmic


knowledge and analysis methods into suitable programming platforms, especially using
cloud computing resources.
2. Data organization methods and skills: ability to cast data from raw sources into
formats (structured or semi-structured) that are amenable to scalable automated analysis,
visualization on various platforms, data wrangling.
3. Numerical Linear Algebra: knowledge of underlying mathematics that supports the
ability of students to conceptualize transformation operations and convert them into
computational algorithms such as Principal Component Analysis (PCA).
4. Multivariate Calculus: the mathematical study of a function of multiple variables as
required in understanding optimization methods such as gradient descent that underlie
much of modern machine learning.
5. Probability and Statistics: understanding randomness in data that is fundamental to
understanding of the processes that generate data and estimation procedures as a basis for
critical thinking and data analysis; quantifying the accuracy in estimation and
model-fitting; performing multiple hypothesis tests; and optimality in estimation and
prediction.

Given the breadth of the applicant pool, it is understandable that among the incoming students
interested in the Data Science graduate program, there may be some lacking the basic
background at the undergraduate level in one or more of the above areas. This would prohibit
them from taking the relevant graduate level courses. Accordingly, we have devised five
foundational knowledge area courses described in the catalogue copy and listed below. A student
can receive credit towards the Ph.D. degree for a maximum of three courses from the list of
courses below. We expect that students graduating from quantitative undergraduate backgrounds
would have taken a majority of these courses (or equivalent). Students with an undergraduate
degree from the Data Science major or a Data Science minor would have taken in all the five
areas mentioned above thus obviating the need for background preparation.

1. DSC 200: Data Science Programming [New], 4 units: Computing structures and
programming concepts such as object orientation, data structures such as queues, heaps,
lists, search trees and hash tables. Laboratory skills include Jupyter notebooks, RESTful
interfaces and various software development kits (SDKs). Instructors: Aaron Fraenkel,
Yoav Freund

2. DSC 202: Data Management for Data Science [New], 4 units: Principles of data
management, relational data model, relational algebra, SQL for data science, NoSQL Databases

Ph.D. in Data Science, November 30, 2020 Version 4.1 30 | Page


(document, key–value, graph, column-family), Multidimensional data management (data
warehousing, OLAP Queries, OLAP Cubes, Visualizing multidimensional data) Instructors:
Babak Salimi, Jingbo Shang, Amarnath Gupta

3. DSC 210: Numerical Linear Algebra [New], 4 units: Linear algebraic systems, least
squares problems, orthogonalization methods, ill-conditioned problems, eigenvalue and
singular value decomposition, principal component analysis. Instructors: Rayan Saab,
Alex Cloninger, Gal Mishne

4. DSC 211: Introduction to Optimization [New], 4 units. Continuity and differentiability


of a function of several variables, gradient vector, Hessian matrices, Taylor
approximation, fundamentals of optimization, Lagrange multipliers, convexity, gradient
descent. Instructors: Yian Ma, Rayan Saab, Arya Mazumdar.

5. DSC 212: Probability and Statistics for Data Science [New], 4 units: Probability,
random variables, distributions, central limit theorem, maximum likelihood estimation,
method of moments, confidence intervals, hypothesis testing, Bayesian estimation,
introduction to simulation and the bootstrap. Instructors: Jelena Bradic, Dimitris Politis,
Armin Schwartzmann

Group B: Core Knowledge and Skill Areas [Ph.D. students take at least 6 courses]

Building upon the foundation courses in Group A, the graduate program identifies several core
graduate courses. Four core courses are required for all Ph.D. students, including those with a
Bachelors in Data Science. The four required courses are:

1. DSC 240: Machine Learning [New], 4 units: A graduate level course in machine
learning algorithms: decision trees, principal component analysis, k-means, clustering,
logistic regression, random forests, boosting, neural networks, deep learning. Instructors:
Misha Belkin, Yian Ma, Jelena Bradic, Gal Mishne, Virginia de Sa

2. DSC 260: Data Ethics and Fairness [New], 4 units: Ethical considerations regarding
privacy and control of information. Principles of fairness, accountability, and
transparency. Use of metadata to information algorithms. Algorithmic fairness. Policy
issues such as the Fair Information Practices Principles Act, and laws concerning the
“right to be forgotten.” Instructor: R. Stuart Geiger, David Danks

3. *DSC 241: Statistical Models [New], 4 units: linear/nonlinear models, generalized


linear models, model fitting and model selection (cross-validation, knockoffs, etc.),
regularization and penalization (ridge regression, lasso, etc.), robust methods,

Ph.D. in Data Science, November 30, 2020 Version 4.1 31 | Page


nonparametric regression, conformal prediction, causal inference. Instructors: Ery
Arias-Castro, Jelena Bradic, Dimitris Politis.

4. *DSC 204A: Scalable Data Systems [New], 4 units: Storage/memory hierarchy,


distributed scalable computing (i.e., cluster, cloud, edge) principles. Big Data storage,
management and processing at scale. Dataflow programming systems and programming
models (MapReduce/Hadoop and Spark). [Prerequisite: DSC 202] Instructors: Ilkay
Altintas, Mai Nguyen.

(*) Depending on academic preparation, a Ph.D. student can take an advanced course on Applied
Statistics, such as MATH 282B instead of DSC 241. Similarly, instead of DSC204A, a student
can take a course on Algorithms, such as CSE 202: Design and Analysis of Algorithms.

In addition, a doctoral student must select at least 2 out of the following 8 core courses13.

5. DSC 203: Data Visualization and Scalable Visual Analytics [New], 4 units:
Commonly used algorithms and techniques in data visualization. Interactive reasoning
and exploratory analysis though visual interfaces. Application of data visualization in
various domains including science, engineering, and medicine. Scalable interactive
methods involving exploring with big data and visualization methods. Techniques to
evaluate effectivity and interpretability of analytical products for diverse users to obtain
insights in support of assessment, planning, and decision making. [Prerequisite: DSC
202] Instructors: Ilkay Altintas, Juergen Schulze

6. DSC 204B: Big Data Analytics & Applications : The goal of this course is to introduce
the student to the methods and methodologies of big data analytics. Methods covered
include: I/O bottleneck and the memory hierarchy, HDFS, Spark, XGBoost and
tensorflow. Methodologies include: writing jupyter notebooks that can be understood and
used by people of diverse background Replicability and statistical significance.
[Prerequisite: DSC 204B] Instructors: Yoav Freund.

7. DSC 242: High-dimensional Probability and Statistics [New], 4 units: Concentration


inequalities, Markov processes and ergodicity, martingale inequalities, empirical
processes, sparse linear models in high dimensions, Principal component analysis in high
dimensions, estimation of large covariance matrices. [This class may be cross-listed with
the Mathematics Department.] Instructor: Jelena Bradic, Rayan Saab

8. DSC 243: Advanced Optimization [New], 4 units: Linear/quadratic programming,


optimization under constraints, gradient descent (deterministic and stochastic),
13
HDSI faculty plans to propose two additional course options in the areas of Artificial Intelligence and
Accountability/Trust and Critical Data Studies.

Ph.D. in Data Science, November 30, 2020 Version 4.1 32 | Page


convergence rate of gradient descent, acceleration phenomena in convex optimization,
stochastic optimization with large data sets, complexity lower bounds for convex
optimization. Instructor: Yian Ma, Rayan Saab

9. DSC 244: Large-Scale Statistical Analysis [New], 4 units: Exploratory data analysis,
diagnostics, bootstrap, large-scale (multiple) hypothesis testing, false discovery rate,
empirical Bayes methods. [This class may be cross-listed with Mathematics and/or
Biostatistics.] Instructor: Armin Schwartzman, Jelena Bradic

10. DSC 245: Introduction to Causal Inference [New], 4 units: Causal versus predictive
inference, potential outcomes and randomized experiments (A/B testing), structural
causal models (interventions, counterfactuals, causal diagram, do-operator, d-separation),
identification of causal effect (back-door and front-door criterion, do-calculus),
estimation of causal effect (matching, propensity score, g-computation, doubly robust
estimation, regression discontinuity and instrumental variables, conditional effects),
structure learning (constraint and score-based algorithms), advanced topics (mediation
and path-specific effects, bounding causal effect, selection bias, external validity and
transportability, processing missing data, causal inference in networks) [Prerequisite:
DSC 212, 240] Instructors: Babak Salimi

11. DSC 250: Advanced Data Mining [New], 4 units: Graph mining and basic text analysis
(including keyphrase extraction and generation), set expansion and taxonomy
construction, graph representation learning, graph convolutional neural networks,
heterogeneous information networks, label propagation, and truth findings. [Prerequisite:
DSC 190A or CSE158 or equivalent] Instructor: Jingbo Shang

12. DSC 261: Responsible Data Science [New], 4 units: responsible data management,
algorithmic fairness (fairness definitions, impossibility results, causal fairness, building
fair ML models, fairness beyond classification), algorithmic transparency (interpretability
vs explainability, auditing-black-box algorithms, algorithmic recourse), privacy and data
protection, sampling bias, reproducibility [Prerequisite: DSC 260, 240, 245] Instructors:
Babak Salimi

Thus, together with Group A and Group C courses, doctoral students are required to take a
minimum of 5 courses for letter-grade credit. On the other end, students can satisfy all letter
grade course requirements except (satisfactory completion of professional preparation) teaching,
survival skills and research seminar courses. These students are expected to enroll into individual
research (DSC 298) in a section offered by the faculty advisor to meet residency requirements
and maintain graduate student standing during the period of dissertation research.

Ph.D. in Data Science, November 30, 2020 Version 4.1 33 | Page


Group C: Professional Preparation and Elective Courses [Remaining credits]

Group C courses aim to provide either practical experiences in chosen specialization areas, or
advanced training for students preparing for doctoral programs. The courses include required
professional preparation courses: 2 unit TA/tutor training (DSC 599), 1 unit of academic survival
skills (DSC 295) and 1 unit faculty research seminar (DSC 293), all of which must be completed
with a Satisfactory (S) grade using the S/U option.

Courses in this group also serve as a means to directly engage faculty in departments across the
campus who are directly interested in Data Science related topics and instruction. Consequently,
we make important courses taught by HDSI affiliated faculty visible to the Data Science
graduate students. However, their availability is subject to schedule and enrollment constraints of
the individual departments. Based on written approval from participating departments, courses
available in a given domain in a given year will be announced beginning of the academic year
with a pre-registration deadline for capacity planning purposes.

Professional Preparation Courses:


DSC 599: TA/TUTOR Training: 2 units (S/U): Expected TA duties, evaluation methods. Rules
governing TA appointment, conduct and evaluation. Practice effective teaching strategies
including communications with students and instructors, conduct of discussion sessions,
formulating learning objectives and implementation of active learning strategies. Prerequisites:
none. Instructors: Teaching Faculty Staff. CSE 599 can be taken for credit to meet this
requirement.

DSC 293: Faculty Research Seminar: 1 unit (S/U): Weekly faculty research seminar.
Individual HDSI colloquia and distinguished lecturers may be included at the discretion of the
instructor. Instructor: HDSI Faculty.

DSC 294: Research Rotation: 4 units (S/U): Special topics research under the direction of an
HDSI faculty member. The research topics may include training in specific research
methodologies consisting of practical laboratory skills, computational skills or proof systems in a
research group/laboratory in which the student may pursue doctoral dissertation research.
Prerequisites: Data Science graduate students and consent of the instructor.

DSC 295: Academia Survival Skills: 1 unit (S/U): Basic skills necessary to succeed as a
researcher in Data Science including scripting, cloud computing skills, fellowship proposal
preparation, CV preparation, writing reviews, preparing posters etc.

General Elective Courses:

Ph.D. in Data Science, November 30, 2020 Version 4.1 34 | Page


These are advanced courses in core Data Science subjects listed under Group B above, or offered
as research topics (DSC 291), or they can be graduate courses in other departments subject to
approval by the student’s HDSI academic advisor.14 Additional elective courses will be offered
based on faculty interest and availability. Any numbered course (other than DSC 291) must be
offered at least once in three years to stay on the course catalogue. HDSI plans to expand
offerings as a part of its growing engagement with faculty across other departments.

DSC 205: Geometry of Data, Instructor: Gal Mishne, Alex Cloninger


Graph-based data modeling, analysis and representation. Topics include: spectral graph theory,
spectral clustering, kernel-based manifold learning, dimensionality reduction and visualization,
multiway data analysis, multimodal and multiview data representation, graph neural networks.

DSC 213: Topological Data Analysis, Instructor: Yusu Wang


Topological methods provide powerful tools for analyzing complex data. This course introduces
basic concepts and topological structures, as well as recent theoretical and algorithmic
developments, together with examples of applications. Some topics include: basics in topology,
simplicial complexes to model data, persistent homology, discrete Morse theory, topology
inference, the Mapper methodology, hierarchical clustering, and integration of topological
methods with machine learning.

DSC 231: Embedded Sensing and IOT Data Models and Methods: Sensory data and control
is mediated by devices near the edge of sensor networks, referred to as IOT (Internet of Things)
devices. Components of IOT platforms: signal processing, communications/networking, control,
real-time operating systems. Interfaces to cloud computing stack, publish-subscribe protocols
such as MQTT, embedded software/middleware components, metadata schema, metadata
normalization methods, applications in selected CPS (cyber-physical system) applications.
Instructor: Rajesh Gupta

DSC 251: Machine Learning in Control: Estimation of stability and uncertainty, optimal
control, and sequential decision making. Instructor: Yian Ma

DSC 252: Statistical Natural Language Processing, 4 units. Diving deep to the classical NLP
pipeline: tokenization, stemming, lemmatization, part-of-speech tagging, named entity
recognition, parsing, and machine translation. Finite-state transducer, context-free grammar,

14
Academic advisors are appointed by the HDSI GradCom and are required to be different from
the student's research advisor (or chair of dissertation committee). The primary responsibility of
an academic advisor is to provide an assessment of student progress, and be a spokesperson for
the student welfare to the HDSI faculty.

Ph.D. in Data Science, November 30, 2020 Version 4.1 35 | Page


Hidden Markov Models (HMM), and Conditional Random Fields (CRF) will be covered in
detail. Instructor: Jingbo Shang

DSC 253: Advanced Data-driven Text Mining, 4 units: Unsupervised, weakly supervised, and
distantly supervised methods for text mining problems, including information retrieval,
open-domain information extraction, text summarization (both extractive and generative), and
knowledge graph construction. Bootstrapping, comparative analysis, learning from seed words
and existing knowledge bases will be the key methodologies. Instructor: Jingbo Shang

DSC 254: Statistical Signal and Image Analysis. 4 units. A graduate level course on signal and
image analysis spanning three main themes. Statistical signal processing: random processes,
stochasticity, stationarity, Wiener filter, Kalman filter, matched filter ; Signal processing:
time-frequency representations, wavelets, signal processing with sparse representation
(dictionary learning) ; Image processing: registration, image degradation and restoration: noise
models + denoising, image pyramids, random fields. Instructor: Gal Mishne, Armin
Schwartzman

DSC 213: Statistics on Manifolds. 4 units. This is a graduate topics course covering statistics
with manifold constraints. Topics include: Frechet means and variances, principal geodesic
analysis, directional statistics, random fields on manifolds, statistical distances between
distributions, transport problems, and information geometry. Manifold constraints will be
considered on simplexes, spheres, Stiefel manifold, stratified manifolds, cone of positive definite
matrices, trees, compositional data, and other relevant manifolds. Instructor: Armin
Schwartzman, Alex Cloninger

CSE 234: Data Systems for Machine Learning. 4 units. Data management and systems issues
across the whole lifecycle of ML-based analytics in real-world applications, including: data
sourcing, preparation, and organization for ML; programming models and systems for scalable
ML training, feature engineering, and model selection; systems for ML inference, deployment,
and explanations; and governed ML platforms and feature stores. Instructor: Arun Kumar

DSC 261: Responsible Data Science, 4 units. Computational aspects of responsible data
science. Computational approaches for enforcing fairness in machine learning, interpretability,
explainability, privacy. Prerequisites: DSC 240, DSC 241. Instructor: Babak Salimi.

MATH 281A-B-C: Mathematical Statistics (4-4-4 units). Math 281A consists of statistical
models, sufficiency, efficiency, optimal estimation, least squares and maximum likelihood, large
sample theory. Math 281B continues and discusses Hypothesis testing and confidence intervals,
one-sample and two-sample problems. Bayes theory, statistical decision theory, linear models

Ph.D. in Data Science, November 30, 2020 Version 4.1 36 | Page


and regression. Math 281C finished the sequence with nonparametrics: tests, regression, density
estimation, bootstrap and jackknife. Instructor: Jelena Bradic, Ery Arias-Castro.

MATH 284: Survival Analysis. 4 units. Survival analysis is an important tool in many areas of
applications including biomedicine, economics, engineering. It deals with the analysis of time to
events data with censoring. This course discusses the concepts and theories associated with
survival data and censoring, comparing survival distributions, proportional hazards regression,
nonparametric tests, competing risk models, and frailty models. The emphasis is on
semiparametric inference, and material is drawn from recent literature Instructor: Lily Xu,
Jelena Bradic

MATH 285. Stochastic Processes (4 units). Elements of stochastic processes, Markov chains,
hidden Markov models, martingales, Brownian motion, Gaussian processes. Recommended
preparation: undergraduate probability theory. Instructor: Ruth Williams

MATH 287A. Time Series Analysis (4 units). Discussion of finite parameter schemes in the
Gaussian and non-Gaussian context. Estimation for finite parameter schemes. Linear vs.
nonlinear time series. Stationary processes and their spectral representation. Spectral estimation.
Students who have not taken MATH 282A may enroll with consent of the instructor. Instructor:
Dimitris Politis

MATH 287B: Multivariate Analysis. 4 units. Bivariate and more general multivariate normal
distribution. Study of tests based on Hotelling’s T2. Principal components, canonical
correlations, and factor analysis will be discussed as well as some competing nonparametric
methods, such as cluster analysis. Students who have not taken MATH 282A may enroll with
consent of the instructor. Instructor: Ery Arias-Castro

MATH 287D: Statistical Learning Theory. 4 units. Topics include regression methods:
(penalized) linear regression and kernel smoothing; classification methods: logistic regression
and support vector machines; model selection; and mathematical tools and concepts useful for
theoretical results such as VC dimension, concentration of measure, and empirical processes.
Instructor: Jelena Bradic.

COGS 243: Statistical Inference and data analysis (4 units): This course provides a rigorous
treatment of hypothesis testing, statistical inference, model fitting, and exploratory data analysis
techniques used in the cognitive and neural sciences. Students will acquire an understanding of
mathematical foundations and hands-on experience in applying these methods using Matlab.
Cognitive science PhD students must enroll for four units and will be required to do assignments
and a final project. All other students can enroll for two units and will be required to complete all

Ph.D. in Data Science, November 30, 2020 Version 4.1 37 | Page


assignments but not a final project (or by request of a project and no assignments). Instructor:
Angela Yu, Virginia de Sa.

4. Student Advising Information


The following lists important advising information for meeting the course completion
requirements of the proposal doctoral program.
a. Incoming students must meet with their assigned HDSI faculty academic advisor to
customize their individual program of study. Students must exemplify they have the
required preparation in the above five areas in order to be exempted from taking some (or
all) of the above courses. For example, having taken MATH 170A (or equivalent) would
indicate that the student does not need to take DSC210. Having taken MATH 173A (or
equivalent) would show that the student does not need to take DSC211, and having taken
MATH 181A (or equivalent) would show that the student does not need to take DSC212.
Similarly, having taken BENG216 would show that the student does not need to take
DSC200. Once again, these are only representative examples among a large number of
preparatory courses across different programs.

b. In addition, incoming students can transfer up to two upper-level undergraduate or


graduate courses taken under a different program and/or university, as long as (i) these
courses are related to one of the above five foundational areas, (ii) these courses have not
already been used for credit towards a different degree, and (iii) these courses are
approved by the student’s HDSI faculty advisor to establish their relevance to data
science and avoid course duplication.

c. A student can receive credit towards the M.S. or Ph.D. degree for a maximum of two
courses (8 units) taken at the upper-division undergraduate level, subject to the approval
of the student’s faculty advisor. These two courses can be transferred (as discussed
above), or taken during the course of the graduate program. For example, a student can
make use of the equivalencies discussed in part (a) above.

8. Field Examinations
No field examinations are required.

9. Qualifying Examinations
As discussed in Section 2.4, successful completion of a Ph.D. program in Data Science requires
timely completion of a preliminary assessment, research qualifying and a final dissertation
defense examination.

Ph.D. in Data Science, November 30, 2020 Version 4.1 38 | Page


1. Preliminary Assessment
The preliminary assessment is an advisory examination. It consists of an oral examination in an
area selected by the student with the goal to assess the student's preparation for the proposed
area, including several relevant topics, and identify any courses that are required or
recommended for the candidate based on knowledge shown and critical missing background
revealed. The preliminary examination must be completed before the start of the second year in
the doctoral degree program. The examination dates are announced no later than the start of the
Winter Quarter. A failing grade in the preliminary examination would include recommendation
for the opportunity to receive a MS in Data Science degree provided they meet the degree
requirements in no more than one extra quarter over the standard time for the MS program; here
we refer to the newly proposed degree of MS in Data Science (not its online version). Students
who fail the preliminary examination may file a petition to retake it; if the petition is approved,
they will be allowed to retake it one (and only one) more time.

After a student successfully completes the preliminary assessment examination, in the next
annual review of the student (conducted annually in the Fall Quarter as a part of the Annual
Faculty Retreat), the GradCom of the HDSI Faculty Council assigns the academic advisor to
provide necessary updates to the GradCom and helps in setting up the doctoral dissertation
committee.

2. Research Qualifying Examination, or the UQE


A research qualifying examination (UQE) is conducted by the dissertation committee
consisting of five or more members approved by the graduate division as per senate regulation
715(D). One senate faculty member must have a primary appointment in the department outside
of HDSI. Faculty with 25% or less partial appointment in HDSI may be considered for meeting
this requirement on an exceptional basis upon approval from the graduate division.15 The goal of
UQE is to assess the ability of the candidate to perform independent critical research as
evidenced by a presentation and writing a technical report at the level of a peer-reviewed journal
or conference publication. The research qualifying examination must be completed no later than
fourth year or 12 quarters from the start of the degree program; the UQE is tantamount to the
advancement to PhD candidacy exam

15
This exception is stipulated in view of a large number of formally appointed faculty on HDSI
faculty council (at 25% or 0%) drawn from different departments and divisions thus making it
impossible for a student to find an “outside” faculty member in some areas.

Ph.D. in Data Science, November 30, 2020 Version 4.1 39 | Page


10. Thesis Requirements
HDSI PhD program thesis requirements must meet Regulation 715(D) requirements. Additional
requirements above UC San Diego Graduate Division requirements are explained below.
Specifically, a dissertation in the scope of Data Science is required of every candidate for the
Ph.D. degree. A draft of the dissertation must be submitted to each member of the doctoral
committee at least four weeks before the final examination (also known as doctoral defense
examination, discussed below). The final form of the dissertation document must comply with
published guidelines by the Graduate Division. Two official copies of the approved dissertation
must be submitted to the Registrar for deposit in the University Library.

Generalization and Reproducibility Requirements:

A candidate for doctoral degree in data science is expected to demonstrate evidence of


generalization skills as well as evidence of reproducibility in research results. Evidence of
generalization skills may be in the form of -- but not limited to -- generalization of results arrived
at across domains, or across applications within a domain, generalization of applicability of
method(s) proposed, or generalization of thesis conclusions rooted in formal or mathematical
proof or quantitative reasoning supported by robust statistical measures. Reproducibility
requirement may be satisfied by additional supplementary material consisting of code, data
repository along with evidence of independent external use or adoption.

11. Final Examination


Successful defense of the dissertation presentation in a final examination to the doctoral
committee consisting of five or more members approved by the graduate division as per senate
regulation 715(D). One senate faculty member must have a primary appointment in the
department outside of HDSI. As explained earlier, partially appointed faculty in HDSI (at 25% or
less) are acceptable in meeting this outside-department requirement as long as their main (lead)
department is not HDSI.

12. Explanation of Special Requirements Over and Above Graduate


Division Minimum Requirements

Generalizability and Reproducibility Requirements


There are no special requirements over and above Graduate Division minimum requirements
related to course work. There are requirements as to the structure of doctoral dissertation
specifically related to evidence of generalizability and reproducibility as explained in Section
2.10. The primary reason for this additional requirement is the transdisciplinary nature of the

Ph.D. in Data Science, November 30, 2020 Version 4.1 40 | Page


nascent discipline that places an additional emphasis on identifying core elements of a research
and dissertation that forms a basis for it to be considered primarily in data science.

Rotation Training Program


In many areas of science, research rotations provide the opportunity for first-year PhD students
to obtain research experience under the guidance of HDSI faculty members. Through the
rotations, students can identify a faculty member under whose sponsorship their dissertation
research will be completed.

Given the diversity of background training and intellectual persuasion of entering Ph.D. students,
a subset of the HDSI faculty council felt strongly that a rotation training program would be
essential to providing an informed match for each Ph.D. student. The interdisciplinary nature of
data science makes research rotation experience a desirable aspect of the Ph.D. program while
addressing principally different advising cultures in constituent areas of data science.
Accordingly, HDSI seeks a principled way to address this difference in academic advising
culture. One possibility is to require rotation of all candidates, but allow for exceptions through
individual review of candidates who demonstrate strong background work and inclination to
work with a specific faculty member. The other possibility is to identify a subset of admitted
students who would be good candidates for participation in the first year rotation program. While
the exact details will be worked out by the Graduate Committee, we plan to offer participation in
the rotation program to all students at the time of the admission.

A research rotation is a guided research experience lasting one quarter (10 weeks) obtained by
registering for DSC 294 with an instructor. Ph.D. students will participate in a minimum of 2
research rotations during their first year, and with a minimum of two different faculty members,
and as much as four rotations including summer quarter. A student may rotate twice under the
same faculty member as long as they rotate with at least two faculty members. The goal is to
help the student identify and develop their research interests and to expose students to new
methodological approaches or domain knowledge that may be outside the scope of their eventual
thesis. Research rotations must complete before the start of the second year with a signed
commitment form from a faculty advisor.

The Graduate Committee (GradCom) will develop detailed guidelines on the selection process
and conduct of the Rotation Program that specifically address questions such as rotation
schedule, whether or not it is arranged by faculty advisor or by the student themselves,
orientation of students into the rotation program, guidelines on academic advising by the rotation
advisors and student evaluations in a rotation program, and exception conditions including
extension of the rotation experience for a student to four quarters and other special
circumstances.

Ph.D. in Data Science, November 30, 2020 Version 4.1 41 | Page


The rotation program will be evaluated for its effectiveness as a part of our first three-year
review program with recommendations for any changes to the program submitted for review and
approval by the Graduate Council.

13. Relationship of Master’s and Doctoral Programs


The proposed doctoral program is closely related to and builds upon the pending Master’s
program in Data Science (MS-DS) by HDSI that has been recently approved by the Graduate
Council of the Academic Senate. In particular, the doctoral program is structured to benefit from
our investment and support for courses that enables students with a broad range of backgrounds
to successfully complete master’s level work in data science. This ‘onboarding’ of entering
students into our Master’s program is equally valuable for our doctoral program coursework even
though the qualifications for entry into the doctoral program are more detailed in terms of
background preparation in computing and mathematical subject areas. In addition, PhD students
along their way towards their degree may fulfill all requirements for the M.S. degree, and
therefore can apply and receive it before the conferral of the Ph.D. degree; notably, the
Registrar’s Office does not award the M.S. and the Ph.D. degrees in the same quarter.

14. Special Preparation for Careers in Teaching


All graduate students in the doctoral program are required to complete at least one quarter of
experience in the classroom as teaching assistants regardless of their eventual career goals.
Effective communications and ability to explain deep technical subjects is considered a key
measure of a well-rounded doctoral education. Thus, Ph.D. students are also required to take
1-unit DSC 295 (Academia Survival Skills) course for a Satisfactory grade.

Ph.D. in Data Science, November 30, 2020 Version 4.1 42 | Page


3. Projected Need
1. Student Demand for the Program
Section 1.1 outlines the programmatic reasons for launching the graduate program as a key
vehicle for advancing knowledge and practice in Data Science. The driver for such a program,
indeed, of HDSI as an institution, is to satisfy the growing demand for the graduate program in
data science both internally as well as externally. Almost all our external letters of evaluation
from academic institutions have specifically pointed to “student demand for rigorous PhD-level
training in data science” (Alex Aue, UC Davis), because of “unprecedented increase in demand”
(Larry Wasserman, CMU) stated similarly by George Michailidis, University of Florida.
Academic demand for Data Science postdoctoral scholars and faculty is not hard to see given the
rise of academic units (departments and schools) in Data Science across the country.

Further, industry surveys have repeatedly shown a soaring need for data scientists. Chief and
credible among these are reports by McKinsey16, IBM17, Bloomberg18. As mentioned, in a survey
of our students graduating from Data Science major, roughly a third of them have indicated their
interest in a graduate degree in Data Science program despite strong placement opportunities for
our graduates in the industry for students training in key data science areas of AI and Machine
Learning. Since our undergraduate major in Data Science covers basic and advanced courses in
these areas, the graduate interest is primarily for a doctoral research degree program. In 2019,
nearly a half (1571) of over 3000 applicants to various graduate degree programs at UC San
Diego who indicated interest in Machine Learning and Artificial Intelligence related topics in
Electrical Engineering, Computer Science and Cognitive Science, directly indicated their interest
in Data Science programs at HDSI. HDSI offered scholarships to 10 of these students admitted
into degree programs in Computer Science, Math, Electrical Engineering or Cognitive Sciences.

Thus, the demand for a graduate training in Data Science is high and continues to grow. Further,
the program not only serves the HDSI mission of educating talent in the area of Data Science,
but also serves as a vehicle for continued engagement and proliferation of Data Science training
across various graduate programs through new foundational, core and elective course offerings
that engage domain experts into the field of Data Science. The program provides an excellent
means to create new educational opportunities for students, especially for underserved and
economically-disadvantaged student populations who can benefit from graduate scholarships
offered by HDSI as a part of its core endowment-supported activities that we have mentioned
earlier.

16
https://mck.co/2W4LriY
17
https://bit.ly/3dkiDIS
18
https://bloom.bg/3chfute

Ph.D. in Data Science, November 30, 2020 Version 4.1 43 | Page


Based on the demographic data of students interested in constituent areas of machine learning,
information theory and statistics, one would expect a skewed demographic balance at about
one-fifth of domestic students. However, with growth of areas such as computational biology and
computational social science, we expect international fraction to be closer to campus average of
60% (7%, Social Sciences, 15% in Arts and Humanities, 23% Biological Sciences, 32% Health,
45% GPS, 50% Physical Sciences, 70% JSOE and RSM) and better than the founding areas of
Mathematics (90%), Computer Science (80%), Electrical Engineering (83%). Indeed, the
demographics on our Undergraduate Program support an expectation of one-third to one-half of
domestic students. Following a similar reasoning, we expect a better ratio of California resident
applicants over computer science where we saw four times as many resident applicants for the
MS program over the doctoral program. We shall be monitoring and reporting on these numbers
in our annual program review as a part of HDSI annual report.

2. Opportunities for Placement of Graduates


Data Science as an academic subject area is rapidly emerging with Schools or Colleges of Data
Sciences (such as Berkeley, Wisconsin, MIT, Columbia) or Departments such as NYU,
Michigan, Yale, Cornell, UC Irvine, Virginia just to name a few. UC Berkeley recently
reorganized and launched the Division of Computing, Data and Society. Regardless of
institutional home as a department, division, school or college, the faculty demand for Data
Science is expected to rise along with the need for a pipeline of graduate students in the coming
years.

While no survey data is available for the doctoral demand, we have plenty of data and evidence
for placement of graduate students in the industry and civil organizations. As a case study, we
examined the entire class of Ph.D. students who graduated from the Ph.D. program at NYU. We
located 30 graduates from the program (from 2017 starting year) in LinkedIn working as
postdoctoral scholars, senior data scientists at companies such as Boston Consulting Group,
Walmart Laboratories, Facebook, Uber; as research analysts in venture capital and investment
banking; and as software engineers at Hulu and FreeWheel.

A detailed market analysis in support of the opportunity in ‘Big Data’ and long-term trends in
this domain comes from a recent McKinsey Global Institute report that identified ‘Game
Changer’ opportunities for US growth19. The May 2011 McKinsey Global Institute report, “Big
Data: The next frontier for innovation, competition and productivity”20, predicted the need for
over 500,000 data scientists by 2018. McKinsey projected a shortfall of 1.5 million additional

19
http://www.mckinsey.com/insights/americas/us_game_changers
20

http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovat
ion

Ph.D. in Data Science, November 30, 2020 Version 4.1 44 | Page


managers and analysts in the U.S. who can “ask the right questions and consume the results of
the analysis of big data effectively.” These numbers were later analyzed by commercial outfits
such as kdnuggets.com. Analysis of LinkedIn Workforce report dated August 2018 states
“Nationally, we have a shortage of 151,717 people with data science skills”. Kaggle, the largest
community of data scientists and now a part of Google has over 2 million subscribers. The
LinkedIn profile of Data Scientists lists 132,083 people with Data Scientist titles that are spread
across IT services, Computer Software, Financial Services, Banking, Insurance and Higher
Education.

In a separate ‘bottom-up’ study, the EMC Corporation, a publicly traded company with 60,000
employees, interviewed 497 data science and business intelligence professionals from around the
world. The results of their study on the need for Data Scientists pointed to some interesting
trends in the computing industry. About two-thirds of the individuals polled believe the demand
for data scientists will outpace supply in the next five years with nearly 30% coming from
professionals in disciplines other than computer science. The study also cited the lack of training
and resources as the biggest obstacles to data science in organizations. These observations
directly support the case for the need for rigorous scientific training for the professionals moving
into the data science field.

Data Scientists constitute a separate category of jobs that are currently posted along with IT and
business analytics positions. As of this writing, Indeed.com posts openings for 5953 Data
Science jobs, Glassdoor lists 21,166 Data Science jobs with a salary range from $76,000 to
$148,000 for an average of $118,700. This compares with an average salary of $76,500 for
software engineers, $70,700 for computer engineers and $110,200 for computer scientists.
Indeed, driven by the opportunities available, a whole industry has sprung up on Data Science
placements.21 Data Science graduates will be well qualified for job titles such as data analysts,
business intelligence and predictive analysis professionals. The students are likely to find
employment across many areas including internet companies, banking, insurance, investments,
engineering and healthcare. We will work with Career and Placement Services as well as Alumni
Board to ensure mentoring and placement of graduates from the Data Science program.

3. Importance to the Discipline


Section 1.2 addresses the intellectual underpinning driving the emergence of the discipline of
Data Science where UC San Diego already has an oversubscribed undergraduate program
(currently 8th largest major at UC San Diego) as well as active postdoctoral program offered by
HDSI. A graduate program is crucial to the establishment of Data Science as an academic area.
The proposed doctoral program is a necessary step toward a complete graduate program that

21
https://www.dataquest.io/blog/career-guide-find-data-science-jobs/

Ph.D. in Data Science, November 30, 2020 Version 4.1 45 | Page


establishes the complete pipeline of talent into academic (teaching and research) careers in the
emerging discipline.

4. Ways in which the program will meet the needs of the society
Data Scientists are highly sought after, showing a societal need for individuals with this
professional competency. Furthermore, Data Sciences are already having an impact on many
aspects of society, including e-commerce, financial industries, technology companies, health
care, and academia. There are few aspects of society that will not be affected by Data Science.
The proposed program directly serves the current and growing need of professionals in the area
and its applications.

5. Relationship of the program to research and/or professional interests


of the faculty
As of this writing, the Institute has appointed 2 teaching assistant professors, 11 ladder-rank
full-time professors (8 assistant, 1 associate, 2 full) as well as 13 ladder-rank joint faculty (5 at
50% and 8 at 25%) into the Institute, each of which directly lists Data Science as their main
professional research interest.

6. Program Differentiation
Sections 1.5 and 1.6 cover in detail related programs at UC San Diego and in the UC system.
The growth in Data Science degree program is following a middle-out process, starting with a
large number of MS programs counting over 100, with emerging data science bachelor’s degree
programs such as UC Berkeley and UC Irvine in addition to BS in Data Science offered by
HDSI, now in its fifth year. New York University has offered a PhD program in Data Science
since 2017, with specializations in Data Science of existing PhD degrees offered by Columbia,
Michigan and many other schools. In contrast to emerging programs as specializations of
Statistics, or Engineering degrees, HDSI organization presents us with the capability to design
integrated programs in Data Science for a broader and deeper training through a large and
diversified set of electives. With the increasing participation from faculty and departments across
the campus in creating additional electives/specialization courses, we hope to extend the reach
and impact of Data Science as a discipline.

Ph.D. in Data Science, November 30, 2020 Version 4.1 46 | Page


4. Faculty
The HDSI faculty community consists of over 200 faculty affiliates organized into 44 different
research clusters. The core faculty of HDSI consists of 48 faculty who are appointed at various
levels of FTE scale reflecting the extent of teaching responsibilities within Data Science22. The
table below lists the Institute faculty. A two-page CV for each faculty member is provided as a
separate attachment to the proposal.

HDSI Faculty Council

Faculty Names and Title Specialization Areas


Group

Mikhail Belkin, Professor Machine Learning, Learning Theory,


AI: understanding structure in data,
analysis and algorithms for non-linear
high-dimensional data

Justin Eldridge, Assistant Teaching Machine Learning Theory: improving


Professor correctness of learning algorithms,
clustering, process of learning

Aaron Fraenkel, Assistant Teaching Machine Learning, Design of


Professor anti-fraud, anti-abuse systems

Yian Ma, Assistant Professor Machine Learning Theory: scalable


inference methods, time-series data
and sequential decision making,
Full Time
Bayesian inference algorithms
Faculty in
HDSI Arya Mazumdar, Associate Professor Machine Learning, Information
(11) Theory: error correcting codes for use
in storage systems

Gal Mishne, Assistant Professor Signal processing and machine


learning for graph-based modeling,
processing and analysis of large-scale
high-dimensional real-world data;
unsupervised data analysis in
neuroscience

Yusu Wang, Professor Computational Geometry,

22
https://datascience.ucsd.edu/about/faculty/hdsi-faculty-council/

Ph.D. in Data Science, November 30, 2020 Version 4.1 47 | Page


Topological/geometric methods for
data analysis

Babak Salimi, Assistant Professor Data Management, Causal Inference,


Fairness in Decision Support Systems

Zhiting Hu, Assistant Professor Machine Learning and Natural


Language Processing with applications
in controllable content generation,
enterprise AI platforms and healthcare.

Berk Ustun, Assistant Professor Interpretability, Fairness,


Accountability in Machine Learning,
Applications of ML to Medicine,
Finance, Justice and Business.

Tsui-Wei (Lily) Weng, Assistant Robust, Reliable and Trustworthy AI


Professor Systems, Deep Reinforcement
Learning, Fairness in machine learning
and Robustness against adversarial
attacks.

David Danks, Professor Learning & reasoning in humans,


Ethics & policy for autonomous
systems, machine learning.

R. Stuart Geiger, Assistant Professor Computational Social Science:


computational ethnography,
socio-technical systems

Mikio Aoi, Assistant Professor Computational Biology: large-scale


Bayesian nonparametric inference,
Bayesian optimization, neuronal
Joint analysis
Faculty in Jingbo Shang, Assistant Professor Data mining, natural language
HDSI processing, and machine learning:
(4.5 FTE, mining and constructing structured
13 knowledge from massive text corpora
headcount) with minimum human effort

Benjamin Smarr, Assistant Professor Computational biology, dynamical


systems, stochastic processes and
Biological Circuits

Barna Saha, Associate Professor Theoretical Computer Science:

Ph.D. in Data Science, November 30, 2020 Version 4.1 48 | Page


algorithm design and analysis,
probabilistic method and large scale
data analytics

Arun Kumar, Assistant Professor Databases, Data management and


software systems, data preparation,
model selection, and model
deployment, ML/AI-based data
analytics

Yoav Freund, Professor Machine Learning and its applications


in bioinformatics, computer vision,
finance, network routing, and
high-performance computing.

Jelena Bradic, Associate Professor Statistics: causal inference, ensemble


learning, robust statistics and survival
analysis with applications to
gene-knockout experiments,
understanding cell cycles, developing
new policies or detecting effects of
treatments onto survival

Rayan Saab, Associate Professor Mathematics: signal processing and


analysis, sparse and low-dimensional
representations of high dimensional
data, compressed sensing

Alex Cloninger, Assistant Professor Applied harmonic analysis, machine


learning, neural networks, analysis of
high-dimensional data

Margaret Roberts, Associate Professor Political Science: automated content


analysis, political methodology and
politics of information

Armin Schwartzman, Professor Biostatistics: Signal and image


analysis; functional and
manifold-valued data;
high-dimensional data; modern
multivariate statistics; large scale
multiple testing; applications in
biomedicine and the environment.

Fellows & Bradley Voytek, Associate Professor Cognitive neuroscience: neural


Administra HDSI Fellow modeling and simulation, along with

Ph.D. in Data Science, November 30, 2020 Version 4.1 49 | Page


tive large-scale data mining and machine
learning techniques, to understand the
physiological basis of human cognition
and age-related cognitive decline

Ilkay Altintas, Research Scientist Scientific workflows and solution


HDSI Fellow architectures for data and
computational science, eScience
applications.

Virginia De Sa, Professor Cognitive Science: computational


HDSI Associate Director neuroscience, visual perception, EEG
analysis, brain-computer interfaces
Machine Learning, multi-view
learning, multi-task learning, computer
vision applications

Dimitris Politis, Distinguished Statistics: time series, bootstrap


Professor, HDSI Associate Director methods, and nonparametric estimation
methods

Rajesh K. Gupta, Distinguished Embedded and Cyber-physical


Professor, HDSI Director Systems: sensor data organization,
metadata models and methods.

Angela Yu AI (artificial agents): learning and


decision making under uncertainty,
social cognition.

Eran Mukamel Computational neuroscience: modeling


and analysis of large-scale data sets to
understand complex biological
networks of the brain

0% Frank Wuerthwein Physics: experimental particle physics,


Appointme distributed high-throughput computing
nts with large data volumes

George Sugihara Complex system dynamics, methods


for forecasting chaotic systems,
neurobiology, gene expression in
cancer

Henrik Christensen Robotics, computer vision, AI:


systems-oriented approach to machine
perception, robotics and design of

Ph.D. in Data Science, November 30, 2020 Version 4.1 50 | Page


intelligent machines

Julian McAuley, Assistant Professor Machine Learning: Social Networks,


using artificial intelligence in fashion
choice, and data science in various
applications.

Larry Smarr, Distinguished Professor High-performance computing and


networking: advanced
cyberinfrastructure, experimental
systems

Lucila Ohno-Machado, Professor Biomedical Informatics: accessible and


usable health data and its use in
evidence-based health decisions

Michael Pazzani, Professor Emeritus Machine learning, explainable artificial


HDSI Distinguished Scientist intelligence, personalization, internet
search, and recommendation systems

Michael Holst, Professor Mathematical and Computational


Physics: biochemistry and biophysics,
computational fluid dynamics,
computer graphics, materials science,
and numerical algorithms relativity

Robin Knight, Professor Bioengineering, Cellular and


Molecular Biology, Computer Science:
Microbial analysis

Ronghui (Lily) Xu, Professor Mathematics and Biostatistics:


machine learning, statistical inference
for complex data-types in the presence
of high-dimensional covariates

Ruth Williams, Distinguished Mathematics: Probability theory,


Professor stochastic models of complex networks
(e.g., in internet, systems biology)

Shankar Subramaniam, Distinguished Bioengineering and Systems Biology


Professor

Shannon Ellis, Assistant Teaching Human Genetics, Data Science


Professor Education

Tara Javidi, Professor Information Theory, Machine

Ph.D. in Data Science, November 30, 2020 Version 4.1 51 | Page


Learning: wireless mesh networks.

Terry Sejnowski, Distinguished Computational Neurobiology,


Professor Neurosciences, Neural Networks

Vineet Bafna, Professor Bioinformatics, Computational


Biology

Young-Han Kim, Professor Information Theory: network


information theory.

In addition to the 48 faculty members listed above (15.5 FTE), the Institute is also planning to
fill one teaching faculty (LSOE) and one advancing faculty diversity (AFD) position in the
current recruiting season. It anticipates additional 1-2 new faculty members to join the Institute
for a total faculty strength of 16-17 FTE including 3 FTE LPSOE and 8 U18 lecturers.

Together, these provide a capacity of 51-52 courses annually by the current ladder-rank faculty in
addition to 5 cross-listed courses as well as teaching by U18 continuing lecturers for a combined
total annual capacity of 58-74 courses. The current Data Science undergraduate program
accounts for 35 courses/sections per year. Conservatively, the Institute has the capacity to offer
6-10 graduate courses per quarter that enables it to adequately serve the proposed doctoral
program.

Ph.D. in Data Science, November 30, 2020 Version 4.1 52 | Page


5. Courses
Section 2.7 describes the structure of the program and courses. eCourse description of the
courses and their instructors is attached in Appendix C. We note that the courses have been
devised to ensure broadest possible access to the Data Science graduate program by motivated
students from diverse educational backgrounds. Accordingly, the program makes provision for
course credit for a maximum of 3 out 5 courses (Group A: Foundational Areas) that ensure
adequate preparation of students to enable successful completion of the graduate degree. Group
B: Core Areas specify a minimum of 5 courses out of 10 courses that constitute the body of
knowledge and skills in methods/tools areas of data science. This list includes two required
courses on Machine Learning and Data Ethics & Fairness. Finally, elective courses (including
thesis research) seek to specialize data science skills in specific areas or application domains.

6. Resource Requirements
As mentioned earlier that no new or additional resource requirements are expected from the
campus in support of the proposed Ph.D. program. Instead, the Institute’s continuing and planned
expenses in graduate scholarships, faculty recruiting and cyber-infrastructure resources
(including personnel) will be key enablers for the successful operation of the proposed graduate
program. Starting Winter 2021, the Institute has been allocated space on two floors of the 38,000
square feet Literature Building that would provide ample space for housing the faculty, students,
and advising staff for the graduate and undergraduate programs. HDSI’s current undergraduate
program has over 1000 students in its majors and minors, thus making the undergraduate major
to be the 8th largest major. This provides a funding source for graduate Teaching Assistants who
will be primarily drawn from the proposed doctoral program. Faculty fully or primarily
appointed in HDSI currently direct research projects worth $5.5M annually that are managed by
HDSI. The sponsored research is expected to grow as additional faculty join starting Fall 2021.
The combined research and teaching activities will be taken into consideration by the graduate
division in setting the graduate scholarship support that is expected to cover one full-year of
non-employment based graduate student support for each of the entering graduate students. We
point out that this graduate support is realized due to additional teaching and research activities
(and associated revenues) by the HDSI faculty. HDSI faculty will continue to fund and supervise
students working on Data Science research projects who are drawn from Ph.D. programs in other
departments as well.

7. Graduate Student Support


Consistent with the Graduate Division’s instructions, HDSI plans to offer five year of confirmed
financial support including tuition remission for all its entering students that would consist of a

Ph.D. in Data Science, November 30, 2020 Version 4.1 53 | Page


combination of research and teaching assistantships. HDSI will guarantee first-year of
non-employment based support as a part of the rotation program. Our normative expectation is
that Ph.D. students are able to confirm a research advisor by the end of the first-year who will be
responsible for providing graduate student research support.

Overall, HDSI’s guarantee of financial support is rooted in four primary sources of funding for
graduate students: (a) Graduate division support of graduate students based on campus policy on
distribution of scholarships support to graduate students. It was earlier known as “block grants”
derived on the basis of campus policies for academic units based on their need and extramurally
funded research activities; (b) Teaching assistant support. Currently, TA support is at 8 TA FTE
per year and expected to rise with increase in undergraduate enrollment in our majors and minors
(from currently at 700 students in the major, 5000 students in classes annually to 1000 majors
and 10000 students in classes annually in three years).

Graduate students will be trained and once determined to be qualified per university regulations,
they will be offered TAships; (c) Extramurally funded research projects including training
grant(s). Extramural funding is likely to be the largest source of funding for our graduate
students, given the extensive growth and consistent availability of research funded by
organizations such as NSF, DARPA, DOE, ARL and others. Data Science areas are among the
most intensely invested areas of research both by public and private organizations (foundations).
Based on budget analysis provided as a part of 3-year FTE planning, we anticipate annual
research support of $200K/year per faculty appointed in the institute; (d) endowment-supported
graduate student scholarships. We have currently budgeted $600K per year for this program. We
expect to grow this program with the growing industry contributions and philanthropic support to
the Institute. Financial aid will be available to approximately one quarter of our best students in
the early years. As we scale the program, the ratio of financial support may drop to no less than
15% of the total student population. In addition, as outlined in our EDI strategy (Section 1.6), the
Institute will directly offer scholarship for URM students.

8. Governance
The program is offered by the Halicioglu Data Science Institute, established as an academic unit
by the Academic Senate on June 6, 2018. HDSI faculty council is the governing body of all
academic programs by the Institute. A copy of Bylaws is attached in the Appendix.

9. Changes in Senate Regulations


No changes to Senate regulations are proposed.

Ph.D. in Data Science, November 30, 2020 Version 4.1 54 | Page


Appendix A: Listing of Research Areas
The following table lists topical areas covered in doctoral research efforts engaging core HDSI
faculty organized by seven core themes of HDSI.
AI: Automated Reasoning, Knowledge Representations, Cognition
Knowledge Representations, Distributed representations, learning multiple levels of representation or
based on composing learned functions
Multi-agent Systems combined with graph signal processing, network analysis
Automated decisions, Computer augmented decision making (with applications in geospatial analysis,
health)
Intelligence amplification and application to decision making, Augmented Cognition
Machine Learning: Theory, Algorithms, Systems
Adversarial ML, ML for security and privacy, algorithmic fairness
Reinforcement learning, Learning as optimization, multi-task learning, transfer learning, learning to
learn
Algorithms, game-theoretic setups such as GANs, realistic study of the limits of machine learning and
applied statistics
NLP, language technologies, unstructured text analysis
Accelerated ML Systems: architectures, algorithms, tools and libraries for accelerated ML systems
Data Infrastructure: Data Viz, Programming and DB Systems

Data Visualization, Visual Analytics, HCI for data science


Databases/data systems for data systems
Data mining, data integration from multiple modalities (text, time-series, imaging etc)
Methods and System design to ensure data security and privacy
Distributed/cloud computing
IOT and Cyber-Physical Systems, AI sensors
Software engineering and PL for Machine Learning, ML Systems
Mathematical Foundations of Data Science: Causal Inference, Hypothesis Testing, Optimization
Theory
Causal inference in machine learning, Sequential decision making methods and their statistical
analysis
Non-parametric data analysis
Multiple hypothesis testing and high dimensional data analysis, false discovery rate
Applied probability problems for the analysis of data science methods

Ph.D. in Data Science, November 30, 2020 Version 4.1 55 | Page


Submodular optimization, transport theory (optimal transport/Wasserstein distance, parallel transport,
especially on non-Euclidean spaces), optimization theory and algorithms that use large data to reduce
computation without compromising statistical validity
Digital Humanities: DS in Society, Ethics/Policy
Ethics, data science in public interest
Philosophy of information: how data science allows us to learn about the world, information transfer
from data to models, prediction/interpretation tradeoffs
Privacy and public policy, Accountability measures and methods from Data Science
Understanding humans, data science in language, literature and arts
Computational social science, data-driven sociology
Computational Linguistics, Speech versus intentional language, Conversational design and human
behavior
Systems and Applications: CPS/IOT, Architectures, Health, Economics, Robotics

Brain-inspired Computing Machines, Neuromorphic Architectures, Hyperdimensional Computing


Medical signal processing, computational medicine, medical data integration challenges (patient
records, device records, insurance records etc); Causal inference in Medical Informatics
Data-driven developmental economics, new economic theories based on automated data-driven
measurements and methods
Statistics and economics: statistical game theory (focus on statistical and computational properties of
the Nash equilibrium and its implications to fairness), market efficiency (antitrust), and other market
design problems
Geospatial data collection and analysis
Robust and commonsense learning in robotic systems
ML augmented organizational workflows, data science applications in organizational behavior,
business
ML methods for dynamic, time, causal inference with implications for Political Science, Economics
and/or Healthcare
IOT for health, autonomous vehicles
eSciences: Real-Time Instrumentation Data, Sustainability

Environmental Data Sciences


Data science in ecology and conservation, Sustainability at scale
Data Science in Precision Imaging Systems: Applied Optical and Electron Microscopy
Data Science for High-throughput Biology, Sequencing, Mass Spectrometry in support of
Bioinformatics, Quantitative Biology

Ph.D. in Data Science, November 30, 2020 Version 4.1 56 | Page


Appendix B: Letters of Support Solicited
As of the writing, following letters of support have been received and enclosed. Additional letters
will be provided in time for review by the Graduate Council meeting in January 2021.

Divisional Deans & Directors


1. Al Pisano, Dean of Jacobs School of Engineering
2. Peter Cowhey, Dean of Global Policy Institute
3. Carol Padden, Dean of Social Sciences
4. Cristina Della Coletta, Dean of Arts and Humanities
5. Cheryl Anderson, Dean, Herbert Wertheim School of Public Health and Human
Longevity Science.
6. Lisa Ordóñez, Dean, Rady School of Management

Department Chairs
1. James McKernan, Chair, Department of Mathematics
2. Jonathan Cohen, Chair, Department of Philosophy
3. Brian Goldfard, Chair, Department of Communication
4. Bill Lin, Chair, Department of Electrical and Computer Engineering
5. Kun Zhang, Chair, Department of Bioengineering
6. Sorin Lerner, Chair, Department of Computer Science and Engineering
7. Stefan Leutgeb, Neurobiology Section, Division of Biological Sciences
8. Thad Kousser, Chair, Department of Political Science.

External Reviewers:
1. Professor Alexander Aue, Chair, Department of Statistics, Co-Director, Center for Data
Science and Artificial Intelligence Research, UC Davis
2. Professor Larry Wasserman, Department of Statistics and Data Science, CMU
3. Professor George Michailidis, Founding Director, UF Informatics Institute, University of
Florida
4. Professor Sharad Mehrotra, Information and Computer Science, UC Irvine.

Faculty, Instructors
1. Jorge Cortes, MAE247
2. James Fowler, POLI 287
3. Trey Idekar, BNFO 286 / MED 283
4. Massimo Franceschetti, ECE 227
5. Vineet Bafna, CSE 283 / BENG 203, CSE 280A
6. Alex Cloninger, DSC 210, Math 170A, Math 277A
7. Arun Kumar, CSE 234, DSC 202, DSC 204

Ph.D. in Data Science, November 30, 2020 Version 4.1 57 | Page


8. Rayan Saab, DSC 210, DSC 211, DSC 242, DSC 243
9. Armin Schwartzman, DSC 244, DSC 212, DSC 241, DSC 242
10. Angela Yu, COGS 243
11. Brad Voytek, COGS 280
12. Ronghui (Lily) Xu, MATH 284
13. Siavash Mirarabbaygi, ECE

Ph.D. in Data Science, November 30, 2020 Version 4.1 58 | Page


ALBERT (“AL”) P. PISANO, DEAN 9500 GILMAN DRIVE
IRW IN AND JOAN JACOBS SCHOOL OF ENGINEERING LA JOLLA CALIFORNIA 92093-0403
WALTER J. ZABLE PROFESSOR OF ENGINEER ING TEL: (858) 534-6237 FAX: (858) 822-3904
7313 JACOBS HALL EMAIL: DeanPisano@eng.ucsd.edu

29 November 2020
TO: Graduate Council
FROM: Albert P. Pisano, Dean of Engineering
RE: Doctor of Philosophy Degree in Data Science (Ph.D./DS)

I am writing to express my strong support for the new, proposed Doctor of Philosophy
Degree in Data Science (Ph.D./DS), to be offered by the Halicioglu Data Science Institute
(HDSI). There already exist significant collaborations between HDSI and the Jacobs School of
Engineering, and this new Ph.D. degree will serve to strengthen and expand that collaboration. I
am pleased to report that there are six jointly appointed faculty between HDSI and the Jacobs
School on which this strong collaboration will be based: Benjamin Smarr, Bioengineering,
Jingbo Shang, CSE, Barna Saha, CSE, Joav Freund, CSE, Arun Kumar, CSE, and Rajesh Gupta,
CSE. In my conversations with faculty colleagues I find there is broad support for the proposed
Ph.D. program. Indeed, because HDSI is a unit that has faculty who conduct research with Ph.D.
students, it seems appropriate that HDSI have the ability to offer the proposed degree. Further,
Engineering is willing to collaborate with HDSI in areas of common research interest, including
Algorithms, Artificial Intelligence, Machine Learning, Data Infrastructure and Systems, as well
as application areas of the research. There are many opportunities for broadening the course
offerings at UCSD, and the course offerings in Data Science will benefit the Ph.D. and MS
students in CSE and ECE. Similarly, a number of the graduate classes in CSE and ECE are sure
to be of interest to Data Science PhD students.

I am confident that HDSI and Engineering will move forward together in a mutually-
beneficial way, and I anticipate there will be high demand from students for this program, and
look forward to an exciting new crop of Ph.D. researchers.

Sincerely,

Albert ("Al") P. Pisano


Member, US National Academy of Engineering
Member, US National Academy of Inventors
Walter J. Zable Distinguished Professor & Dean
Irwin and Joan Jacobs School of Engineering
University of California, San Diego
PETER F. COWHEY 9500 Gilman Drive #0519
Dean, School of Global Policy and Strategy La Jolla, California 92093-0519
Qualcomm Chair in Communications and Technology Policy T: (858) 534-1946 | F: (858) 534-3939
pcowhey@ucsd.edu | gps.ucsd.edu

To: The Graduate Council

From: Peter Cowhey, Dean, GPS

Re: HDSI proposal for a PhD program in Data Science

I have reviewed the proposal and fully endorse it.

The proposal (and all of the HDSI work) recognizes that the application of data science to
applied problem solving requires partnership with domain experts. HDSI has worked to organize
a Data Science in Society cluster that fulfills this philosophy. It includes a number of faculty
members from GPS. As a result, the PhD program has a roster of pertinent researchers (and
teachers) available already in place whose interests will lead to constructive spillovers to the
teaching and research programs of GPS.

The cluster is more than an aspirational. HDSI is already providing support for some of the large
research initiatives at GPS. Two examples are the big data program for the analysis of the
politics and economics of China that is housed at our 21st Century China Center and the other is
the "Big Pixel" program employing satellite imagery analysis in our Center on Global
Transformation. These and other initiatives are leading to a new set of graduate courses on
marrying data science to policy analysis. One of our senior faculty members, Professor John
Ahlquist, has made a multi-year commitment to being the "Sherpa" for this undertaking. We
expect some of these courses will be available to HDSI PhD students.

Finally, it should be noted that GPS has no plan for a PhD program that would conflict with the
proposed HDSI offering.
November 30, 2020

TO: Rajesh Gupta, Director


Halıcıoğlu Data Science Institute

RE: Proposal for a PhD program in Data Science

In another letter to the Graduate Council, we offered strong support to HDSI’s plans for a M.S.
program in Data Science. I am pleased to also support HDSI’s plans for a PhD program in Data
Science. The proposed program shows there is strong coherence across their degree programs, at the
undergraduate and masters’ level as well. This proposal for a PhD is a natural extension of their
curricular planning to date.

We do not have a surplus of courses and programs about data science. Although areas such as
machine learning and artificial intelligence are also taught in Computer Science and Cognitive Science
(and Mathematics), the teaching emphasis and the cases under study across the divisions are different
enough as not to be redundant or overlapping. Further, because we have a number of faculty in the
Social Sciences (e.g. Voytek, Roberts, Geiger) who participate in the teaching programs of the HDSI,
their perspectives are regularly considered and incorporated into HDSI courses – thus maintaining
cooperative intellectual and research programs across divisions while teaching to the needs and
ambitions of the different PhD programs.

I agree with HDSI that the demand for high-level training in data science is such that having multiple
PhD programs is not a problem. In fact, the areas of machine learning and artificial intelligence will
benefit from being taught in different ways for different populations of students who will bring their
skills to a broad job market that has an acute need for skill in this area. The proposed program has
many interesting and thoughtful elements, and it is clear they have thought very carefully about
pedagogy at this level. I welcome their innovative teaching into our campus community.

Sincerely,

Carol Padden
Dean, Division of Social Sciences

Dean, Division of Social Sciences


University of California San Diego • 9500 Gilman Drive # 05020 • La Jolla, California 92093-0502
Tel: (858) 534-6073 • Fax: (858) 534-7394 • socialsciences.ucsd.edu
November 29, 2020

To: Rajesh K. Gupta, Director, Halıcıoğlu Data Science Institute (HDSI)


From: Cristina Della Coletta, Dean Arts & Humanities
RE: Proposed PhD Degree in Data Science

Dear Professor Gupta:


I am very pleased to offer my support for the creation of a Doctoral Degree program in “Data
Science” (PhD/DS) at UC San Diego.

The proposal frames the program’s core objectives very clearly around three main competency
areas, namely, to train students to (a) collect raw data for computational modeling and analysis;
(b) appropriately use algorithms in a specific domain by developing effective optimization
methods; and (c) interpret, analyze, and visualize the results of these algorithms to complete
relevant scientific inquiry.

The program is designed around multiple specialization tracks, in order to allow students from
diverse academic backgrounds to both develop shared core competencies and explore domain-
elective courses.
As noted in the proposal, the demand for PhD education in Data Science is growing across
multiple institutions. The structure of the HDSI proposed program is especially nimble and
innovative, as it will allow doctoral students to train across various graduate programs through
foundational, core and elective course offerings, in partnership with other Academic Units. This
feature makes the program especially competitive.
Not only will the PhD program in Data Science create timely transdisciplinary opportunities for
many students; it will also play a crucial role in serving underserved and economically-
disadvantaged student populations, thanks to the graduate scholarships offered by HDSI as a part
of the Institute’s foundation-supported initiatives.
The proposal for the PhD in Data Science is well-argued and meticulously presented. I believe
the PhD degree program in Data Sciences will provide a welcome addition to graduate studies at
UC San Diego. I look forward to seeing this program take off, and to further opportunities of
collaboration between the Division of Arts and Humanities and HDSI.

Sincerely,

Cristina Della Coletta


Dean, Arts & Humanities

Division of Arts and Humanities


University of California San Diego ∙ 9500 Gilman Drive #0406 ∙ La Jolla, California 92093-0406
Tel (858) 534-6270 ∙ Fax (858) 534-0091 ∙ artsandhumanities.ucsd.edu
November 29, 2020

Rajesh K. Gupta, Director


UC San Diego
Halıcıoğlu Data Science Institute

Dear Dr. Gupta,

On behalf of the Herbert Wertheim School of Public Health and Human Longevity Science
(HWSPH), I am pleased to offer support for your proposal for a program of graduate study in data
science leading to a Doctor of Philosophy in Data Science (PhD/DS).

Thank you for the opportunity to review and comment on your proposal. Strengths of this
proposal are that it addresses an emerging field of study for which there is a demand for training,
it is highly relevant to a wide range of industries, uses a campus-wide collaborative approach,
and I see it as complementary to the degrees we offer in the HWSPH. It was also great to have
faculty from the HWSPH’s Biostatistics and Bioinformatics group (Drs. Xu and Schwartzman)
included in the planning process.

I offer my best wishes as you create this important training program, and look forward to
supporting it when it is approved.

Sincerely,

Cheryl A. M. Anderson, PhD, MPH, MS


Professor and Dean
Herbert Wertheim School of Public Health
and Human Longevity Science

Cheryl A. M. Anderson, PhD, MPH, MS


Professor and Dean • Herbert Wertheim School of Public Health and Human Longevity Science
UC San Diego • 9500 Gilman Drive # 0628 • La Jolla, California 92093 • Tel: (858) 534-8363
Lisa D. Ordóñez, PhD 9500 Gilman Drive # 0553
Dean La Jolla, California 92093-0553
Stanley and Pauline Foster Endowed Chair Tel: (858) 822-0830
Rady School of Management lordonez@ucsd.edu
rady.ucsd.edu

Dec. 1, 2020

To: Graduate Council


From: Dean Lisa Ordóñez
Re: Doctor of Philosophy in Data Science

Dear Colleagues,

Several faculty members at Rady and I have had an opportunity to review HDSI’s proposed degree program for a
Doctor of Philosophy in Data Science. We are very supportive of the proposal and believe it will be well-
received by other units on campus. This proposal is a timely one and will help address an unmet and growing
need for graduate education in data science.

There is a lot to like in the details of HDSI’s proposal. First, a fundamental goal of the program is to “lay the
foundation for future researchers who can expand the boundaries of knowledge in Data Science itself”. This is an
important aspect that will help produce capable researchers who will become leaders in theory and practice of
data science and advance the emerging field.

We view the PhD in Data Science program as complementary to our degree programs and believe the experience
of graduate students focusing in the areas of data science and its applications will be positively impacted by its
existence. For instance, our students will mutually benefit from some of the new graduate courses that are created
as part of this proposal. While our students typically take their breadth electives within Rady, some students seek
courses in departments across campus. These electives require approval by program directors who assess fit.

In closing, I am supportive of the Doctor of Philosophy in Data Science program presented in this proposal. It is
very well thought out and designed with aspects that make it unique within UC San Diego. I anticipate that the
program will be successful in achieving its goals.

Best regards,

Lisa D. Ordóñez, PhD


Dean, Rady School of Management
Stanley and Pauline Foster Endowed Chair
UNIVERSITY OF CALIFORNIA, SAN DIEGO UCSD

BERKELEY • DAVIS • IRVINE • LOS ANGELES • MERCED • RIVERSIDE • SAN DIEGO • SAN FRANCISCO SANTA BARBARA • SANTA CRUZ

Professor James Mc Kernan, Chair Tel: (858) 534-6347


Department of Mathematics Fax: (858) 534-5273
9500 Gilman Drive # 0112 Email: jmckernan@math.ucsd.edu
La Jolla, CA 92093–0112 Url: http://www.math.ucsd.edu/∼jmckerna/

October 30, 2020

Professor Rajesh Gupta


Director HDSI
UCSD

Dear Professor Gupta,


I am writing to express the Mathematics Department’s support for the new proposed Doctor of Philosophy in
Data Science (PhD/DS) to be offered by the Halicioğlu Data Science Institute (HDSI) in collaboration with
various academic units on the UC San Diego campus.
The proposed program will give its students the knowledge, skills and awareness required to perform data
driven taskts to do research which will expand the boundaries of Data Science. This training and research will
provide students with the knowledge, skills and research expertise for a career in Data Science, in academia,
industry or the civil service.
As mathematics and statistics are integral components of the interdisciplinary field of Data Science, and
indeed many UCSD Mathematics Department faculty are affiliated with HDSI, the department is pleased to
further cement the connections that this proposal will make with the Mathematics Department. In particular
many courses (including DSC 205, 210, 211, 212, 213, 240, 241, 242, 243, 244, 281ABC, 284, 285, 287AB)
will often be taught by faculty who have joint appointments with HDSI and mathematics.
The Mathematics Department looks forward to cooperating with HDSI on this program to further catalyze
connections and collaborations related to data science.

Yours sincerely,

James Mc Kernan, FRS


Department Chair
Charles Lee Powell Endowed Chair in Mathematics
UNIVERSITY OF CALIFORNIA, SAN DIEGO

BERKELEY · DAVIS · IRVINE · LOS ANGELES · RIVERSIDE · SAN DIEGO · SAN FRANCISCO

JONATHAN COHEN
PROFESSOR AND CHAIR
DEPARTMENT OF PHILOSOPHY
9500 GILMAN DRIVE, DEPT. 0119
LA JOLLA, CALIFORNIA 92093–0119
C
(760) 814-1110
FAX: (858) 534-8566
cohen@ucsd.edu
SANTA BARBARA

http://aardvark.ucsd.edu

October 8, 2020
· SANTA CRUZ

UC San Diego Academic Senate


Dear Committee Members:

On behalf of the Department of Philosophy, I write to offer support for the proposal for a PhD program in
Data Science that would be housed within Halicioğlu Data Science Institute (HDSI).
Data science is clearly an important emerging field with connections to many areas of intellectual inquiry
spread across our University. The creation of a PhD program that would capitalize on these resources in a
way that benefits a new generation of scholars is an exciting prospect.
We hope and expect that the establishment of such a program will lead to further cooperation in research
and instruction between Philosophy and HDSI in the areas of causal discovery, machine learning, data ethics,
and more. We look forward to discussing ways in which we might contribute as the shape of the new program
becomes clearer.
We are confident that HDSI has the infrastructure and expertise to run the proposed PhD program. Our
Department does not expect that the program will negatively impact our research or pedagogical missions
at any level, and so endorse the proposal without reservations.
Please feel free to contact me for any additional questions.

Sincerely,

Jonathan Cohen, Professor and Chair


DEPARTMENT OF COMMUNICATION, MC0503 9500 GILMAN DRIVE
OFFICE: (858) 534-0234 LA JOLLA, CALIFORNIA 92093-0503
FAX: (858) 534-7315

To: Rajesh K. Gupta, Director, Halıcıoğlu Data Science Institute (HDSI)

From: Brian Goldfarb, Associate Prof. and Chair


Department of Communication

Subject: Support for proposed doctoral degree program in “Data Science” (PhD/DS

October 26, 2020

Dear Rajesh,

I am writing to express support from the Department of Communication for the proposed
doctoral degree program in “Data Science” (PhD/DS) to be offered by the Halıcıoğlu Data
Science Institute (HDSI). As a key partner to HDSI in building Data Science, the Communication
Department views the proposed program as an important step in advancing interdisciplinary
cross-fertilization between the two units. Our department has two faculty affiliated with HDSI:
Kelly Gates and Lilly Irani, and one, Stuart Geiger, who holds a joint appointment across the two
units. Since joining the faculty this fall, Prof. Geiger has been working to establish a working
group on Critical Data Studies which promises to build a tighter fabric of connections between
HDSI and Communication. The creation of the proposed PhD promises to set up a platform to
expand the interactions among our faculty around research and graduate advising/mentorship.

The proposal lays out a plan for a program with clear standards for academic rigor. The
curriculum includes a well-considered set of requirements as well as options that balance the
establishment of shared scholarly concerns with cross-fertilization of contributing disciplines.
The proposal also articulates the impressive scope and strengths of faculty who would
participate in the program and establishes a reassuring picture of the adequacy of the facilities
that will be dedicated to research and teaching. Finally, the initial success of the undergraduate
program and the interest from graduate students in affiliated departments signals that HDSI can
anticipate a strong applicant pool, while the growth of the field bodes well for the placement
prospects for its graduates.

In summary, I confirm support of this proposal and look forward to the opportunities for
collaborations between faculty and students in our Department and HDSI.

Sincerely,

Brian Goldfarb, Assoc. Professor and Chair, Department of Communication


PROF. BILL LIN ELECTRICAL & COMPUTER ENGINEERING
TEL: (858) 822-1383 9500 GILMAN DRIVE, MAIL CODE 0407
E-MAIL: billlin@eng.ucsd.edu LA JOLLA, CALIFORNIA 92093-0407

DATE: October 26, 2020

TO: Graduate Council

FROM: Bill Lin, Chair, Department of Electrical and Computer Engineering

RE: Doctoral Degree in Data Science

Dear Colleagues,

It is my pleasure to write this strong letter of support for the newly proposed Doctoral Degree in Data
Science (PhD/DS), to be offered by the Halicioğlu Data Science Institute (HDSI). The propsed
doctoral degree will serve the need for advanced graduate students in the area of Data Science.

Demand for data scientists is clearly exploding in both academia and industry, as data science is
being applied in all aspects of society. The proposed PhD/DS program in HDSI is very timely to
serve this need. The proposed program is very strong in both quality and academic rigor. Further,
HDSI is fully capable of administering this program given the size and expertise of the HDSI faculty
as well the facilities and budgets available to HDSI. Also, the exploding demand for data scientists
in both academia and industry will ensure a strong applicant pool as well as exceptionally strong
placement prospects for the graduates of the proposed PhD/DS program. In addition, the proposed
HDSI PhD/DS program will facilitate closer engagements and collaborations between the faculty in
HDSI and ECE, as well as other departments across campus.

Overall, the proposed HDSI PhD/DS program will undoubtedly bring much greater visibility to UC
San Diego as the preeminent university for artificial intelligence, machine learning, and data science.
I look forward to cooperating with HDSI on this program as well as other initiatives.

Best regards,

Bill Lin, Chair


Electrical and Computer Engineering Department
University of California, San Diego
KUN ZHANG, PH.D. TELEPHONE (858) 822-7876
PROFESSOR FAX (858) 534-5722
CHAIRMAN E-MAIL: k4zhang@ucsd.edu
DEPARTMENT OF BIOENGINEERING
9500 GILMAN DRIVE 0412
LA JOLLA, CALIFORNIA 92093-0412

November 29, 2020


To: Professor Rajesh Gupta,
Director, HDSI at UCSD

Re: Data Science PhD Program

I am writing on behalf of the Department of Bioengineering to enthusiastically endorse and support the
graduate program Doctor of Philosophy in Data Science.

Given the vast importance of Data Sciences in modern society, it is imperative that we train qualified
professionals who can join the workforce solving problems where big data is the paradigm. I have
reviewed your proposed program and the design of the curriculum is excellent and will be ideal for training
students.

The proposed PhD Program will benefit from the interactions with Bioengineering and our top-notched
Bioengineering graduate program. We anticipate that students in your PhD Program will have the options
to take a number of Bioengineering courses related to systems biology, genomics and imaging. I am also
excited that we have an outstanding joint hire in Ben Smarr, who will serve as a bridge between our
programs. I should also add that my colleague Dr. Shankar Subramaniam is launching a new course on
Biomedical Data Sciences in the academic year 2020-21, which would be valuable to the two recently
proposed MS Programs in BE and HDSI, as well as this PhD program.

Several other courses offered by Bioengineering including graduate courses in technologies that generate
vast data in biomedicine and complex modeling courses that transform data into knowledge would be
valuable for our Programs.

I look forward to working with you and helping make the PhD/DS graduate program harbinger
of the future.

Best regards,

Kun Zhang
Professor and Chair
Department of Bioengineering
DATE: 11/29/2020

FROM: Sorin Lerner, Chair, Department of Computer Science and Engineering

TO: Rajesh K. Gupta, Director, Halıcıoğlu Data Science Institute (HDSI)

Dear Rajesh,

The department of Computer Science and Engineering fully supports the proposed PhD program
in Data Science. Since HDSI is a unit that has faculty who conduct research with PhD students, it
only makes that HDSI have its own PhD program. We are happy to work with HDSI on areas of
common interest, including Algorithms, Artificial Intelligence, Machine Learning, Data
Infrastructure and Systems, and application areas of common interest. There are many
opportunities for broadening the course offerings at UCSD, and the course offerings in Data
Science will benefit the PhD and MS students in CSE. Similarly, some of the graduate classes in
CSE, such as the CSE 202 course on Algorithms (but others too), will be of interest to Data
Science PhD students, and we are happy to make seats available to the Data Science PhD
students in those classes.

Sincerely,

Sorin Lerner
UNIVERSITY OF CALIFORNIA, SAN DIEGO UCSD
BERKELEY • DAVIS • IRVINE • LOS ANGELES • MERCED • RIVERSIDE • SAN DIEGO • SAN FRANCISCO SANTA BARBARA • SANTA CRUZ

STEFAN LEUTGEB, Ph.D. PACIFIC HALL


CHAIR AND PROFESSOR ROOM 3225A
NEUROBIOLOGY SECTION 9500 GILMAN DRIVE
DIVISION OF BIOLOGICAL SCIENCES LA JOLLA, CA 92093-0357
TEL: (858) 246-0824
FAX: (858) 534-7309
EMAIL: SLEUTGEB@UCSD.EDU

Nov 2, 2020

Rajesh K. Gupta
Director
Halıcıoğlu Data Science Institute

Dear Rajesh,

I write to express my highest enthusiasm for the proposed doctoral degree program in Data Sciences by
the Halıcıoğlu Data Science Institute (HDSI). As you know, there is a strong connection of Neurobiology
research with many of the core themes of HDSI, including artificial intelligence, machine learning, and
scientific discovery. To take advantage these shared interests, Neurobiology and HDMI have hired several
faculty at the intersection between our fields over just the past three years, including Gal Mishne and Alex
Cloninger at HDSI and Marcus Benna and Yonatan Aljadeff in Neurobiology. In addition, our first joint
faculty member, Mikio Aoi, has just been hired and arrived on campus.

A PhD program in Data Sciences will fill a major gap in our current offerings of PhD programs, in that
your program will not be merely geared towards prospective students in mathematics, computer science
and engineering but will also attract those with an avid interest in one of the application sciences such as
chemistry and biology. From our perspective, computational neuroscience, in combination with big data
from neural recordings is a discipline that strongly benefits from the integration with data science, and
your PhD program will be unique in attracting students at this intersection between disciplines.
Conversely, many of the foundational ideas for machine learning have at least a loose analogue in circuit
mechanisms that are used by the brain, and there is an enormous potential in further applying findings
from rigorous experimental research to engineering applications. These are new frontiers that can be
effectively approached by the type of PhD applicant that only your program can attract, such as students
with a background in both the life sciences and in computer sciences. Importantly, there is also an
increasing number of prospective employers in both academia and industry who are in need of a
workforce who can lead projects in data analytics in fields that include molecular biology, biochemistry,
and neurobiology.

HDSI has already brought together an impressively interdisciplinary group of faculty who have the
expertise to train a new generation of data scientists. Including trainees at the doctoral level is particularly
valuable, because PhD students do not only contribute to the training of students at the undergraduate and
Masters level but are also invaluable for the research mission and the continued national and international
leadership of faculty at HDSI. The launch of your proposed PhD program is therefore well timed in that
all the necessary expertise across disciplines is now in place so that there will be a rewarding interaction
that will further strengthen the status of HDSI as one of the premier institutions of its kind. Based on the
faculty with appointments and joint appointments in HDSI, their expertise is well suited to teach the range
of classes and seminars that are proposed. Taken together, the PhD program in Data Sciences will not
only lead to a rigorous education of the students in the program but to also fill a gap that is currently not
covered by more specialized PhD programs in the analytical and application disciplines. By admitting
students who can bridge gaps between these established programs, numerous PhD programs will be
substantially strengthened by the cohort of students that can go between these diverse fields.

Given that close collaborations between HDSI and our department have already been developing among
faculty, I anticipate that your PhD program will only further foster these interactions and become a pillar
for the type of interdisciplinary science that UC San Diego stands for. Such interdisciplinarity will benefit
the entire campus community and particularly students at all levels within your program as well as
beyond your program. I therefore most enthusiastically support the addition of a PhD program in Data
Sciences.

Sincerely,

Stefan Leutgeb, PhD


Professor and Chair
Neurobiology Section, Division of Biological Sciences
Fellow, Kavli Institute for Brain and Mind
University of California, San Diego
UNIVERSITY OF CALIFORNIA, SAN DIEGO UCSD
BERKELEY • DAVIS • IRVINE • LOS ANGELES • MERCED • RIVERSIDE • SAN DIEGO • SAN FRANCISCO SANTA BARBARA • SANTA CRUZ

THAD KOUSSER DEPT. OF POLITICAL SCIENCE 0521


Professor and Department Chair 9500 Gilman Drive
E-MAIL: tkousser@ucsd.edu La Jolla, California 92093-0521
TEL: (858) 534-3239 FAX: (858) 534-7130

October 30, 2020

Dear Academic Senate Members,

As chair of the Department of Political Science, I am writing to voice my department’s strong support for
the Halıcıoğlu Data Science Institute’s Proposal for a Program of Study in Data Science Leading to a
Degree in Doctor of Philosophy in Data Science. We have reviewed this proposal and are excited about
its potential to be both a rigorous and successful program in its own right and to serve as a central force
uniting other campus strengths in bolstering UC San Diego’s emerging leadership in data science
education and research.

We believe that this proposal lays out that rigorous course of study in data science that the faculty
associated with the HDSI – which includes some of our faculty – are highly qualified to deliver. It will
deliver foundational skills early in the program and important applied machine learning skills as students
progress. We are especially encouraged by the inclusion of a course in Arts, Humanities, Society, Policy
and Social Sciences, which will connect students with the diverse disciplinary strengths of our campus.

We believe that students graduating with a Ph.D. in the proposed data science degree would have strong
placement prospects both within and outside of academia. Within academia, there is an increasing demand
across disciplines to hire faculty with data science expertise. The Department of Political Science has
itself hired successfully in data science and is continuously looking to expand the group of political
methodologists with a data science focus. Data scientists are also in high demand in the non-academic
sector, and supply is still limited. The Political Science Department has been very successful in placing its
students with data science expertise in a variety of companies, ranging from Amazon to Facebook to
Google. We therefore believe that students who graduate with a dedicated data science degree would have
very strong non-academic job prospects.

The proposed program also offers potential synergy effects across departments on campus. A data science
PhD program would offer additional courses that would be attractive to PhD students from other
programs. At the same time, Data Science could potentially draw from existing courses that fit very well
in the proposed curriculum. If I can answer any further questions about this matter, please feel free to call
me at 858-246-0721 or to email me at tkousser@ucsd.edu.

Sincerely,

Thad Kousser
October 28, 2020

RE: Support for proposed PhD Program in Data Science

Dear Dr. Gupta:

I am writing in enthusiastic support of the PhD Program in Data Science currently proposed at UC San
Diego. In my capacity as a Co-Director of the UC Davis Center for Data Science and Artificial Intelligence
Research and as chair of our campus-wide Data Science Steering Committee, I have been closely following
the developments at UC San Diego and the HDSI, viewing them as remarkably useful and exemplary for
our own deliberations. I believe USCD has gotten crucial decisions right in the past and is in the process of
adding another successful piece to its data science portfolio.

The planned introduction of a PhD Program in Data Science is the natural next step for your campus and
completes the educational data science infrastructure, complementing the already existing undergraduate
major and the recently approved MS program. While the latter programs help provide industry,
government and academia with graduates versed in the application of diverse data scientific tools, a
maturing of the field will require mapping out and building the intellectual foundations that make data
science a unique new field, and placing it within the existing disciplinary landscape. This is best done in
conjunction with the development of a strong PhD program that allows for this process to play out in a
coordinated yet flexible fashion. Outside of academia, future PhD graduates will take on leadership
positions in industry and government that require more data science expertise than expected of those
with undergraduate and MS degrees. I imagine the presence of the HDSI will enable a seamless integration
of academia, industry and government interests into a coherent whole. Given the all-encompassing role
data science is expected to play in the future, the PhD program will be of great service to all constituents
at UC San Diego. It will in particular help satisfy student demand for rigorous PhD-level training in data
science.

I specifically like the proposers’ thoughtfulness in defining the aims of the PhD program and devising the
curriculum, clearly bearing in mind the transdisciplinarity and evolving nature of the field, within the UC
San Diego data science ecosystem and beyond. The strategy laid out in the proposal made available to me
is sound and constitutes a broad consensus of the involved parties. It is also laudable that guidelines put
forward by the National Academies have been followed in mapping out the structure of the coursework,
inclusive of important ethical components. The proposed curriculum will serve future PhD students in Data
Science well. Faculty members listed as having teaching responsibilities in HDSI programs include
renowned experts and leaders in their fields and should ensure the highest quality of instruction. I liked to
see the on-ramping options that will allow students from diverse backgrounds to enter the program. Once
in existence, I will make sure that undergraduate and MS students in Statistics and other disciplines at UC
Davis are made aware of this exciting new opportunity for graduate education at UC San Diego.

Overall, I view UC San Diego as primed to play a major role in data science research and education on the
national level. The PhD program is the last piece missing to complete the full educational pipeline. The
proposal is well sought out and administered by leading faculty at one of the foremost data science
institutes. The PhD degree will be an outstanding addition to the portfolio of graduate degrees on your
campus, providing your students with a pathway to the high-level jobs in data-intensive fields the US needs
to cultivate in order to ensure a prosperous as well as equitable future. The proposal has my enthusiastic
support.

Please let me know if you should have any further questions.

Yours sincerely,

Alexander Aue
Professor and Chair
Department of Statistics
Co-Director
Center for Data Science and Artificial Intelligence Research
University of California, Davis
+1-530-752-0560
aaue@ucdavis.edu
Carnegie Mellon Department of Statistics &
Data Science
232 Baker Hall
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213
Larry Wasserman
Rajesh Gupta UPMC Professor
HDSI Director (412) 268-8727
UCSD larry@stat.cmu.edu
www.stat.cmu.edu/∼larry
October 24, 2020

Dear Professor Gupta:

I am writing in support of the proposal to create a doctoral degree program in HDSI.

Data science is one of the fastest growing areas in academia. The reason is that, with
the flood of information that we now have, data science plays a role in every science
and in understanding societal issues. Every statistics, data science and machine
learning doctoral program that I know of has experienced an unprecedented increase
in demand both in terms of applicants and in demand for graduates.

Increasing the capacity to service more doctoral candidates in data science is thus
critical to the infrastructure of science and public policy. In short, we need more
doctoral programs.

HDSI is well positioned to offer a doctoral program. There is already a B.Sc. and
there is a large pool of talented faculty with an impressive array of research ex-
pertise. I have reviewed the proposal and it is clear that the proposed program has
been clearly thought out. I should add that UCSD is unusual in that it does not have
a statistics department. Having a doctoral program in data science will thus fill a
serious gap.

In summary, the proposal to create a doctoral program in data science in HDSI is


well supported and I strongly support this proposal.
October 24, 2020 2

Sincerely,

Larry Wasserman

2
UF Informatics Institute E251 CISE Bldg
PO Box 118545
Gainesville, FL 32611-8545
352-294-3912

October 26, 2020

Rajesh Gupta, Ph.D.


Distinguished Professor
HDSI Founding Director
UC San Diego

Dear Professor Gupta,

It is my pleasure to write a letter of support for the new Ph.D. program in Data Science proposed by the
Halicioglu Data Science Institute (HSDI) of UC San Diego. The demand for graduates in data science is
very high in all sectors of the economy –even during the pandemic--- and the need for Ph.D. level
researchers is subsequently becoming evident in industry, as well as academia.

I have been involved with the design of two data science programs during my career. The first is the
Masters in Data Science at the University of Michigan, while I was faculty there and launched in late
2015. It is jointly administered by the Departments of Computer Science and Statistics and the School
of Information and provides training in both core methodologies (programming, data structures, data
management, probability, statistical inference, data modeling, machine learning, optimization and
computational methods) and domain expertise through elective coursework. The degree also requires a
capstone course that requires students to do an end-to-end data science project involving
understanding the scientific question of interest, data collection and curation, modeling and
computation and finally communication of the results through a written report and an oral presentation.

The second data science program I have been involved with is the Data Science undergraduate major
at the University of Florida, a program jointly administered by the Departments of Statistics,
Mathematics and Computer Science. Its philosophy is analogous to the previous program and aims to
provide training in core methodologies, but also expose students to additional training in data science
problems in specific domains (e.g. social sciences, natural sciences, public health) through specific
thrusts.

The Ph.D. program proposal developed by HSDI is elaborate and well thought out, both in terms of
proposed coursework that covers both in depth state-of-the-art technical topics, but also provides
exposure to a wide range of topics, necessary to produce well rounded data scientists. I was
particularly impressed by the fact that the Ph.D. program will be open to students from various and
diverse backgrounds and it is designed to prepare them for success. To that end, students will attend
(as needed) certain carefully designed preparatory classes on core methods -computing, mathematics
and statistics. Hence, all incoming students will be brought to the same page by the end of first year,
including students who are admitted with little quantitative background.

UCSD has a large Department of Mathematics, but (surprisingly) it does not have a Department of
Statistics. The Statistics group within the UCSD Mathematics Department is very strong in

The Foundation for The Gator Nation


An Equal Opportunity Institution
Mathematical Statistics. The proposed Ph.D. program in Data Science may thus provide an outlet for
some top-quality work in more computationally and applications oriented work coming out of UCSD.

In summary, I believe this new Ph.D. is carefully designed to accommodate a wide range of students
and thus it represents an exciting development for UCSD. I believe it will be highly successful and I fully
support the proposal.

Please do not hesitate to contact me should you need more information.

Sincerely,

George Michailidis
Founding Director, UF Informatics Institute
U Florida Research Foundation Professor
Professor of Statistics and Computer Science
University of Florida

The Foundation for The Gator Nation


An Equal Opportunity Institution
UNIVERSITY OF CALIFORNIA

BERKELEY • DAVIS • IRVINE • LOS ANGELES • MERCED • RIVERSIDE • SAN DIEGO • SAN FRANCISCO SANTA BARBARA • SANTA CRUZ

The Donald Bren School of Department of Computer


Science
Information and Computer Sciences Irvine, CA 92697-3435
Tel: (949) 824-0016
Fax: (949) 824-4056
www.ics.uci.edu

January 8th, 2021

I would like to commend UC San Diego and HDSI for their initiative to create Ph.D. program in
Data Sciences. This program spearheaded by a set of very talented and dedicated faculty will
undoubtably continue the meteoritic trajectory UC San Diego in on in an important and timely
area of data sciences. As a database faculty with keen interest in data sciences, I have been closely
monitoring UCSDs efforts led by HDSI over the past several years.

It is now well recognized that data science is destined to be the catalyst for disruptive innovations
in science and technology leading to unprecedented changes and improvements to all walks of
modern society. We live in the time of data revolution where machines, sensors, a variety of data
capture devices enable us to collect and monitor every aspect of our lives whether they be personal
experiences, health, social interactions, or our interactions with the engineered and physical
systems. The ability to automatically and seamlessly monitor social as well as physical worlds at
various spatial and temporal granularities has created unprecedented opportunities leading to major
data-centric innovations, new opportunities, new efficiencies, and new industries. Companies such
as Google and Yahoo! have used such data to provide improved search, better personalized
experiences of individuals on the internet, designed novel ways to monetizing and funding the new
ideas through placement of advertisement. Moving beyond internet companies, organizations such
as the health care providers, product companies, political activist groups, and news media have
developed tools to monitor public opinion feedback about their goods and services and use such
feedback to launch new product lines or new models. While the above emphasizes the role/impact
of data-centric approaches to industry, its role to the future of science and technology, new
discoveries whether they be in medicine, health sciences, oceanography, or cosmology, will be
even more profound.

UC San Diego was amongst the early schools to realize the central role data science was to play
to the future of education, and, now with its proposed Ph.D. program it is all set to lead the
academic community in creation of the foundational principles that form the core of data-driven
explorations, as well as, to expand the boundaries of knowledge and contribute to tools and
techniques that will expand the nascent field of Data Sciences. While I cannot emphasize enough
the timeliness of creating such a program and the very strong arguments the proposal makes as to
how such a program will help not just UC, San Diego but the academic community, what I found
truly exceptional was how well thought out are the operational plans to creating such a program.
In particular, the proposal clearly articulates the important role of multidisciplinary research and
education in such a program and, based on such a realization, it is noteworthy that the leadership
at HDSI systematically approached over 200 faculty (drawn from Engineering, Computer
Science, Physical Sciences, Arts, Humanities, Social Sciences, Medicine, and Health) who are
now are part of the affiliates program to create a cohesive integrative vision of Data Sciences
outlined in the proposal. The approach emphasizes not just principles, algorithms, mathematical
foundations, tools and technologies at the core of data-driven approaches, but provides an
integrative view that incorporates domain sciences to set a path forward for doctoral dissertations
based on interdisciplinary collaborations that open up opportunities for breakthrough in areas such
as physics, medicine, social sciences, etc. Indeed, the architects of the proposal have this view
firmly in their minds when they observe two unique aspects the proposed Ph.D. program compared
to existing data science efforts UC San Diego and other universities that are typically part of
Computer Sciences and Machine Learning. While existing efforts can be expected to advance
algorithmic solutions, machine learning, and data management principles that form the theoretical
underpinning of data science, a truly effective program (such as the one promoted by the proposal)
must seek involvement of researchers with multidisciplinary background that embrace an
interdisciplinary curriculum with faculty and students from disciplines interested in exploring data
sciences in order to advance science and technology using data-driven approaches. Indeed,
interactions of specialists and research at the cross-boundaries of disciplines is where the largest
advances in data sciences and benefit of data driven approaches are expected to be.

In looking through the details of the program articulated in the proposal, it is clear that the proposal
writers have done their homework and tried to strike a balance in terms of courses and requirements
that highlights quality and academic rigor while at the same time ensure success of the program
from the very beginning.. As is always the case, additional/new needs will emerge when the
program is launched. The proposal includes mechanisms necessary for such future adaptations
based on emerging needs. With the faculty talent associated with the proposal, both at the
leadership levels as well as excellent new hires associated with HDSI, and faculty affiliated with
HDSI, I have no doubt that once the Ph.D. program is launched, it will be monitored and improved
based on initial lessons learnt and the progress of the program will set the example for other
universities, including my own – UC, Irvine -- to follow on their footsteps.

It is for all these reasons that I very enthusiastically support the proposed UCSD effort.
The proposal is very well thought out. It addresses an emerging need and is the logical next step
for HDSI as it establishes itself to be a center of excellence and leadership in data sciences.

Please do not hesitate to contact me if I can be of further assistance.

Prof. Sharad Mehrotra


IEEE Fellow
Department of Computer Science
University of California, Irvine. CA 92617
UNIVERSITY OF CALIFORNIA
UNOFFICIAL SEAL
UNIVERSITY OF CALIFORNIA, SAN DIEGO

BERKELEY · DAVIS · IRVINE · LOS ANGELES · MERCED · RIVERSIDE · SAN DIEGO · SAN FRANCISCO SANTA BARBARA · SANTA CRUZ

MECHANICAL AND AEROSPACE ENGINEERING +1 (858) 534-0708


LA JOLLA, CALIFORNIA 92093 Attachment B - “Unofficial” Seal
For Use on Letterhead

Mechanical and Aerospace Engineering


Jacobs School of Engineering
University of California
9500 Gilman Dr
La Jolla, California 92093
P HONE : +1 (858) 822-7930
E MAIL : cortes@ucsd.edu
April 29, 2020

To: Professor Rajesh Gupta, Director HDSI


Re: MAE247: Cooperative Control of Multi-Agent Systems
From: Jorge Cortes, Professor, Mechanical and Aerospace Engineering

Dear Rajesh,

This is to confirm that I support the listing of the graduate course “MAE247: Cooperative Control of Multi-Agent
Systems” as an elective for the HDSI Masters program under the Networks specialization and warmly welcome
qualified graduate students in the course.

Sincerely,

Jorge Cortés
Professor
UNIVERSITY OF CALIFORNIA, SAN DIEGO UCSD

BERKELEY • DAVIS • IRVINE • LOS ANGELES • MERCED • RIVERSIDE • SAN DIEGO • SAN FRANCISCO SANTA BARBARA • SANTA CRUZ

DEPARTMENT OF POLITICAL SCIENCE, 0521 9500 GILMAN DRIVE


TELEPHONE: (858) 534-6807 LA JOLLA, CALIFORNIA
FAX: (858) 534-7130 92093-0521

April 29, 2020

Dear Colleagues:

To: Professor Rajesh Gupta, Director HDSI


Re: POLI 287: Multidisciplinary Methods in Political Science: Social Networks
From: James Fowler, Professor

This is to confirm that I support the listing of the above course as an elective for the HDSI
Masters program under the Networks specialization and welcome qualified graduate
students in this course.

Sincerely,

James H. Fowler
Professor
University of California, San Diego
fowler@ucsd.edu
UC San Diego Health April 29, 2020
Department of Medicine
9500 Gilman Drive MC-0688 Professor Rajesh Gupta, Director HDSI
La Jolla, CA 92093-0688
T: +1 858.822.4558
UC San Diego
F: +1 858.534.4246 rgupta@ucsd.edu
tideker@ucsd.edu
idekerlab.ucsd.edu Re: BNFO 286 / MED 283: Network Biology and Biomedicine

Dear Dr. Gupta,


Trey Ideker, Ph.D.
Professor of Medicine This is to confirm that I support the listing of the above course as an elective
Adjunct Professor of for the Halıcıoğlu Data Science Institute (HDSI) Masters program under the
Bioengineering and Computer Networks specialization and welcome qualified graduate students in this
Science
course.
Director, NCI Cancer Cell Map
Initiative (CCMI)
Sincerely,
Director, NIGMS National Resource
for Network Biology (NRNB)

Director, NIMH Psychiatric Cell


Map Initiative (PCMI)

Trey Ideker, Ph.D.


UNIVERSITY OF CALIFORNIA, SAN DIEGO UCSD
BERKELEY ⋅ DAVIS ⋅ IRVINE ⋅ LOS ANGELES ⋅ MERCED ⋅ RIVERSIDE ⋅ SAN DIEGO ⋅ SAN FRANCISCO SANTA BARBARA ⋅ SANTA CRUZ

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING 9500 GILMAN DRIVE


UNIVERSITY OF CALIFORNIA, SAN DIEGO LA JOLLA, CALIFORNIA 92093-0404
THE IRWIN AND JOAN JACOBS SCHOOL OF ENGINEERING FAX: (858) 534-7029

May 4, 2020
To: Prof. Rajesh Gupta, Director HDSI

This is to confirm that I support the listing of the following courses that I teach as electives
for the HDSI MS program under the Bio specialization and welcome qualified graduate
students in this course.

1. CSE280A: Algorithms for population genetics


2. CSE283/Beng203

Sincerely,

Vineet Bafna, PhD


Professor, CSE, #4218
UC San Diego
9500 Gilman Drive
La Jolla, CA 92093-0404
vbafna@cs.ucsd.edu
858-822-4978 (O)
858-534-7029 (F)
http://www.cs.ucsd.edu/~vbafna
UNIVERSITY OF CALIFORNIA, SAN DIEGO UCSD

BERKELEY · DAVIS · IRVINE · LOS ANGELES · MERCED · RIVERSIDE · SAN DIEGO · SAN FRANCISCO SANTA BARBARA · SANTA CRUZ

DEPARTMENT OF MATHEMATICS +1 (858) 534-4889


9500 GILMAN DRIVE, #0112 FAX +1 (858) 534-5273
LA JOLLA, CALIFORNIA 92093-0112 EMAIL acloninger@ucsd.edu

May 3, 2020

Dear Colleagues,
To: Professor Rajesh Gupta, Director HDSI
This is to confirm that I support the listing of the course Math 277A: Topics in Computational and Applied Math:
Diffusion Geometry and Metric Graph Learning as an elective for the HDSI Masters program under the Networks
specialization, and welcome qualified graduate students in the course.

Sincerely yours,

Alexander Cloninger, Ph.D.


Assistant Professor of Mathematics and
Halıcıoğlu Data Science Institute
University of California, San Diego
DEPARTMENT OF COGNITIVE SCIENCE
9500 GILMAN DRIVE, 0515 858-822-3317
LA JOLLA, CALIFORNIA 92093-0515 ​ajyu@ucsd.edu

MAY 1, 2020 
 
Dear Colleagues: 
 
To: Professor Rajesh Gupta, Director HDSI 
Re: COGS 243: Statistical Inference and Data Analysis (4 units) 
From: Angela Yu, Associate Professor 
 
This  is  to  confirm  that  I  support  the  listing  of  the  above  course  as  a  general  elective  for  the  HDSI  Masters 
program  under  Group  B  (Core  Knowledge  and  Skills  Areas),  and  welcome  qualified  graduate  students  in  this 
course. 
 
 
Sincerely, 
 

 
 
Angela Yu, PhD 
Associate Professor 
University of California, San Diego 
 
BERKELEY • DAVIS • IRVINE • LOS ANGELES • MERCED • RIVERSIDE • SAN DIEGO • SAN FRANCISCO SANTA BARBARA • SANTA CRUZ

BRADLEY VOYTEK 9500 GILMAN DR.


ASSOCIATE PROFESSOR LA JOLLA, CA
UC SAN DIEGO 92093-0515

2020 May 01

To: Professor Rajesh Gupta, Director, Halıcıoğlu Data Science Institute


Re: COGS 280: Neural Oscillations
From: Bradley Voytek, Associate Professor

This letter confirms my support for including the above class, COGS 280: Neural Oscillations, as an
elective for the Halıcıoğlu Data Science Institute Master program, Computational Neuroscience
Specialization Area.
I look forward to working with students from HDSI!

Sincerely,

Bradley Voytek, Ph.D.

UC San Diego
Department of Cognitive Science
Halıcıoğlu Data Science Institute
Neurosciences Graduate Program

Cc:
Professor Doug Nitz, Chair, Cognitive Science
Jennifer Morgan, MSO, HDSI
DEPARTMENT OF COGNITIVE SCIENCE, 0515
La Jolla, CA 92093  Fax: (858) 534-1128

9500 GILMAN DRIVE


LA JOLLA, CA 92093-0515
To: Professor Rajesh Gupta, Director, Halıcıoğlu Data Science Institute
Re: COGS 225: Image Recognition
From: Zhuowen Tu, Professor

This letter confirms my support for including the above class, COGS 225: Image Recognition, as an elective for the
Halıcıoğlu Data Science Institute Master program, Computational Neuroscience Specialization Area.

Sincerely,

Zhuowen Tu

Professor
Department of Cognitive Science,
Department of Computer Science and Engineering (affiliate)
University of California, San Diego
Email: ztu@ucsd.edu
Tel: +1-858-822-0908

Cc:
Professor Doug Nitz, Chair, Cognitive Science
Jennifer Morgan, MSO, HDSI
SIAVASH MIRARABBAYGI PHONE: (858) 822-6245
ASSISTANT PROFESSOR OF ELECTRICAL AND COMPUTER ENGINEERING E-MAIL: smirarabbaygi@ucsd.edu
9500 GILMAN DRIVE, MC 0407
LA JOLLA, CA 92093

May 4, 2020

To: Prof. Rajesh Gupta, Director HDSI


Re: ECE208: Computational Evolutionary Biology
From: Siavash Mirarab, Assistant Professor, ECE

This is to confirm that I support the listing of the above course that I teach for the graduate
program in ECE at the Masters level as an elective for the HDSI Masters program under any
appropriate specialization and welcome qualified graduate students in this course.

With kindest regards

Sincerely,

Siavash Mirarab

1
MASSIMO FRANCESCHETTI PHONE; (858) 822-2284
PROFESSOR OF ELECTRICAL ENGINEERING FAX: (858) 534-2486
9500 GILMAN DRIVE, MC 0407 E-MAIL: massimo@ece.ucsd.edu
LA JOLLA, CALIFORNIA 92093-0407

April, 30, 2020

To: Prof. Rajesh Gupta, Director HDSI


Re: ECE227: Big Network Data
From: Massimo Franceschetti, Professor, ECE

This is to confirm that I support the listing of the above course that I teach for the machine
learning and data science graduate program in ECE at the Masters level as an elective for the
HDSI Masters program under the Networks specialization and welcome qualified graduate
students in this course.
With kindest regards

Massimo Franceschetti

1
Appendix C: Catalogue Copy Description [Draft]

Data Science (DSC)


All courses, faculty listings, and curricular and degree requirements described herein are
subject to change or deletion without notice. Updates may be found on the Academic Senate
website: http://senate.ucsd.edu/catalog-copy/approved-updates/.

The Graduate Program


The graduate program offers a master of science degree and a doctor of philosophy degree in
data science. To be accepted into either course of study, a student should have a BS or BA
degree in relevant fields or work experience in Data Science, or be able to demonstrate an
equivalent competency.

Admission to the graduate program is done through the Graduate Division, UC San Diego. The
application deadline is in December. Admissions are always effective the following fall quarter.
For admission deadline and requirements, please refer to the departmental web page:
http://datascience.ucsd.edu.

Admission decisions for the MS and PhD programs are made separately. A current MS student
who wishes to enter the PhD program must submit a petition, including a new statement of
purpose and three new letters of recommendation, to the HDSI graduate admissions committee.

Data Science Program


The field of Data Science spans mathematical models, computational methods and analysis
tools for navigating and understanding data and applying these skills to a broad and emerging
range of application domains. A whole range of industries – from drug discovery to healthcare
management, from manufacturing to enterprise business processes as well as government
organizations – are creating demand for data scientists with a skill set that enables them to
create mathematical models of data, identify trends and patterns using suitable algorithms and
present the results in effective manners. The target systems can be, for example, biological
(e.g., clinical data from cancer patients), physical (e.g., transportation networks), social (e.g.,
social networks) or cyber-physical (e.g., smart grids). In all these cases, there is a combination
of core knowledge in information processing coupled with the skills to abstract, build and test
predictive and descriptive models that must be taught and learnt in the context of an application

Ph.D. in Data Science, November 30, 2020 Version 4.1 59 | Page


domain. These application areas are in many domains served by Engineering, Physical
Sciences, Social Sciences, Health & Life Sciences, and Arts & Humanities.

Doctor of Philosophy Program

The goal of the doctoral program is to create leaders in the field of Data Science who will lay the
foundation and expand the boundaries of knowledge in the field.

Course Requirements

There are Foundation, Core, and Elective and Research requirements for the graduate
program. These course requirements are intended to ensure that students are exposed to (1)
fundamental concepts and tools (Foundation), (2) advanced, up-to-date views in topics central
to Data Science for all students (the Core requirement), and (3) a deep, current view of their
research or application are (the Elective requirement). Courses may not fulfill more than one
requirement.

The doctoral program is structured as a total of 52 units in courses grouped into foundational,
core, professional preparation and research experience areas as described below. Successful
completion of the program requires successful and timely completion of three examinations and
completion of a doctoral dissertation. Out of the 52 units, 48 units must be taken for letter grade
and at least 40 units must be using graduate-level courses. Out of the 12 courses, at least 10
must be graduate-level courses; at most two can be upper-level undergraduate courses. 36
units or 9 courses must be completed within six quarters from the start of the degree program.
Group A, Group B and Group C. Group A courses are introductory level courses taught at the
level of undergraduate senior or mezzanine courses. Group B are core graduate level courses
with prerequisites from Group A courses. Group C are advanced, specialized and free-standing
courses, often part of the required courses in the Data Science specialization of Graduate
Program in other departments. In all three groups, required courses are indicated as such; they
can not be substituted by other courses without exception approval from the graduate program
committee.

Group A: Preparatory Courses


There are five important knowledge and skills necessary for understanding (and advancing)
core data science. It is, therefore, important that all our entering students either have
background preparation or have courses available in the program to ensure a successful
completion of the stipulated doctoral degree program. A student can receive credit towards the
Ph.D. degree for a maximum of three courses from the list of courses below:

1. DSC 200: Data Science Programming.


2. DSC 202: Data Management for Data Science
3. DSC 210: Numerical Linear Algebra
4. DSC 211: Introduction to Optimization

Ph.D. in Data Science, November 30, 2020 Version 4.1 60 | Page


5. DSC 212: Probability and Statistics for Data Science

Group B: Core Courses


Four core courses are required for all Ph.D. students, including those with a Bachelors in Data
Science. The four required courses are:
1. DSC 240: Machine Learning
2. DSC 260: Data Ethics and Fairness
3. DSC 241: Statistical Models (or MATH 282B)
4. DSC 204A: Scalable Data Systems (or CSE 202)

In addition, a doctoral student must select at least 2 out of the following 8 core courses
1. DSC 203: Data Visualization and Scalable Visual Analytics
2. DSC 204B: Big Data Analytics and Applications
3. DSC 242: High-dimensional Probability and Statistics
4. DSC 243: Advanced Optimization
5. DSC 244: Large-Scale Statistical Analysis
6. DSC 245: Introduction to Causal Inference
7. DSC 250: Advanced Data Mining
8. DSC 261: Responsible Data Science

Thus, together with Group A and Group C courses, doctoral students are required to take a
minimum of 5 courses for letter-grade credit. On the other end, students can satisfy all letter
grade course requirements except (satisfactory completion of professional preparation)
teaching, survival skills and research seminar courses. These students are expected to enroll
into individual research (DSC 298) in a section offered by the faculty advisor to meet residency
requirements and maintain graduate student standing during the period of dissertation research.

Group C: Professional Preparation and Elective Courses


Group C courses aim to provide either practical experiences in chosen specialization areas, or
advanced training for students preparing for doctoral programs. The courses include required
professional preparation courses: 2 unit TA/tutor training (DSC 599), 1 unit of academic survival
skills (DSC 295) and 1 unit faculty research seminar (DSC 293), all of which must be completed
with a Satisfactory (S) grade using the S/U option.

Professional Preparation Courses


1. DSC 599: TA/Tutor Training
2. DSC 293: Faculty Research Seminar
3. DSC 294: Research Rotation
4. DSC 295: Academia Survival Skills

Elective and Specialization Courses

Ph.D. in Data Science, November 30, 2020 Version 4.1 61 | Page


Students can choose from following elective or specialization tracks a total of three 4-unit
courses to complete course requirements.

DSC 205, DSC 231, DSC 251, DSC 252, DSC 253, DSC 254, DSC 213
CSE 234, MATH 181 A-B-C, MATH 284, MATH 285, MATH 287A-B, COGS 243.

Preliminary Advisory Assessment


The preliminary assessment is an advisory examination. It consists of an oral examination in an
area selected by the student with the goal to assess the student's preparation for the proposed
area, including several relevant topics, and identify any courses that are required or
recommended for the candidate based on knowledge shown and critical missing background
revealed. The preliminary examination must be completed before the start of the second year in
the doctoral degree program. The examination dates are announced no later than the start of
the Winter Quarter. A failing grade in the preliminary examination would include
recommendation for the opportunity to receive a MS in Data Science degree provided they meet
the degree requirements in no more than one extra quarter over the standard time for the MS
program; here we refer to the newly proposed degree of MS in Data Science (not its online
version). Students who fail the preliminary examination may file a petition to retake it; if the
petition is approved, they will be allowed to retake it one (and only one) more time.

After a student successfully completes the preliminary assessment examination, in the next
annual review of the student (conducted annually in the Fall Quarter as a part of the Annual
Faculty Retreat), the GradCom of the HDSI Faculty Council assigns the academic advisor to
provide necessary updates to the GradCom and helps in setting up the doctoral dissertation
committee.

Research Qualifying Examination (UQE)


A research qualifying examination (UQE) is conducted by the dissertation committee consisting
of five or more members approved by the graduate division as per senate regulation 715(D).
One senate faculty member must have a primary appointment in the department outside of
HDSI. Faculty with 25% or less partial appointment in HDSI may be considered for meeting this
requirement on an exceptional basis upon approval from the graduate division. The goal of
UQE is to assess the ability of the candidate to perform independent critical research as
evidenced by a presentation and writing a technical report at the level of a peer-reviewed journal
or conference publication. The research qualifying examination must be completed no later than
fourth year or 12 quarters from the start of the degree program; the UQE is tantamount to the
advancement to PhD candidacy exam

Dissertation Defense Examination


Students must successfully complete a final dissertation defense presentation and examination.

Ph.D. in Data Science, November 30, 2020 Version 4.1 62 | Page


Student with Disabilities

In order for the program to respond, a student requiring accommodation for disability may make
a request for accommodation upon submission of the student’s intent to apply to the Graduate
Program. Declaration of any disability information is not part of the admissions review process
and will not be a factor in admissions.

Information concerning accommodation requests is available at: https://disabilities.ucsd.edu/ .


Distance learning sites must confirm their ability to support students with disabilities.

Ph.D. in Data Science, November 30, 2020 Version 4.1 63 | Page


Halicioglu Data Science Institute (HDSI) Bylaws – Preamble

UC San Diego has organized the Halicioglu Data Science Institute (HDSI) as an academic unit
tasked with creation and operation of academic programs related to the field of Data Science,
broadly defined as the study of mathematical models, computational methods and analysis
tools for navigating, securing and understanding data, data-driven systems and decisions and
applying these skills to all areas of human enquiry, creativity and applications in natural, social
and engineered systems. Due to its breadth, Data Science is considered a transdisciplinary
subject, that is, spanning and overlapping many existing disciplines such as Mathematics,
Computer Science, Electrical Engineering as well as topical areas of Machine Learning,
Artificial Intelligence, Data/Cyber Infrastructure and Digital Humanities.

Serving as the hub for data science talent and programs, HDSI builds upon unique strengths of
UC San Diego. In particular, UC San Diego seeks institutional presence – both of UCSD in
Data Science as well as Data Science at UCSD – that benefits all existing units, departments,
programs and schools with potential to contribute to the academic discipline. As an academic
unit, HDSI mission consists of three core components: (a) train talent in data science at all
levels via courses, degree and professional training programs; (b) catalyze research in data
science via integrative research projects and initiatives; and (c) cultivate an ecosystem of data
science by engaging industry partners, non-profit and civic organizations with potential to
contribute to data science practice.

Institutionally, HDSI is designated an academic unit with responsibilities that include functions
carried out by traditional academic departments and schools. Accordingly, HDSI functions
under a divisional budget model and direct oversight by the Academic Senate and
administration. The Institute is also endowed by founding gift to ensure HDSI is the hub for
Data Science and is able to carry out its three-part mission by bringing the campus together.
Halicioglu Data Science Institute (HDSI) Bylaws

1st Draft 14 Oct. 2019; this version May 1, 2020

1. INTRODUCTION

Preamble
The authority for the departments to establish a form of departmental governance is
established by Academic Senate Bylaws (Part I), Title VI, Bylaw 55.A:

“According to the Standing Orders of The Regents, “... the several departments of the
University, with the approval of the President, shall determine their own form of
administrative organization.... No department shall be organized in such a way that would
deny to any of its non-emeritae/i faculty who are voting members of the Academic Senate, as
specified in Standing Order 105.1(a), the right to vote on substantial departmental
questions...”

The source documents, referred to as “Higher Authority” for the Bylaws, are the UCSD
PPM, the UC APM and the UC and UCSD Academic Senate Bylaws. Important, frequently
used policies that are specified in detail in these documents are described and referenced
here, along with Department policies that fill in additional details. In the case of
discrepancies, Higher Authority takes precedence.

Principles
Responsibilities These bylaws describe procedures for discharging faculty responsibilities in an
academic unit. Some responsibilities are assigned by the UCSD Policies and Procedures
Manual (PPM) specifically to the faculty as a whole, others to the Director, and some are
assigned to both. These bylaws describe responsibilities, the rules and guidelines for
performing a responsibility, and the selection of faculty members who will perform a role that
fulfills a responsibility. A basic principle that has been followed in the case of shared
responsibilities is for the Director to initiate decisions and actions, and for the faculty to
approve them or take remedial action. On all academic matters, including courses, degree
programs, senate faculty appointments, faculty approval as a group is a necessary requirement
without any exceptions.

Delegation of Responsibilities In order to facilitate effective governance, HDSI may have


Associate and/or Assistant Directors, and permanent committees to which the faculty may
delegate some of their responsibilities and on which the Director may rely for advice. The
Director shall have the authority to create committees, and Associate and/or Assistant
Director positions, subject to relevant university regulations, and to choose the faculty to
serve in those positions. The appointment of Associate Director(s) is done by the Chancellor
in consultation with the Director. Associate Directors may assist the Director in carrying out
the Director’s responsibilities as Director, but the responsibilities themselves may not be
delegated.
Furthermore
(i) all such entities shall have a written statement of their responsibilities;
(ii) faculty may request a Departmental vote on the creation of new entities or on
significant changes that are made to the responsibilities of those entities if some
or all of these responsibilities are not explicitly allocated to the Director;
(iii) committees, Associate/Assistant Director(s) and the Director shall keep the
faculty informed of all significant decisions or actions;
(iv) a faculty member may request a Departmental vote, and by a regular approval
majority vote overturn a committee decision or action over which the faculty has
full or shared jurisdiction.

Structure of the Bylaws

The Bylaws are organized into the following sections.


1. Introduction
2. General Voting Definitions and Procedures
3. Faculty Meetings
4. Directors and Associate Directors
5. Standing Committees
6. Faculty Appointments
7. Faculty Promotions
8. Miscellaneous and Infrequent Responsibilities
9. Bylaw Changes

2. GENERAL VOTING DEFINITIONS AND PROCEDURES

Voting
The HDSI Faculty Council consists of all faculty members that are members of the
Academic Senate, and hold a full or partial (even 0%) appointment with HDSI; HDSI fellows
are also members of HDSI Faculty Council. Faculty members whose HDSI appointment is
solely as adjunct or affiliate are not part of the HDSI Faculty Council.

A faculty member is considered to be in residence for the purposes of a vote if the member is
not on leave from HDSI or university, nor on sabbatical away from the campus, in the quarter
during which the vote is taken.

The HDSI voting population consists of all members of the HDSI Faculty Council that are in
residence.

The means of taking a vote include: show of hands at a faculty meeting, secret written ballot
conducted at a faculty meeting, secret ballot circulated by mail, email or fax ballot. Votes
conducted by written ballots, email and or faxes are referred to as mail votes. Written and
non-written votes conducted at meetings are referred to as meeting votes.

A secret ballot vote must be used for a vote if it is requested by at least one member of the
voting population.
For non-secret votes, a fax or email vote may be used at the discretion of a voting faculty
member. For secret ballot votes, fax and email votes will be allowed for faculty members
who are not in residence, or who are unable to be on campus for any reasons. They are also
allowed for other members unless two faculty members request that email and fax ballots be
restricted for that secret vote. Email votes are counted by the chief administrative officer of
HDSI.

Proxy voting is not allowed. A faculty member who is eligible to vote during a faculty
meeting, but who will be absent for the meeting, may request a ballot in advance, which can
be submitted to and entered into the voting process by the supervisor for that vote.

All written ballots will allow a simple yes, no, or abstain choice on an issue and provide a
space for remarks. The members of a voting population who are in residence and who do
not vote are reported as abstain. The members of a voting population who are not in
residence and who do not vote are reported as absent. The remarks shall be reported along
with any report of the vote’s results, and must be included in any HDSI letter that
summarizes HDSI’s position.

The supervisor of a vote will report the results of all votes to the relevant voting population in
a timely manner. A vote shall not be considered completed until it has been reported to the
voting population.

Some issues will specify a required vote. If a required vote is part of some process, it must be
held. Others will involve a requested vote. A requested vote only occurs under certain
circumstances, which include a certain number or percentage of members of a voting
population requesting the vote. When requested, it must be held in a timely fashion. A
requested vote can be either a mail or meeting vote.

Quorums and Levels of Approval


A quorum is defined to be more than half of the members of the HDSI voting population. All
decision-making votes require a quorum in order to be valid. Mail votes are invalid if the
number of people voting does not meet the quorum by the final deadline for the vote. At the
discretion of the supervisor of the vote, an initial deadline for a vote may be extended,
provided the original deadline has not passed. A mail or electronic vote shall be held open for
a minimum of three working days.

Approval requirements for a vote are defined as follows:

Super-approval: the number of yes votes greater than 2/3 of the size of the HDSI voting
population.

Regular approval: the number of yes votes greater than 50% of the size of the HDSI voting
population.

Simple approval: the number of yes votes greater than 50% of the number of votes cast.
As a general rule, super-approval is required for all required and requested votes that
result in the recruitment of faculty members, changes to the bylaws, creation of (or
significant changes to) descriptions of permanent committee responsibilities, revoking
senate committee assignments, and voting on personnel matters outside the default
voting rules. Regular approval is sufficient for the promotion and/or academic review of
HDSI faculty, and for voting on departmental actions required by policy. Simple
approval is typically used for minor issues in the context of a faculty meeting.

3. FACULTY MEETINGS

A faculty meeting, also referred to as faculty council meeting, is used to carry out a number
of different responsibilities. It may be used by the Director and other faculty members to
make announcements, to provide a forum for discussion of issues of importance, and to
facilitate decisions and actions for which there is no specific provision in the bylaws.

Scheduling
(i) Faculty meetings shall occur at least once a quarter, excluding the summer.
(ii) The Director must call a meeting within a reasonable amount of time if any three
members of the HDSI voting population petition the Director to do so on any issue.

Agenda
The Director shall announce the agenda in writing for each faculty meeting at least two
working days in advance. Urgent items can be added to the agenda at the last minute but
the case for the urgency has to be explicitly stated.
(i) Issues that were the reason for scheduling the meeting, as described above, will be
automatically placed on the agenda.
(ii) For all other issues, any two members of the HDSI voting population may request
that any issue be placed on the agenda for a meeting, and the Director may delay
placing that issue on the agenda for at most one meeting.

Motions
Any issue related to an agenda item may be brought to a vote if it is proposed and seconded
by two members of the HDSI voting population. The Director shall allow a reasonable
amount of time in each faculty meeting to consider faculty motions. If the issue is one for
which the Director has sole, specific authority, the result of the vote will be advisory, but
otherwise it will be binding. The level of required approval will depend on the issue.

Operation
The Director presides over faculty meetings, or delegates this duty to an Associate Director,
or another member of the HDSI voting population. If necessary, a meeting shall have both an
open and a closed part. All academic personnel matters shall be discussed only during closed
parts of the meetings that are restricted to relevant voting members of the HDSI voting
population, the Associate and/or Assistant Director(s), and the relevant academic specialists.
A faculty member may request that an item be discussed in a closed session which is
restricted to members of the relevant HDSI voting population.
Minutes
The Director shall ensure that the minutes for each faculty meeting are published within five
working days following the meeting. As a minimum, minutes shall include a record of each
motion voted on and the outcome of the vote. Faculty members have five working days to
submit corrections to the minutes. Minutes shall be stored in a safe place for at least five
years. Access to the minutes for the closed part of a meeting shall be restricted to faculty
members who were eligible to attend the closed part.

4. DIRECTORS AND ASSOCIATE DIRECTOR(S)

Responsibilities
The specific responsibilities for which the Director has an authority that cannot be delegated
are analogous to those of a Department Chair as described in university regulations (PPM
230-1.IV.B).

Department Consultation: The Director is expected to inform the HDSI faculty and seek
advice for major decisions made with respect to the above responsibilities. The Director shall
inform the faculty of staff organization and responsibilities, and seek advice on how these
arrangements can best be used to support faculty duties and responsibilities. The Director
must be receptive to questions and facilitate appropriate remedial procedures as required.

Selection of Director and Associate/Assistant Director(s)


The HDSI Director is appointed by the Chancellor following UCSD procedures for
administrative appointments. The HDSI Director reports to the HDSI Oversight Committee.
The Director must hold the rank of Full Professor.

Associate Directors are appointed by the Chancellor at the recommendation of the Director.
Associate Directors must be tenured members of the Academic Senate.

Assistant Director appointments are made by the Director. Assistant Director(s) can be staff
members.

The Director and Associate/Assistant Director(s) can be re-appointed by the Chancellor for
an unlimited number of consecutive terms.

The appointment to the office of Director is for a period of five years.

Associate Director appointments are for a period of three years, subject to annual review.

If a Director does not wish to be reappointed, then the new appointments procedure specified
in [PPM 230 2 III A] will be followed. The procedure requires that the tenured members of
the HDSI voting population meet to consider their recommendation of a new Director. In the
case where the recommendation is not unanimous, a vote will take place whose results will
be included as part of the recommendation to the Chancellor.
In the case where a Director wishes to be reappointed, the reappointments procedure specified
in [PPM 230 2 III B] will be followed. The procedure requires that the reappointment ad hoc
committee consult with faculty members. In the year of the reappointment, the tenured
members of the HDSI voting population will meet to determine their recommendation, which
will be forwarded to the committee. In the case where the recommendation is not unanimous, a
vote will take place whose results will be included as part of the recommendation to the
Chancellor.

The Director reappointment evaluation procedure may be initiated by the Chancellor at


intermediate stages of a Director’s tenure.

Any two tenured faculty members may request a secret ballot vote of no confidence in the
Director. The faculty will meet to select by vote, a committee of two HDSI faculty members
who will administer the vote, and report it to the office of the Chancellor via the Dean and/or
Vice Chancellor.

5. STANDING COMMITTEES

Responsibilities
Certain responsibilities will be managed by permanent (standing) committees listed below in
no particular order:
• Space Planning & Collaboratories (SpaceCom)
• Computing and Cyber-Infrastructure (CI)
• Graduate Admissions & Scholarships (GradAdmin)
• Grad and Post-doctoral Programs (GradCom)
• Undergraduate Programs and Scholarships (UGS)
• Colloquia, DLS and Sponsorships (DLS)
• Equality, Diversity and Inclusion (EDI)
• Industry Liaison and Institutional Partnerships (ILIP)
• Recruiting (RecCom): multiple recruiting committees may be appointed to conduct
searches in broadly different areas.

Standing committees have specifically defined areas of significant responsibility and


continue indefinitely, even though their membership may change. Some of the
responsibilities may involve issues that are the sole responsibility of the Director and cannot
be delegated (see Section 4.1). In this case the committees may act in an advisory
role. Other responsibilities are the prerogative of the faculty or are shared responsibilities, and
the committee acts as the representative of the faculty.

Committee Creation, Deletion, and Modification


The Director has the authority to create a new committee, providing a written description
of committee's role, procedures and policies. Any two faculty may request a faculty vote of
approval. Deletion of any existing committee is also open to a requested vote of approval.
Substantial changes to a committee’s role, procedures or policies must be announced and are
open to a requested vote of approval. Requested votes on standing committee creation and
deletion require super-approval. Requested votes on committee policies and procedures, for
which the faculty has sole or shared responsibility, require regular approval.

Committee Selection
In order to facilitate effective governance, the Director chooses committee chairs and, in
consultation with the committee chair, chooses the members of the committee. All permanent
committee members must be members of the appropriate HDSI voting population. The
Director and/or Associate Directors may be a member or chair of one or more standing
committees.

Committee Procedures and Policies


Each committee chair, in consultation with the Director and/or Associate Director(s), must
prepare a specification of that committee’s responsibilities and policies.

In some cases, such as the Recruiting Committee, the policies and procedures are
specified by a higher authority, such as the UCSD PPM or the UC APM. Some of the
more important procedures and policies for this committee are summarized in the
Recruiting Section of these Bylaws.

In other cases, such as Undergraduate Program, Master's Program and Ph.D. program, the
significant policies and procedures are the responsibility of the faculty and must be
approved by a faculty vote. Examples include the procedure for selecting admissions to
the Ph.D. program.

Advisory Board
The chair may create an Advisory Board that consists of up to five distinguished
individuals that do not necessarily belong to the HDSI Faculty Council. The Advisory Board
does not have authority to make decisions. Its creation and operation must follow the
procedures specified in Sections 5.2, 5.3 and 5.4 above.

6. FACULTY APPOINTMENTS

UC Academic Bylaws assign the responsibility for faculty appointments to the tenured
members of a Department [VI.55.B.1]. A vote will be held among the tenured members of
the HDSI voting population to extend the responsibility for faculty appointments to all
members of the Faculty Council, tenured or not; this vote will require super-approval.

Responsibilities for faculty appointments include identifying, evaluating and voting on new
HDSI faculty members. All appointments, except part-time lecturers, require a vote with
super-approval. This includes research series and adjunct appointments. The Director has
sole responsibility for the appointment of part-time lecturers which can be delegated to the
Chair of the suitable committee. Appointments to visiting positions will either be voted on or
the HDSI voting population will delegate this responsibility to the Director or a duly
appointed committee.
Adjunct, visiting and research appointments will be processed individually and require a
vote with regular approval.

The operation of the HDSI faculty as a recruiting committee of the whole is authorized by
the PPM. Some of these responsibilities are delegated by the faculty to the Recruiting
committee. The Recruiting committee and its chair are chosen by the Director. They use the
following procedures.

Hiring Plan
At the beginning of the recruiting season, the Director, in consultation with the HDSI Faculty
Council, will formulate a hiring plan. This plan will be based on the expected number of
positions that will be available, the expected levels of appointments, and targeted areas. Any
specific strategy to be adopted to meet diversity goals will be part of this plan. Examples of
strategy include flexibility in target areas, seniority or any specific target(s) of opportunity
available that season.

Screening
The Recruiting committee, which will be made up from members of the HDSI
Faculty Council, will be responsible for initial screening of applicants for all open positions.
This consists of all positions except for part- time lecturers.

The Recruiting committee will evaluate candidates, solicit letters of reference, and
recommend candidates to be invited for a visit to HDSI. The committee will make every
effort to consider all candidates fairly and to use an appropriate comparison process. The
Recruiting committee will consider both the plan for hiring and the excellence of candidates
which may result in exceptions to the plan consistent with the strategy specified.

The Recruiting committee shall provide the Director its recommendation concerning which
candidates to invite for a visit. The Director will share the committee’s recommendation
with the HDSI faculty. All members of the appropriate HDSI voting population may
examine the applicant files and suggest to the Committee that specific additional candidates
be added to the committee’s recommendation list. In the case of disagreement, a vote will
be scheduled in a timely manner to allow new candidates to be considered along with the
others. Adding an applicant to the list will require, under the general rules for faculty
meetings, regular approval.

Institutional/Departmental Evaluation
After all candidates in a particular search or search specialization who were approved by the
screening process have completed their interviews with HDSI, the Director will call a meeting
of the relevant HDSI voting population to discuss the candidates. At this meeting, a vote will
be held in which each faculty member may vote yes or no for each candidate. Super-approval
is required for a candidate to be further considered. The result of Institutional Evaluation is a
list of faculty candidates recommended for making an offer of a faculty appointment in HDSI
and/or jointly with another department/school. The list may be unordered or partially ordered.
The actual offer and offer order will be specified by an offer strategy discussed next.
Offer Strategy
In the case of multiple candidates for one or more positions, the Director may formulate a
strategy for scheduling the approved candidates for a formal vote and offer. This will take into
account: the original plan for hiring, the approved candidates, the maximal number of offers to
have out at one time, balance between areas, financial cost considerations and the responsiveness
of the candidates to academic and EDI goals of the Institute and of the hiring plans. The Director
will present the strategy to the faculty for advice and consent. If requested by a member of the
voting population, the strategy will be put to a faculty vote, where it will require majority
approval. If no strategy is proposed, or no proposed strategy is approved, the candidates will be
offered positions in an order determined by the number of votes they received. In the case of ties,
the Director will make the decision.

Documentation
The documentation required for a proposed HDSI appointment is covered in PPM 230.20.IX.
Included in this documentation is the Departmental Recommendation Letter. This letter is
meant to summarize the Department position. The file, including this letter, shall be made
available for inspection by the HDSI voting faculty for a period of not less than five working
days before submission of the file. The Director shall announce to the members of the HDSI
voting population when the letter is available for inspection. If a faculty member objects to the
Department letter, or to the process that was used, the faculty member has the option of
including a letter of dissent, which may be signed by one or more faculty members. Dissenting
faculty members must submit their letter within the five-day inspection period. A file shall not
be submitted to the administration that has not had a five-day inspection period.

If desired, the Director may also include a confidential letter in a file, which can be used to
express the Director’s personal opinion.

If significant additional evidence for a file arrives after the file has been submitted, the
Director has the option of submitting the additional information or recalling the file
for additional faculty consideration and/or processing.

Endowed Chairs
The procedure for the awarding of an endowed Chair, including the required faculty consultation
is described in [PPM 230 8]. In the case where an endowed chair is to be used in faculty
recruiting, the candidate must satisfy both the procedure for faculty hiring and the endowed chair
appointment.

7. FACULTY PROMOTIONS

The relevant responsibilities in this category include: identification of HDSI faculty members
who may be eligible for normal or accelerated advancement, assembling a promotion file, and
carrying out a faculty vote; this includes HDSI faculty members with partial HDSI
appointments as long as they have a non-zero % HDSI appointment.
Responsibilities with regard faculty promotions are carried out by the Director, the relevant
voting population subset of the faculty, an ad hoc committee, and the individual faculty
member who is up for promotion.

The Director shall select the chair of an ad hoc committee, which will consist of the chair and
two additional faculty members. The Director shall select the members of the ad hoc
committee in consultation with the ad hoc chair. The committee members must be chosen
from the voting population for the candidate’s promotion.

In the case that a faculty member has a partial (but non-zero) HDSI appointment, HDSI shall
follow the above procedure and arrive at its own recommendation, even if the candidate’s
other department(s) are also conducting a separate review. In case that there is disagreement
in the assessment of the candidate’s file by the different academic units, the candidate will be
given the option to apply to move his/her FTE to the academic unit of his/her choice, before
the promotion recommendation is filed to the Campus.

Screening
In the academic year preceding a normal or proposed accelerated promotion, the Director
shall determine which faculty members are eligible for a normal merit promotion within
rank, or promotion to the next rank.

Any faculty member may request consideration for an accelerated promotion, either at the
time of what would be a normal promotion or at an intermediate time in a promotion cycle.
A faculty member who would be a member of the voting population for a proposed
accelerated promotion may also propose such a promotion for another faculty member.

Institutional/Departmental Evaluation
Faculty members who are eligible for a normal promotion, or who have been proposed
for an accelerated promotion, will be informed in a timely manner of the procedures for
preparing their promotion files, and the deadlines for submission of materials for which
they are responsible.

The ad hoc committee will, if necessary, choose references and oversee the assembly of a
candidate’s file. The Director has the final authority over the selection of references.

Objections to the choice by a faculty member may be made in a dissenting letter to be added
to the file before it is submitted.

It is required by the PPM that voting members have the opportunity to express their
opinions of the promotion case. There will be a pre-vote meeting scheduled for the voting
faculty that must be held far enough in advance of the vote to allow suggestions related to
the processing of the file to be implemented.

The chair of the ad hoc committee shall prepare a letter to the Director that details the
committee’s recommendation.
The relevant voting population shall vote on all promotions to a new rank, advancements
from Full Professor Step V to Step VI, advancements from Full Professor Step IX to
Above Scale (AS), and all accelerated advancements either within or to a new rank. In the
case of merit advancements that are not accelerated, for which the Director in consultation
with the ad hoc committee recommends approval, no faculty vote is required. In the case
of merit advancements for which the Director in consultation with the ad hoc committee
recommends disapproval, the candidate may request that their file be put forward for a
vote, along with the department's negative recommendation.

Voting Population
The voting populations for promotions of members of the Academic Senate will be based
on the default populations specified in UC Academic Senate Bylaw 55. They are defined in
the following table. All references to Professor in the table, unless designated otherwise,
refer to ladder rank Professor appointments who are members of the HDSI voting
population.

For voting purposes, all cases that involve the removal of the Acting modifier from the
title of a member of the Academic Senate shall be treated as promotions to the rank in
question. The table contains an entry for the voting population for promotion to Assistant
Professor. This may occur as the result of the removal of the "Acting" designation, or a
promotion from Instructor. There is no corresponding case for Lecturer PSOE. An
appointment to Senior Lecturer PSOE is considered to be comparable to that of an initial
appointment as Acting Full Professor, based on the salary restrictions. A subsequent
promotion from Senior Lecturer PSOE to Senior Lecturer SOE is covered by row 5 in the
table.

In the case of promotions for non-academic senate members, these promotions will be
determined by faculty members of the Academic Senate using a voting population that
parallels that for Academic Senate promotions. Such appointments include adjunct and
research series appointments and are referred to as “at a level equal to ...” in the table
below.

Promotion to and within Eligible voters

Assistant Professor, Assistant Full Professors, Associate Professors, Assistant Professors, Full
Teaching Professor Teaching Professor, Associate Teaching Professors

Teaching Professor (Lecturer Full Professors, Associate Professors, Full Teaching Professors,
SOE) Associate Teaching Professors (Senior Lecturers SOE)
Associate Professor Full Professors, Associate Professors, Associate Teaching
Associate Teaching Professor Professors
Associate Professor In
Residence

Senior Lecturer SOE Full Professors, Senior Lecturers SOE


Full Professor Full Professors
Full Professor in Residence
or at a level equal to a Full
Professor

Documentation
The clarification and additional documentation details for appointments, as contained in
the Section for recruiting, shall also apply to promotions. In addition, a faculty member
who is a candidate for promotion may, after examination of the redacted promotion file,
include a letter in the file.

8. MISCELLANEOUS AND INFREQUENT RESPONSIBILITIES

These are responsibilities that are not covered in the bylaws. They may be unanticipated,
infrequent, or minor in nature. These responsibilities may be carried out by the Director or
Associate Director(s), by temporary committees, or by the faculty as a whole.

The Director and the faculty shall have the authority to create a temporary committee
and choose its members. The committee is expected to be advisory.

The Director will have primary authority for these responsibilities. In substantial matters for
which the Director and/or faculty has authority, the Director may request a faculty vote.
In matters for which the faculty shares or has authority, the faculty may request a vote.
All votes will be approved by regular approval, except in matters that are covered in the
other sections of the bylaws, for which a higher level of approval is indicated.

The Director should keep the faculty informed of all important issues and decisions
taken with respect to the issues.

9. BYLAW CHANGES

Changes to, additions and deletions from the bylaws are carried out by the HDSI voting
population.

Suggestions for changes to the bylaws, and requests for a vote on the suggested changes,
may be made by any member of the HDSI voting population in accordance with the
regulations for faculty meetings. A vote is required on a suggested change, and such a
vote may be either a meeting vote or a mail vote. All such votes require super-approval.
Ilkay Altintas
San Diego Supercomputer Center Telephone: (858) 822-5453
9500 Gilman Drive Fax: (858) 822-3693
MC 0505 E-mail: altintas@sdsc.edu
La Jolla, CA 92093-0505

Professional Preparation
Middle East Technical University, B.S. Computer Engineering 1999
Ankara, Turkey
Middle East Technical University, M.S. Computer Engineering 2001
Ankara, Turkey
University of Amsterdam, Ph.D. Computational Science 2011
Amsterdam, Netherlands

Appointments
2018-. Fellow, Halicioglu Data Science Institute, UCSD
2016-. Associate Research Scientist, San Diego Supercomputer Center, UCSD
2016-. Faculty Co-Director, Master of Advanced Studies in Data Science and Engineering, UCSD
2015-. Chief Data Science Officer, San Diego Supercomputer Center (SDSC), UCSD
2015-. Division Director, Cyberinfrastructure Research, Education and Development, SDSC,
UCSD
2014-. Founder and Director, Workflows for Data Science Center of Excellence, SDSC, UCSD
2012-. Lecturer, Department of Computer Science and Engineering, UCSD
2012-2016 Assistant Research Scientist, San Diego Supercomputer Center, UCSD
2008-2014 Deputy Coordinator for Research, San Diego Supercomputer Center, UCSD
2004-2014 Founder and Director, Scientific Workflow Automation Technologies Laboratory, SDSC,
UCSD
2005-2007 Assistant Director, National Laboratory for Advanced Data Research (NLADR) - Data,
SDSC, UCSD
2001-2004 Research Programmer (P/A III), SDSC, UCSD
1999-2001 Research Assistant, Middle East Technical University (Ankara, TURKEY)

Products (Out of 100+)


1. I. Altintas, J. Block, R. de Callafon, D. Crawl, C. Cowart, A. Gupta, M.Nguyen, H.W. Braun, J.
Schulze, M. Gollner, A. Trouve, L. Smarr: Towards an Integrated Cyberinfrastructure for Scalable
Data-Driven Monitoring, Dynamic Prediction and Resilience of Wildfires. In Proceedings of the
Workshop on Dynamic Data-Driven Application Systems (DDDAS) at the 15th International
Conference on Computational Science (ICCS 2015), Procedia Computer Science, Volume 51, 2015,
Pages 1633-1642, ISSN 1877-0509, doi:10.1016/j.procs.2015.05.296. (Best Paper Award)
2. Kepler Scientific Workflow System Releases 1.0, 2.0 through 2.4. (Downloaded by 100K+)
3. J. Wang, D. Crawl, I. Altintas, W. Li. Big Data Applications using Workflows for Data Parallel
Computing. Computing in Science & Eng., 16(4), pp. 11-22, July-Aug. 2014, IEEE.

1
4. J. Wang, P. Korambath, I. Altintas, J. Davis, D. Crawl. Workflow as a Service in the Cloud:
Architecture and Scheduling Algorithms. In Proceedings of International Conference on
Computational Science (ICCS 2014), pages 546-556. DOI: 10.1016/j.procs.2014.05.049
5. B. Ludaescher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger-Frank, M. Jones, E. Lee, J. Tao, Y.
Zhao, Scientific Workflow Management and the Kepler System, Concurrency and Computation:
Practice & Experience, 18(10), pp. 1039-1065, 2006. (Cited by 2124 in October 2019.)
Other Selected Products
6. I. Altintas, M.K. Anand, T. Vuong, S. Bowers, B. Ludaescher, P.M.A. Sloot, “A Data Model for
Analyzing User Collaborations in Workflow-Driven eScience,” The International Journal of
Computers and Their Applications (IJCA), 2011. Vol. 18, No. 3, p.160 – 180, Dec, 2011.
7. I. Altintas, A.W. Lin, J. Chen, C. Churas, M. Gujral, S. Sun, W. Li, R. Manansala, M. Sedova, J.S.
Grethe, and M. Ellisman, “CAMERA 2.0: A Data-centric Metagenomics Community Infrastructure
Driven by Scientific Workflows,” In Proceedings of the SWF 2010 at IEEE SERVICES '10, pp.
352-359, 2010. DOI=10.1109/SERVICES.2010.89
8. A. Goderis, C. Brooks, I. Altintas, E. Lee, and C. Goble, “Heterogeneous composition of models of
computation,” FGCS, vol. 25, no. 5, pp. 552–560, 2009.
9. I.Altintas, O. Barney, E. Jaeger-Frank, Provenance Collection Support in the Kepler Scientific
Workflow System, in Provenance and Annotation of Data, LNCS Volume 4145/2006, pages 118-
132, 2006. (Cited by 361 in October 2019.)
10. I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludaescher, and S. Mock, “Kepler: An extensible
system for design and execution of scientific workflows,” in Intl. Conference on Scientific and
Statistical Database Management (SSDBM), Greece, 2004. (Cited by 1103 in October 2019.)

Recent Synergistic Activities


• Associate Editor, Elsevier Future Generation Computer Systems - Impact Factor: 5.768 (since
2012)
• Massive Open Online Course (MOOC) Instructor, Coursera and edX – over 1 Million students
worldwide (since 2016)
• Member, The National Academies of Sciences, Engineering, and Medicine Committee on
Forecasting Costs for Preserving, Archiving, and Promoting Access to Biomedical Data, 2019-
2020
• Member, The National Academies of Sciences, Engineering, and Medicine Committee on
Realizing Opportunities for Advanced and Automated Workflows in Scientific Research, 2019-
2020
• Advisory Board Member, National Center for Atmospheric Research (NCAR) Computational and
Systems Information Lab (since 2017)

2
ERY ARIAS-CASTRO

CONTACT INFORMATION
Department of Mathematics Voice: (858) 534-3590
University of California, San Diego Fax: (858) 534-5273
La Jolla, CA 92093-0112 (USA) E-mail: eariascastro@ucsd.edu

EDUCATION
Ph.D. in Statistics, Stanford University 2004
M.S. in Artificial Intelligence and Applied Mathematics, École Normale Supérieure de Cachan and Washington
University in Saint Louis 1998
B.S. in Mathematics, École Normale Supérieure de Cachan 1997

PROFESSIONAL
Professor, Mathematics, University of California, San Diego 2015–present
Associate Professor, Mathematics, University of California, San Diego 2011–2015
Assistant Professor, Mathematics, University of California, San Diego 2005–2011
Postdoctoral Fellow, Mathematical Sciences Research Institute Spring 2005
Postdoctoral Fellow, Institute for Pure and Applied Mathematics Fall 2004

MEMBERSHIPS
Institute of Mathematical Statistics

SERVICE
Committee work within UCSD: Academic Integrity Review Board [2012-13], Faculty mediator [2018-]
Associate Editor: Annals of Statistics [2013-19], Journal of the American Statistical Association [2014-], Jour-
nal of the Royal Statistical Society [2016-], Electronic Journal of Statistics [2015-], ESAIM - Probability and
Statistics [2015-], ALEA [2019-]
Area Chair: Conference on Learning Theory (COLT) [2016], Artificial Intelligence and Statistics (AISTATS)
[2017, 2019, 2020]
Guest Editor: Special Issue on Detection, IEEE Journal of Selected Topics in Signal Processing, 2012
Reviewer: IEEE Transactions on Image Processing, IEEE Transactions on Information Theory, IEEE Transac-
tions on Signal Processing, IEEE Transactions on Pattern Analysis and Machine Intelligence, Journal of Mathe-
matical Imaging and Vision, Annals of Statistics, Electronic Journal of Statistics, Journal of Multivariate Analy-
sis, Journal of the Royal Statistical Society (Series B), Annals of Applied Statistics, Statistical Science, Journal
of Nonparametric Statistics, Solar Energy, The Astrophysical Journal, Bernoulli, Journal of the American Sta-
tistical Association, ESAIM: Probability and Statistics, Journal of Machine Learning Research, Conference on
Learning Theory, Artificial Intelligence and Statistics, International Conference on Machine Learning, etc.
Conference Organization: Math+Stats+X, a conference in honor of David Donoho’s 60th birthday, 2017 (co-
organizer); IMS Meeting, 2014 (session organizer); Probability and Statistics Day, 2013 (co-organizer); Meeting
of New Researchers in Statistics and Probability, 2014 (board member); Meeting of New Researchers in Statistics
and Probability, 2013 (board member); Meeting of New Researchers in Statistics and Probability, 2012 (chair
and local chair); Quality and Productivity Research Conference, 2012 (session organizer)
Other: NSF grant panel [2012, 2016]

FIVE RECENT PUBLICATIONS


1. E. Arias-Castro, A. Javanmard, and B. Pelletier, “Perturbation bounds for procrustes, classical scaling,
and trilateration, with applications to manifold learning,” Journal of Machine Learning Research, vol. 21,
pp. 15–1, 2020
2. E. Arias-Castro, S. Bubeck, G. Lugosi, and N. Verzelen, “Detecting Markov random fields hidden in white
noise,” Bernoulli, vol. 24, no. 4B, pp. 3628–3656, 2018
3. E. Arias-Castro, G. Lerman, and T. Zhang, “Spectral clustering based on local PCA,” The Journal of
Machine Learning Research, vol. 18, no. 1, pp. 253–309, 2017
4. E. Arias-Castro, “Some theory for ordinal embedding,” Bernoulli, vol. 23, no. 3, pp. 1663–1693, 2017
5. N. Verzelen and E. Arias-Castro, “Community detection in sparse random networks,” The Annals of Applied
Probability, vol. 25, no. 6, pp. 3465–3510, 2015
1. Name: Mikhail Belkin

2. Education – degree, discipline, institution, year:


Ph. D., mathematics, University of Chicago, 2003.
M.Sc. , mathematic, University of Chicago, 1997.
B. Sc. Mathematics, University of Toronto, 1995.

3. Academic experience – institution, rank, title (chair, coordinator, etc. if appropriate)


Ohio State University, professor, 2017-present.
Simons Institute for the Theory of Computing, UC Berkeley, visiting faculty, 2017, 2019.
Ohio State University, associate professor, 2012-2017.
Ohio State University, assistant professor, 2005-2012

4. Current membership in professional organizations:


Association for Computing Machinery (ACM)

5. Honors and awards:


NSF Career Award, Google Faculty Research Award, Lumley Research Award.

6. Service activities (within and outside of the institution):

Editorial Board Service: SIAM Journal on Mathematics of Data Science, Associate


editor, 2020 – present.
The Journal of Machine Learning Research, Action Editor, 2011 – 2020;
IEEE Transactions on Pattern Recognition and Machine Intelligence, Associate Editor,
2011 – 2016.

Recent workshop organizing activities: Steering Committee of Midwest Machine


Learning Symposium, 2018-present; Information Modeling and Control of Complex
Systems Workshop, Ohio State University, 2016, 2017; Simons Institute Workshop on
Spectral Algorithms: From Theory to Practice (co-chair), 2014.
Recent program committee service: Area chair/PC for COLT 2019, NeurIPS 2018, ICML
2018, ICML 2017; AAAI 2017; COLT 2016; AI and Statistics 2015.

7. Briefly list the most important publications and presentations from the past five years –
title, co-authors if any, where published and/or presented, date of publication or
presentation.

• Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal, Reconciling modern
machine learning practice and the bias-variance trade-off, PNAS, 2019, 116 (32).
• Chaoyue Liu, Libin Zhu, Mikhail Belkin, Toward a theory of optimization for
over-parameterized systems of non-linear equations: the lessons of deep learning,
arxiv, 2020.
• Chaoyue Liu, Mikhail Belkin, Accelerating Stochastic Training for Over-
parametrized Learning, ICLR 2020.
• Mikhail Belkin, Daniel Hsu, Ji Xu, Two models of double descent for weak
features, arxiv 2019.
• Mikhail Belkin, Siyuan Ma, Soumik Mandal, To understand deep learning we
need to understand kernel learning, ICML 2018.
• Siyuan Ma, Mikhail Belkin, Kernel machines that adapt to GPUs for effective
large batch training, SysML 2019.
• Mikhail Belkin, Daniel Hsu, Partha Mitra, Overfitting or perfect fitting? Risk
bounds for classification and regression rules that interpolate, Neural Inf. Proc.
Systems (NeurIPS) 2018.
• Siyuan Ma, Raef Bassily, Mikhail Belkin, The power of interpolation:
understanding the effectiveness of SGD in modern over-parametrized learning,
ICML 2018.
Jelena Bradic

Department of Mathematics Office: 858-534-3590


& Halicioglu Data Science Institute Email: ​jbradic@math.ucsd.edu
University of California, San Diego (UCSD) Homepage: ​http://www.jelenabradic.net
Applied Physics & Mathematics Building (AP&M)
9500 Gilman Drive # 0112
La Jolla, CA 92093-0112

Education

​PhD ​ perations Research and Financial Engineering​


O ​Princeton University ​ 2011
Magister P ​ robability and Statistics Belgrade University 2007
BS ​Mathematics Belgrade University 2004

Academic Experience
Stanford University, Statistics Department,
Visiting Associate Professor, 2019-present
University of California San Diego, Halicioglu Data Science Institute,
Associate Professor (with tenure), 2019-present
University of California San Diego, Mathematics Department,
Associate Professor (with tenure), 2018-present
University of California San Diego, Mathematics Department,
Assistant Professor (on maternity leave 2011/2012), 2011-2018

Professional Associations
Institute of Mathematical Statistics, American Statistical Association, Bernoulli Society

Honors and Awards


Journal of the American Statistical Association Discussion Paper, to be awarded in JSM meeting, for
the paper ​Tuning-free Robust Regression with High-Dimensional Heavy-Tailed Data​, ​2020
Leads Fellow, Uc San Diego, ​2019
NSF DMS award 17212481, PI(single), ​Hypothesis Testing in High-Dimensions without Sparsity​, ​2017
Hellman Fellowship, awarded by Hellman Foundation, ​2014
NSF DMS award 1205296, PI(single), ​Regularization for High-Dimensional Inference and Sparse
Recovery,​ ​2012
Laha Award (superseded by IMS New Researcher Travel Award), awarded by IMS,​ 2010
Professional Service
Program Chair Elect, section on Statistical Learning and Data Science of the American Statistical
Association, 2020-2021
Task-force on Equity, Diversity and Inclusion in Undergraduate Education, UC San Diego, 2019-2020
Program co-Chair, conference on Statistical Learning and Data Science, 2020
Associate Editor, Journal of the American Statistical Association, 2019-
Associate Editor, Journal of Nonparametric Statistics, 2019-
Associate Editor, Scandinavian Journal of Statistics, 2019-
Treasurer, section on Nonparametric Statistics, of the American Statistical Association, 2018-2019
Task-force on the Status of Women in Physical Sciences, UC San Diego, 2018-2019
NSF DMS Panelist, Statistics section, 2019, 2017, 2016

Publications
Bradic, Jelena ​and Jianqing Fan and Zhu, Yinchu (2020), Testability of high-dimensional linear
models with non-sparse structures, ​to appear at the Annals of Statistics
Bradic, Jelena ​and Claekens, Gerda and Gueuning,Thomas (2020), Testing fixed effects in
high-dimensional misspecified linear mixed models, ​Journal of the American Statistical Association:
Theory & Methods​, 115 (529), 1-16.
Zhu, Yinchu and ​Bradic, Jelena ​(2018), Linear hypothesis testing in dense high-dimensional linear
models, ​Journal of American Statistical Association:Theory & Methods,​ 113(524), 1583-1600.
Li, Alexander Hanbo and ​Bradic, Jelena ​(2018), Boosting in the presence of outliers: classification
with non- convex loss functions, ​Journal of American Statistical Association: Theory & Methods,​ 512
(113), 660-674.
Ryzhov, Ilya and Han, Bin and ​Bradic, Jelena ​(2016), Cultivating Disaster Donors: A Case
Application of Scalable Analytics for Big Data, ​Management Science​, 62(3), 849-866.
ALEXANDER C. CLONINGER
acloninger@ucsd.edu

EMPLOYMENT
UNIVERSITY OF CALIFORNIA, SAN DIEGO La Jolla, CA
Assistant Professor July 2017-present
Mathematics Department 2017-present
Halicioğlu Data Science Institute 2020-present

YALE UNIVERSITY New Haven, CT


Gibbs Assistant Professor and NSF Postdoctoral Fellow September 2014-June 2017
Applied Mathematics Program

EDUCATION
UNIVERSITY OF MARYLAND College Park, MD
Ph.D. in Applied Math and Scientific Computing Program May 2014
Adviser: Wojciech Czaja and John J. Benedetto
Thesis: Exploiting Data-Dependent Structure for Improving Sensor Acquisition and Integration

WASHINGTON UNIVERSITY St. Louis, MO


B.S. Physics (Second Major in Pure Math) May 2009

WORK EXPERIENCE
INSTITUTE FOR DEFENSE ANALYSIS Bowie, MD
Center for Computing Sciences 6/2010 - 9/2013

DEPARTMENT OF DEFENSE Various locations


Summer Programs Summer 2008-Summer 2009

AWARDS AND HONORS


ˆ US Patent 10,613,176 - 2D NMR Relaxometry with Partial Data 2020
ˆ Co-PI Russel Sage Foundation Grant Number 2196 - Economics and Satellite Imagery 2019-2020
ˆ PI on Collaborative Research NSF DMS-1819222 - Generative Models 2018- 2021
ˆ Founding Member Halıcıoglu Data Science Institute 2018
ˆ NSF Mathematical Sciences Postdoctoral Research Fellowship - Data Fusion 2014- 2017
ˆ Spotlight on Student Research Prize, University of Maryland Math Department 2014
ˆ Ann G. Wylie Dissertation Fellow 2013
ˆ 3rd Place in IEEE GRSS Data Fusion Best Paper Contest 2013
ˆ Seymour Goldberg Prize for Exposition, University of Maryland Math Department 2012
ˆ Distinguished Teaching Assistant, University of Maryland 2009-2010 and 2010-2011
ˆ Gold Metal in Teaching Excellence, University of Maryland Math Department 2010-2011

SERVICE ACTIVITIES
ˆ NSF Panel Reviewer Apr. 2020
ˆ Organizer of Mini-symposium “Distance Metrics High Dim. Point Clouds”, ICIAM Jul. 2019
ˆ Organizer of Mini-symposium “High Dim. Machine Learning”, DSCO SF Institute Mar. 2019
ˆ Organizer of Panel “AI and DNN in Radiation Oncology”, ASTRO Oct. 2018
ˆ Organizer of Undergraduate Math Colloquium Talks Fall 2018
ˆ Founding Faculty HDSI Institute, UCSD Mar. 2018
ˆ Organizer of Mini-symposium “Laplacians and Applications”, SIAM PDE Conference Dec. 2017
ˆ Organizer of Applied Math Seminar at Yale University 2014-2016
SELECTED PUBLICATIONS
ˆ A Potapov, I Colbert, K Kreutz-Delgado, A Cloninger, S Das. “PT-MMD: A Novel Statistical Frame-
work for the Evaluation of Generative Systems.” ASILOMAR, 2019.
ˆ X. Cheng, A. Cloninger, and R.R. Coifman. “Two Sample Statistics Based on Anisotropic Kernels.”
Information and Inference, 2019.
ˆ A. Cloninger, B. Roy, C. Riley, and H. Krumholz. “People Mover’s Distance: Class level geometry using
fast pairwise data adaptive transportation costs.” Applied and Computational Harmonic Analysis,
2019.
ˆ G. Mishne, U. Shaham, A. Cloninger, I. Cohen. “Diffusion Nets.” Applied and Computational Har-
monic Analysis, 2018.
ˆ A Cloninger, S Steinerberger. “On the dual geometry of Laplacian eigenfunctions.” Experimental
Mathematics, 2018.
ˆ J. Katzman, U. Shaham, A. Cloninger, J. Bates, T. Jiang, Y. Kluger. “Deep Survival: A Deep Cox
Proportional Hazards Network.” BMC medical research methodology, 2018.
ˆ A. Cloninger. “A Note on Markov Normalized Magnetic Eigenmaps.” Applied and Computational
Harmonic Analysis, 2017.
ˆ A. Cloninger, W. Czaja, and T. Doster. “The Pre-image Problem for Laplacian Eigenmaps Utilizing
L1 Regularization with Applications to Data Fusion.” Inverse Problems, 2017.
ˆ A. Cloninger, S. Steinerberger. “Spectral Echolocation via the Wave Embedding.” Applied and Com-
putational Harmonic Analysis, 2017.
ˆ U. Shaham, A. Cloninger, R. Coifman. “Provable approximation properties for deep neural networks.”
Applied and Computational Harmonic Analysis, 2017.
ˆ A. Cloninger. “Prediction models for graph-linked data with localized regression.” SPIE: Wavelets
and Sparsity XVII, 2017.
ˆ Nicholas S Downing, Alexander Cloninger, Arjun K Venkatesh, Angela Hsieh, Elizabeth E Drye, Ronald
R Coifman, Harlan M Krumholz. “Describing the performance of US hospitals by applying big data
analytics.” PloS One, 2017.
ˆ A Cloninger, S Steinerberger. “On suprema of autoconvolutions with an application to Sidon sets.”
Proceedings of the American Mathematical Society, 2017.
ˆ A. Cloninger, R. Coifman, N. Downing, H. Krumholz. “Bigeometric Organization with Deep Nets.”
Applied and Computational Harmonic Analysis, 2016.
ˆ A. Cloninger, W. Czaja. “Eigenvector Localization on Data-Dependent Graphs.” SampTA, 2015.
ˆ A. Hafftka, H. Celik, A. Cloninger, W. Czaja, R. Spencer. “2D Sparse Sampling Algorithm for ND
Fredholm Equations with Applications to NMR Relaxometry.” SampTA, 2015.
ˆ R. Bai, P. Basser, A. Cloninger, W. Czaja. “Efficient 2D MRI Relaxometry Using Compressed Sens-
ing.” Journal of Magnetic Resonance, 2015.
ˆ N Jamil, X Chen, A Cloninger. “Hildreth’s algorithm with applications to soft constraints for user
interface layout.” Journal of Computational and Applied Mathematics, 2015.

SELECTED TALKS
ˆ Kernel approaches in global statistical distances, local measure detection, and active learning. Collo-
quium talk, Claremont Graduate University, Claremont, CA, February 5, 2020.
ˆ Dual Geometry of Laplacian Eigenfunctions with Applications to Graph Wavelets, Cuts, and Visual-
ization. Jubilee of Fourier Analysis and Applications, College Park, MD, September 21, 2019.
ˆ Manifold Learning with Diffusion Variational Autoencoders. Approximation Theory 16, Vanderbilt
University, Nashville, TN, May 21, 2019.
ˆ Fast Detection of Inter-Group Differences in Images. Statistical, Variational, and Learning Techniques
in Image Analysis, Joint Math Meetings, Baltimore, MD, January 19, 2019.
ˆ New Developments in AI/Deep Learning. Artificial Intelligence and Deep Learning Within Radiation
Oncology, 2018 ASTRO Meeting, San Antonio, TX, October 23, 2018.
ˆ Fast Point Cloud Distances and Multi-Sample Testing. Applied Harmonic Analysis and Data Process-
ing, Mathematisches Forschungsinstitut Oberwolfach, Oberwolfach, Germany, March 29, 2018.
ˆ Deep Learning Function Approximation on Manifolds. Applied Harmonic Analysis, Massive Data Sets,
Machine Learning, and Signal Processing Workshop, Casa Matematica Oaxaca, October 18, 2016
ˆ Defining Distances Between High-Dimensional Point Clouds. Symposium on Advanced Computational
Methods in Biomedical Imaging, National Institutes of Health, October 6, 2016
BIOGRAPHICAL SKETCH
NAME: de Sa, Virginia R
Completion
DEGREE FIELD OF STUDY
EDUCATION Date

Queen’s University B.Sc. 06/1998 Mathematics and


Engineering (Electrical) Engineering
University of Rochester PhD 06/1994 Computer Science
University of Toronto Postdoc 12/1995 Computer Science
University of California, San Francisco Postdoc 08/2001 Theoretical
Neuroscience
Academic Experience

1994-1995 Postdoctoral Fellow, Computer Science, University of Toronto (mentor: Geoff Hinton)
1996-2001 Postdoctoral Fellow, Physiology, University of California at San Francisco (mentors: Michael
Merzenich, Michael Stryker)
2001-2008 Assistant Professor, Cognitive Science, University of California at San Diego
2008-2018 Associate Professor, Cognitive Science, University of California at San Diego
2018-present Professor, Cognitive Science, University of California at San Diego
2019-present Associate Director, Halıcıoğlu Data Science Institute, University of California at San Diego

Current Membership in Professional Organizations

Founding Member of the BCI Society

Honors and Awards

1988 Medal in Mathematics and Engineering (88), Queen's University [highest (in ME dept) standing]
1988 Professional Engineer's Gold Medal (88), Queen's University [highest standing in final year]
1988 Governor-General's Medal, Queen's University [highest standing throughout 4 years of Eng]
1988-1992 Natural Sciences and Engineering Research Council of Canada (NSERC) 1967 Science and
Engineering Scholarship [one of 47 given to graduating students across Canada]
1994-1995 Natural Sciences and Engineering Research Council of Canada (NSERC) Postdoctoral Fellowship
1996-1998 Sloan Postdoctoral Fellowship
2001-2007 NSF CAREER Award
2003-2004 UCSD Faculty Career Development Program Award
2007-2008 UCSD Chancellor’s Collaboratories award
2012-2013, 2016-2017, 2018-2019 Kavli Innovative Research Award
2016-2017, 2017-2018 UCSD Frontiers of Innovation Scholars Program
2019-2020 Kavli Symposium Inspired Proposal Award

Service Activities (External)

1999, 2000 Advanced Tutor, EU Advanced Course in Computational Neuroscience, Trieste, Italy
2001-2002 Co-chair for the Neural Information Processing Systems workshops
2002 Member of NSF, Knowledge and Cognitive Systems, grant review panel
2003 Member of NSF, Machine Learning, grant review panel
2008 Member of NSF, Robust Intelligence, grant review panel
2009-present Institutional Review Board for Neurosky
2013,2016 Member of NSF, Human Centered Computing, grant review panel
2007,2014,2017,2018,2019,2020 Program Committee (Neural Information Processing Systems (NIPS))
2016,2017,2018,2019,2020 Program Committee (Cognitive Science Conference)
2018,2019 Program Committee (International Conference on Learning Representations (ICLR))
2019 Program Committee (International Joint Conference on Artificial Intelligence (IJCAI))
2020 NSF grant Panel Review CISE

Recent Service Activities (Internal)

2019-present Associate Director of the Halıcıoğlu Data Science Institute


2019-present Served on Capacity-based admissions workgroup
2019-2020 Chaired one and sat on 3 other faculty recruitment committees
2019-present Executive Committee Institute for Neural Computation

Most important publications and presentations

Noh, E., Liao, K., Mollison, M.V., Curran, T., & de Sa, V.R. (2018). Single-trial EEG analysis predicts memory
retrieval and reveals source-dependent differences. Frontiers in Human Neuroscience 12:258. doi:
10.3389/fnhum.2018.00258

Mousavi, M., & de Sa, V.R. (2019). Temporally Adaptive Common Spatial Patterns with Deep Convolutional
Neural Networks. Proceedings of the 41st Annual International Conference of the IEEE EMBS Engineering in
Medicine and Biology Society (EMBC'19)

Xu, X. Huang, J. & de Sa, V.R. (2019) Pain Evaluation in Video using Extended Multitask Learning from
Multidimensional Measurements. Proceedings of Machine Learning Research (Machine Learning for Health
ML4H at NeurIPS 2019).

Liao, K., Mollison, M., Curran, T., and de Sa, V.R. (2018). Single-Trial EEG Predicts Memory Retrieval Using
Leave-One-Subject-Out Classification. First International Workshop on Machine Learning for EEG Signal
Processing (MLESP 2018).

Noh, E., Liao, K., Mollison, M.V., Curran, T., & de Sa, V.R. (2018). Single-trial EEG analysis predicts memory
retrieval and reveals source-dependent differences. Frontiers in Human Neuroscience 12:258. doi:
10.3389/fnhum.2018.00258

Mousavi, M. & de Sa, V.R. (2019) Spatio-temporal analysis of error-related brain activity in active and passive
brain-computer interfaces. Brain-Computer Interfaces. https://doi.org/10.1080/2326263X.2019.1671040

Mousavi, M., Koerner, A.S., Zhang, Q., Noh, E., & de Sa, V.R. (2017) Improving motor imagery BCI with user
response to feedback. Brain-Computer Interfaces. doi 10.1080/2326263X.2017.1303253

de Sa, V.R. Using insights from cortical architectures for neural networks. Invited presentation at Cell Press
Beijing Conference: AI and the Brain. Nov 6-7, 2019 Sunrise Kempinski Hotel, Beijing China

Tang, S. & de Sa, V.R. (2019). Exploiting Invertible Decoders for Unsupervised Sentence Representation
Learning. ACL 2019.

Most recent professional development activities


UCSD faculty leadership academy (2019-present)
Justin Eldridge
 eldridge@cse.ohio-state.edu
 http://web.cse.ohio-state.edu/~eldridge.48/
 (330) 803-7449

Research Machine learning and artificial intelligence.


Interests Theoretical foundations of unsupervised learning.
Scientific discovery aided by machine learning.
Pedagogy of computer science and machine learning.

Education Ph.D. in Computer Science 2017


The Ohio State University
Dissertation: Clustering Consistently
Advisors: Mikhail Belkin & Yusu Wang
M.S. in Computer Science 2016
The Ohio State University
B.S. in Physics 2011
The Ohio State University
Summa cum laude
B.S. in Applied Mathematics 2011
The Ohio State University
Summa cum laude

Academic Senior Lecturer Spring 2018


Positions CSE 2321: Foundations I (Algorithms), OSU
Post-Doctoral Researcher Spring 2018
The Ohio State University
Presidential Fellow 2017
The Ohio State University
Graduate Visitor Spring 2017
Simons Institute for the Theory of Computing, Berkeley
Graduate Research Assistant 2011-2017
Dept. of Computer Science and Engineering, OSU
Advisors: Mikhail Belkin & Yusu Wang
Graduate Research Assistant 2012
Center for Cognitive Science, OSU
Advisors: Mikhail Belkin, Simon Dennis, Allison Lane
Graduate Student Instructor 2011-2012
CSE 101/105: Intro to Computer-Assisted Problem Solving, OSU
Undergraduate Research Assistant 2010
Dept. of Physics, The Ohio State University
Advisor: Fengyuan Yang
Undergraduate Research Assistant 2009
Dept. of Physics, University of California, Davis
Advisor: Rena Zieve

Awards Presidential Fellowship, The Ohio State University, 2017.


Most prestigious award bestowed by the OSU graduate school.
2016 Neural Information Processing Systems (NIPS) Travel Award.
Best Student Paper, Conference on Learning Theory (COLT), 2015.
Beyond Hartigan Consistency, with M. Belkin & Y. Wang.
Center for Cognitive Science GRA, The Ohio State University, 2012.
Smith Senior Award, Dept. of Physics, The Ohio State University, 2011.
National Merit Scholar, 2006.

Eldridge | 2
Publications Conference Papers
Unperturbed: spectral analysis beyond Davis-Kahan.
J. Eldridge, M. Belkin, Y. Wang
Algorithmic Learning Theory (ALT), 2018.

Graphons, mergeons, and so on!


J. Eldridge, M. Belkin, Y. Wang
Neural Information Processing Systems (NIPS), 2016.
Full oral, top ∼2% of submissions.

Beyond Hartigan Consistency: Merge Distortion Metric for Hierarchical


Clustering.
J. Eldridge, M. Belkin, Y. Wang.
Conference on Learning Theory (COLT), 2015.
Mark Fulk Award, best student paper.

Support Vector Machine (SVM) Analysis of Auditory Oddball Event-


Related Potentials (ERP) Classifies Toddlers with and without
Early Signs of Autism.
A.E. Lane, J. Eldridge, K. Harpster, S. Dennis, T. Shahin, M.
Belkin.
International Meeting for Autism Research (IMFAR), 2012.

Journal Articles
Robust features for the automatic identification of autism spectrum dis-
order in children.
J. Eldridge, A.E. Lane, M. Belkin, S. Dennis.
Journal of Neurodevelopmental Disorders, 2014.

Workshop Abstracts
Graphons, mergeons, and so on!
J. Eldridge, M. Belkin, Y. Wang.
Abstract, talk. Workshop on Geometry and Machine Learning,
2016.

Technical Reports
Denali: A tool for visualizing scalar functions as landscape metaphors.
J. Eldridge, M. Belkin, Y. Wang.
http://denali.cse.ohio-state.edu/tech_report.pdf

Eldridge | 3
Reviewing Theoretical Computer Science, special issue on Algorithmic Learning
Theory.
IEEE Transactions on Pattern Analysis and Machine Learning.

Talks Invited
Tulane CS Colloquium, November 2017.
Air Force Research Laboratory ATR Summer Seminar, June 2017.
Information Theory and Applications, Graduation Day Talk, Feb. 2017.
Italian Institute of Technology Machine Learning Seminar, Dec. 2016.

Conference
NIPS 2016, full oral. Video: https://youtu.be/en_qtNAtkUs
COLT 2015, best student paper. Video: https://goo.gl/c7M42J

Seminar
Consistent Clustering. AI Seminar, OSU, November 2017.
Graphons, mergeons, and so on! Topology, Geometry, and Data Analysis
(TGDA) seminar, OSU, November 2016.
Graphons, mergeons, and so on!. AI Seminar, OSU, November 2016.
What do we seek in a hierarchical clustering?, AI Seminar, OSU, April
2015.

Teaching Senior Lecturer, The Ohio State University


CSE 2321: Foundations I (Algorithms), Spring 2018

Guest Lecturer, The Ohio State University


CSE 5522: Machine Learning – 3 classes
CSE 2331: Foundations II (Algorithms) – 2 classes

Graduate Instructor, CSE 101/105: Computer-Assisted Problem Solving,


The Ohio State University, 2011-2012.
Invited by students to Faculty Appreciation Lunch.

Software Denali: Cross-platform, open source interface for visualizing hierarchies


as landscape metaphors. Written in C++ using Qt and VTK.
http://denali.cse.ohio-state.edu

Eldridge | 4
Name: Aaron Fraenkel

Education: Ph.D. Mathematics, UC Berkeley 2011

Academic Experience:
● UCSD, Assistant Teaching Professor, Chair Undergraduate Program, 2018-2020
● Boston College, Visiting Assistant Professor, 2012-2014
● Pennsylvania State University, Chowla Research Assistant Professor, 2011-2012

Non-Academic Experience:
● ID Analytics, Senior Data Scientist, Fraud Modeling and Identity Resolution, 2014-2016
● Amazon.com, Senior Machine Learning Scientist, Security and Abuse, 2016-2018

Service: Chair, DSC Undergraduate Program


Rajesh K. Gupta
Computer Science and Engineering
University of California, San Diego
9500 Gilman Drive, MC 0404
La Jolla, CA 92093-0404 Phone: 858-822-4391 http://mesl.ucsd.edu

EDUCATION
Indian Institute of Technology, Kanpur Electrical Engineering B. Tech, 1984.
UC Berkeley, Berkeley, CA Electrical Engineering M.S., 1986.
& Computer Science
Stanford University, Stanford, CA Electrical Engineering Ph. D., 1994.

ACADEMIC APPOINTMENTS
2018-now Distinguished Professor, Computer Science & Engineering, UC San Diego
2018-now Director, Halicioglu Data Science Institute, UC San Diego
2003-2018 Qualcomm Chair Professor, Computer Science & Eng., UC San Diego
2006 Visiting Professor, EPFL, Lausanne, Switzerland
2005 Visiting Professor, Electrical Engineering, Stanford University
2002-2003 Professor of Information and Computer Science, UC Irvine
1998-2002 Associate Professor of Information and Computer Science, UC Irvine
1996-1997 Assistant Professor of Information and Computer Science, UC Irvine
1994-1996 Assistant Professor of Computer Science, U. Illinois, Urbana-Champaign.
1986-1993 Senior Design Engineer, Intel Corporation, Santa Clara, California.

RELEVANT RECENT PUBLICATIONS


1. “Who can Access What, and When? Understanding Minimal Access Requirements of
Building Applications,” J. Koh, D. Hong, S. Nagare, S. Boovaraghavan, Y. Agarwal, R.
K. Gupta ACM International Conference on Systems for Buildings, Cities, and
Transportation (BuildSys), 2019.
2. “Beyond a House of Sticks: Formalizing Metadata Tags with Brick”, G Fierro, J. Koh, Y.
Agarwal, R. K. Gupta, D. E. Culler ACM Buildsys, 2019.
3. “Plaster: An Integration, Benchmark, and Development Framework for Metadata
Normalization Methods”, J. Koh, D. Hong, R. K. Gupta, K. Whitehouse, H. Wang, Y.
Agarwal, ACM BuildSys, 2018
4. “Scrabble: Transferrable Semi-Automated Semantic Metadata Normalization using
Intermediate Representation,” J. Koh, B. Balaji, D. Sengupta, J. McAuley, R. K. Gupta,
Y. Agarwal, ACM BuildSys, 2018.
5. “Brick: Metadata schema for portable smart building applications,” B. Balaji, A.
Bhattacharya, G. Fierro, J. Gao, J. Gluck, D. Hong, A. Johansen, J. Koh, J. Ploennings,
Y. Agarwal, M. Berges, D. Culler, R. K. Gupta, Mikkel B. Kjaergaard, M. Srivastava, K.
Whitehouse, Applied Energy, 2018.

OTHER RECENT PUBLICATIONS


1. “Towards verified programming of embedded devices,” J-P Talpin, J-J Marty, S.
Narayana, D. Stefan, R. K. Gupta, Design, Automation & Test in Europe (DATE), 2019
2. “Real Time Principal Component Analysis,” R. R. Chowdhury, M. A. Adnan, R. K.
Gupta, IEEE International Conference on Data Engineering (ICDE), 2019.
3. “Zodiac: Organizing Large Deployment of Sensors to Create Reusable Applications for
Buildings,” B. Balaji, C. Verma, B. Narayanaswamy, Y. Agarwal, ACM Buildsys, 2015.
4. “Sentinel: An Occupancy Based HVAC Actuation System using existing WiFi
Infrastructure in Commercial Buildings”, B. Balaji, J. Xu, R. K. Gupta, Y. Agarwal,
ACM Conference on Embedded Networked Sensor Systems (SenSys 2013), 2013.
5. “Duty-Cycling Buildings Aggressively: The Next Frontier in HVAC Control,” Y.
Agarwal, B. Balaji, S. Dutta, R. K Gupta, T. Weng, IEEE/ACM International
Conference on Information Processing in Sensor Networks: Sensor Platforms, Tools and
Design Methods (IPSN/SPOTS), April 2011.

RECENT AND ONGOING SYNERGISTIC ACTIVITIES


1. Editor-in-Chief, IEEE Trans on CAD, 2018-.
2. General Chair, BuildSys 2018, IPSN 2009, CPSWeek 2009
3. Founding Editor-in-Chief, IEEE Embedded Systems Letters, 2009-13.
4. Founding General Chair, ACM/IEEE Conference on Models and Methods in Codesign
(MEMOCODE)
5. Founding General Co-Chair of ACM/IEEE/IFIP CODES+ISSS Conference.
Arun Kumar

3218 CSE/EBU3b, UC San Diego Email : arunkk@eng.ucsd.edu


9500 Gilman Drive, Mail Code 0404 Phone: (+1) 614-602-9734
La Jolla, CA 92093 Web: http://cseweb.ucsd.edu/~arunkk/

EDUCATION University of Wisconsin-Madison


Ph.D. in Computer Sciences. 2011–2016
M.S. in Computer Sciences. 2009–2011
Indian Institute of Technology, Madras
B.Tech. in Computer Science and Engineering. 2005–2009

ACADEMIC University of California, San Diego, Assistant Professor


EXPERIENCE Department of Computer Science and Engineering (CSE) From 2016
Halicioğlu Data Science Institute (HDSI) From 2019

NON- Microsoft Jim Gray Systems Lab


ACADEMIC Research Assistant. Fall 2013–Summer 2016
EXPERIENCE
Microsoft Cloud and Information Services Lab
Research Intern. Summer 2013
Oracle Labs
Research Intern. Summer 2012
IBM Research Almaden
Research Intern. Summer 2011

PROFESSIONAL Association for Computing Machinery (ACM)


MEMBERSHIPS Member since 2010. Professional Member since 2017.
The Institute of Electrical and Electronics Engineers (IEEE)
Member since 2014.

SELECTED Google Faculty Research Award 2020, 2017


HONORS Invited Paper at ACM Transactions on Database Systems 2020, 2016
Honorable Mention for Best Paper Award at ACM SIGMOD 2019
ACM SIGMOD Distinguished PC Member 2019, 2017
VLDB Distinguished PC Member 2019
Hellman Fellowship 2018
Faculty of the Year from UCSD oSTEM Chapter 2018
UW-Madison CS Graduate Student Research Award for best PhD research 2016
Best Paper Award at ACM SIGMOD 2014
Invited Paper at the Communications of the ACM 2013

MAJOR Organizer: Associate Editor for VLDB‘21, XLDB‘18, SIGMOD‘18 DEEM Work-
SERVICE shop, SIGKDD‘18 CMI Workshop, SoCal DB Day 2018
PC Member: SIGMOD ‘17–‘20, VLDB ‘18–‘21, MLSys ‘19–‘20, ICDE‘17, SIG-
MOD‘17 Demo and SRC, HotCloud‘16, SIGMOD‘16 URC
Reviewer: ACM TODS 2017 and 2015, IEEE TKDE 2014
Proposal Reviewer/Panelist:
NSF SBIR/STTR Phase II 2020, DOE Solar Office 2020, NSF HDR Data Science
Corps 2019
Key Department/University Service:
2019–20: HDSI Faculty Recruiting Committee; 2019–20: CSE Bylaws Committee;
2018: UCSD LGBTQIA+ Undergraduate Scholarships Committee; 2017–20: CSE
MS Committee; 2017: UCSD SDSC Sustainability Committee; 2016–17: CSE PhD
Admissions Committee
Key Contributions to Diversity:
2019–20: Represented UCSD CSE at NSF Workshop on Departmental BPC Plans;
co-authored CSE’s Departmental BPC plan
2019: Organized a panel on LGBTQ+ community resources on UCSD on CSE Cele-
bration of Diversity Day
2017–18: Represented UCSD and CSE twice at oSTEM National Conference
2017: Co-proposed/created a new UCSD CSE PhD diversity-focused scholarship
2017–: Active member of UCSD CSE DEI Committee
2017–: Actively involved with UCSD LGBT Resource Center and oSTEM activities
(Q & A panels, talks, scholarships, etc.) as an out faculty member

PUBLICATIONS Full papers at top-tier conferences (SIGMOD, VLDB, etc.): 21


SUMMARY Other peer-reviewed conference and journal papers: 6
Peer-reviewed workshop and demonstration papers: 12
Full papers under submission: 3
Number of citations: 1431 and h-index: 15 (as per Google Scholar in May 2020)
Full list of publications: https://adalabucsd.github.io/publications.html

SELECTED Vista: Declarative Feature Transfer from Deep CNNs at Scale


MAJOR Supun Nakandala and Arun Kumar
PUBLICATIONS ACM SIGMOD 2020 (To appear)
Incremental and Approximate Inference for Faster Occlusion-based Deep CNN Ex-
planations
Supun Nakandala, Arun Kumar, and Yannis Papakonstantinou
ACM SIGMOD 2019 (Honorable Mention for Best Paper Award)
Data Management in Machine Learning Systems
Matthias Boehm, Arun Kumar, and Jun Yang
Synthesis Lectures on Data Management, Morgan & Claypool Publ. (Book), 2019
Are Key-Foreign Key Joins Safe to Avoid when Learning High Capacity Classifiers?
Vraj Shah, Arun Kumar, and Xiaojin Zhu
VLDB 2018
Towards Linear Algebra over Normalized Data
Lingjiao Chen, Arun Kumar, Jeffrey Naughton, and Jignesh M. Patel
VLDB 2017
Learning Generalized Linear Models Over Normalized Data
Arun Kumar, Jeffrey Naughton, and Jignesh M. Patel
ACM SIGMOD 2015
Materialization Optimizations for Feature Selection Workloads
Ce Zhang, Arun Kumar, and Christopher Ré
ACM SIGMOD 2014 (Best Paper Award; Invited to ACM TODS 2016)
Yian Ma

E-mail: yianma@google.com
Website: https://sites.google.com/view/yianma

EXPERIENCE
Assistance Professor, Halicioğlu Data Science Institute, University of California, San Diego July 2020 –

Visiting Faculty, Google Brain Health and Google Research August 2019 – July 2020

Post-doctoral Fellow, Electrical Engineering and Computer Sciences September 2017 – August 2019
University of California, Berkeley, CA, USA
Advisor: Michael I. Jordan

EDUCATION
Ph.D. of Science, Applied Mathematics June 2017
University of Washington, Seattle, WA, USA
Advisors: Emily B. Fox and Hong Qian

Bachelor of Engineering, Computer Science and Engineering (honor thesis) June 2012
Shanghai Jiao Tong University, Shanghai, China

SELECTED PUBLICATIONS
• Yi-An Ma, Yuansi Chen, Chi Jin, Nicolas Flammarion, Michael I. Jordan. Sampling can be faster than
optimization, Proc. Natl. Acad. Sci., 2019.
• Chris Aicher, Yi-An Ma, Nick Foti, Emily B. Fox. Stochastic gradient MCMC methods for state space
models, SIAM J. Math. Data Sci., 2019.
• Yi-An Ma, Emily B. Fox, Tianqi Chen, Lei Wu. Irreversible samplers from jump and continuous
Markov processes, Stat. Comput. (2018).
• Niladri S. Chatterji, Nicolas Flammarion, Yi-An Ma, Peter L. Bartlett, Michael I. Jordan. On
the theory of variance reduction for stochastic gradient Monte Carlo, in Proceedings of International
Conference on Machine Learning 35 (ICML 2018).
• Yi-An Ma, Nick Foti, Emily B. Fox. Stochastic gradient MCMC methods for hidden Markov models,
in Proceedings of International Conference on Machine Learning 34 (ICML 2017).
• Xiaojie Qiu, Andrew Hill, Jonathan Packer, Dejun Lin, Yi-An Ma, Cole Trapnell. Single-cell mRNA
quantification and differential analysis with Census, Nature Methods (2017).
• Yi-An Ma, Tianqi Chen, Emily B. Fox. A complete recipe for stochastic gradient MCMC, in Advances
in Neural Information Processing Systems 28 (NIPS 2015).

SELECTED TALKS
• Bridging MCMC and Optimization
– Statistics Department Seminar, Mathematics Department, University of California, Davis; April 2020.
– Mathematics Department Seminar, Duke University; Sept. 2019.
– Invited Talk at Microsoft Research New England, Boston, MA; Aug. 2019.
– Invited Talk at Google Research, San Francisco, CA; July 2019.
– Statistics Department Seminar, University of Warwick; Feb. 2019.
– Machine Learning Department Seminar, Carnegie Mellon University; Feb. 2019.
– Halicioğlu Data Science Institute (HDSI) Seminar, University of California, San Diego; Feb. 2019.
– Statistics Department Seminar, Rutgers University; Feb. 2019.
– Department of Statistics and Data Science Seminar, Yale University; Feb. 2019.
– Statistics Department Seminar, Eberly College of Science, Penn State; Jan. 2019.
– Courant Institute and Center for Data Science Seminar, New York University; Jan. 2019.
– Stewart School of Industrial and Systems Engineering (ISyE) Seminar, Georgia Tech; Jan. 2019.

• When is sampling faster than optimization and how to accelerate it?


– Invited talk at Bayes Comp, University of Floridap; Jan. 2020.
– Statistics and Data Science Symposium, Halicioğlu Data Science Institute (HDSI), UC San Diego; Jan. 2019.
• Stochastic gradient MCMC for independent and correlated data
– Statistics Department Seminar, University of Minnesota; April 2018.
– SAMSI Workshop on “Trends and advances in Monte Carlo algorithms”, Duke University; Dec. 2017.
– Invited talk at Pacific Northwest National Lab (PNNL), Richland, WA; Sept. 2017.
• A unifying framework for constructing MCMC algorithms from irreversible diffusion processes
– Probability Seminar, University of California, Berkeley; April 2018.
• Scalable and efficient MCMC for complex posteriors
– Statistics Department Seminar, Stanford University; Feb. 2017.
– Invited talk at Los Alamos National Lab (LANL), Los Alamos, NM; Jul. 2016.
– ICERM Workshop on “Stochastic numerical algorithms, multiscale modeling and high-dimensional data analytics”,
Brown University; July, 2016.
– Intractable Likelihood (I-Like) Workshop; Lancaster University, Lancaster, UK; June, 2016.
• Stochastic gradient MCMC methods for hidden Markov models
– 34th International Conference on Machine Learning (ICML); Sydney, Australia; Aug. 2017.
• A complete recipe for stochastic gradient MCMC
– 29th Annual Conference on Neural Information Processing Systems (NIPS); Montréal, Canada; December, 2015.

SELECTED AWARDS
• 2017 Stein fellowship (declined for other opportunities).
• Best undergraduate thesis “Lyapunov functions for oscillatory and chaotic dynamical systems” awarded
by the computer science department, Shanghai Jiao Tong University.

SERVICES
• Journals: reviewer for Journal of the American Statistical Association (JASA), Biometrika, Bernoulli,
Journal of Machine Learning Research (JMLR), Statistics and Computing.
• Conferences: reviewer for Advances in Neural Information Processing Systems (NeurIPS/NIPS), Inter-
national Conference on Machine Learning (ICML), Annual Conference on Learning Theory (COLT);
served on the program committee of AAAI conference on Artificial Intelligence.
• Secretary for the University of Washington Chapter of Society for Industrial and Applied Mathematics
(SIAM), 2015-2016.

PATENT
• Patent: Real Time Supervise Machine against Traffic Law Violation awarded by the State Intellectual
Property of the Peoples Republic of China in 2012 (Patent No.: 201120076406.X).
1. Gal Mishne

2. Education –
BSc, Electrical Engineering, 2009
BSc, Physics, 2009
PhD, Electrical Engineering 2017

3. Academic experience –
Yale University, Gibbs Assistant Professor, 2017-2019
UC San Diego, Assistant Professor, 2019-present

4. Non-academic experience –
Rafael Advanced Defense Systems Ltd., Image processing engineer, 2008-2014

5. Awards
AMS-Simons Travel Grant, 2018
SIAM Early Career Travel Award, 2017
Wolf Foundation Award for Ph.D. students, 2016

6. Service activities
• Hiring committee – HDSI/Neurobiology, 2020
• PhD program committee – HDSI, 2020
• Reviewer – Cosyne 2020; ICML 2020; Neural Computation; Involve; IEEE
Transactions on Image Processing; IEEE ICASSP; Elsevier Information Sciences;
Journal of Mathematical Imaging and Vision; Neurons, Behavior, Data analysis,
and Theory; Advances in Computational Mathematics,
• DeepMath 2020 – Co-organizer

7. Briefly list the most important publications from the past five years
• X. Cheng and G. Mishne, ``Spectral embedding norm: To look deep into the spectrum of
the graph Laplacian", accepted to SIAM Imaging Sciences.
• G. C. Linderman, G. Mishne}, A. Jaffe, Y. Kluger and S. Steinerberger,``Randomized
nearest neighbor graphs, giant components and applications in data science", accepted to
Advances in Applied Probability.
• S. Gigante, A. S. Charles, S. Krishnaswamy and G. Mishne, ``Visualizing the PHATE of
Neural Networks", NeurIPS-2019, December 2019.
• G. Mishne, Eric C. Chi and R. R. Coifman, ``Co-manifold learning with missing data",
ICML 2019, June 2019.
• X. Cheng, G. Mishne, and S. Steinerberger, ``The geometry of nodal sets and outlier
detection", Journal of Number Theory, vol. 185, pp 48--64, 2018.
• G. Mishne, R. Talmon, I. Cohen, Y. Kluger and R. R. Coifman, ``Data-driven tree
transforms and metrics", IEEE Transactions on Signal and Information Processing over
Networks, vol. 4, no. 3, pp. 451--466, Sept. 2018
• G. Mishne, U. Shaham, A. Cloninger and I. Cohen, ``Diffusion Nets", Applied and
Computational Harmonic Analysis, Aug. 2017.
• G. Mishne, R. Talmon, R. Meir, J. Schiller, M. Lavzin, U. Dubin and R. R. Coifman,
``Hierarchical coupled-geometry analysis for neuronal structure and activity pattern
discovery", IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 7, pp.
1238-1253, Oct. 2016.

8. Briefly list the most recent professional development activities


DIMITRIS N. POLITIS

Education

1990 Stanford University, Ph.D. in Statistics.


1990 Stanford University, M.S. in Statistics.
1989 Stanford University, M.S. in Mathematics.
1985 Rensselaer Polytechnic Institute, M.S. in Computer and Systems Engineering.
1984 University of Patras, B.S. in Electrical Engineering.

Academic Positions

2018-present Associate Director, Halicioglu Data Science Institute, UCSD.


2016-present Distinguished Professor, Department of Mathematics; also affiliated with the
Department of Economics, University of California at San Diego.
2001-2016 Professor, Department of Mathematics, and Adjunct Professor, Department of
Economics, University of California at San Diego.
2014 (Summer) John-von-Neumann Visiting Professor, Department of Mathematics, Technical
University of Munich, Germany.
1997-2001 Associate Professor, Department of Mathematics, UCSD.
1999 (Fall) Visiting Associate Professor, Athens University of Economics and Business.
1995-1997 Associate Professor, Department of Mathematics and Statistics, University of Cyprus.
1995-1996 Associate Professor, Department of Statistics, Purdue University (on leave).
1990-1995 Assistant Professor, Department of Statistics, Purdue University.

Professional Associations

Institute of Mathematical Statistics (Fellow), American Statistical Association (Fellow), The


Econometric Society, International Society for NonParametric Statistics (Co-founder).

Grants and Awards

2020 Co-Principal investigator, NIH grant T32 MH122376-01 [304118-00001] `Advanced data
analytics training for behavioral and social sciences research'.
2019 Principal investigator, NSF grant DMS 19-14556, `Computer-intensive methods for
nonparametric analysis of dependent data'.
2013 Awarded the Econometric Theory Multa Scripsit Award.
2013 Fellow of the Institute of Advanced Study, Technische Universitaet Muenchen.
2012 Co-Principal investigator, NSF grant DMS 12-23137, `ATD: Detection of Clusters in
Spatial Data and Images'.
2012 Awarded the Tjalling C. Koopmans Econometric Theory Prize 2009-2011 for the paper
"Higher-Order Accurate, Positive Semi-Definite Estimation of Large-Sample Covariance and
Spectral Density Matrices," Econometric Theory, Vol. 27, No. 4, August 2011, pp. 703-744.
2011 Fellowship from the John Simon Guggenheim Memorial Foundation for the project
"Model-free Prediction and Regression".
2011 Elected Fellow of the American Statistical Association (ASA). Citation reads: ``For path-
breaking research in nonparametric statistics, for outstanding applications of this methodology
to time series analysis, resampling, subsampling, and function estimation; and for exemplary
leadership and service to the profession, especially for conference organization and prolific
editorial work.''
2004 Elected Fellow of the Institute of Mathematical Statistics (IMS). Citation reads: ``Prof.
Politis received the award for innovative methodology in the analysis of time series and models
of spatial dependence, as well as groundbreaking theory in nonparametric statistics".

Professional Service

• Chair (2018-2019) and Chair-Elect (2017-2018) of the Section on Nonparametric


Statistics of the American Statistical Association.
• Member of the External Review Committee for the Department of Statistics, Purdue
University, Sept. 2016.
• Co-founder (with M. Akritas and S.N. Lahiri) of the International Society for
NonParametric Statistics (ISNPS), and member of the ISNPS Executive Committee
2010-2016.

Editorial Work

• Co-Editor of the Journal of Time Series Analysis, 2013--present.


• Editor of the IMS Bulletin, 2011--2014.
• Founding member of the Editorial Board of the Springer Book Series: Frontiers in
Probability and the Statistical Sciences, 2012--present.
• Editor of the Journal of Nonparametric Statistics, 2008--2011.
• Associate Editor of the journal Econometrics and Statistics, 2016--2018.
• Associate Editor of the journal Bernoulli, 2013--2018.
• Associate Editor of the Electronic Journal of Statistics, 2013--2018.
• Associate Editor of the Journal of the American Statistical Association, 2011--2020.
• Associate Editor of the Journal of the Royal Statistical Society, Series B, 2009--2012.
• Associate Editor of the IMS Collections Series, 2008--2012.
• Associate Editor of the Journal of Time Series Analysis, 2006--2013.
• (Associate) Editor of the Journal of Multivariate Analysis, 2005--2011.
• (Associate) Editor of the Journal of Nonparametric Statistics, 2005--2008.
• Associate Editor of the Journal of Statistical Planning and Inference, 2000--2006.
• Co-Editor of Sankhya, the Indian Journal of Statistics, 1999--2002.

Publications

Co-author of over 100 journal papers (see list at: www.math.ucsd.edu/~politis/) and of the books:
--SUBSAMPLING, D.N. Politis, J.P. Romano, M. Wolf, Springer, New York, 1999.
--MODEL-FREE PREDICTION AND REGRESSION: A TRANSFORMATION-BASED
APPROACH TO INFERENCE, D.N. Politis, Springer, New York, 2015.
--TIME SERIES: A FIRST COURSE WITH BOOTSTRAP STARTER, T.S. McElroy and
D.N. Politis, Chapman and Hall/CRC Press, 2020.
1. Name: Rayan Saab

2. Education:

• The University of British Columbia, Vancouver, Canada; Electrical Engineering;


PhD., 2010

3. Academic experience:
• 2017–present: Associate Professor, Mathematics, The University of California,
San Diego (UCSD), San Diego, CA
• 2013–2017: Assistant Professor, Mathematics, The University of California, San
Diego (UCSD), San Diego, CA
• 2011–2013: Visiting Assistant Professor, Mathematics, Duke University,
Durham, NC
• 2010–2011: Postdoctoral Researcher, Mathematics, The University of British
Columbia, Vancouver, Canada;

4. Non-academic experience: N/A

5. Certifications or professional registrations

6. Member of the AMS

7. Honors and awards:

• August-Wilhelm Scheer Visiting Prof., Teschnische Universitat Munchen (June


2017)
• Hellman Fellowship (July 2015 -- June 2016)
• Simons Foundation Collaboration Grant (2015, declined)
• Mercator Fellowship (2014 -- 2017)
• Banting Postdoctoral Fellowship (October 2011 - September 2013)
• NSERC Postdoctoral Fellowship (2011, declined)

8. Service activities (within and outside of the institution):


• Within UCSD (last 2 years):
i. 2018/2019 - 3 departmental merit review AdHoc Committees,
Mathematics Department Hiring Committee, Data Science Hiring
Committee, HDSI Faculty Council, Mathematics Department Strategic
Growth Committee, Mathematics Department Space Committee, Faculty
Advisor, Mathematics – Computer Science Major, Math Dept., UCSD.
ii. 2019/2020 – Faculty Advisor, Mathematics – Applied Math Major, Math
Dept., UCSD, Graduate Admissions Committee, Mathematics
Department, HDSI Undergraduate Scholarship Committee, UCSD, HDSI
PhD Program Planning Committee, HDSI Hiring Committee
• Outside UCSD (last 3 years): Special session organizer, Joint Math Meetings,
Atlanta GA, 2017, Session organizer, Information Theory and Applications
Workshop, San Diego CA, February 2017, 2018, 2019, Invited session organizer,
Sampling Theory and Applications (SampTA), July 2017, 2019, IMA Special
Workshop “Phaseless Imaging in Theory and Practice: Realistic Models, Fast
Algorithms, and Recovery Guarantees”, August 14 - 18, 2017, Currently co-
organizing the international inter-institutional “One World Mathematics of
Information, Data, and Signals (MINDS) seminar.

9. Briefly list the most important publications and presentations from the past five years –
title, co-authors if any, where published and/or presented, date of publication or
presentation
• T. Huynh, R. Saab, “Fast binary embeddings, and quantized compressed sensing
with structured matrices", Communications on Pure and Applied Mathematics,
Vol. 73, no. 1, pp 110 – 149, 2020.
• D. Needell, R. Saab, T. Woolf, “Simple Classification using Binary Data", The
Journal of Machine Learning Research, Vol. 19, no. 1, pp. 2487 – 2516, 2018.
• R. Saab, R.Wang, Ö. Yılmaz, “Quantization of compressive samples with stable
and robust recovery", Applied and Computational Harmonic Analysis, vol. 44,
pages 123–143, 2018.
• K. Knudson, R. Saab, R. Ward, “One-bit compressive sensing with norm
estimation", IEEE Transactions on Information Theory, vol. 62, no. 5, pages
2748–2758, 2016.
• M. A. Iwen, B. Preskitt, R. Saab, A. Viswanathan, “Phase retrieval from local
measurements: Improved robustness via eigenvector-based angular
synchronization," Applied and Computational Harmonic Analysis, Vol. 48, no. 1,
pp. 415 – 444, 2020.
• R. Saab, R. Wang and Ö. Yılmaz, “From Compressed Sensing to Compressed
Bit-Streams: Practical Encoders, Tractable Decoders," IEEE Transactions on
Information Theory, Vol. 64, no. 9, pp. 6098-6114, 2018.

10. Briefly list the most recent professional development activities


N/A (?)
Armin Schwartzman
Education
Technion – Israel Inst. of Tech. Haifa, Israel Electrical Eng. BS 1995
California Inst. of Tech. Pasadena, CA Electrical Eng. MS 1996
Stanford University Stanford, CA Statistics PhD 2006

Academic Appointments (full time)


U. of California, San Diego Biostatistics & HDSI Professor 2019-
U. of California, San Diego Biostatistics Assoc. Prof. 2016-2019
Technion – Israel Inst. of Tech. Industrial Eng. Visiting Prof. 2015-2016
North Carolina State University Statistics Assoc. Prof. 2013-2015
Technion – Israel Inst. of Tech. Electrical Eng. Visiting Prof. 2012-2013
Harvard School of Public Health Biostatistics Assist. Prof. 2007-2012

Professional Experience (full time)


DaimlerChrysler Research & Tech. Palo Alto, CA Research Intern 2013
Biosense Webster (Israel) Ltd. Haifa, Israel Algorithm Developer 1999-2001
Rockwell Semiconductor Systems San Diego, CA Systems Engineer 1996-1998

Professional Memberships: Institute of Mathematical Statistics (lifetime member), American


Statistical Association (lifetime member)

Honors and Awards (excluding research grants)


U. of California, San Diego Hispanic Center of Excellence (HCOE) Fellow 2018
U. of California, San Diego Hispanic Center of Excellence (HCOE) Fellow 2018
Stanford University Teaching award in statistics 2003
Stanford University W. R. Kimbal and S. Heart Graduate Fellowship 2001
Technion – Israel Inst. of Tech. President's Academic List of Honors 1992, 1994
Technion – Israel Inst. of Tech. Dean’s Academic List of Honors 1993

Service Activities
1. Academic Advising: 9 postdoctoral scholars, 17 PhD students (4 as main thesis adviser + 5
as PhD committee member + 8 other), 4 MSc students, 2 BSc student.
2. University Service:
UC San Diego: HDI MS Program Cmte. (AY 2019-20); HDSI PhD Program Cmte. (AY
2019-20); HDSI Faculty Council (AY 2019-20); HDSI Advisory Board (AY 2018-19);
Biostatistics Executive Committee (AY 2018-20); Biostatistics PhD Program Admissions
Cmte. (AY 2016-20); Biostatistics PhD Program Education Cmte. (AY 2016-20); BS in
Public Health Steering Cmte. (AY 2018-19); Biostatistics Hiring Cmte. (AY 2016-17).
NC State University: Written Preliminary Exam Committee (AY 2014-15); Hiring
Committee (AY 2014-15); Basic Exam Committee (AY 2013-14).
Harvard SPH: High Dimensional Data Seminar co-Chair (AY 2007-12); Qualifying Exam
Cmte. (AY 2008-11); Newsletter Cmte. (AY 2007-09); Degree Program Cmte. (AY
2007-08); Diversity Cmte. (AY 2007-08); Seminar Cmte. (AY 2006-07).

1
3. Reviewer services:
a. Associate Editor, Electronic Journal of Statistics (2016-Present); Econometrics and
Statistics, Special issue on Neuroimaging (2018 - 2020)
b. Journal reviewer: 77 times for 17 statistical journals; 29 times for 16 scientific journals.
c. Grant Reviewer: Emerging Imaging Technologies in Neuroscience (EITN) Study
Section, NIH (2020); Dutch Research Council (NWO) (2019); Israeli-Qu\'{e}bec
Collaboration in Medical Bio-Imaging (2017); Statistics Program, Division of
Mathematical Sciences, NSF (2017); Israeli-Quebec Collaboration in Medical Bio-
Imaging (2017); Biostatistical Methods and Research Design Study Section (BMRD),
NIH (2015); Network for Translational Research: Optical Imaging, NCI/NIH (2008), In-
vivo Cellular and Molecular Imaging Centers, NCI/NIH (2007).
4. Conference organization: Comp. and Methodological Statistics, Pisa, Italy (2018), London,
UK (2017) and Sevilla, Spain (2016); Joint Statistical Meetings, Montreal, QC (2013) and
Vancouver, BC (2010); International Biometric Society, Fort Collins, CO (2012) and San
Luis Obispo (2011); Harvard Cancer Center (2009); Radcliffe Institute for Advanced Study
(2008).

Publications: 57 original articles (29 in methodological journals, 28 in other scientific journals),


4 invited comments, 5 refereed conference proceedings.

MOST IMPORTANT PUBLICATIONS – LAST FIVE YEARS


1. Schwartzman A, Keeling RF. Achieving atmospheric verification of CO2 emissions. Nature
Climate Change 2020; 10: 416-417.
2. Schwartzman A, Schork A, Zablocki R, Thompson WK. A simple, consistent estimator of
heritability from genome-wide association studies. Ann. Appl. Stat. 2019; 13(4): 2509-2538.
3. Schwartzman A, Telschow F. Peak p-values and false discovery rate inference in
neuroimaging. Neuroimage 2019; 197: 402-413.
4. Sommerfeld M, Sain S, Schwartzman A. Asymptotic Confidence Regions for Spatial
Excursion Sets, with an Application to Climate. J. Amer. Stat. Assoc. 2018; 113:523, 1327-1340.
5. Cheng D, Schwartzman A. Expected Number and Height Distribution of Critical Points of
Smooth Isotropic Gaussian Random Fields. Bernoulli 2018; 24(4B), 3422-3446.
6. Cheng D, Schwartzman A. Multiple Testing of Local Maxima for Detection of Peaks in
Random Fields. Ann. Stat. 2017; 45(2): 529-556.
7. Schwartzman A. Log-Normal Distributions and Geometric Averages of Positive Definite
Matrices. International Statistical Review 2016; 84(3): 456-486.
8. Azriel D, Schwartzman A. The Empirical Distribution of a Large Number of Correlated
Normal Variables. J. Amer. Stat. Assoc. 2015; 110(511): 1217-1228.

Oral Presentations (national and international): 63 invited seminars (Statistics, Biostatistics,


Mathematics and Engineering departments); 41 invited conference presentations; 9 contributed
conference presentations; 11 contributed posters.

Recent Professional Development Activities: Mentoring and teaching workshops as fellow of


the UCSD Hispanic Center of Excellence (2017-2019).

2
Please use the following format for the faculty vitae (2 pages maximum in Times New Roman 12
point type)

1. Name

Jingbo Shang

2. Education – degree, discipline, institution, year:

PhD, Computer Science, University of Illinois at Urbana-Champaign, 2019

3. Academic experience – institution, rank, title (chair, coordinator, etc. if appropriate),


when (ex. 2002-2007), full time or part time

University of California, San Diego, Assistant Professor, 2020-Present

4. Non-academic experience – company or entity, title, brief description of position, when


(ex. 2008-2012), full time or part time

5. Certifications or professional registrations

6. Current membership in professional organizations

7. Honors and awards

Google PhD Fellowship 2017-2019


World Champions in IEEE Xtreme Competition, 2018 and 2019
The Web Conference Best Poster Award Runner-up, 2018
4th Place in Fake News Challenge, 2017
Grand Prize in Yelp Dataset Challenge, 2015

8. Service activities (within and outside of the institution)

ACM SIGKDD 2020 Workshop Co-Chair.


Program Committee member of ACL and EMNLP since 2019.
Reviewers of IEEE TKDE, TKDD, and TBD journals.
Tutorial Organizers at ACM SIGKDD since 2017.
Chief Judge of ACM-ICPC North America Mid-Central Region since 2019.

9. Briefly list the most important publications and presentations from the past five years –
title, co-authors if any, where published and/or presented, date of publication or
presentation
There are about 40 publications in the past 5 years. The complete list can be found on my
website (https://shangjingbo1226.github.io/publications/) or my Google scholar page
(https://scholar.google.com/citations?user=0SkFI4MAAAAJ&hl=en). Here are a few examples:

1. “Contextualized Weak Supervision for Text Classification,” D. Mekala and J. Shang. Annual
Meeting of the Association for Computational Linguistics (ACL), 2020.
2. “Empower Entity Set Expansion via Language Model Probing,” Y. Zhang, J. Shen, J. Shang
and J. Han. Annual Meeting of the Association for Computational Linguistics (ACL), 2020.
3. “NetTaxo: Automated Topic Taxonomy Construction from Large-Scale Text-Rich Network,”
J. Shang, X. Zhang, L. Liu, S. Li and J. Han. The Web Conference (WWW), 2020.
4. “Integrating Local Context and Global Cohesiveness for Open Information Extraction,” Q.
Zhu, X. Ren, J. Shang, Y. Zhang, A. EI-Kishky and J. Han. International Conference on Web
Search and Data Mining (WSDM), 2019.
5. “CrossWeigh: Training Named Entity Tagger from Imperfect Annotations,” Z. Wang, J.
Shang, L. Liu, L. Lu, J. Liu and J. Han. ACL SIGDAT Empirical Methods in Natural
Language Processing (EMNLP), 2019.
6. “Learning Named Entity Tagger using Domain-Specific Dictionary,” J. Shang, L. Liu, X. Gu,
X. Ren, T. Ren and J. Han. ACL SIGDAT Empirical Methods in Natural Language
Processing (EMNLP), 2018
7. “Empower Sequence Labeling with Task-Aware Neural Language Model,” L. Liu, J. Shang,
F. Xu, X. Ren, H. Gui, J. Peng and J. Han. AAAI Conference on Artificial Intelligence
(AAAI), 2018.
8. “Automated Phrase Mining from Massive Text Corpora,” J. Shang, J. Liu, M. Jiang, X. Ren,
C. Voss and J. Han. IEEE Transactions on Knowledge and Data Engineering (TKDE), 2018.
9. “MetaPAD: Meta Pattern Discovery from Massive Text Corpora,” M. Jiang, J. Shang, T.
Cassidy, X. Ren, L. Kaplan, T. Hanratty and J. Han. ACM SIGKDD Conference on
Knowledge Discovery and Data Mining (KDD), 2017.

10. Briefly list the most recent professional development activities


1. NAME. Benjamin Smarr

2. EDUCATION.
Ph.D., Neurobiology and Behavior Program, University of Washington.
B.S., Biological Sciences, University of California at Santa Cruz.

3. POSITION.
Assistant Professor in Bioengineering and the Halicioğlu Data Science Institute, University of California 3/2019-
NIH K99 fellow, UC Berkeley, 2017-2019
NIH F23 fellow, UC Berkeley, 2014-2017
Postdoctoral researcher, Kriegsfeld lab, UC Berkeley, 2013

4. CONSULTING AND ADVISING


HALE Sports, 2018 - Scientific advisor
Decoding Superhuman Podcast, 2018 - Scientific consultant
OWaves, 2017 - Scientific advisor
Stephen Auger Lighting / AhhBe, 2016 - Scientific advisor
Invicta, 2016 – Scientific advisor
PrimaTemp, 2016 - Scientific advisor
Oura Ring, 2015 - Scientific consulting
Reverie Sleep Systems, 2015 - Scientific advisor

5. CURRENT MEMBERSHIPS
Society for Behavioral Neuroendocrinology, 2014-
Endocrine Society, 2012-
Society for Research in Biological Rhythms, 2011-
Society for Neuroscience, 2007-
Inaugural student president, Center for Sensorimotor Neural Engineering, UW, 2011-2012
President, Neurobiology and Behavior Community Outreach, UW, 2010-2012
Organizing member of Neurobiology and Behavior Community Outreach, UW, 2006-2012

6. SELECTED AWARDS AND HONORS


2017-2022. Pathway to Independence Award (K99/R00), NIEHS (PI: Smarr).
“Understanding the Impact of Environmental Disruption in Biological Timing Systems Through Signal Processing”.
2014-2017. Postdoctoral Ruth L. Kirschstein National Service Award (NRSA), NICHD (PI: Smarr).
“The Role of Circadian Stability During Development in Adult Health and Behavior”.
2012-2018. Research Merit Award, Society for Research in Biological Rhythms
2016. Best in Body Area Networks Award, Body Area Networks
2012. Seed Project Grant, NSF Center for Sensory-motor and Neural Engineering, University of Washington (PI: Smarr).
“Circadian Modulation of Neuromotor Control”.
2012. Education and Outreach-Development Grant, NSF Center for Sensory-motor and Neural Engineering, University of
Washington (PI: Smarr). “Neural Engineering Outreach Development”.
2010. Outreach activities featured in Science with picture, SCIENCE VOL 328 25 JUNE 2010
2009. NSF Doctoral Dissertation Improvement Grant (DDIG) (PI: de la Iglesia and Smarr).
“Circadian Regulation of Female Reproductive Physiology”.

7. SERVICE
Science in Society Outreach
Vocal proponent of data rights in biomedicine, and frequent public speaker / interviewee about the role of technology
advancing public health, and the importance of the individual in this process.
Recent publications include Wired, The Economist, BBC Business Daily, San Francisco Chronicle, Readers’ Digest, and many
others globally.
Education outreach
Over 100 K-12 “What do Brains do” school visits and lab tours;
Special emphasis on schools in lower socioeconomic status neighborhoods;
Organizer, “Brain Days Fair” at University of Washington through the Neurobiology and Behavior Community Outreach;
Created lesson plans for neuroscience classroom activities with over 10,000 downloads;

8. SELECT PUBLICATIONS FROM 2016-2020, AND THEIR IMPORTANCE.


Koskimaki H, Kinnunen H, Salla R, Smarr B. Following the heart – What does variation of resting heart rate tell us about us
as individuals and as population. UbiComp. 2019. Winner, best paper.
Smarr B, Cutler T, Loh D, Kuljis D, Kudo T, Kriegsfeld L, Ghiani C, Colwell C. Circadian dysfunction in the (z)Q175 model of
Huntington’s disease: network analysis. J. Neurosci Res. 2019.
Smarr B, Schirmer A. 3.4 million real-world learning management system logins reveal the majority of students experience
social jetlag correlated with decreased performance. Scientific Reports. 2018 March 29; 8:4793. Featured on BBC, Reddit
front page, The Telegraph, ScienceDaily, Slate, and dozens of other news outlets globally.
Gharibans AA, Smarr B, Kunkel DC, Kriegsfeld LJ, Hayat MM, Coleman TP. Demonstration and Validation of a Noninvasive
System for Measuring Gastric Myoelectric Activity in Ambulatory Subjects. Scientific Reports. 2018 March 22; 8: 5019.
Featured in San Diego Union-Tribune, EurekaAlert!, MobiHealthNews, and many national news outlets.
Smarr B, Grant A, Perez L, Zucker I, Kriegsfeld LJ. Maternal and Early-Life Circadian Disruption Have Long Lasting Negative
Consequences on Offspring Development and Adult Behavior in Mice. Scientific Reports. 2017 June 12; 7(1):3326.
Smarr B, Grant A, Zucker I, Prendergast BJ, Kriegsfeld LJ. Sex Differences in Variability Across Time Scales in Mice. Biology
of Sex Differences. 2017 Feb 9; 8:7.
Smarr B, Zucker I, Kriegsfeld LJ. Detection of Successful and Unsuccessful Pregnancies in Mice within Hours of Pairing
Through Frequency Analysis of High Temporal Resolution Core Body Temperature Data. PLoS One. 2016 Jul 28;
11(7):e0160127.
Smarr B, Burnett DC, Mesri SM, Kriegsfeld LJ, Pister KSJ. A Wearable Sensors System with Circadian Rhythm Stability
Estimation for Prototyping Biomedical Studies. IEEE Transactions on Affective Computing. 2016 Jul; 7(3): 220-230.
Smarr B, Kriegsfeld LJ. “Biological Rhythms.” American Psychiatric Association Handbook. 2016; 599-614.

My most important work of the last 5 years has been leveraging time series data into insights about health outcomes. In
animal models these include the ability to detect pregnancy within hours of conception, and to predict pregnancy outcome
within the first day of pregnancy; to develop within-individual tracking of Huntington’s disease progression; to
demonstrate that female animals show less variability than males – not more - across ovulatory cycles when faster
biological rhythms are included in analyses; and that disruptions to circadian rhythms during pregnancy cause autism-like
outcomes in resultant offspring.
I have transitioned this work into human populations, contributing a number of important insights in the last few years.
These include: determining wearable device design for inferring circadian phase from body temperature; inferring internal
hormonal concentrations from time series features of wearable sensor data; prediction of student academic performance
from sleep and circadian metrics; co-discovery of circadian and ultradian rhythms within the gastric system; seasonal
variation in cardiac output across large populations.
My work is now focused on developing early insights into COVID-19 and other illnesses from wearable device data.
Manuscripts in preparation include detection of fever, prediction of illness onset, and classification of illness variants from
physiological time series features. This work is being carried out with participation from 50,000 wearable device users,
and is an important template for expanded capture of “natural experiments” for developing tools for individuals at public-
health scale using distributed hardware infrastructure of personal tracking devices.
Together, these works focus on unlocking the novel potential of continuous physiological data generated from wearable
devices and related technologies, with a focus on modernizing women’s health, education outcomes, and long-term care
and monitoring in illness.

8. RECENT PROFESSIONAL DEVELOPMENT


NSF Big Data in Public Infrastructure workshop, participant
World Economic Forum on Public Health Potential from Wearable Devices, participant and presenter
TemPredict public-private planning sessions, coordination with NIH, DoD, wearable device manufacturers, UCSD @ HDSI
and SDSC
Berk Ustun
Education Massachusetts Institute of Technology 2010 – 2017
PhD in Electrical Engineering and Computer Science
SM in Computation for Design and Optimization

University of California, Berkeley 2005 – 2009


BS in Industrial Engineering and Operations Research, BA in Economics

Academic University of California, San Diego 2021 – Present


Positions Assistant Professor, Halıcıoğlu Data Science Institute

Google Research 2020 – 2021


Visiting Faculty, Google Medical Brain

Harvard University 2017 – 2020


Postdoctoral Fellow, Center for Research on Computation and Society

Selected Petal. New York, NY 2015 – 2020


Professional Co­Founder & Technical Advisor
Experience Spearheaded machine learning strategy to lend responsibly to consumers without credit
history in the US. Petal
Petal is a credit card startup with over 100 employees and 50K customers.

Amazon. Seattle, WA Summer 2013


Research Scientist Intern, IPC Buying Strategy Team
Developed algorithms to identify complementary products. Proposed new inventory and
transportation policies for complementary products that achieved major cost savings.

Selected INFORMS Innovative Applications in Analytics Award 2016 & 2019


Honors & INFORMS Computing Society Best Student Paper Prize 2017
Awards INFORMS Wagner Award for Excellence in Operations Research Practice, Finalist 2017
MIT Presidential Fellowship 2012

Academic Conference & Workshop Organization: NeurIPS Workshop Selection Program Commit­
Service tee (2020), FAT* – Co­Chair for the Computer Science Track (2020), FAT/ML – Workshop
Organizer & Webmaster (2017, 2018), INFORMS Session Organizer (2013­2019)
Grant Reviewing: National Science Foundation Panelist (2019)
Journal Reviewing: Management Science, IEEE Transactions on Signal Processing, Statis­
tical Analysis & Data Mining, Artificial Intelligence, Information Sciences, Minds & Ma­
chines, Big Data, Epidemiology, Nature Digital Medicine, Artificial Intelligence & Law,
Journal of Quantitative Criminology, IBM Journal of Research & Development.
Conference Program Committee: NeurIPS (2018, 2019, 2020), ICML (2019), ICLR (2020),
FAT* (2018, 2019), AISTATS (2019), AAAI (2019), HCOMP (2019), UAI (2018), ISIT (2018)

Page 1 of 2
Advising PhD Students
Jennifer Chien, UCSD CSE 2020 – Present
Jamelle Watson­Daniels, Harvard SEAS 2020 – Present
Eric Mibuari, Harvard SEAS 2018 – Present
Hao Wang, Harvard SEAS 2017 – Present

MS Students
Haorang Zhang, University of Toronto 2020 – Present
Vinith Suriyakumar, University of Toronto 2020 – Present
Alexander Spangher, Columbia University 2017 – 2019

Undergraduates
Tynan Seltzer, Harvard SEAS 2018 – Present
Charles Marx, Haverford College Summer 2019
Jiaming Zeng, MIT 2014 – 2016

Selected 1. Predictive
Predictive Multiplicity
Multiplicity in
in Classification
Classification.
Classification
Papers Charles Marx, Flavio Calmon, Berk Ustun. International Conference on Machine Learning, 2020
2. Learning
Learning Optimized
Optimized Risk
Risk Scores
Scores.
Scores
Berk Ustun, Cynthia Rudin. Journal of Machine Learning Research, 2019
3. Fairness
Fairness without
without Harm:
Harm: Decoupled
Decoupled Classifiers
Classifiers with
with Preference
Preference Guarantees
Guarantees
Guarantees.
Berk Ustun, Yang Liu, David C. Parkes. International Conference on Machine Learning, 2019
4. Repairing without
Repairing without Retraining:
Retraining: Avoiding
Avoiding Disparate
Disparate Impact
Impact with
with Counterfactual
Counterfactual Distributions
Distributions
Distributions.
Hao Wang, Berk Ustun, Flavio Calmon. International Conference on Machine Learning, 2019
5. Actionable
Actionable Recourse
Recourse in
in Linear
Linear Classification
Classification.
Classification
Berk Ustun, Alexander Spangher, Yang Liu. ACM Conference on Fairness, Accountability, and
Transparency, 2019
6. The World
The World Health
Health Organization
Organization Adult
Adult ADHD
ADHD Self­Report
Self­Report Screening
Screening Scale
Scale for
for DSM­5
DSM­5
DSM­5.
Berk Ustun, Lenard Adler, Cynthia Rudin, Stephen Faraone, Thomas Spencer, Patricia Berglund,
Michael Gruber, Ronald C. Kessler. JAMA Psychiatry, 2017
7. Association
Association of
of an
an EEG­Based
EEG­Based Risk
Risk Score
Score With
With Seizure
Seizure Probability
Probability in
in Hospitalized
Hospitalized Patients
Patients.
Patients
Aaron Struck, Berk Ustun, Andres Ruiz, Jong Woo Lee, Suzette LaRoche, Lawrence Hirsch, Emily
Gilmore, Jan Vlachy, Hiba A. Haider, Cynthia Rudin, Brandon Westover. JAMA Neurology, 2017
8. Interpretable
Interpretable Classification
Classification Models
Models for
for Recidivism
Recidivism Prediction
Prediction.
Prediction
Jiaming Zeng, Berk Ustun, Cynthia Rudin. JRSS Series A, 2016
9. Supersparse
Supersparse Linear
Linear Integer
Integer Models
Models for
for Optimized
Optimized Medical
Medical Scoring
Scoring Systems
Systems.
Systems
Berk Ustun, Cynthia Rudin. Machine Learning, 2015

Professional MIT Kaufman Teaching Certification Program Spring 2014


Development Completed semester­long course on modern approaches to curriculum design, lecturing,
and assessment.

Berk Ustun ­ Faculty Vitae Page 2 of 2


Bradley Voytek, Ph.D.
Initial faculty appointment: March 01, 2014 (858) 643-0002
University of California, San Diego bvoytek@ucsd.edu
9500 Gilman Dr. http://www.voyteklab.com
La Jolla, CA 92093-0515 http://github.com/voytekresearch
Education
2004 - 2010 Ph.D. Neuroscience, UC Berkeley
1998 - 2002 B.S. Psychology, University of Southern California
Professional
2020 - current Vice-chair: Data Science Major/Minor Steering Committee, UC San Diego
2019 - current Board Member: UC San Diego, Halıcıoğlu Data Science Institute Industry Member Board
2019 - current Co-founder and Board Member: Data Science Alliance - 501(c)(3) non-profit
2019 - current Diversity Representative and Chair: Halıcıoğlu Data Science Institute, UC San Diego
2018 - current Director: Halıcıoğlu Data Science Institute Scholarship Program, UC San Diego
2018 - current Director: Halıcıoğlu Data Science Institute Distinguished Lecture Series, UC San Diego
2018 - current Affiliate: Institute for Practical Ethics, UC San Diego
2018 - current Associate Professor (with tenure), Cognitive Science, UC San Diego
2018 - 2020 Fellow (similar to endowed chair), Halıcıoğlu Data Science Institute at UC San Diego
2017 - current Executive Committee, Neurosciences Graduate Program, UC San Diego
2017 - current Executive Committee, Halıcıoğlu Data Science Institute, UC San Diego
2017 - 2019 Co-founder, Diversity Outreach and Training Committee, Cognitive Neuroscience Society
2016 - 2019 Diversity Committee Chair: Neurosciences Graduate Program, UC San Diego
2014 - current Cognitive Science Student Association Faculty Representative, UC San Diego
2014 - 2016 Diversity Representative: UC San Diego, Neurosciences Graduate Program
2014 - current Data Science Student Society (DS3) Faculty Representative, UC San Diego
2013 - current Consultant, National Academy of Sciences, Science & Entertainment Exchange
Honors and Awards
• 2017 National Academy of Sciences Kavli Frontiers of Science, Symposium Chair
• 2016 Computational and Systems Neuroscience New Attendee Award
• 2015 National Academy of Sciences Kavli Fellow
• 2015 Alfred P. Sloan Research Fellow in Neuroscience
• 2011 AAAS: Finalist - Early Career Award for Public Engagement with Science
• UC Berkeley: Outstanding Graduate Student Instructor, University-wide teaching award
Research Support
• 2019 - 2023: NIGMS R01 GM134363-01: Tools for parameterizing and visualizing
electrophysiological rhythmic and arrhythmic features. $1,262,000 (total). (PI: B. Voytek)
• Intel Corporation: On Device Telemetry Workload Correlations with Personas and Environment.
$15,000 (total). (PI: B. Voytek)
• 2018 - 2020: Whitehall Foundation 2017-12-73: Prefrontal Oscillatory Mechanism of “Activity Silent”
Memory. $225,000 (total). (PI: B. Voytek)
• 2018: UC Stem Cell Program Innovative Award. $100,000 (total). (PI: A. Muotri; Co-PI: B. Voytek).
• 2018: UC San Diego, Shiley-Marcos Alzheimer’s Disease Research Center (ADRC): Research
Training in Alzheimer’s Disease, $35,000 (total). PI: B. Voytek)
• 2017 - 2020: National Science Foundation BCS COGNEURO 1736028: Oscillatory phase dynamics
coordinate cognitive neural networks, $471,777 (total). (PI: B. Voytek)
• 2017 - 2020: National Science Foundation DGE NRT 1735234: NRT-IGE: Augmenting, Piloting, and
Scaling Computational Notebooks to Train New Graduate Researchers in Data-Centric Programming,
$498,751 (total). (PI: James Hollan; Co-PIs: Philip Guo, Scott Klemmer, Bradley Voytek)
Selected Publications
• Voytek B & Knight RT. Prefrontal cortex and basal ganglia contributions to visual working memory.
Proc Natl Acad Sci USA 2010.
• Voytek B, Canolty RT, Shestyuk A, Crone NE, Parvizi J, Knight RT. Shifts in gamma phase-
amplitude coupling frequency from theta to alpha over posterior cortex during visual tasks. Front Hum
Neurosci 2010.
• Voytek B, Davis M, Yago E, Barceló F, Vogel EK, Knight RT. Dynamic neuroplasticity after human
prefrontal cortex damage. Neuron 2010.
• Voytek JB & Voytek B. Automated cognome construction and semi-automated hypothesis
generation. J Neurosci Methods 2012.
• Voytek B, D’Esposito M, Crone NE, Knight RT. A method for event-related phase/amplitude coupling.
NeuroImage 2013.
• Voytek B, Kayser A, Badre D, Fegen D, Chang EF, Crone NE, Parvizi J, Knight RT, D’Esposito M.
Oscillatory dynamics coordinating human frontal networks in support of goal maintenance. Nature
Neuroscience 2015.
• Voytek B, Kramer MA, Case J, Lepage KQ, Tempesta ZR, Knight RT, Gazzaley A. Age-related
changes in 1/f neural electrophysiological noise. Journal of Neuroscience 2015.
• Voytek B & Knight RT. Dynamic network communication as a unifying neural basis for cognition,
development, aging, and disease. Biological Psychiatry 2015.
• Tran TT, Hoffner NC, LaHue SC, Tseng L, Voytek B. Alpha phase dynamics predict age-related
visual working memory decline. NeuroImage 2016.
• Voytek B. The virtuous cycle of a data ecosystem. PLOS Computational Biology 12(8): 1-6. 2016.
• Cole SR, van der Meij R, Peterson EJ, de Hemptinne C, Starr PA, Voytek B. Nonsinusoidal beta
oscillations reflect cortical pathophysiology in Parkinson's disease. J Neurosci 2017.
• Cole SR & Voytek B. Brain oscillations and the importance of waveform shape. Trends Cogn Sci
2017.
• Voytek B. Social Media, Open Science, and Data Science are Inextricably Linked. Neuron 2017.
• Gao RD, Peterson EJ, Voytek B. Inferring synaptic excitation/inhibition balance from field potentials.
NeuroImage 2017.
• Moore SM, Seidman JS, Ellegood J, Gao R, Savchenko A, Troutman TD, Abe Y, Stender J, Lee D,
Wang S, Voytek B, Lerch JP, Suh H, Glass C, Muotri A. Setd5 haploinsufficiency alters neuronal
network connectivity and leads to autistic-like behaviors in mice. Translational Psychiatry 2019.
• Cole SR, Donoghue T, Gao R, Voytek B. NeuroDSP: A package for neural digital signal processing.
Journal of Open Source Software 2019.
• Jackson N, Cole SR, Voytek B, Swann NC. Characteristics of waveform shape in Parkinson’s
disease detected with scalp electroencephalography. eNeuro 2019.
• Veerakumar A, Tiruvadi V, Howell B, Waters AC, Crowell AL, Voytek B, Posse PR, Denison L,
Rajendra JK, Edwards JA, Bijanki KR, Choi KS, Mayberg HS. Field potential 1/f activity in the
subcallosal cingulate region as a candidate signal for monitoring deep brain stimulation for treatment
resistant depression. J Neurophysiol. 2019.
• Cole S & Voytek B. Cycle-by-cycle analysis of neural oscillations. J Neurophysiol. In press.
• Trujillo CA*, Gao R*, Negraes PD*, Chaim IA, Domissy A, Vandenberghe M, Devor A, Yeo GW,
Voytek B#, Muotri AR#. Complex Oscillatory Waves Emerging from Cortical Organoids Model Early
Human Brain Network Development. Cell Stem Cell. 2019. *,# these authors contributed equally
• Robertson MM, Furlong S, Voytek B, Donoghue T, Boettiger CA, Sheridan MA. EEG Power Spectral
Slope differs by ADHD status and stimulant medication exposure in early childhood. J Neurophysiol.
122(6): 2427-2437. 2019.
• Molina JL, Voytek B, Thomas ML, Joshi YB, Bhakta SG, Talledo JA, Swerdlow NR, Light GA.
Memantine effects on EEG measures of putative excitatory/inhibitory balance in schizophrenia. Biol
Psychiatry Cogn Neurosci Neuroimaging. In press.
• Ghatak S, Dolatabadi N, Gao R, Wu Y, Scott H, Trudler D, Sultan A, Ambasudhan R, Nakamura T,
Masliah E, Talantova M, Voytek B, Lipton SA. NitroSynapsin ameliorates hypersynchronous neural
network activity in Alzheimer hiPSC models. Mol Psychiatry. In press.
• Tran TT, Rolle CE, Gazzaley A, Voytek B. Linked sources of neural noise contribute to age-related
cognitive decline. J Cogn Neurosci. In press.
Book
• Verstynen T & Voytek B (2014). Do Zombies Dream of Undead Sheep? A Neuroscientific View of the
Zombie Brain: Princeton University Press.
Please use the following format for the faculty vitae (2 pages maximum in Times New Roman 12
point type)

1. Name
Tsui-Wei (Lily) Weng

2. Education – degree, discipline, institution, year


Ph.D, EECS, MIT, Sep 2020
M.S, Communication Engineering, National Taiwan University, June 2013
B.S, Electrical Engineering, National Taiwan University, June 2011

3. Academic experience – institution, rank, title (chair, coordinator, etc. if appropriate),


when (ex. 2002-2007), full time or part time
Assistant Professor, UCSD HDSI, 2021-present

4. Non-academic experience – company or entity, title, brief description of position, when


(ex. 2008-2012), full time or part time
MIT-IBM Watson AI Lab, Research Staff Member, 2020-2021, Full time
Google DeepMind, Research Intern, 2019.05-2019.09, Full time
IBM Research, Research Intern, 2018.05-2018.08, Full time
IBM Research, Research Intern, 2017.05-2017.08, Full time
Mitsubishi Electric Research Lab, Research Intern, 2015.06-2015.08, Full time

5. Certifications or professional registrations

6. Current membership in professional organizations

7. Honors and awards

8. Service activities (within and outside of the institution)

Top-tier Machine Learning Conference Reviewers including ICML, CVPR, AAAI,


ICLR, ECCV, NeurIPS, 2018-present

9. Briefly list the most important publications and presentations from the past five years –
title, co-authors if any, where published and/or presented, date of publication or
presentation

(i) Five Most Relevant Publications:


1. T.-W. Weng, H. Zhang, P.-Y Chen, J. Yi, D. Su, Y. Gao, C.-J. Hsieh and L. Daniel,
“Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach”,
ICLR 2018
2. T.-W. Weng, H. Zhang, H. Chen, Z. Song, C.-J. Hsieh, D. Boning, I. S. Dhillon and L.
Daniel, “Toward Fast Computation of Certified Robustness for ReLU Networks”, ICML
2018
3. T.-W. Weng, P.-Y. Chen, L. M. Nguyen, M. S. Squillante, A. Boopathy, I. Oseledets, and
L. Daniel, “PROVEN: Verifying Robustness of Neural Networks with a Probabilistic
Approach”, ICML 2019
4. A. Boopathy, T.-W. Weng, P.-Y. Chen, S. Liu and L. Daniel, “CNN-Cert: An Efficient
Framework for Certifying Robustness of Convolutional Neural Networks”, AAAI 2019
5. A. Boopathy, T.-W. Weng, S. Liu, P.-Y. Chen and L. Daniel, “ Efficient Training of
Robust and Verifiable Neural Networks”, under submission to NeurIPS 2020

(ii) Five Other Significant Publications


1. Y.-S. Weng, T.-W. Weng, and L. Daniel, “Neural Network Control Policy Verification
with Persistent Adversarial Perturbations”, ICML 2020
2. J. Mohapatra, T.-W. Weng, P.-Y. Chen, S. Liu and L. Daniel, “Towards Verifying
Robustness of Neural Networks Against a Family of Semantic Perturbations”, CVPR 2020
3. T.-W. Weng, K. Dvijotham, J. Uesato, K. Xiao, S. Gowal, R. Stanforth, Pushmeet Kohli,
“Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous
Control”, ICLR 2020.
4. H. Zhang, T.-W. Weng, P.-Y. Chen, C.-J. Hsieh and L. Daniel, “Efficient Neural Network
Robustness Certification with General Activation Functions”, NeurIPS 2018
5. C.-Y. Ko, Z. Lyu, T.-W. Weng, L. Daniel, N. Wong, and D. Lin, “POPQORN:
Quantifying Robustness of Recurrent Neural Networks”, ICML 2019

10. Briefly list the most recent professional development activities


1. Name: Yusu Wang

2. Education – degree, discipline, institution, year

School, university Dates of Major subject or Date


Location Degrees
attendance field received
Stanford University 2004-2005 Stanford, CA Computer Science Postdoc 2005
Duke University 2000-2004 Durham, NC Computer Science Ph.D. 2004
Duke University 1998-2000 Durham, NC Computer Science M.Sc. 2000
Tsinghua 1993-1998 China Computer Science B.Sc. 1998
University

3. Academic experience – institution, rank, title (chair, coordinator, etc. if appropriate),


when (ex. 2002-2007), full time or part time

Period of
Institution, firm or Rank, title, or
employment Location
organization position
From - To:
2020-Present Univ. California, San Diego California Professor
2018-2020 Foundation Research Ohio Co-director of CoP
Community of Practice (CoP)
at Translational Data
Analytics Institute @ OSU
2017-2020 Ohio State University Ohio Professor
2011-2017 Ohio State University Ohio Associate Professor
2012-2013 Institute of Science and Austria Visiting Professor
Technology
2005-2011 Ohio State University Ohio Assistant Professor

4. Non-academic experience – company or entity, title, brief description of position, when


(ex. 2008-2012), full time or part time
N/A
5. Certifications or professional registrations
N/A
6. Current membership in professional organizations
2005-present ACM member

7. Honors and awards

(2002) Department service award, Computer Science Dept, Duke University


(2004) Best Ph.D Dissertation, Computer Science Dept., Duke University
(2006) U.S. Department of Energy (DOE) Early Career Principal Investigator (ECPI)
Award
(2008) U.S. National Science Foundation (NSF) Career Award
(2008) Top Reviewer for journal Computational Geometry: Theory and Applications
(2010) A Best Paper Award, Eurovis
(2011) Lumley Research Award, College of Engineering, OSU
(2015) Best Paper Award at ACM SIGSPATIAL GIS
(2015) Best Student Paper Award at Conf. Learning Theory (COLT)

8. Service activities (within and outside of the institution)

Steering Committee:
(06/2020-present) Computational Geometry Steering Committee
Associate Editors:
(2019-present) SIAM Journal on Computing (SICOMP)
(2020-present) Computational Geometry: Theory and Applications
(2010-present) Journal of Computational Geometry
Program committee (Co-)Chair:
(2016) Joint STOC/SoCG Workshop Day
(2019) 35th Symposium on Computational Geometry (SoCG)
Co-organizers:
TGDA@OSU Conference (2016, 2018), TGDA@OSU summer school (2018), NSF CBMS
Conference on Elastic Functional and Shape Data Analysis (2019), AMW workshop on
Women in Computational Topology at JMM (Joint Mathematics Meeting) (2019).

9. Briefly list the most important publications and presentations from the past five years

• (2021) AMS Southeastern Sectional Meeting of the Society, Keynote speaker


• (2020) 32nd Canadian Conference on Computational Geometry (CCCG), Keynote
speaker
• (2019) 34th Summer Conf. on Topology and its Applications, Semi-Plenary speaker
• (2017) 6th Mini-Symposium on Computational Topology, Australia, Keynote speaker
• (2016) International Workshop on Topological Data Analysis in Biomedicine (TDA-
Bio), part of ACM BCB, Keynote speaker
• Banerjee, L. Magee, D. Wang, X. Li, B. Huo, J. Jayakumar, K. Matho, M. Lin, K. Ram, M.
Sivaprakasam, J. Huang, Y. Wang, and P. Mitra. Semantic segmentation of microscopic
neuroanatomical data by combining topological priors with encoder-decoder deep networks.
Nature Machine Intelligence. Sept. 2020.
• L. Yan, Y. Wang, E. Munch, E. Gasparovic, and B. Wang. A structural average of labeled
merge trees foruncertainty visualization. IEEE Trans. Vis. Comput. Graph. (TVCG), 26(1),
832–842 (2020).
• Q. Zhao and Y. Wang. Learning metrics for persistence-based summaries and applications
for graph classification. In 33rd Conf. Neural Information Processing Systems (NeuRIPS),
9855—9866, 2019.
• C. Chen, X. Ni, Q. Bai, Y. Wang. A topological regularizer for classifiers via persistent
homology. In 22nd Intl. Conf. Artificial Intelligence and Statistcs, 2573-2582, 2019.
• T. K. Dey, J. Wang and Y. Wang. Graph reconstruction by discrete Morse theory. In Sympos.
Comput. Geom. (SoCG), 31:1–31:15, 2018.

10. Briefly list the most recent professional development activities


BIOGRAPHICAL SKETCH: RUTH J. WILLIAMS
Address: Dept. of Mathematics, UCSD, 9500 Gilman Drive, La Jolla, CA 92093-0112.
Email: rjwilliams@ucsd.edu, Web page: http://www.math.ucsd.edu/∼williams/
a. Professional Preparation.
B.Sc. (Hons.), Mathematics, University of Melbourne, Melbourne, Australia, 1977.
M.Sc., Mathematics, University of Melbourne, Melbourne, Australia, 1979.
Ph.D., Mathematics, Stanford University, Palo Alto, CA, 1983.
b. Appointments.
Charles Lee Powell Chair in Mathematics I, UC San Diego, 2011-present.
Distinguished Professor, Dept. of Mathematics, UC San Diego, 2009–present.
Professor, Dept. of Mathematics, UC San Diego, 1991–2009.
Associate Professor, Dept. of Mathematics, UC San Diego, 1988–91.
Assistant Professor, Dept. of Mathematics, UC San Diego, 1983–88.
Postdoctoral Member in Probability, Courant Inst. of Math. Sciences, NYU, 1983–84.
Visiting Positions (last 25 years).
Visiting Scholar, Center of Mathematical Sciences & Applic., Harvard Univ., 9/19-11/19.
Visiting Member of ACEMS (Australian Research Council Centre of Excellence for Mathe-
matical and Statistical Frontiers), Univ. Melbourne, Australia, 9/15, 7/18 and 7/19.
Visiting Member, Center for Modeling Stochastic Sys., Monash Univ., Australia, 3/17-5/17.
G. C. Steward Visiting Fellow in Math., Gonville & Caius College, Cambridge, 3/10-4/10.
Visiting Fellow, Isaac Newton Institute for Math. Sciences, Cambridge, 3/10-4/10.
Visiting Prof. of Operations, Information & Technology, Stanford University, 9/01–6/02.
Research Professor, Mathematical Sciences Research Institute, Berkeley CA, 9/97–5/98.
Visiting Scientist, Inst. for Math. & Its Applic., U. Minnesota, 3–6/86, 1–3/94 & 10-11/15.
c. Honors.
Fellow of the Society for Industrial and Applied Mathematics, since 2020.
Corresponding Member of the Australian Academy of Science, elected 2018.
Award for the Advancement of Women in Operations Research & Management Sci., 2017.
John von Neumann Theory Prize, 2016 (jointly with M. I. Reiman), awarded by the Institute
for Operations Research and the Management Sciences (INFORMS).
Inaugural Fellow of the American Mathematical Society, Class of 2013.
Fellow of the American Academy of Arts and Sciences since 2009.
INFORMS Fellow since 2008.
INFORMS Applied Probab. Soc. Best Pub. Award 2007 (joint with Gromoll and Puha).
Guggenheim Fellowship, 2001-2002.
Invited 45 minute speaker, International Congress of Mathematicians, Berlin, 1998.
Fellow of the American Association for the Advancement of Science since 1995.
Fellow of the Institute of Mathematical Statistics since 1992.
Alfred P. Sloan Research Fellow, 1988–92.
National Science Foundation Presidential Young Investigator Award, 1987–93.

1
d. Research Interests.
Probability, stochastic processes and their applications; multidimensional reflected diffusions;
stochastic differential (delay) equations; measure-valued processes; fluid and diffusion approx-
imations for complex networks; analysis and control of stochastic networks with applications
to operations management, telecommunications and systems biology.
e. Five Selected Recent Publications.
1. Yingjia Fu and Ruth J. Williams, Stability of a subcritical fluid model for fair bandwidth
sharing with general file size distributions, Stochastic Systems, in press.
2. J. A. Mulvany, A. L. Puha and R. J. Williams, Asymptotic behavior of a critical fluid
model for a multiclass processor sharing queue via relative entropy, Queueing Systems, 93
(2019), 351-397.
3. S. C. Leite and R. J. Williams, A constrained Langevin approximation for chemical reaction
networks, Annals of Applied Probability, 29 (2019), 1541-1608.
4. R. J. Williams, Stochastic Processing Networks, invited article for Annual Review of
Statistics and Its Application, 3 (2016), 323-345.
5. D. Lipshutz and R. J. Williams, Existence, uniqueness and stability of slowly oscillating
periodic solutions for delay differential equations with non-negativity constraints, SIAM J.
on Mathematical Analysis, 47 (2015), 4467–4535.
f. External Professional Activities (5 illustrative examples).
(i) Member of the Council of the National Academy of Sciences, 2019-2022.
(ii) Member of the selection committee for the biennial INFORMS Impact Prize, 2018, 2020
(committee chair in 2020).
(iii) Member of Governance Board of MATRIX (Australian Mathematics Inst.), 2015-pres.
(iv) Associate Editor for Applied Probability Trust Journals: Journal of Applied Probability
and Advances in Applied Probability (2016-present).
(v) President, Institute of Mathematical Statistics, 2012.
g. UC San Diego Recent Service Activities (5 illustrative examples).
(i) Founding Faculty and Council Member, Halicioğlu Data Science Institute, 2018-present
(ii) Faculty Advisory Committee for Moore Science Communication grant, 2017–19.
(iii) Physical Sciences Task Force on the Status of Women in the Physical Sciences, 2017–18.
(iv) Council Member, Mathematics Department, 2016–2020 (one of three Council members
elected to represent the faculty).
(v) Chair, Mathematics Department Hiring Committee, 2016-2018.

2
1. Name: Arya Mazumdar

2. Education degree, discipline, institution, year


a. Ph.D., Electrical and Computer Engineering, University of Maryland College
Park, 2011

1. Academic experience – institution, rank, title (chair, coordinator, etc. if appropriate),


when (ex. 2002-2007), full time or part time
a. University of California San Diego, Associate Professor, 2021-present.
b. University of Massachusetts Amherst, Associate Professor, 2019-2021.
c. University of Massachusetts Amherst, Assistant Professor, 2015-2019.
d. University of Minnesota Twin Cities, Assistant Professor, 2013-2015.
e. Massachusetts Institute of Technology, Postdoctoral Associate, 2011-2012.

2. Non-academic experience – company or entity, title, brief description of position, when


(ex. 2008-2012), full time or part time
a. Amazon, Senior Scientist, 2019-2020
b. IBM Almaden, PhD Intern, 2010.
c. HP Labs, PhD Intern, 2008.

3. Current membership in professional organizations: Senior Member, IEEE and IEEE


Information Theory Society, Member, ACM

4. Honors and awards


a. Best Paper Award, European Association for Signal Processing, 2020
b. NSF CAREER Award, 2015
c. Distinguished Dissertation Award, University of Maryland, 2011
d. Jack K. Wolf Paper Award, ISIT, 2010.

5. Service activities (within and outside of the institution)


a. UCSD: DSC Program Committee, Industry Liaison Committee
b. UMass: Faculty Hiring Committee, Graduate Committee, Admissions Committee
c. Editorial Boards: Associate Editor: IEEE Transactions on Information Theory;
Area Editor: Now Publishers Foundations and Trends; Guest Editor: Entropy
d. Organization:
i. Information Theory School, 2019
ii. Workshop on Coding Theory for Optimization, Learning, Inference
e. Review Panels: NSF, BSF (US+Israel), ISF (Israel), Research Grant Council
(Hong Kong)
f. Conference Program Committees: ISIT, AAAI, AISTATS, DSP, ITW, NVMW

6. Briefly list the most important publications and presentations from the past five years –
title, co-authors if any, where published and/or presented, date of publication
a. A Mazumdar, S Pal, “Semisupervised clustering by queries and locally encodable
source coding," IEEE Transactions on Information Theory, vol. 67, no. 2, 2021.
Preliminary Version in NeurIPS 2017 (Spotlight paper).
a. V Gandikota, D Kane, R Maity, A Mazumdar, “vqSGD: vector quantized
stochastic gradient descent," AISTATS , 2021.
b. A Ghosh, R Maity, A Mazumdar, “Distributed Newton can communicate less and
resist byzantine workers," NeurIPS, 2020.
c. V Gandikota, A Mazumdar, S Pal, “Recovery of sparse linear classifiers from
mixture of responses," NeurIPS, 2020.
d. S Ubaru, S Dash, A Mazumdar, O Gunluk, “Multilabel classification by
hierarchical partitioning and data-dependent grouping," NeurIPS, 2020.
e. R McKenna, R Maity, A Mazumdar, G Miklau, “A workload-adaptive mechanism
for linear queries under local differential privacy," VLDB, 2020.
f. Arya Mazumdar, Soumyabrata Pal, “Recovery of sparse signals from a mixture of
linear samples," International Conference on Machine Learning (ICML), 2020.
g. A Krishnamurthy, A Mazumdar, A McGregor, S Pal, “Algebraic and Analytic
Approaches for Parameter Learning in Mixture Models," ALT, 2020.
h. A Krishnamurthy, A Mazumdar, A McGregor, S Pal, “Sample complexity of
learning mixture of sparse linear regressions," NeurIPS, 2019.
i. W Huleihel, A Mazumdar, M Medard, S Pal, “Same-cluster querying for
overlapping clusters," NeurIPS, 2019.
j. L Flodin, V Gandikota, A Mazumdar, “Superset technique for approximate
recovery in one-bit compressed sensing," NeurIPS, 2019.
k. A Mazumdar, A McGregor, S Vorotnikova, “Storage capacity as information
theoretic vertex cover and the index coding rate,"IEEE Tran on Information
Theory, vol. 65, no. 9, 2019.
l. A Mazumdar, B Saha, “Clustering with noisy queries ," NIPS, 2017.
m. A Mazumdar, B Saha, “Query complexity of clustering with side information,"
NIPS, 2017.
n. A Barg, A Mazumdar, “Group testing schemes from codes and designs," IEEE
Transactions on Information Theory, vol. 63, no. 11, Nov 2017.
o. S Ubaru, A Mazumdar, Y Saad, “Low rank approximation and decomposition of
large matrices using error correcting codes," IEEE Tran on Information Theory,
vol. 63, no. 9, Sep 2017.
p. A Mazumdar, “Nonadaptive group testing with random set of defectives," IEEE
Transactions on Information Theory, vol. 62, no. 12, Dec 2016.

7. Briefly list the most recent professional development activities


a. Courses Taught Recently: Algorithms for Data Science, Optimization in
Computer Science, Undergraduate Probability and Statistics, Information Theory,
Coding Theory
Angela J. Yu Associate Professor (Asst. Prof. 2008-2014)
ajyu@ucsd.edu Department of Cognitive Science
http://www.cogsci.ucsd.edu/ ajyu University of California San Diego

Education Princeton University (04/05–07/08) Princeton, NJ


Post-doctoral fellow in the Center for the Study of Brain, Mind, and Behavior
UCL Gatsby Computational Neuroscience Unit (10/00-06/05 London, UK
Ph.D in Computational Neuroscience.
Massachusetts Institute of Technology(09/96–06/00 Cambridge, MA
B.S. Mathematics, B.S. Computer Science, B.S. Brain & Cognitive Sciences; GPA: 5.0/5.0

Selected Guan, J, Ryali, C, & Yu, A J (2018). Computational modeling of social face perception in
Pubs humans: Leveraging the active appearance model. bioRxiv, https://doi.org/10.1101/360776.
Ryali, C, & Yu, A J (2018). Beauty-in-averageness and its contextual modulations: A Bayesian
statistical account. Adv. in Neural Information Processing Systems, 32.
Guo, D, Yu, A J (2018). Why so gloomy? A Bayesian explanation of human pessimism bias
in the multi-armed bandit task. Adv. in Neural Information Processing Systems, 32.
Ryali, C, Gautam, R, & Yu, A J (2018). Demystifying excessively volatile human learning: A
Bayesian persistent prior and a neural approximation. Adv. in Neural Information Processing
Systems, 32.
Wang, W, Hu, S, Ide, J S, Zhornitsky, S, Zhang, S, & Yu, A J, Li, C-S R (2018). Motor
preparation disrupts proactive control in the stop signal task. Frontiers in Human Neuroscience,
doi: 10.3389/fnhum.2018.00151.
Cogliati Dezza, I, Yu, A J, Cleeremans, A, Alexander, W (2017). Learning the value of infor-
mation and reward over time when solving exploration-exploitation problems. Nature Scientific
Reports, 7:16919.
Harlé, K M, Guo, D, Zhang, S, Paulus, M, Yu, A J (2017). Anhedonia and anxiety underlying
depressive symptomatology have distinct effects on reward-based decision-making. PLoS ONE,
12(10):e0186473.
Harlé, K M, Zhang, S, Ma, N, Yu*, A J, & Paulus, M P* (2016). Reduced neural recruitment
for Bayesian adjustment of inhibitory control in methamphetamine dependence. Biological
Psychology: Cog. Neurosci. and Neuroimaging, 1: 48-459. *Co-senior authors.
Li L, Malave, V, Song, A, & Yu, A J (2016). Extracting Human Face Similarity Judgments:
Pairs or Triplets? Proceedings of the Cognitive Science Society Conference.
Ma, N & Yu, A J (2016). Inseparability of Go and Stop in Inhibitory Control: Go Stimulus
Discriminability Affects Stopping Behavior. Frontiers in Decision Neuroscience, 10 (54).
Harlé, K M, Zhang, S, Schiff, M, Mackey, S, Paulus, M P, & Yu, A J (2015). Altered statistical
learning and decision-making in methamphetamine dependence: Evidence from a two-armed
bandit task. Frontiers in Psychology, 6 (1910).
Harlé, K M, Steward, J L, Zhang, S, Tapert, S, Paulus, M P, & Yu, A J (2015). Bayesian
neural adjustment of inhibitory control predicts emergence of problem stimulant use. Brain,
138:3413-26.
Ma, N & Yu, A J (2015). Statistical Learning and Adaptive Decision-Making Underlie Human
Response Time Variability in Inhibitory Control. Frontiers in Psychology, 6(1046).
Ide, J S, Hu, S, Zhang, S, Yu, A J & Li, C-S R (2015). Impaired Bayesian learning for cognitive
control in cocaine dependence. Drug and Alcohol Dependence, 151: 220-227.
Ahmad, S & Yu, A J (2015). A rational model for individual differences in preference choice.
Proceedings of the Cognitive Science Society Conference.
Zhang, S, Song, M, & Yu, A J (2015). Bayesian hierarchical model of local-global processing:
Visual crowding as a case-study. Proceedings of the Cognitive Science Society Conference.
Yu, A J & Huang, H (2014). Maximizing masquerading as matching: Statistical learning and
decision-making in choice behavior. Decision, 1 (4): 275-287.
Harlé, K M, Shenoy, P, Steward, J L, Tapert, S, Yu*, A J, Paulus*, M P (2014). Altered
neural processing of the need to stop in young adults at risk for stimulus dependence. Journal
of Neuroscience, 34(13): 4567-4580. *Co-senior authors.
Ahmad, S, Huang, H, & Yu, A J (2013). Context-sensitivity in human active sensing. Adv. in
Neural Information Processing Systems 26.
Zhang, S & Yu, A J (2013). Forgetful Bayes and myopic planning: Human learning and
decision-making in a bandit setting. Adv. in Neural Information Processing Systems 26.
Shenoy, P & Yu, A J (2013). A rational account of contextual effects in preference choice:
What makes for a bargain? Cognitive Science Society Conference.
Dayanik, S & Yu, A J (2013). Reward-rate maximization in sequential identification under a
stochastic deadline. SIAM Journal on Control and Optimization, 51 (4), 2922-2948.
Yu, A J (2013). Bayesian Models of Attention. Chapter in Handbook of Attention, Eds. S.
Kastner & K. Nobre. Oxford, UK: Oxford University Press.
Ide, J S, Shenoy, P, Yu*, A J, & Li*, C-R (2013). Bayesian prediction and evaluation in the
anterior cingulate cortex. Journal of Neuroscience, 33: 2039-2047. *Co-senior authors.
Shenoy, P & Yu, A J (2012). Rational impatience in perceptual decision-making: a Bayesian
account of discrepancy between two-alternative forced choice and Go/NoGo behavior. Adv. in
Neural Information Processing Systems 25.
Yu, A J (2012). Change is in the eye of the beholder. Nature Neuroscience 15: 933-935.
Shenoy, P, Rao, R, & Yu, A J (2010). A rational decision making framework for inhibitory
control. Adv. in Neural Information Processing Systems 23: 2146-2154.
Yu, A J & Cohen, J D (2009). Sequential effects: Superstition or rational behavior? Adv. in
Neural Information Processing Systems 21: 1873-1880.
Yu, A J, Dayan, P, & Cohen J D (2008). Dynamics of attentional selection under conflict:
Toward a rational Bayesian account. J. Exp. Psy.: Human Perc. and Perf., 35: 700-717.
Frazier, P & Yu, A J (2008). Sequential hypothesis testing under stochastic deadlines. Adv.
in Neural Information Processing Systems 20: 465-72.
Yu, A J. (2007) Adaptive behavior: Humans act as Bayesian learners. Current Biology 17.
Cohen, J D, McClure, S M, & Yu, A J (2007). Should I stay or should I go? How the human
brain manages the tradeoff between exploitation and exploration. Philosophical Transactions
of the Royal Society B: Biological Sciences 362: 933-942.
Yu, A J (2007). Optimal change-detection and spiking neurons. Adv. in Neural Information
Processing Systems 19: 1545-52. MIT Press, Cambridge, MA.
Dayan, P & Yu, A J (2006). Norepinephrine and neural interrupts. Adv. in Neural Information
Processing Systems 18: 243-50.
Yu, A J & Dayan, P (2005). Uncertainty, neuromodulation, and attention, Neuron, 46: 681-
692.
Yu, A J & Dayan, P (2005). Inference, attention, and decision in a Bayesian neural architec-
ture. In Adv. in Neural Information Processing Systems 17: 1577-84.
Yu, A J & Dayan, P (2003). Expected and unexpected uncertainty: ACh and NE in the
neocortex. In Adv. in Neural Information Processing Systems 15.
Yu, A J & Dayan, P (2002). Acetylcholine in cortical inference. Neural Networks, 15 (4/5/6):
719-730.
Dayan, P & Yu, A J (2002). Acetylcholine, uncertainty, and cortical inference. Adv. in Neural
Information Processing Systems 14.
Other Journal editor: Decision, Frontiers in Behavioral Neurosci., Frontiers in Human Neurosci.
Activities Journal Reviewer: Adaptive Behavior, Brain Research, Cognition, Cognitive Psychology,
Computational Brain & Behaivor, Current Biology, Decision, eLife, European Journal of Neuro-
science, Frontiers in Computational Neuroscience, Frontiers in Decision Neuroscience, Frontiers
in Behavioral Neuroscience, Frontiers in Human Neuroscience, Journal of Autonomous Agents
and Multi-Agent Systems, Journal of Neuroscience, Journal of Theoretical Biology, Memory &
Cognition, Nature Communications, Nature Human Behavior, Nature Reviews, Neural Com-
putation, Neuron, PLoS Computational Biology, PLoS ONE, PNAS, Psychological Review,
Psychonomic Bulletin & Review, Psychopharmacology, Science
Conference Reviewer/Organizer: Cogsci, Cosyne, IJCAI, RLDM, NIPS (AC, SAC)
Zhiting Hu
Phone: +1 (412) 320-0630
Halıcıoğlu Data Science Institute
Email: zhh019@ucsd.edu
University of California San Diego
zhitinghu@gmail.com
La Jolla, CA 92093
Homepage: http://zhiting.ucsd.edu/

Education
2020 Ph.D., Machine Learning Department, Carnegie Mellon University
Advisor: Eric P. Xing
2016 M.S., Language Technologies Institute, Carnegie Mellon University
Advisior: Eric P. Xing
2014 B.S., Computer Science, Peking University, China

Academic Experience
starting 2021.9 Assistant Professor, Halıcıoğlu Data Science Institute, UC San Diego

Non-Academic Experience
2020 – 2021.9 Full-time Visiting Academic, Amazon Alexa AI
2017 – 2020 Full-time Research Scientist, Petuum Inc.

Awards & Honors (Selected)


2019 Best Demo Paper Nomination, ACL 2019
2019 Best Paper Award, ICLR 2019 drlStructPred Workshop
2017 NVIDIA Pioneering Research Award
2017 IBM Ph.D. Fellowship
2017 Baidu Ph.D. Scholarship
2016 Outstanding Paper Award, ACL 2016
2014 Excellence Award of Stars of Tomorrow Internship Program, Microsoft Research Asia
2014 GPA 1st/140, EECS, Peking University
2013 Outstanding Undergraduate Award, China Computer Federation (CCF)
2013 Google Excellence Scholarship
2009 First Prize in China National Biology Olympiad for Senior High School

Service Activities
Co-organizer, NeurIPS 2019 Workshop on Learning with Rich Experience: Integration of Learning Paradigms
Zhiting Hu 2

Co-organizer, CVPR 2019 Workshop on Towards Causal, Explainable and Universal Medical Visual Diagnosis
Co-organizer, ICML 2019 Workshop on Learning and Reasoning with Graph-Structured Representations
Co-organizer, ICML 2018 Workshop on Theoretical Foundations and Applications of Deep Generative Models
Reviewer, NeurIPS, ICML, ACL, EMNLP, NAACL, CVPR, AAAI, KDD, WWW, JMLR, MLJ, TPAMI, etc
Outstanding reviewer, EMNLP 2020

Publications (Selected)
Google Scholar Profile
[1] Yue Wu, Pan Zhou, Andrew Gordon Wilson, Eric P Xing, Zhiting Hu.
Improving GAN Training with Probability Ratio Clipping and Sample Reweighting
Neural Information Processing Systems (NeurIPS 2020).
[2] Zhengzhong Liu, Guanxiong Ding, Avinash Bukkittu, Mansi Gupta, Pengzhi Gao, Atif Ahmed,
Shikun Zhang, Xin Gao, Swapnil Singhavi, Linwei Li, Wei Wei, Zecong Hu, Haoran Shi, Xiaodan
Liang, Teruko Mitamura, Eric P Xing, Zhiting Hu.
A Data-Centric Framework for Composable NLP Workflows
Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), demo.
[3] Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Tom Mitchell, Eric P Xing.
Learning Data Manipulation for Augmentation and Weighting
Neural Information Processing Systems (NeurIPS 2019).
[4] Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He,
Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, etc.
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Annual Meeting of the Association for Computational Linguistics (ACL 2019).
Best Demo Paper Nomination, https://github.com/asyml
[5] Zhiting Hu*, Bowen Tan* (equal contrib.), Zichao Yang, Ruslan Salakhutdinov, Eric P Xing.
Connecting the Dots between MLE and RL for Sequence Prediction
ICLR 2019 Workshop on Deep Reinforcement Learning Meets Structured Prediction,
Best Paper Award
[6] Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Xiaodan Liang, Lianhui Qin, Haoye Dong, Eric P
Xing.
Deep Generative Models with Learnable Knowledge Constraints
Neural Information Processing Systems (NeurIPS 2018).
[7] Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Eric P Xing.
On Unifying Deep Generative Models
International Conference on Learning Representations (ICLR 2018).
[8] Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, Eric P Xing.
Toward Controlled Generation of Text
International Conference on Machine Learning (ICML 2017).
[9] Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, Eric P Xing.
Harnessing Deep Neural Networks with Logic Rules
Annual Meeting of the Association for Computational Linguistics (ACL 2016).
Outstanding Paper Award
Curriculum Vitae (abbreviated)
R. Stuart Geiger, Ph.D
stuart@stuartgeiger.com

Geiger II, Richard Stuart


Name:
Dept Communication & Data Science Title Assistant Professor (effective 7/1/2020)

Education

School, college, Dates of Major subject or Degrees or Date


Location
university attendance field certificates received
University of Texas at 8/2005 - 5/2007 Austin, TX Humanities B.A. 5/2007
Austin Honors

Georgetown 8/2007 - 5/2009 Washington, Communication, M.A.. 5/2009


University DC Culture, and
Technology

UC Berkeley, School 8/2010-12/2015 Berkeley, CA Information Ph.D., 12/2015


of Information Management &
Systems

Academic Experience

Period of Institution, firm or


Location Rank, title, or position
employment organization
10/2019 – 6/2020 UC Berkeley, Berkeley Berkeley, CA Professional researcher,
Institute for Data Science assistant rank (full-time)

1/2016 - 9/2019 UC Berkeley, Berkeley Berkeley, CA Postdoctoral scholar (full-


Institute for Data Science time)

5/2009 – 8/2010 Georgetown University, Washington, DC Research associate (full-


Dept of Communication, time)
Culture, and Technology

Current memberships in professional organizations


1. Association for Computing Machinery (ACM), 2009-present
2. Society for the Social Studies of Science (4S), 2009-present
3. Association of Internet Researchers (AoIR), 2013-present
4. International Communication Association (ICA), 2014-present

Honors and Awards


1. Research grant (lead PI): $138,055 from the Sloan Foundation & Ford Foundation for 2018-2020
grant on “The Visible and Invisible Work of Maintaining Open-Source Software” w/ Co-PIs L. Irani
& A. Paxton
2. Best presentation award: Awarded at the 2018 Annual European Conference on Computer-Supported
Cooperative Work (ECSCW): Nancy, France, June 7, 2018. For “The Types, Practices, and Roles of
Documentation.”
3. Best paper award (1st runner up): The 2018 David B. Martin Best Paper Award (1st runner up),
from the European Society for Socially Embedded Technologies. For “The Types, Practices, and Roles
of Documentation,” with N. Varoquaux, C. Cabasse-Mazel, and C. Holdgraf.

Service activities

1. Lead organizer of the Best Practices in Data Science working group at the UC-Berkeley Institute for
Data Science (2018-2020)
2. Co-organizer of workshop and event series on Diversity & Inclusion in Data Science at UC-Berkeley
(2018-2020)
3. Program committee member of the ACM Conference on Collective Intelligence (2019-present)
4. Co-organizer of the Data Science Studies / Critical Data Studies conference track at the Annual
Meeting of the Society for the Social Studies of Science (4S) (2016-2018)
5. Undergraduate Research Apprenticeship Mentor, UC-Berkeley (2018-2020)

Recent selected publications and presentations

1. Geiger, R.S., K. Yu, Y. Yang, M. Dai, J. Qiu, R. Tang, and J. Huang. 2020. "Garbage In, Garbage Out:
Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled
Training Data Comes From?" In Proceedings of the ACM Conference on Fairness, Accountability, and
Transparency.
2. Geiger, R.S. 2019. “The Visible and Invisible Work of Maintaining and Sustaining Open-Source
Software.” Keynote at the SciPy (Scientific Python) 2019 conference, Austin, Texas. July 10, 2019.
3. Geiger, R.S. 2018. “Key Values: What We Talk About When We Talk About ‘Open Science.’” Keynote
at the 2018 Hawai’i Open Science Symposium, Manoa, HI. Apr 20, 2018.
4. Geiger, R.S., N. Varoquaux, C. Mazel-Cabasse, and C. Holdgraf. 2018. “The Types, Roles, and
Practices of Documentation in Data Analytics Open Source Software Libraries: A Collaborative
Ethnography of Documentation Work.” Computer Supported Cooperative Work.
https://doi.org/10.1007/s10606-018-9333-1
5. Geiger, R.S. and Halfaker, A. 2017. “Operationalizing conflict and cooperation between
automated software agents in Wikipedia: A replication and expansion of Even Good Bots Fight."
In Proceedings of the ACM on Human-Computer Interaction (Nov 2017 issue, CSCW 2018
Online First). https://doi.org/10.1145/3134684
6. Geiger, R.S. 2017. "Beyond opening up the black box: Investigating the role of algorithmic systems in
Wikipedian organizational culture." Big Data & Society 4(2). http://stuartgeiger.com/algoculture-
bds.pdf
7. Geiger, R.S. 2016. “Bot-based collective blocklists in Twitter: the counterpublic moderation of
harassment in a networked public space.” Information, Communication, and Society 19(6).
http://stuartgeiger.com/blockbots-ics.pdf

Recent Professional Development Activities


1. UC Sexual Violence and Sexual Harassment Prevention Training (6/18/2019)
2. UC Cyber Security Awareness Fundamentals (8/26/2019)
Margaret (Molly) E. Roberts
Contact Information
University of California, San Diego http://margaretroberts.net
Social Sciences Building 301 meroberts@ucsd.edu
9500 Gilman Drive, #0521
La Jolla, CA 92093-0521

Academic Appointments
University of California, San Diego
Chancellor’s Associates Endowed Chair, 2020-.
Associate Professor, Dept of Political Science and Halıcıoğlu Data Science Institute, July 2020-.
Associate Professor, Dept of Political Science, July 2018-2020
Assistant Professor, Dept of Political Science, July 2014-2018.

Education
Harvard University,
Ph.D., Government, 2014

Stanford University
M.S. Statistics, June 2009

Stanford University
B.A. International Relations & Economics, June 2009

Books
- Grimmer, Justin, Margaret E. Roberts, and Brandon M. Stewart. Text as Data. Princeton
University Press. (In Press)
- Roberts, Margaret E. Censored: Distraction and Diversion Inside China’s Great Firewall.
Princeton University Press. (2018)

Selected Publications
- Eddie Yang and Margaret E. Roberts. 2021. “Censorship of Online Encyclopedias: Implica-
tions for NLP Models.” In Conference on Fairness, Accountability, and Transparency (FAccT
‘21)
- Grimmer, Justin, Margaret E. Roberts, and Brandon M. Stewart. “Machine Learning for
Social Science: An Agnostic Approach.” Annual Review of Political Science. (In Press)
- Roberts, Margaret E. ”Resilience to online censorship.” Annual Review of Political Science 23
(2020): 401-419.
- Iyad Rahwan, Manuel Cebrian, Nick Obradovich, Josh Bongard, Jean-Franois Bonnefon, Cyn-
thia Breazeal, Jacob W. Crandall, Nicholas A. Christakis, Iain D. Couzin, Matthew O. Jackson,
Nicholas R. Jennings, Ece Kamar, Isabel M. Kloumann, Hugo Larochelle, David Lazer, Richard
McElreath, Alan Mislove, David C. Parkes, Alex ‘Sandy’ Pentland, Margaret E. Roberts, Azim
Shariff, Joshua B. Tenenbaum Michael Wellman. “Machine behaviour.” Nature. 2019 Apr
568(7753):477
- Hobbs, William R., and Margaret E. Roberts. “How sudden censorship can increase access to
information.” American Political Science Review 112.3 (2018): 621-636.
- King, Gary, Jennifer Pan, and Margaret E. Roberts. 2017. “How the Chinese Government
Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument.” American
Political Science Review.
- King, Gary, Patrick Lam, and Margaret E. Roberts. 2017. “Computer-Assisted Keyword and
Document Set Discovery from Unstructured Text.” American Journal of Political Science.
- Roberts, Margaret E, Brandon M. Stewart, and Edo M. Airoldi. 2016. “A model of text for
experimentation in the social sciences.” Journal of the American Statistical Association, 111
(515): 988-1003.
- King, Gary, Jennifer Pan, and Margaret E. Roberts. 2014. “Reverse Engineering Chinese
Censorship: Randomized Experimentation and Participant Observation.” Science, 345 (6199):
1-10.
- Roberts, Margaret E, Brandon M. Stewart, Dustin Tingley, Christopher Lucas, Jetson Leder-
Luis, Shana Gadarian, Bethany Albertson, David Rand. 2014.“Structural Topic Models for
Open-Ended Survey Responses.” American Journal of Political Science, 58 (4): 1064-1082.
- King, Gary, Jennifer Pan, and Margaret E. Roberts. 2013. “How Censorship in China Allows
Government Criticism but Silences Collective Expression.” American Political Science Review,
107(2), 326-343.

Selected Honors and Awards


• 2019 Best Book in the Information Technology and Politics Section, APSA
• 2019 Best Book Award in the Human Rights Section, APSA
• 2019 Goldsmith Book Prize
• 2019 International Studies Association Human Rights Section’s Best Book Runner Up
• Foreign Affairs Best Books of 2018
• 2018 Society for Political Methodology Statistical Software Award for stm: An R package
for Structural Topic Models
• 2017 National Science Foundation RIDIR Award, “Collaborative Research: Analytical
tools for text based social data integration” (with Amarnath Gupta and Brandon Stewart)
$1,516,099
• 2016 UC San Diego, Hellman Fellowship, “Estimating the Impact of the Great Firewall
on Political Opinion in China”$46,447
• 2016 UC San Diego, Academic Senate Research Award, “The Credibility of Media Infor-
mation to Citizens” (with Seth Hill) $14,032
• 2015 UC San Diego, Integrated Digital Infrastructure Grant, “A High-Performance Stor-
age, Management and Computation Platform for Heterogeneous, Multilingual Text Data
to Enable Social Science Research” (with James Fowler and Amarnath Gupta) $94,000
• 2015 National Science Foundation RAPID Grant, “RAPID: Measuring the Intent of Chi-
nese Leaders through Censorship Behavior” (with Gary King and Jennifer Pan) $200,000
• 2015 APSA Division of Political Communication’s Outstanding Dissertation Award
• 2015 Richard J. Herrnstein Prize for Dissertation “Fear, Friction, and Flooding: Methods
of Online Information Control.”
• 2014 Edward M. Chase Prize for the best dissertation on a subject relating to the promotion
of world peace

Selected Service
• Editorial Board Member: Political Analysis, Asian Survey, American Journal of Political
Science, American Political Science Review, China Quarterly, Political Behavior, World
Politics
• American Political Science Association Methodology Section Chair (2019)
• Text as Data Association Board (2019-)
• Society for Political Methodology Diversity Committee (2014-present)
David Danks

Education
Ph.D., Philosophy, University of California, San Diego, 2001
M.A., Philosophy, University of California, San Diego, 1999
A.B., Philosophy, Princeton University, 1996

Academic experience
(As of July 2021: University of California, San Diego, Professor of Data Science & Philosophy)
Carnegie Mellon University, L.L. Thurstone Professor of Philosophy & Psychology, 2016 –
CMU, Head, Department of Philosophy, 2014 -
CMU, Professor of Philosophy & Psychology, 2014 - 2016
CMU, Associate Professor of Philosophy & Psychology, 2008 - 2014
CMU, Assistant Professor of Philosophy, 2003 - 2008
Florida Institute for Human & Machine Cognition, Research Scientist, 2001 – 2012 (full-time for
2001-2003; part-time from 2003-2012)
Colorado College, Visiting Assistant Professor of Philosophy, 2002 - 2003

Membership in professional organizations


American Philosophical Association
Cognitive Science Society
Philosophy of Science Association

Honors and awards


Andrew Carnegie Fellowship (2017)
James S. McDonnell Foundation Scholar (2008)

Service activities (selected)


National Academies Committee on Responsible Computing Research, Member
Technology Transformation Services (GSA) AI Portfolio, Expert advisor
National Security Commission on Artificial Intelligence, SGE for Ethics Line of Effort
Pittsburgh Task Force on Public Algorithms, Member
Grefenstette Center for Ethics in Science, Tech., & Law (Duquesne Univ.), Advisory Board
Partnership to Advance Responsible Technology, Founding Board member
Salesforce Ethical & Responsible Use advisory council, External member
IBM Watson AI XPRIZE competition, Lead/Presiding judge
CMU Center for Informed Democracy and Social Cybersecurity (IDeaS), Founding co-Director
CMU Block Center for Technology & Society, Chief Ethicist
CMU President’s Task Force on Campus Climate, Co-chair

Most important recent publications & presentations


Lütge, C., Poszler, F., Acosta, A. J., Danks, D., Gottehrer, G., Mihet-Popa, L., & Naseer, A.
(2021). AI4People: Ethical guidelines for the automotive sector – fundamental requirements
and practical recommendations. International Journal of Technoethics, 12(1), 101-125.
Zhou, Y., & Danks, D. (2020). Different “intelligibility” for different folks. In A. Markham, J.
Powles, T. Walsh, & A. L. Washington (Eds.), Proceedings of the 2020 AAAI/ACM
Conference on Artificial Intelligence, Ethics, & Society (pp. 194-200). New York: ACM.
Danks, D. (2019). The value of trustworthy AI. In Proceedings of the 2019 AAAI/ACM
Conference on Artificial Intelligence, Ethics, and Society.
Danks, D., & Plis, S. M. (2019). Amalgamating evidence of dynamics. Synthese, 196(8), 3213-
3230.
Geary, T., & Danks, D. (2019). Balancing the benefits of autonomous vehicles. In Proceedings
of the 2019 AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society.
Danks, D. (2018). Privileged (default) causal cognition: A mathematical analysis. Frontiers in
Psychology, 9: 498. doi:10.3389/fpsyg.2018.00498
Danks, D., & Ippoliti, E. (Eds.) (2018). Building theories: Heuristics and hypotheses in science.
Berlin: Springer-Verlag.
LaRosa, E., & Danks, D. (2018). Impacts on trust of healthcare AI. In Proceedings of the 2018
AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society.
doi:10.1145/3278721.3278771
London, A. J., & Danks, D. (2018). Regulating autonomous vehicles: A policy proposal. In
Proceedings of the 2018 AAAI/ACM Conference on Artificial Intelligence, Ethics, and
Society. doi:10.1145/3278721.3278763
Roff, H. M., & Danks, D. (2018). “Trust but Verify”: The difficulty of trusting autonomous
weapons systems. Journal of Military Ethics, 17, 2-20.
Danks, D., & London, A. J. (2017). Algorithmic bias in autonomous systems. In C. Sierra (Ed.),
Proceedings of the 26th International Joint Conference on Artificial Intelligence (pp. 4691-
4697).
Danks, D., & London, A. J. (2017). Regulating autonomous systems: Beyond standards.
Intelligent Systems, 32(1), 88-91.
Hyttinen, A., Plis, S., Järvisalo, M., Eberhardt, F., & Danks, D. (2017). A constraint optimization
approach to causal discovery from subsampled time series data. International Journal of
Approximate Reasoning, 90, 208-225.
Wellen, S., & Danks, D. (2016). Adaptively rational learning. Minds & Machines, 26(1), 87-102.
DOI: 10.1007/s11023-015-9370-1

You might also like