Professional Documents
Culture Documents
01Ld W20 - CST2200 - 04 - Coronavirus
01Ld W20 - CST2200 - 04 - Coronavirus
01Ld W20 - CST2200 - 04 - Coronavirus
2
Coronavirus – Where’s the data?
• Centres for Disease Control and Prevention (CDC)
• Very structured and controlled
• High degree of accuracy, reliable, current
• Data is ready for research
• https://www.cdc.gov/coronavirus/2019-ncov/summary.html
3
Coronavirus – Where’s the data?
• Social networks, news publications, blogs, etc
• Twitter feeds - https://twitter.com/hashtag/coronavirus?lang=en
• Unstructured, uncontrolled, less reliable, too old?, not research ready
• BUT STILL COMPLETELY VALUABLE … why is it valuable? You tell me
4
Coronavirus – One example of interactive data
The Center for Systems Science and Engineering at Johns Hopkins University
• Housed in the Johns Hopkins Department of Civil and Systems Engineering,
CSSE takes a multidisciplinary approach to modeling, understanding, and
optimizing systems of local, national, and global importance. These include
medicine, health care delivery, national infrastructure, information security,
disaster response, and education. In addition to faculty from across multiple
engineering departments, CSSE utilizes the expertise of researchers from
the schools of Medicine, Public Health, Nursing, Arts and Sciences,
Business, and Education; and from JHU’s Applied Physics Laboratory,
already one of the nation’s leading centers of systems engineering.
• https://systems.jhu.edu/research/public-health/ncov/
5
Coronavirus
Let’s discuss issues related around:
• Data scraping / wrangling / ETL (formal or ad-hoc?)
• Do we have time to write a requirements document?
• Data warehousing for reporting and sharing
• “Real-time” or “near real-time” analysis what's the difference
• The cost of data-related errors can be measured in actual lives