Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Data Acquisition and Management

Week 01: Lecture 01


Day 01: 5th Jan 2023

Instructor: Fatima Islam Mouri


Outline for Today

• Self Introduction and Interaction

• Introduction:
• Course Outline

• Course Content

• Break & Brainstorming

• Discussion & Q/A


• Contact:

fa.mouri@uwinnipeg.ca
Course Outline
• Topics: 5 Modules

• Assignments: 40%
• Each week: submission on Sunday (11:59 PM)

• Late submission: each day 25% penalty Not allowed; [Contact: Nawal]
• Extension: Depends on situation and proof

• Quizzes: 60%
• Each week: on Thursday (in class)

• Final quiz: 7th Feb 2023


Pre-Requisite
Database and Programming Essentials
• Why?
• Computer Science concepts
• Data and Database
• Programming language and Tools

• What will be the next step?


• Data managing
• Data collect or processing
• Tools
Data Management
• Collecting
• Organizing
• Protecting
• Storing

• WHY need to do these??


Data Management
• Reason:
• Visibility
• Eliminate redundancy
• Reliability
• Improve decision making
• Security
• Data back up
• Data encryption
• Minimize data loss
• Scalability
• Improve organizational efficiency
• Compliance with regulatory requirements
Data Management

Techniques

• Data Preparation • Data Governance


• Data Pipeline • Data Architecture
• ETL • Dara Security
• Data Catalogs • Data modeling
• Data Warehouses
Data Science Hierarchy of Needs

Target of this course


Data Engineer
Data Engineer
• Responsibilities: • Skills
• Performance optimization • Database optimization

• Stream processing • Data techniques

• Implementing ELT process • Knowledge of programming language

• Make data available for analysis and business • Data handling


operations • Understanding pf relational and non-
• Design, store and manage data relational database

• Develop and maintain data architecture


Data
Data format
• File formats:
• CSV, DOC, PDF
• Data Type: • HTML, XML, JSON
• Symbol
• Character • PHP, CSS
• Text • JPG, PNG, GIF
• Number
• Audio/Video • ZIP, RAR, TAR
• Image • AI, PPT, ESP

• MP3, MOV
Data
Data format Example
Data
Data Source
• Internal Data
• Sales Data

• Financial Data

• Human Resources Data

• Websites logs

• Used by within a company


Data
Data Source

• External Data
• Government Data

• Social Media Data

• Weather forecasts

• Publicly available dataset for ML

https://www.ventivtech.com/blog/whats-the-difference-between-internal-and-external-data
Data
Essential Programming Language

• SQL

• Python

• Java

• Bash Script

• R

• PySpark
Common Data Repositories

• Database

• Data Warehouse

• Data Lake

• Data Mart

• Big Data stores


https://www.astera.com/type/blog/data-repository/
Book:
https://uwinnipeg.on.worldcat.org/oclc/1202466316
https://uwinnipeg.on.worldcat.org/oclc/1202027443
Browse & Explore

You might also like