Sol03 en

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Assigned: 2021/12/07

大數據分析 (Big Data Analysis) SOL#3 Due: 2021/12/21


系別/班級 學號 姓名
系 年 班
Department / Class Student ID Name

第一部份:名詞解釋 (30 pts., each 2 pts.)


Part I: Glossary

1. Symbol
 Symbolic/expressive means of specific meaning, mainly for communication, expression or
recording
─ Sound, light, color, graphics (patterns, icons), shapes (geometries, arrangements, abstract
shapes/symbols), words (characters, letters), waveforms (sound, light, electric), movements (body
language, sign language), etc. can all be symbols.

2. Data
 Facts and observations obtained from relevant situations can be regarded as knowledge of people,
events and things
─ Narrowly defined as a numerical feature obtained through observation, usually consisting of a series of
symbols, which can be a pile of organized or unorganized text, numbers, files, etc., for sharing,
communicating and storing
─ After measurement, collection, reporting and analysis, it can be used for different applications, and can
also be visualized with charts, graphs or other analytical tools to assist in the communication of
information, personal interpretation and analysis.

3. Information
 The result obtained by applying the data to a meaningful situation can be seen as an abstract concept
of understanding the data
─ After storage, analysis, and interpretation, the corresponding meaning can be obtained, but the result
may vary from person to person. Even the same person may have different interpretations of the same
data in different time and space contexts, but not because of bias or disagreement in the data, but because
of the influence of personal knowledge, perspectives, and various external factors.
─ The way or medium in which the information is recorded also affects the outcome of the information
conveyed

4. Knowledge
 Things that are believed or valued based on the content of the information
 The accumulation of other people's knowledge, experience, and experience
─ It is a known state or fact that can be seen as the sum of what is understood, discovered, or learned by
humans
─ Knowledge gained from the experience of others, a common understanding (consensus) of people about
a particular person, thing, or object
5. Intelligence
 The ability to think, analyze, and search for the truth, based on the comprehensive ability formed by
the neural organs (the material basis) to deeply understand various people and situations
 Personal knowledge, experience and accumulation of experiment
─ Including perception, knowledge, memory, understanding, association, emotion, logic, discrimination,
calculation, analysis, judgment, culture, middle ground, tolerance, decision, etc.

6. Digitization
 The process of electronization (transformation into digital media) of information, which may include
text, pictures, audio, etc.
─ Most of the conversion process is carried out using sensors or scanners, mainly for the conversion of
the information itself
─ Photographs taken by digital cameras are a common medium for digitizing information.

7. Digitalization
 The main purpose is to use digital technology or methods to improve the business model or customer
experience.
 Unlike digitization, digitalization emphasizes the "process" of digitization, that is, the digitization
of the process, including the digitization of the interaction between people, organizations and
organizations and the communication medium.
─ In organizations, digitalization often refers to the process of automating parts of the organization's
operations by incorporating digital technology into existing operational processes and business models.
─ Email, social media, APIs that integrate corporate functions, and a variety of software that enables
internal operational processes to be completed on digital channels are common examples of content.

8. Embedded System
 Microcomputer systems designed for specific applications embedded in a controlled target
─ The system must be miniaturized to facilitate integration and embedding into the target
─ In addition to the ability to run and compute programs, the system is usually built into the system chip
with the required peripheral hardware modules to perform with the minimum amount of hardware
resources.
─ The software programs designed for specific objectives are called firmware.
─ Usually the system is not a stand-alone system, but a subsystem of a larger device
9. Micro Controller Unit (MCU)
 Embedded controllers and microprocessors are collectively referred to as chip systems that can
independently and effectively perform hardware control and data computation.
 A microcontroller is like a computer unit with basic functions concentrated in a single chip, so it is
often called a microcomputer controller or a single-chip controller.
 The basic components of a microcontroller includes:
 Central Processing Unit (CPU)
 Memory, including RAM & ROM
 Input/Output, I/O
 Other Modular Units

10. Micro Processor Unit (MPU)


 Basic unit integration of computer system, including CPU (ALU+CU), internal memory, bus and I/O
ports, etc. …

11. Internet
 A vast complex network linked from network to network and providing standardized services that
can be used to handle the interaction and sharing of large amounts of data and even provide virtual
services

12. Cloud Computing


 Cloud computing is a computing method/virtual computing environment that provides a variety of
terminals and other devices with dynamically scalable and virtualized resources on demand through
shared software and hardware resources and information.
 The main points of emphasis are its dynamic configuration and flexibility of use
─ The service characteristics of cloud computing are similar to those of clouds and water circles in the
natural world, hence the name
─ The amount of computing, storage, data and application resources pooled on the Internet is increasing
with the expansion of the Internet, which is transforming the Internet from a traditional communication
platform to a ubiquitous and intelligent computing platform
─ Cloud computing relies heavily on the sharing of resources to achieve economies of scale, with service
providers integrating large amounts of resources for multiple users, allowing users to easily request/rent
resources, adjust usage according to demand, and release unneeded resources back to the system
architecture

13. Edge Computing


 The main reason for edge computing is that in today's cloud applications, if you wait until all
information is uploaded to the cloud before supervising, controlling, and analyzing actions, it will
sometimes be too slow to meet the needs of the urgent, so the computing capability is added to the
data aggregation field terminal/gateway so that it can achieve real-time supervision, real-time analysis,
and response functions, that is, computing and processing at the edge relative to the data cloud.
14. Data Acquisition (DAQ)
 A device that takes samples of signals that measure real-world physical conditions and converts the
resulting samples into numerical values that can be manipulated by a computer, also known as a
monitor meter
 The main components include
 Sensor: converts physical parameters into electrical signals
 Signal modulation circuit: converts the sensor signal into an extractable form
 Analog-to-digital converter (A/D converter, ADC): converts the regulated sensor signal into a
numerical value
 Processing core: responsible for controlling the periphery of the device for measurement
processing, and communicating with the computer to send back data
 Communication interface: MCU and the computer for data exchange interface, commonly used
USB or UART

15. Feature Extraction


 Feature extraction is a process that starts from an initial data set and constructs informative and non-
redundant derived values from it.
─ Feature extraction is a dimensionality reduction step where the initial data set is reduced to
more manageable groups (features) for analysis, while maintaining the accuracy and integrity
of the original data set description.
─ For the processing of more complex and informative image data, there are many specialized
feature extraction algorithms, developed in computer graphics and image processing
disciplines.
─ For the analysis of extracted sub-data sets, feature selection is usually performed for the
analysis proposition to exclude irrelevant, redundant or highly relevant features.
─ If the extracted features are qualitative, additional data coding (assigning corresponding values
for processing and analysis) may be required.
第二部份:簡答題 (70 pts., each 5 pts.)
Part II: Short-Answer Type

1. What’s the difference between “data” and “information”?


 When data is examined in context or analyzed, it becomes information. In other words, when data
is processed, the resulting usable information is called information.
 The biggest difference between data and information is not whether it has been "processed"
or not, but whether it is "meaningful" or not.
 In academic discourse, data are the units that make up information
 In the computer field, data are stored or computed in the form of characters, fields, records,
files, and databases.
 In the field of data analysis, data represent a set of qualitative or quantitative variables about
one or more people and things, while information is processed and useful data

2. Relationships between symbols, data, information, knowledge and intelligence.


 In order to think and communicate, human beings create various symbols and use them to compose
data and store them, or to obtain information through processing for analysis and transmission, thus
generating knowledge and accumulating it, gradually forming wisdom.
 Symbols form and store data
 Information is obtained by processing data for analysis and transmission
 Generating knowledge from information and accumulating it, gradually forming personal wisdom

3. Classification of “data”.
資料
Data

非結構化資料 結構化資料
Unstructured Structured

轉換

定性資料 定量資料
Qualitative Quantitative

定類資料 定序資料 定距資料 定比資料


Nominal Ordinal Interval Ratio
(Categorical)

資料呈現:資料視覺化/描述性統計
Visualize / Descriptive Statistics
4. What’s digitization and digitalization?
 Digitization refers to the process of electrifying (converting) information into a digital medium,
which can include text, images, audio, etc. Most conversions are done using sensors or scanners.
Most conversions are done using sensors or scanners, and are primarily focused on the information
itself.
 Digitization/technology digitization refers to the use of digital technology or methods to improve
the business model or customer experience. Unlike digitization, digitization emphasizes the
"process" of digitization, i.e., the digitization of the process, including the digitization of the
interaction and communication media between people, organizations and organizations.

5. Operations of data processing.


 Create: Choose the type of electronic file according to its content
 Conversion: Save to different media
 Sort: Sort in order
 Merge: Merge files of the same quality and subject into one file
 Distribute: Distribute data to different files according to conditions and specifications
 Search: Find the required data or file by a key value
 Compute & Listing: Compute data or list results according to instructions
 Update: Edit, append, delete, etc. to the data
 Check: Check the data content with the search and comparison target

6. Why digitize (digitization)?


 Easy to save and record
 Data protection by encryption and decryption algorithms
 Easy exchange and transmission
 Reduces data capacity by compressing and decompressing algorithms for easy storage and
transmission
 Easy to edit and process, and less likely to damage the original data
 Easy to analyze, process, debug, and remove errors

7. What’s embedded systems?


 Microcomputer systems designed to be embedded in a controlled target for a specific application
 The system must be miniaturized to facilitate integration and embedding into the target.
 Unlike general processors, in addition to the ability to run and compute programs, more emphasis
is placed on independence, and the required peripheral hardware modules are usually added to the
system chip to perform the work with minimum hardware resources.
 The software program designed for a specific target is called firmware.
 Usually the system is not a standalone system, but a subsystem of a large device.
8. What’s internet of things (IoT)?
 Based on the Internet structure, with various wired or wireless sub-network structures to upload the
data on the items/equipment to the network, so that the monitoring and operation of the equipment
can be performed remotely.
 It is even possible to connect all the equipment in the factory to the central control server through
the Internet of Things for unified management, and to make production management decisions by
means of machine learning or artificial intelligence to achieve the goal of intelligent automatic
chemical plants.
 A simplified conceptual description is "Connecting every things".

9. Please list the three-layer / five-layer architecture of IoT.


Three-Layer Architecture:
 Sensing/device layer: consisting of sensors, actuators, hardware, software, connectivity, and
gateways to form a device that connects and interacts with the network
 Network/Communication Layer: The connection medium (connection method) and data exchange
method (protocol) of the device for data exchange and transmission
 Application layer: the application function of IoT technology, or the so-called IoT platform
Five-layer architecture:
 Application layer: the interactive interface between the user and the device
 Transport layer: data transfer between the layers to ensure data communication and secure
communication
 Network layer: helps devices communicate with routers
 Data link layer: transferring data within the system architecture, finding and correcting errors in the
physical layer
 Physical layer: establish communication channels to enable devices to connect within a specified
environment

10. Six contents of digital transformation (DX).


 Operational Process
 Marketing & Sales
 Supporting Functions
 Research & Development
 Information Technology
 Digital Reinvention

11. Five aspects needs to pay attention to when digital transformation (DX).
 Organization & Culture
 People & Capabilities
 Technology and Tools
 Data Ecosystem Management
 Digital Transformation Strategy
12. Please list five different data sources.
 Entity Media Data
 Enterprise/Institutional Data
 Cloud Data / Network Data
 Open Data
 IoT Data

13. Please explain about the collation of data.


 Clears data irregularities
 Identify missing values and anomalies
 Eliminate invalid (harmful/abnormal) data
 Fill in missing parts
 Data adjustment
 Normalization/Standardization of data
 Constructing data features (creation, addition, and deletion)
 Data dimension conversion

14. Steps of feature engineering.


 Data Cleaning
 Data Integration
 Data Selection
 Data Transformation
 Data Mining
 Pattern Evaluation
 Knowledge Presentation

15. Data storage technology and storage architecture.


Data Storage Technology
 Storage Area Network, SAN
 Network Attached Storage, NAS
 Cloud Storage
Data Storage Stracture
 File Storage
 Block Storage
 Object-based Storage

You might also like