Assignment 3: Department of Computer Science & IT

ational College of Business Administration & Economics
Bahawalpur Campus
Department of Computer Science & IT
Assignment 3
Name of Student: Mashavia Ahmad
Session: 2019-2021
Name of Supervisors: Dr. Ghulam Gilanie
Working Title: Assignment 3
Start Date: 20th Dec 2020

1. Data/text mining:
Data mining is a process used by companies to turn raw data into useful
information. By using software to look for patterns in large batches of data, businesses can learn
more about their customers to develop more effective marketing strategies, increase sales and
decrease costs. Example banks use data mining to better understand market risks. It is commonly
applied to credit ratings and to intelligent anti-fraud systems to analyses transactions, card
transactions, purchasing patterns and customer financial data.
Text mining (also referred to as text analytics) is an artificial intelligence (AI) technology that
uses natural language processing (NLP) to transform the free (unstructured) text in documents
and databases into normalized, structured data suitable for analysis or to drive machine learning
(ML) algorithms. Example include call center transcripts, online reviews, customer surveys, and
other text documents. Text mining and analytics turn these untapped data sources from words to
actions.
Problem Statement:
With increases in the use of warranty, concessionary, and public-private partnerships, as well as
other innovative contracting processes, changes in the use of pavement management data can be
expected. For instance, historical pavement performance data and forecasted conditions may be
used to set acceptable condition levels and to determine whether contractual performance
requirements have been satisfied. As a result, a higher level of reliability is required of the data
than is needed for traditional processes, and so data collection processes may need to be
modified.
Tasks: The research will include the following tasks:
1. Identify data needs for managing innovative contracting projects, such as critical data for
measuring performance.
2. Determine the impacts innovative contracting has on pavement management practices, and
develop recommendations for accommodating these impacts (i.e., selecting applicable
performance measures).
3. Identify means for collecting data to support performance measures.
4. Develop guidelines for ensuring pavement management needs are satisfied by innovative
contracted projects.
Final Product:
The final product of the research is a set of guidelines for ensuring pavement management needs
are satisfied by innovative contracting practices.
III. OBJECTIVE
There are three specific objectives for the research. First, the research will identify the various
impacts innovative contracting has on pavement management systems. The second objective is
to determine how to account for the impacts innovative contracting has on pavement
management systems; for example, developing performance metrics and applicable data to
measure said
impacts. The final research objective is to develop guidelines for ensuring pavement
management needs are satisfied by innovative contracting practices.
2. Speech/voice recognition/classification:
Speech recognition, or speech-to-text, is the ability for a machine or program to identify words
spoken aloud and convert them into readable text. Rudimentary speech recognition software has
a limited vocabulary of words and phrases, and it may only identify these if they are spoken very
clearly. Speech recognition system basically translates spoken languages into text. There are
various real-life examples of speech recognition systems. For example, Apple SIRI which
recognize the speech and truncates into text. Eliminate echoes and noises. Another measure that
may improve your computer's voice-recognition accuracy is to eliminate background noise by
installing carpeting, tapestries, or soundproofing material to reduce sounds and noises that might
interfere with your computer's ability to understand you. The algorithms used in this form of
technology include PLP features, Viterbi search, deep neural networks, discrimination training,
WFST framework, etc. If you are interested in Google's new inventions, keep checking their
recent publications on speech. Whereas Speech classification is a means for automatic
classification of audio signals. Using these techniques, the incoming speech signals can be
classified, sorted and prioritized.
Voice recognition is a computer software program or hardware device with the ability to decode
the human voice. Voice recognition is commonly used to operate a device, perform commands,
or write without having to use a keyboard, mouse, or press any buttons. Speech recognition
technologies such as Alexa, Cortana, Google Assistant and Siri are changing the way people
interact with their devices, homes, cars, and jobs. The technology allows us to talk to a computer
or device that interprets what we're saying in order to respond to our question or command.
3. Natural language processing:

Natural language processing is a subfield of linguistics, computer science, and artificial
intelligence concerned with the interactions between computers and human language, in
particular how to program computers to process and analyze large amounts of natural language
data. Natural Language Processing, or NLP for short, is broadly defined as the automatic
manipulation of natural language, like speech and text, by software. The study of natural
language processing has been around for more than 50 years and grew out of the field of
linguistics with the rise of computers. It involves the reading and understanding of spoken or
written language through the medium of a computer. This includes, for example, the
automatic translation of one language into another, but also spoken word recognition, or the
automatic answering of questions. For example, we can use NLP to create systems like speech
recognition, document summarization, machine translation, spam detection, named entity
recognition, question answering, autocomplete, predictive typing and so on. Python is the leading
coding language for NLP because of its simple syntax, structure, and rich text processing tool.
Natural language processing helps computers communicate with humans in their own language
and scales other language-related tasks. For example, NLP makes it possible for computers to
read text, hear speech, interpret it, measure sentiment and determine which parts are important.
The goal of natural language processing is to specify a language comprehension and production
theory to such
a level of detail that a person is able to write a computer program which can understand and
produce natural language. Natural Language Processing (NLP) is a branch of Artificial
Intelligence (AI) that studies how machines understand human language. Its goal is to build
systems that can make sense of text and perform tasks like translation, grammar checking, or
topic classification.
Problem Statement:
The problem of natural language understanding (NLU) is central as it is a prerequisite for many
tasks such as natural language generation (NLG). The consensus was that none of our current
models exhibit 'real' understanding of natural language.
Innate biases vs. learning from scratch: A key question is what biases and structure should we
build explicitly into our models to get closer to NLU. Similar ideas were discussed at the
Generalization workshop at NAACL 2018, which Ana Marasovic reviewed for The Gradient and
I reviewed here. Many responses in our survey mentioned that models should incorporate common
sense. In addition, dialogue systems (and chat bots) were mentioned several times. On the other
hand, for reinforcement learning, David Silver argued that you would ultimately want the model
to learn everything by itself, including the algorithm, features, and predictions. Many of our
experts took the opposite view, arguing that you should actually build in some understanding in
your model. What should be learned and what should be hard-wired into the model was also
explored in the debate between Yann LeCun and Christopher Manning in February 2018.
Program synthesis: Omoju argued that incorporating understanding is difficult as long as we do

not understand the mechanisms that actually underly NLU and how to evaluate them. She argued
that we might want to take ideas from program synthesis and automatically learn programs based
on high-level specifications instead. Ideas like this are related to neural module
networks and neural programmer-interpreters. She also suggested we should look back to
approaches and frameworks that were originally developed in the 80s and 90s, such
as FrameNet and merge these with statistical approaches. This should help us infer common
sense- properties of objects, such as whether a car is a vehicle, has handles, etc. Inferring such
common sense knowledge has also been a focus of recent datasets in NLP.
Embodied learning Stephan argued that we should use the information in available structured
sources and knowledge bases such as Wikidata. He noted that humans learn language through
experience and interaction, by being embodied in an environment. One could argue that there
exists a single learning algorithm that if used with an agent embedded in a sufficiently rich
environment, with an appropriate reward structure, could learn NLU from the ground up.
However, the compute for such an environment would be tremendous. For comparison, AlphaGo
required a huge infrastructure to solve a well-defined board game. The creation of a general-
purpose algorithm that can continue to learn is related to lifelong learning and to general problem
solvers. While many people think that we are headed in the direction of embodied learning, we
should thus not underestimate the infrastructure and compute that would be required for a full
embodied agent. In light of this, waiting for a full-fledged embodied agent to learn language
seems ill-advised. However, we can take steps that will bring us closer to this extreme, such as
grounded language learning in simulated environments, incorporating interaction, or leveraging
multimodal data.
4. Computer/machine vision:
Computer vision is an interdisciplinary scientific field that deals with
how computers can gain high-level understanding from digital images or videos. From the
perspective of engineering, it seeks to understand and automate tasks that the human visual
system can do. Computer vision, an AI technology that allows computers to understand and label
images, is now used in convenience stores, driverless car testing, daily medical diagnostics, and
in monitoring the health of crops and livestock. From our research, we have seen that computers
are proficient at recognizing images. One of the other reasons why computer vision is
challenging is that when machines see images, they see them as numbers that represent
individual pixels On
top of that, making the machines do complex visual tasks is even more challenging in terms of
the required computing and data resources. Computer vision is necessary to enable self-driving
cars. Manufacturers such as Tesla, BMW, Volvo, and Audi use multiple cameras, lidar, radar,
and ultrasonic sensors to acquire images from the environment so that their self-driving cars can
detect objects, lane markings, signs and traffic signals to safely drive.
Machine vision (MV) is the technology and methods used to provide imaging-based automatic
inspection and analysis for such applications as automatic inspection, process control, and robot
guidance, usually in industry. Machine vision uses sensors (cameras), processing hardware and
software algorithms to automate complex or mundane visual inspection tasks and precisely guide
handling equipment during product assembly. Applications include Positioning, Identification,
Verification, Measurement, and Flaw Detection. Vision systems are capable of measuring parts,
verifying parts are in the correct position, and recognizing the shape of parts. Also, vision
systems can measure and sort parts at high speeds. Computer software processes images captured
during the process you are trying to assess to capture data. Computer vision is necessary to
enable self-driving cars. Manufacturers such as Tesla, BMW, Volvo, and Audi use multiple
cameras, lidar, radar, and ultrasonic sensors to acquire images from the environment so that their
self- driving cars can detect objects, lane markings, signs and traffic signals to safely drive.
A machine vision system uses a camera to view an image, computer vision algorithms then
process and interpret the image, before instructing other components in the system to act upon
that data. Computer vision can be used alone, without needing to be part of a larger machine
system.
5. Machine learning:
Machine learning is an application of artificial intelligence (AI) that
provides systems the ability to automatically learn and improve from experience without being
explicitly programmed. Machine learning focuses on the development of computer programs that
can access data and use it learn for themselves. However, machine learning remains a relatively
'hard' problem. There is no doubt the science of advancing machine learning algorithms through
research is difficult Machine learning remains a hard problem when implementing existing
algorithms
and models to work well for your new application. Each of the respective approaches however
can be broken down into two general subtypes – Supervised and Unsupervised Learning.
Supervised Learning refers to the subset of Machine Learning where you generate models to
predict an output variable based on historical examples of that output variable. The goals of
artificial intelligence include learning, reasoning, and perception. AI is being used across
different industries including finance and healthcare. Weak AI tends to be simple and single-task
oriented,
while strong AI carries on tasks that are more complex and human-like. For example, medical
diagnosis, image processing, prediction, classification, learning association, regression etc. The
intelligent systems built on machine learning algorithms have the capability to learn from past
experience or historical data.
Problem Statement:
Movie dataset analysis the challenge is aimed at making use of machine learning and artificial
intelligence in interpreting Movie dataset. The dataset made available to participants is on the
Scripts of the movies, Trailers of the movies, Wikipedia data about the movies and Images in the
movies. In this project, we aim to impart the ability to get rid of biases in a machine or an AI
system. Specifically, we will aim to go beyond information retrieval to do reasoning over the
multimodal dataset and develop algorithms to remove the bias. The dataset is available at:
https://github.com/BollywoodData/Bollywood-Data for ease of use we have made available pre-
processed versions of these datasets. We have applied Watson NLP API and Open IE to produce
more enriched text. Similarly, for previews, we have identified emotions in selected frames along
with metadata for the movies. Participants are at liberty to use one or more of these datasets to
interpret, predict, and draw intelligence of any sort from the dataset provided. The following
section outlines few potential problems that can be taken up.
PROBLEM DESCRIPTION
1. Probable Use case to implement- Enable multi modal Question Answer on top of this dataset.
 User should be able to ask questions (in text format), and the output should be text and/or
image.
 User may also provide an image as an input, and the output should be the plot/points
relevant to that image.
Description: Enable multi modal Question Answer system and help in capturing information
about the dataset.
Stage 1 - Extract the data from Wikipedia-Data folder and extract plot text for each Bollywood
movie. Using this data, one should be able to query the dataset and ask natural language query
and the output of the query should be in natural language or an image. This image can be
extracted from image Data in the corresponding folder on github.
Stage 2 – Extract the data from image-data folder on git hub as an input and the output should be
text or natural language corresponding to the image. This text can be taken from Wikipedia-data
containing plot of each image.
2. Probable Use case to implement- Convert the movie plot into entity relationship graph where
each path traversal provides a different story arc of the movie
 Use this graph to summarize the movie plot on 5 lines. Description: Convert the movie
plot into entity-relationship graph where each path traversal provides a different story arc
of the movie.
Stage 1 - Extract the data from Wikipedia-data folder and extract plot text for each Bollywood
movie. Using this data, one should be able to summarize the movie plot on 5 lines.
Stage 2 – Use this text data to construct entity-relationship graphs. Further using these entity-
relationship graphs find out various arcs of the movie story.
3. Probable Use case to implement- The data set has been used to show bias present in
Bollywood http://proceedings.mlr.press/v81/madaan18a/madaan18a.pdf
 Develop algorithms to remove/reduce such biases Description: Design and develop
algorithm to remove gender bias in text.
Stage 1 – Extract Wikipedia plots data from Wikipedia-data folder and try to construct a
different and unbiased version of a story.
Stage 2 – Use attention model to pin point various parts in the story and then debits those parts.
Further show these nodes in an interactive visualization.
4. Probable Use case to implement- Develop interesting visualization to interactively explore this
dataset.
Description: Develop interesting visualization to explore this dataset.
Stage 1 – To explore the whole dataset, we look for innovative ideas and applications which
allow a user to explore the whole dataset. This also includes providing an interface to user to be
able to navigate at relevant parts of the dataset.
Stage 2 – The application should have the capability to flag the relevant parts of the dataset and
show those in the form of an interactive viz.
About Dataset
The dataset represents a large multimodal dataset derived out of multiple sources. The data
consists of:
Wikipedia Data - Contains text from plots of all movies from 1970 – 2017. The plots are taken
from Wikipedia.
Image Data – Posters of all movies from 1970-2017.
Scripts Data – PDF scripts for 13 movies. The scripts contain complete dialogues.
Preview Data - Previews of around 880 movies from 2010-2017. The dataset is available as-
https://github.com/BollywoodData/Bollywood-Data. For ease of use we also provide pre-
processed versions of these datasets. We have applied Watson NLP API and Open IE to produce
more enriched text. Similarly, for previews, we have identified emotions in selected frames along
with metadata for the movie. We encourage participants to propose interesting problems and
novel solutions.
EXPECTATION
 Solution should be AI driven.
 Participants should demonstrate through system demo at least some useful application.
 Outcome should have document explaining thought process and design approach to
arrive at solution.
EVALUATION CRITERIA
The evaluation criteria are listed on the hackathon page.
TOOLS & TECHNOLOGY
 IBM Cloud
 IBM Watson
 App development framework for desktop (e.g. Python, Java) and mobile (e.g. Android,
iOS)
5. Expert systems:
An expert system is a computer program that uses artificial intelligence (AI) technologies to
simulate the judgment and behavior of a human or an organization that has expert knowledge and
experience in a particular field. In artificial intelligence, an expert system is a computer system
emulating the decision-making ability of a human expert. Expert systems are designed to solve
complex problems by reasoning through bodies of knowledge, represented mainly as if–then
rules rather than through conventional procedural code. An expert system (ES) is a knowledge-
based system that employs knowledge about its application domain and uses an inferencing
(reason) procedure to solve problems that would otherwise require human competence or
expertise. An expert system generally consists of four components: a knowledge base, the search
or inference system, a knowledge acquisition system, and the user interface or communication
system. Two excellent examples of expert system applications in a commercial environment
include Apple's SIRI, a dialog system that attempts to replicate the decision-making capabilities
of a human expert, and the insurance company Blue Cross's automated insurance claim
processing system, which essentially replaces on-site.
6. Automation and Robotics:

Automation is the creation and application of technologies to produce and deliver goods and
services with minimal human intervention. The implementation of automation technologies,
techniques and processes improve the efficiency, reliability, and/or speed of many tasks that
were previously performed by humans. Automation, application of machines to tasks once
performed by human beings or, increasingly, to tasks that would otherwise be impossible. Three
types of automation in production can be distinguished: (1) fixed automation, (2) programmable
automation, and (3) flexible automation. Automated cells typically perform the manufacturing
process with less variability than human workers. This results in greater control and consistency
of product quality. Smaller environmental footprint. By streamlining equipment and processes,
reducing scrap and using less space, automation uses less energy. Automation brings
in necessary agility to testing and helps it to respond faster and more effectively to changes. ...
Agility requires frequent code deployments, which can also be automated. This frees testers from
mundane, repetitive tasks so that they can focus more on testing. Examples of every day
automation that most people experience on a daily basis; Washing machine, dishwasher,
refrigerators, bus doors, Air Condition systems in a car, turning on a complete home theater
system with the push of just one button, etc. And these are just a few examples from the top of
my head.
Robotics is a branch of engineering that involves the conception, design, manufacture, and
operation of robots. This field overlaps with electronics, computer science, artificial intelligence,
mechatronics, nanotechnology and bioengineering. Robotics is an interdisciplinary research area
at the interface of computer science and engineering. Robotics involves design, construction,
operation, and use of robots. The goal of robotics is to design intelligent machines that can help
and assist humans in their day-to-day lives and keep everyone safe. Robotics, design,
construction, and use of machines (robots) to perform tasks done traditionally by human beings.
Robots are widely used in such industries as automobile manufacture to perform simple
repetitive tasks, and in industries where work must be performed in environments hazardous to
humans. Discuss three of the five major fields of robotics (human-robot interface, mobility,
manipulation, programming, sensors) and their importance to robotics development. Robotics
technology influences every aspect of work and home. Robotics has the potential to positively
transform lives and work practices, raise efficiency and safety levels and provide enhanced levels
of service. In these industries robotics already underpins employment. Examples are the robot
dog Aibo, the Roomba vacuum, AI-powered robot assistants, and a growing variety of robotic
toys and kits.
Problem Statement:
Despite of the attempts and the research work by the robot researchers to emulate human
intelligence and appearance, the result is not achieved. Most robots still cannot see and are not
versatile object is not properly recognized by it. For the effective and proper mechanism of
robotics technology it is important to prioritize the inefficiency associated in it. Though the wide
use of robotics technology will take away many jobs of human being and it will create
unemployment in the society. The use of robots in performing various jobs will lead in the
reduction of jobs of the human being so the initiation should be done systematically. The
developments of robots will lessen many high end precision jobs and will help in various sectors
like agriculture, military, health and so on. This will lead to robots as a helper in the workplace
with some degree of balance between the actual requirement and the greed. The society should
support and care for the developments in the robotics technology as this will be beneficial for the
people and the various sectors of an economy. Many tasks which are beyond the human ability
can be performed with the help of robotics and robotics in the war will be very helpful in its
operation. The advancement of robot technology will be amazing and today, robots can be seen
virtually in all the fields from transport to health, and recreations to industries. The use of this
technology will get proclamations from the society for taking away the jobs of ordinary man. But
to solve the issues related to this the usage of robots should be applied to selected tasks and
mostly be used in the areas where human cannot reach or is not capable of performing.
Survey
Survey as a forecasting technique is one of the important technique for gathering the information
on a particular segment of technology or a product. Surveys can be done on the basis of few
selected questions which will help in providing valuable information about a products or an area
of technology. A survey can be anything from a short paper and pencil feedback to a one to one
in depth interview. It is one the primary source of forecasting techniques. Surveys are broadly
divided into questionnaire and interview. It helps in supplying the information about the use a
particular technology and the view of people for the advancement of the concerned technology.
It will render the ideas and other productive use of a particular technology in various other areas
which can is not is applied yet. Survey brings into notice various issues arising or connected to a
particular technology in the external environment. The surveys can be conducted in various
sectors and general public for their opinion on the technology and how more effectively it can be
used in other different areas.
The following are the benefits of Surveys as one of the techniques of forecasting-
Survey method is relatively inexpensive especially self-administrated surveys.
It is useful in explaining and describing the idea or the views of a large population.
Surveys can be administrated from remote areas using mails or telephone
In survey methods many questions can be asked about a given topics to get better picture of it.
In the survey, there is flexibility in the creation of the source as the questions are to be carried out:
interviews, face to face interviews, by telephonic interview, group administered oral surveys etc.
Standardized questions ensure similar data can be collected from the group.
High reliability can be obtained.
Survey methods will help to
The standardization of the questions in the survey method makes the measurement of the results
more precise.
Surveys method of research used in forecasting helps in furnishing the information and the various
other areas of using the technology in the subject of forecast.
Use of robotics
Robotics is the art and science of robots which includes designing, manufacturing, application,
and practical use. Robots will soon be everywhere from our home to work. Robots are used for
handling dangerous materials, assembling products, cutting and polishing, spray-painting,
inspection of products. The robots are used in diverse tasks such as detecting the bombs,
cleaning of sewers, and use in intricate surgery is increasing steadily, and will continue to rise in
coming years. The rapidly rising power of the artificial intelligence techniques and
microprocessor, robots have dramatically raised their potential as flexible automation tools. The
new upsurge of robotics is in applications is demanding advanced intelligence. Robotic
technology is meeting with a wide variety of complementary technologies – force sensing
(touch), machine vision, speech recognition and advanced mechanics. The introduction of robots
with the structured touch and vision has dramatically changed the speed and the efficiency of the
delivery system and the new productions. Robots are becoming so accurate that they can be
applied in which the manual operations are not
a valid option anymore. The manufacture of Semiconductors is one example, where high level of
quantity cannot be achieved with simple mechanization and human. Significant benefits are
achieved by enabling rapid changeover of product and evolutions which cannot be compared
with the schematic hard tooling.
Analysis
From the forecasting methods and the following analysis are interpreted:-
The major hardware committed to the hierarchical control structure of the robotics will become
standardized in the next twenty years. In the next decade, interactive software will be available to
form a parametric structure in all levels to meet the specific task requirement.
Parallel architecture will make a new paradigm for making the software developments feasible.
The sensor will be implemented at the microchip level with on-board integration of
amplification, analog to digital conversion, data reductions, and linearization and so on. The goal
is to make sensor inexpensive and light weighted like computer chips by 2020.
In between the year 2015to 2020, the robots would be seen in every South Korean and many
European households.
Aggressive image analysis technology will be required if significant progress in the next decade
is to occur.
It is targeted to produce intelligent robots that can make decisions, able to sense the environment
and learn will be used 30% households and organizations by 2022.
By the year 2030, the robots would be capable of performing at most manual jobs at the human
level.
By the year 2015, one third of US military fighting force will be composed of robots.
In the year 2035, the first completely independent robots will solders will be in operations.
By the year 2013 – 2014, agricultural robots will be developed.
The medical robots will be performing small incursive surgery by 2017.
In the year 2017-2022, household robots will be in full use.
By the year 2021, the Nanorobots will be introduced.
It is assumed that there is possibility of developing first generation universal force reflecting
manual controller in the period of next five years. The desired attributes of manual controller are
– lightweight, compact, portable, adaptable, minimum mass, transparency of forced feedback
signal.
Conclusions
Robotics develops man made mechanical devices that can be moved automatically or with help
of remote controls. Robotics technology is already in use in various different sectors like
industrial, transportation, healthcare etc. The goal of robotic technology is to broaden the use and
the effectiveness of the robots in various different fields. Robots are developed to perform
multifarious activities for the welfare of the human being in the most integrated and planned
manner by enhancing productivity and quality. Despite of the shortcomings and the problems
faced, the robotics technology is yet to set a new era in the next decade and the tasks easier and
smoother.
8. Internet of everything:
The Internet of things describes the network of physical objects—“things”—that are embedded
with sensors, software, and other technologies for the purpose of connecting and exchanging data
with other devices and systems over the Internet. Cisco defines the Internet of Everything (IoE)
as the networked connection of people, process, data, and things. The benefit of IoE is derived
from the compound impact of connecting people, process, data, and things, and the value this
increased connectedness creates as “everything” comes online. The Internet of Everything has
four pillars namely people, data, things and processes. In simple terms: IoE is the intelligent
connection of people, process, data and things. The Internet of Everything (IoE) describes a
world where billions of objects have sensors to detect measure and assess their status; all
connected over public or private networks using standard and proprietary protocols. IoT devices
can be used to monitor and control the mechanical, electrical and electronic systems used in
various types of buildings (e.g., public and private, industrial, institutions, or residential) in home
automation and building automation systems. Examples Connected appliances, Smart home
security systems, Autonomous farming equipment, Wearable health monitors, Smart
factory equipment, Wireless inventory trackers, Ultra-high speed wireless internet, Biometric
cybersecurity scanners.
9. Semantic web:
The Semantic Web is an extension of the World Wide Web through standards set by the World
Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-
readable. The term “Semantic Web” refers to W3C's vision of the Web of linked data. Semantic
Web technologies enable people to create data stores on the Web, build vocabularies, and write
rules for handling data. Linked data are empowered by technologies such as RDF, SPARQL,
OWL, and SKOS. The Semantic Web provides a common framework that allows data to be
shared and reused across application, enterprise, and community boundaries. It is a collaborative
effort led by W3C with participation from a large number of researchers and industrial partners.
Semantics (from Ancient Greek: σημαντικός sēmantikós, "significant") is the study of meaning,
reference, or truth. The term can be used to refer to subfields of several distinct disciplines
including linguistics, philosophy, and computer science. This may be the year when many
companies adopt semantic web technologies for commercial use. Examples include Best Buy,
BBC World Cup site, Google, Facebook and Flipboard. Google, Microsoft, Yahoo and Yandex
agree on Schema.org, a vocabulary for associating meaning to data on the web.
10. Information retrieval:
Information retrieval is the activity of obtaining information system resources that are relevant to
an information need from a collection of those resources. Searches can be based on full-text or
other content-based indexing. Information retrieval (IR) is finding material (usually documents)
of an unstructured nature (usually text) that satisfies an information need from within large
collections (usually stored on computers). An information retrieval process begins when a user
enters a query into the system. Queries are formal statements of information needs, for example
search strings in web search engines. In information retrieval a query does not uniquely identify
a single object in the collection. Information retrieval can provide organizations with immediate
value--while it's important to try to figure out ways to capture tacit knowledge, information
retrieval provides a means to get at information that already exists in electronic formats.
Retrieval is the act of getting something back, or of accessing stored data and files in a computer.
An example of retrieval is when you lose your keys down an elevator shaft and you get your
maintenance man to help you get them back.
Information retrieval tools aid the library user to locate, retrieve and use the needed information
in various formats. These information retrieval tools are bibliography, index and abstract, shelve
lists, Online Public Access Catalogue (OPAC) and library card catalogue. For an internet search
engine, data retrieval is a combination of the user-agent (crawler), the database, and how it's
maintained, and the search algorithm. The users then views and interacts with the query
interface. A simple model of an information retrieval system provides a framework for
subsequent discussion of artificial intelligence concepts and their applicability in information
retrieval. Concepts surveyed include pattern recognition, representation, problem solving and
planning, heuristics, and learning.
Problem Statement:
Information Retrieval (IR) is one of the fundamental area in information science-IR allows user
to search and find relevant documents in a collection (also refer to as a library). Users will
present their information needs by providing some key words, which is referred to as query. The
IR system searches the document collection for relevant documents based on the user's query.
These relevant documents will be ranked based on their degrees of relevance There are many
algorithms to calculate the degree of relevance. A simple and popular one is based on the Term
Frequency, or TF. The value of TF of a document regarding a specific query is calculated by the
frequency of the query term(s) in the document, divided by the total number of terms in that
document. For example, given a simple document containing only one line of text "I am taking a
Python class. There are 35 students in the Python class" For the document, there are 14 terms in
total (including the number 35) . If the e query is "students", the TF will be 1/14-0.0714 is If the
query is "Python", the TF will be 2/14-0.1429 For a given query, the IR system calculates TFs
for all documents in the collection and ranks them from high to low. The rank list will be present
to users as the search result. If the TF for a document is zero, the document is considered to be
not relevant to the query and will not be presented. If the TF of every document in the collection
is zero, it means no relevant documents are found in the collection. In this project you are
requested to develop an IR system that ranks documents based on their degrees of relevance
calculated using TF algorithm we mentioned above Constraints: » The collection of documents
will be provided * The program shall
be written using Python programming language Class (or function) shall be implemented and
utilized in your program * Error exception block shall be implemented in your program to
capture errors The user input query contain one term ° With the user's input query, your program
shall be able to provide a ranked list of relevant documents and their associated TF values All
past queries and the corresponding result lists need to be recorded in a query-result log file
(optional: only unique queries and their result list are recorded) » Record your name, class
number, and date of creation in each of your Python program Meet all the basic requirements
listed in the constrains Provide a Readme document (50 points) that a Outine the structure of
your program b. Briefly specify the purpose of each file in your final project folder, including
this readme.txt file C. Briefly specily how to use your program by users(mcluding some
screenshots) d. Briefly specify what can be improved in the future Improve the quality of your
codes a. Readability i. Using comments or docstrings to increase readability ii. Using whitespace
to separate blocks of codes b. Reusability:
i. using functions to increase reusability ii. Using classes to increase reusability ii. Store your
functions and classes in files, and import them into your main program to access these classes or
functions c. Stability: i. Tolerating input errors from user ii. Tolerating and responds
appropriately to different types of errors Paste the captured screens here demonstrating that your
program works.
11. Wireless censored networks:

Wireless networks are computer networks that are not connected by cables of any kind. The use
of a wireless network enables enterprises to avoid the costly process of introducing cables into
buildings or as a connection between different equipment locations. There are basically three
different types of wireless networks – WAN, LAN and PAN: Wireless Wide Area
Networks (WWAN): WWANs are created through the use of mobile phone signals typically
provided and maintained by specific mobile phone (cellular) service providers. A wireless
network allows devices to stay connected to the network but roam untethered to any wires.
Access points amplify Wi-Fi signals, so a device can be far from a router but still be
connected to the network. In networking terminology, wireless is an adjective that describes any
network or device that does not need a wired connection to transmit information or perform
tasks. Instead of physical wires (copper or optical fiber), wireless networks and devices use light
waves or radio frequencies to function. Wireless networks operate using radio frequency (RF)
technology, a frequency within the electromagnetic spectrum associated with radio wave
propagation. When an RF current is supplied to an antenna, an electromagnetic field is created
that then is able to propagate through space. The difference is in the method of connectivity. Wifi
is a terminology that refers to short range wireless connection to a wireline broadband
connection. ... A router, modem, or switch with wireless capability behaves in the same way by
connecting a device with wifi capability wirelessly to a wired broadband connection. WiFi
connection transmits data via wireless signals, while an Ethernet connection transmits data
over cable An Ethernet
connection is generally faster than a WiFi connection and provides greater reliability and security.
Problem Statement:
A search process in unstructured wireless network generally employee whole network due to this
it will generally carried out flooding problem. Existing system contains flooding algorithm to
represent search process but this system address lack of search problem and inefficiency factors.
The flooding algorithm needs to search on each node on over unstructured network to find out
property which consumes an extra time. Energy Rate allocation and flooding problems are main
aspects in unstructured networks. It leads high computational problems and which consumes
extra processing time. To improve any wireless network environment performance it necessary
to accessing structure in wireless network. The energy consumption is a key aspect in wireless
network according to the random walk process will unbalance the energy and dynamic query
search process. A native approach of flooding random walk algorithm will lead flooding
problems and unbalanced energy rate allocation will be lead to network life time problems.
12. Big data:

Big data is a term that describes the large volume of data – both structured and unstructured –
that inundates a business on a day-to-day basis. But it's not the amount of data that's important.
It's what organizations do with the data that matters. Big data is a field that treats ways to
analyze, systematically extract information from, or otherwise deal with data sets that are too
large or complex to be dealt with by traditional data-processing application software. Big Data
helps the organizations to create new growth opportunities and entirely new categories of
companies that can combine and analyze industry data. These companies have ample information
about the products and services, buyers and suppliers, consumer preferences that can be captured
and analyzed. Big data refers to massive complex structured and unstructured data sets that are
rapidly generated and transmitted from a wide variety of sources. These attributes make up the
three Vs of big data: Volume: The huge amounts of data being stored. Big data is a term used to
describe a collection of data that is huge in size and yet growing exponentially with time. Big
Data analytics examples includes stock exchanges, social media sites, jet engines, etc. Big
Data could be 1) Structured, 2) Unstructured, 3) Semi-structured.
13. Data sciences:

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and
systems to extract knowledge and insights from many structural and unstructured data. Data
science is related to data mining, machine learning and big data. Data science can be defined as a
blend of mathematics, business acumen, tools, algorithms and machine learning techniques, all
of which help us in finding out the hidden insights or patterns from raw data which can be of
major use in the formation of big business decisions. By extrapolating and sharing these insights,
data scientists help organizations to solve vexing problems. Combining computer science,
modeling, statistics, analytics, and math skills—along with sound business sense—data scientists
uncover the answers to major questions that help organizations make objective decisions. Data
science is the field of study that combines domain expertise, programming skills, and knowledge
of mathematics and statistics to extract meaningful insights from data. A Stack Overflow report
said that the growth of Python is even larger than it might appear from tools like Stack Overflow
Trends. Much of the growth has been attributed to web development and data science. Given the
recent developments, learning Python has been said to be essential for a good career track.
Problem Statement:
The problem statement stage is the first and most important step of solving an analytics problem.
It can make or break the entire project. A good data science problem should be relevant, specific,
and unambiguous. It should align with the business strategy. A problem statement should
describe
an undesirable gap between the current-state level of performance and the desired future-state
level of performance. A problem statement should include absolute or relative measures
of the problem that quantify that gap, but should not include possible causes or solutions. What
types of questions can data science answer? “Data science and statistics are not magic.
Step 1: Define the problem. First, it's necessary to accurately define the data problem that is to be
solved.
Step 2: Decide on an approach.
Step 3: Collect data.
Step 4: Analyze data.
Step 5: Interpret results.
Step 6: Conclusion.
14. Information security:

Information Security refers to the processes and methodologies which are designed and
implemented to protect print, electronic, or any other form of confidential, private and
sensitive information or data from unauthorized access, use, misuse, disclosure, destruction,
modification, or disruption. The fundamental principles (tenets) of information security
are confidentiality, integrity, and availability. Every element of an information security program
(and every security control put in place by an entity) should be designed to achieve one or more
of these principles. Information systems security, more commonly referred to as INFOSEC,
refers to the processes and methodologies involved with keeping information confidential,
available, and assuring its integrity. It also refers to: Access controls, which prevent unauthorized
personnel from entering or accessing a system. Information security is designed to protect the
confidentiality, integrity and availability of computer system and physical data from
unauthorized access whether with malicious intent or not. Confidentiality, integrity and
availability are referred to as the CIA triad. Three primary goals of information security are
preventing the loss of availability, the loss of integrity, and the loss of confidentiality for systems
and data. The typical threat types are Physical damage, Natural events, Loss of essential services,
Disturbance due to radiation, Compromise of information, Technical failures, unauthorized
actions and Compromise of functions.
15. Software/reverse engineering:

Software reverse engineering (SRE) is the practice of analyzing a software system, either in
whole or in part, to extract design and implementation information. Reverse engineering skills
are also used to detect and neutralize viruses and malware, and to protect intellectual property.
The purpose of reverse engineering is to facilitate the maintenance work by improving the
understandability of a system and to produce the necessary documents for a legacy
system. Reverse Engineering Goals: Cope with Complexity. Recover lost information. For
example, when a new machine comes to market, competing manufacturers may buy one machine
and disassemble it to learn how it was built and how it works. A chemical company may use
reverse engineering to defeat a patent on a competitor's manufacturing process. If the object you
want to reverse engineer is patented, you will have some limitations. It cannot be reverse-
engineered for duplication purposes. This means if you want to recreate a part for your machine,
it's illegal if that part has a patent, and you don't have permission from the patent owner.
Problem Statement:
Another obsolescence originated problem that can be solved by reverse engineering is the need
to support (maintenance and supply for continuous operation) existing legacy devices that are no
longer supported by their original equipment manufacturer. The problem is particularly critical in
military operations.
Step 1: Capture Data. The first step in reverse engineering a part is to capture the data from the
existing part.
Step 2: Refine the Model. Now that you have the detailed dimensions of the part from the scan
files, they can be refined into a final part.
Step 3: Manufacturing.
16. Cloud computing:
Cloud computing is the on-demand availability of computer system resources, especially data
storage (cloud storage) and computing power, without direct active management by the user. The
term is generally used to describe data centers available to many users over the Internet. There
are three main service models of cloud computing – Infrastructure as a Service (IaaS), Platform
as a Service (PaaS) and Software as a Service (SaaS). Cloud computing is an application-based
software infrastructure that stores data on remote serves, which can be accessed through the
internet The front end enables a user to access data stored in the cloud using an internet browser
or a cloud computing software. Simply put, cloud computing is the delivery
of computing services—including servers, storage, databases, networking, software, analytics,
and intelligence—over the Internet (“the cloud”) to offer faster innovation, flexible resources,
and economies of scale. Thanks to cloud computing services, users can check their email on any
computer and even store files using services such as Dropbox and Google Drive For example,
Adobe customers can access applications in its Creative Cloud through an Internet-based
subscription.
Problem Statement:
Even though Security, Privacy and Trust issues exists since the evolution of Internet, the reason
why they are widely spoken these days is because of the Cloud Computing scenario. Any
client/small organization/enterprise that processes data in the cloud is subjected to an inherent
level of risk because outsourced services bypass the "physical, logical and personnel controls" of
the user [1]. When storing data on cloud, one might want to make sure if the data is correctly
stored and can be retrieved later. As the amount of data stored by the cloud for a client can be
enormous, it is impractical (and might also be very costly) to retrieve all the data, if one’s
purpose is just to make sure that it is stored correctly. Hence there is a need to provide such
guarantees to a client. Hence, it is very important for both the cloud provider and the user to have
mutual trust such that the cloud provider can be assured that the user is not some malicious
hacker and the user can be assured of data consistency, data storage [2] and the instance he/she is
running is not malicious. Hence the necessity for developing trust models/protocols is
demanding.
What needs to be done to solve this problem? Both the user and the cloud provider instance must
make sure that whatever requests/response they get is from a trusted source by estimating the
correctness of data that they receive. This can be done by implementing a trust based protocol
that runs between the user and the instance before they start transferring any “real
requests/responses”.
The protocol/model will determine the trust at both the ends by probing each other with
challenges and then decide whether the other end is legitimate to handle requests/provide
responses.
17. Pattern recognition/classification:

Pattern recognition is the automated recognition of patterns and regularities in data. It has
applications in statistical data analysis, signal processing, image analysis, information retrieval,
bioinformatics, data compression, computer graphics and machine learning. The problem is to
divide the set W into two or more subsets, which differ in certain feature or according to
clustering themselves. There are two kinds of pattern recognition problems and methods:
• classification without learning;
• Classification with learning.
The process of pattern recognition involves matching the information received with the
information already stored in the brain. The development of neural networks in the outer layer
of the brain in humans has allowed for better processing of visual and auditory patterns. The
authors wrote, “Because pattern detection is a core component of human intelligence, people
with superior cognitive abilities may be equipped to efficiently learn and use stereotypes about
social groups.” Certain types may be more likely to act on social stereotypes without being aware
of it. An example of pattern recognition is classification, which attempts to assign each input
value to one of a given set of classes (for example, determine whether a given email is "spam" or
"non- spam"). However, pattern recognition is a more general problem that encompasses other
types of output as well.
18. Signal/image processing:

The field of signal and image processing encompasses the theory and practice of algorithms and
hardware that convert signals produced by artificial or natural means into a form useful for a
specific purpose. ... Image processing work is in restoration, compression, quality evaluation,
computer vision, and medical imaging. Image processing aims to transform an image into digital
form and performs some process on it, to get an enhanced image or take some utilized
information from it. Digital Signal Processing is used everywhere. DSP is used primarily in
arenas of audio signal, speech processing, RADAR, seismology, audio, SONAR, voice
recognition, and some financial signals. MATLAB is the most popular software used in the field
of Digital Image Processing.
Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and
synthesizing signals such as sound, images, and scientific measurements. Digital Signal
Processing is used everywhere. DSP is used primarily in arenas of audio signal,
speech processing, RADAR, seismology, audio, SONAR, voice recognition, and some
financial signals. Digital Signal Processors (DSP) take real-world signals like voice, audio,
video, temperature, pressure, or position that have been digitized and then mathematically
manipulate them. ... In the real-world, analog products detect signals such as sound, light,
temperature or pressure and manipulate them. In the context of digital signal processing (DSP), a
digital signal is a discrete time, quantized amplitude signal. In other words, it is a sampled signal
consisting of samples that take on values from a discrete set (a countable set that can be mapped
one-to-one to a subset of integers).
Problem Statement:
The Statement problem of active vibration control consists of two parts estimating the frequency
of the disturbance and given this frequency designing a controller to reject the disturbance. There
are two approaches to designing controllers to achieve the required disturbance rejection
capabilities based on LQG.
In LQG theory the noise processes that drive the system are broadband since they are white noise
processes Therefore some modification is required to address the narrow band disturbance
rejection problem. The modification considered here is that of disturbance modelling.
19. Distributed/information systems:

Distributed information systems represent an increasingly important trend to computer
users. Distributed processing is a technique for implementing a single logical set of processing
functions across a number of physical devices, so that each performs some part of the total
processing required. Typical examples of distributed computing and information
systems are systems that automate the operations of commercial enterprises such as banking and
financial transaction processing systems, warehousing systems, and automated factories. Each of
them has its own internal information processing system. There are various types of information
systems, for example: transaction processing systems, decision support systems, knowledge
management systems, learning management systems, database management systems, and office
information systems. An important goal of a distributed system is to make it easy for users (and
applications) to access and share remote resources. Resources can be virtually anything, but
typical examples include peripherals, storage facilities, data, files, services, and networks, to
name just a few. A distributed system, also known as distributed computing, is a system with
multiple components located on different machines that communicate and coordinate actions in
order to appear as a single coherent system to the end-user. Telephone and cellular networks
are also examples of distributed networks. Telephone networks have been around for over a
century and it started as an early example of a peer to peer network. Cellular
networks are distributed networks with base stations physically distributed in areas called cells.
20. Human computer interaction:

Human–computer interaction studies the design and use of computer technology, focused on the
interfaces between people and computers. Researchers in the field of HCI observe the ways in
which humans interact with computers and design technologies that let humans interact with
computers in novel ways. Human–Computer Interaction (HCI) is the study of the way in
which computer technology influences human work and activities. It can be critical to the many
stakeholders in a design process: customers, users, service providers, and marketers, as well as
designers who want to build upon the system and the ideas it embodies. Design rationale can
contribute to theory development in HCI in three ways. This review article is about the
importance human computer interaction in the technology which increases tremendously. The
goal of HCI is to improve the interaction between users and computers by making computers
more user-friendly and receptive to the user's needs. Human-computer interaction (HCI) is a
design field that focuses on the interfaces between people and computers. HCI incorporates
multiple disciplines, such as computer science, psychology, human factors, and ergonomics,
into one field. Learn the principles of HCI to help you create intuitive and usable interfaces. In
recent years, HCI research
based on gaze gestures has emerged and is increasing rapidly. Methodology for Hand
Gesture Recognition for Human-Computer Interaction: In this method, when the user gives
a gesture to the system it instantly captures the image of the hand gesture with the help of
its camera module.
Problem Statement:
Software hegemony, seamlessness of thought, and the building of computer science upon a
foundation of secrecy, present a series of problems to the individual. Advanced computer
systems is one area where a single individual can make a tremendous contribution to the
advancement of human knowledge. A system that excludes any individual from exploring
it fully, may prevent that individual from ``thinking outside the box'' and therefore
advancing the state-of-the-art.
Some of the new directions in Human-Computer Interaction (HCI) suggest bringing

advanced computing into all aspects of life. Computers everywhere, constantly monitoring
our activities, and responding ``intelligently'' have the potential to make matters worse,
from the above perspective, because of the possibility of excluding the individual user from
knowledge not only of certain aspects of the computer upon his or her desk, but also of the
principle of operation and the function of everyday things.

Assignment 3: Department of Computer Science & IT

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assignment 3: Department of Computer Science & IT

Uploaded by

Copyright:

Available Formats

ational College of Business Administration & Economics

Department of Computer Science & IT

Name of Student: Mashavia Ahmad

Name of Supervisors: Dr. Ghulam Gilanie

Working Title: Assignment 3

Start Date: 20th Dec 2020

3. Natural language processing:

Program synthesis: Omoju argued that incorporating understanding is difficult as long as we do

6. Automation and Robotics:

11. Wireless censored networks:

12. Big data:

13. Data sciences:

14. Information security:

15. Software/reverse engineering:

17. Pattern recognition/classification:

18. Signal/image processing:

19. Distributed/information systems:

20. Human computer interaction:

Some of the new directions in Human-Computer Interaction (HCI) suggest bringing

You might also like