Professional Documents
Culture Documents
Assignment 3: Department of Computer Science & IT
Assignment 3: Department of Computer Science & IT
Bahawalpur Campus
Assignment 3
Session: 2019-2021
Problem Statement:
With increases in the use of warranty, concessionary, and public-private partnerships, as well as
other innovative contracting processes, changes in the use of pavement management data can be
expected. For instance, historical pavement performance data and forecasted conditions may be
used to set acceptable condition levels and to determine whether contractual performance
requirements have been satisfied. As a result, a higher level of reliability is required of the data
than is needed for traditional processes, and so data collection processes may need to be
modified.
Tasks: The research will include the following tasks:
1. Identify data needs for managing innovative contracting projects, such as critical data for
measuring performance.
2. Determine the impacts innovative contracting has on pavement management practices, and
develop recommendations for accommodating these impacts (i.e., selecting applicable
performance measures).
3. Identify means for collecting data to support performance measures.
4. Develop guidelines for ensuring pavement management needs are satisfied by innovative
contracted projects.
Final Product:
The final product of the research is a set of guidelines for ensuring pavement management needs
are satisfied by innovative contracting practices.
III. OBJECTIVE
There are three specific objectives for the research. First, the research will identify the various
impacts innovative contracting has on pavement management systems. The second objective is
to determine how to account for the impacts innovative contracting has on pavement
management systems; for example, developing performance metrics and applicable data to
measure said
impacts. The final research objective is to develop guidelines for ensuring pavement
management needs are satisfied by innovative contracting practices.
2. Speech/voice recognition/classification:
Speech recognition, or speech-to-text, is the ability for a machine or program to identify words
spoken aloud and convert them into readable text. Rudimentary speech recognition software has
a limited vocabulary of words and phrases, and it may only identify these if they are spoken very
clearly. Speech recognition system basically translates spoken languages into text. There are
various real-life examples of speech recognition systems. For example, Apple SIRI which
recognize the speech and truncates into text. Eliminate echoes and noises. Another measure that
may improve your computer's voice-recognition accuracy is to eliminate background noise by
installing carpeting, tapestries, or soundproofing material to reduce sounds and noises that might
interfere with your computer's ability to understand you. The algorithms used in this form of
technology include PLP features, Viterbi search, deep neural networks, discrimination training,
WFST framework, etc. If you are interested in Google's new inventions, keep checking their
recent publications on speech. Whereas Speech classification is a means for automatic
classification of audio signals. Using these techniques, the incoming speech signals can be
classified, sorted and prioritized.
Voice recognition is a computer software program or hardware device with the ability to decode
the human voice. Voice recognition is commonly used to operate a device, perform commands,
or write without having to use a keyboard, mouse, or press any buttons. Speech recognition
technologies such as Alexa, Cortana, Google Assistant and Siri are changing the way people
interact with their devices, homes, cars, and jobs. The technology allows us to talk to a computer
or device that interprets what we're saying in order to respond to our question or command.
Problem Statement:
The problem of natural language understanding (NLU) is central as it is a prerequisite for many
tasks such as natural language generation (NLG). The consensus was that none of our current
models exhibit 'real' understanding of natural language.
Innate biases vs. learning from scratch: A key question is what biases and structure should we
build explicitly into our models to get closer to NLU. Similar ideas were discussed at the
Generalization workshop at NAACL 2018, which Ana Marasovic reviewed for The Gradient and
I reviewed here. Many responses in our survey mentioned that models should incorporate common
sense. In addition, dialogue systems (and chat bots) were mentioned several times. On the other
hand, for reinforcement learning, David Silver argued that you would ultimately want the model
to learn everything by itself, including the algorithm, features, and predictions. Many of our
experts took the opposite view, arguing that you should actually build in some understanding in
your model. What should be learned and what should be hard-wired into the model was also
explored in the debate between Yann LeCun and Christopher Manning in February 2018.
5. Machine learning:
Machine learning is an application of artificial intelligence (AI) that
provides systems the ability to automatically learn and improve from experience without being
explicitly programmed. Machine learning focuses on the development of computer programs that
can access data and use it learn for themselves. However, machine learning remains a relatively
'hard' problem. There is no doubt the science of advancing machine learning algorithms through
research is difficult Machine learning remains a hard problem when implementing existing
algorithms
and models to work well for your new application. Each of the respective approaches however
can be broken down into two general subtypes – Supervised and Unsupervised Learning.
Supervised Learning refers to the subset of Machine Learning where you generate models to
predict an output variable based on historical examples of that output variable. The goals of
artificial intelligence include learning, reasoning, and perception. AI is being used across
different industries including finance and healthcare. Weak AI tends to be simple and single-task
oriented,
while strong AI carries on tasks that are more complex and human-like. For example, medical
diagnosis, image processing, prediction, classification, learning association, regression etc. The
intelligent systems built on machine learning algorithms have the capability to learn from past
experience or historical data.
Problem Statement:
Movie dataset analysis the challenge is aimed at making use of machine learning and artificial
intelligence in interpreting Movie dataset. The dataset made available to participants is on the
Scripts of the movies, Trailers of the movies, Wikipedia data about the movies and Images in the
movies. In this project, we aim to impart the ability to get rid of biases in a machine or an AI
system. Specifically, we will aim to go beyond information retrieval to do reasoning over the
multimodal dataset and develop algorithms to remove the bias. The dataset is available at:
https://github.com/BollywoodData/Bollywood-Data for ease of use we have made available pre-
processed versions of these datasets. We have applied Watson NLP API and Open IE to produce
more enriched text. Similarly, for previews, we have identified emotions in selected frames along
with metadata for the movies. Participants are at liberty to use one or more of these datasets to
interpret, predict, and draw intelligence of any sort from the dataset provided. The following
section outlines few potential problems that can be taken up.
PROBLEM DESCRIPTION
1. Probable Use case to implement- Enable multi modal Question Answer on top of this dataset.
User should be able to ask questions (in text format), and the output should be text and/or
image.
User may also provide an image as an input, and the output should be the plot/points
relevant to that image.
Description: Enable multi modal Question Answer system and help in capturing information
about the dataset.
Stage 1 - Extract the data from Wikipedia-Data folder and extract plot text for each Bollywood
movie. Using this data, one should be able to query the dataset and ask natural language query
and the output of the query should be in natural language or an image. This image can be
extracted from image Data in the corresponding folder on github.
Stage 2 – Extract the data from image-data folder on git hub as an input and the output should be
text or natural language corresponding to the image. This text can be taken from Wikipedia-data
containing plot of each image.
2. Probable Use case to implement- Convert the movie plot into entity relationship graph where
each path traversal provides a different story arc of the movie
Use this graph to summarize the movie plot on 5 lines. Description: Convert the movie
plot into entity-relationship graph where each path traversal provides a different story arc
of the movie.
Stage 1 - Extract the data from Wikipedia-data folder and extract plot text for each Bollywood
movie. Using this data, one should be able to summarize the movie plot on 5 lines.
Stage 2 – Use this text data to construct entity-relationship graphs. Further using these entity-
relationship graphs find out various arcs of the movie story.
3. Probable Use case to implement- The data set has been used to show bias present in
Bollywood http://proceedings.mlr.press/v81/madaan18a/madaan18a.pdf
Develop algorithms to remove/reduce such biases Description: Design and develop
algorithm to remove gender bias in text.
Stage 1 – Extract Wikipedia plots data from Wikipedia-data folder and try to construct a
different and unbiased version of a story.
Stage 2 – Use attention model to pin point various parts in the story and then debits those parts.
Further show these nodes in an interactive visualization.
4. Probable Use case to implement- Develop interesting visualization to interactively explore this
dataset.
Description: Develop interesting visualization to explore this dataset.
Stage 1 – To explore the whole dataset, we look for innovative ideas and applications which
allow a user to explore the whole dataset. This also includes providing an interface to user to be
able to navigate at relevant parts of the dataset.
Stage 2 – The application should have the capability to flag the relevant parts of the dataset and
show those in the form of an interactive viz.
About Dataset
The dataset represents a large multimodal dataset derived out of multiple sources. The data
consists of:
Wikipedia Data - Contains text from plots of all movies from 1970 – 2017. The plots are taken
from Wikipedia.
Image Data – Posters of all movies from 1970-2017.
Scripts Data – PDF scripts for 13 movies. The scripts contain complete dialogues.
Preview Data - Previews of around 880 movies from 2010-2017. The dataset is available as-
https://github.com/BollywoodData/Bollywood-Data. For ease of use we also provide pre-
processed versions of these datasets. We have applied Watson NLP API and Open IE to produce
more enriched text. Similarly, for previews, we have identified emotions in selected frames along
with metadata for the movie. We encourage participants to propose interesting problems and
novel solutions.
EXPECTATION
Solution should be AI driven.
Participants should demonstrate through system demo at least some useful application.
Outcome should have document explaining thought process and design approach to
arrive at solution.
EVALUATION CRITERIA
The evaluation criteria are listed on the hackathon page.
TOOLS & TECHNOLOGY
IBM Cloud
IBM Watson
App development framework for desktop (e.g. Python, Java) and mobile (e.g. Android,
iOS)
5. Expert systems:
An expert system is a computer program that uses artificial intelligence (AI) technologies to
simulate the judgment and behavior of a human or an organization that has expert knowledge and
experience in a particular field. In artificial intelligence, an expert system is a computer system
emulating the decision-making ability of a human expert. Expert systems are designed to solve
complex problems by reasoning through bodies of knowledge, represented mainly as if–then
rules rather than through conventional procedural code. An expert system (ES) is a knowledge-
based system that employs knowledge about its application domain and uses an inferencing
(reason) procedure to solve problems that would otherwise require human competence or
expertise. An expert system generally consists of four components: a knowledge base, the search
or inference system, a knowledge acquisition system, and the user interface or communication
system. Two excellent examples of expert system applications in a commercial environment
include Apple's SIRI, a dialog system that attempts to replicate the decision-making capabilities
of a human expert, and the insurance company Blue Cross's automated insurance claim
processing system, which essentially replaces on-site.
Problem Statement:
Despite of the attempts and the research work by the robot researchers to emulate human
intelligence and appearance, the result is not achieved. Most robots still cannot see and are not
versatile object is not properly recognized by it. For the effective and proper mechanism of
robotics technology it is important to prioritize the inefficiency associated in it. Though the wide
use of robotics technology will take away many jobs of human being and it will create
unemployment in the society. The use of robots in performing various jobs will lead in the
reduction of jobs of the human being so the initiation should be done systematically. The
developments of robots will lessen many high end precision jobs and will help in various sectors
like agriculture, military, health and so on. This will lead to robots as a helper in the workplace
with some degree of balance between the actual requirement and the greed. The society should
support and care for the developments in the robotics technology as this will be beneficial for the
people and the various sectors of an economy. Many tasks which are beyond the human ability
can be performed with the help of robotics and robotics in the war will be very helpful in its
operation. The advancement of robot technology will be amazing and today, robots can be seen
virtually in all the fields from transport to health, and recreations to industries. The use of this
technology will get proclamations from the society for taking away the jobs of ordinary man. But
to solve the issues related to this the usage of robots should be applied to selected tasks and
mostly be used in the areas where human cannot reach or is not capable of performing.
Survey
Survey as a forecasting technique is one of the important technique for gathering the information
on a particular segment of technology or a product. Surveys can be done on the basis of few
selected questions which will help in providing valuable information about a products or an area
of technology. A survey can be anything from a short paper and pencil feedback to a one to one
in depth interview. It is one the primary source of forecasting techniques. Surveys are broadly
divided into questionnaire and interview. It helps in supplying the information about the use a
particular technology and the view of people for the advancement of the concerned technology.
It will render the ideas and other productive use of a particular technology in various other areas
which can is not is applied yet. Survey brings into notice various issues arising or connected to a
particular technology in the external environment. The surveys can be conducted in various
sectors and general public for their opinion on the technology and how more effectively it can be
used in other different areas.
The following are the benefits of Surveys as one of the techniques of forecasting-
Survey method is relatively inexpensive especially self-administrated surveys.
It is useful in explaining and describing the idea or the views of a large population.
Surveys can be administrated from remote areas using mails or telephone
In survey methods many questions can be asked about a given topics to get better picture of it.
In the survey, there is flexibility in the creation of the source as the questions are to be carried out:
interviews, face to face interviews, by telephonic interview, group administered oral surveys etc.
Standardized questions ensure similar data can be collected from the group.
High reliability can be obtained.
Survey methods will help to
The standardization of the questions in the survey method makes the measurement of the results
more precise.
Surveys method of research used in forecasting helps in furnishing the information and the various
other areas of using the technology in the subject of forecast.
Use of robotics
Robotics is the art and science of robots which includes designing, manufacturing, application,
and practical use. Robots will soon be everywhere from our home to work. Robots are used for
handling dangerous materials, assembling products, cutting and polishing, spray-painting,
inspection of products. The robots are used in diverse tasks such as detecting the bombs,
cleaning of sewers, and use in intricate surgery is increasing steadily, and will continue to rise in
coming years. The rapidly rising power of the artificial intelligence techniques and
microprocessor, robots have dramatically raised their potential as flexible automation tools. The
new upsurge of robotics is in applications is demanding advanced intelligence. Robotic
technology is meeting with a wide variety of complementary technologies – force sensing
(touch), machine vision, speech recognition and advanced mechanics. The introduction of robots
with the structured touch and vision has dramatically changed the speed and the efficiency of the
delivery system and the new productions. Robots are becoming so accurate that they can be
applied in which the manual operations are not
a valid option anymore. The manufacture of Semiconductors is one example, where high level of
quantity cannot be achieved with simple mechanization and human. Significant benefits are
achieved by enabling rapid changeover of product and evolutions which cannot be compared
with the schematic hard tooling.
Analysis
From the forecasting methods and the following analysis are interpreted:-
The major hardware committed to the hierarchical control structure of the robotics will become
standardized in the next twenty years. In the next decade, interactive software will be available to
form a parametric structure in all levels to meet the specific task requirement.
Parallel architecture will make a new paradigm for making the software developments feasible.
The sensor will be implemented at the microchip level with on-board integration of
amplification, analog to digital conversion, data reductions, and linearization and so on. The goal
is to make sensor inexpensive and light weighted like computer chips by 2020.
In between the year 2015to 2020, the robots would be seen in every South Korean and many
European households.
Aggressive image analysis technology will be required if significant progress in the next decade
is to occur.
It is targeted to produce intelligent robots that can make decisions, able to sense the environment
and learn will be used 30% households and organizations by 2022.
By the year 2030, the robots would be capable of performing at most manual jobs at the human
level.
By the year 2015, one third of US military fighting force will be composed of robots.
In the year 2035, the first completely independent robots will solders will be in operations.
By the year 2013 – 2014, agricultural robots will be developed.
The medical robots will be performing small incursive surgery by 2017.
In the year 2017-2022, household robots will be in full use.
By the year 2021, the Nanorobots will be introduced.
It is assumed that there is possibility of developing first generation universal force reflecting
manual controller in the period of next five years. The desired attributes of manual controller are
– lightweight, compact, portable, adaptable, minimum mass, transparency of forced feedback
signal.
Conclusions
Robotics develops man made mechanical devices that can be moved automatically or with help
of remote controls. Robotics technology is already in use in various different sectors like
industrial, transportation, healthcare etc. The goal of robotic technology is to broaden the use and
the effectiveness of the robots in various different fields. Robots are developed to perform
multifarious activities for the welfare of the human being in the most integrated and planned
manner by enhancing productivity and quality. Despite of the shortcomings and the problems
faced, the robotics technology is yet to set a new era in the next decade and the tasks easier and
smoother.
8. Internet of everything:
The Internet of things describes the network of physical objects—“things”—that are embedded
with sensors, software, and other technologies for the purpose of connecting and exchanging data
with other devices and systems over the Internet. Cisco defines the Internet of Everything (IoE)
as the networked connection of people, process, data, and things. The benefit of IoE is derived
from the compound impact of connecting people, process, data, and things, and the value this
increased connectedness creates as “everything” comes online. The Internet of Everything has
four pillars namely people, data, things and processes. In simple terms: IoE is the intelligent
connection of people, process, data and things. The Internet of Everything (IoE) describes a
world where billions of objects have sensors to detect measure and assess their status; all
connected over public or private networks using standard and proprietary protocols. IoT devices
can be used to monitor and control the mechanical, electrical and electronic systems used in
various types of buildings (e.g., public and private, industrial, institutions, or residential) in home
automation and building automation systems. Examples Connected appliances, Smart home
security systems, Autonomous farming equipment, Wearable health monitors, Smart
factory equipment, Wireless inventory trackers, Ultra-high speed wireless internet, Biometric
cybersecurity scanners.
9. Semantic web:
The Semantic Web is an extension of the World Wide Web through standards set by the World
Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-
readable. The term “Semantic Web” refers to W3C's vision of the Web of linked data. Semantic
Web technologies enable people to create data stores on the Web, build vocabularies, and write
rules for handling data. Linked data are empowered by technologies such as RDF, SPARQL,
OWL, and SKOS. The Semantic Web provides a common framework that allows data to be
shared and reused across application, enterprise, and community boundaries. It is a collaborative
effort led by W3C with participation from a large number of researchers and industrial partners.
Semantics (from Ancient Greek: σημαντικός sēmantikós, "significant") is the study of meaning,
reference, or truth. The term can be used to refer to subfields of several distinct disciplines
including linguistics, philosophy, and computer science. This may be the year when many
companies adopt semantic web technologies for commercial use. Examples include Best Buy,
BBC World Cup site, Google, Facebook and Flipboard. Google, Microsoft, Yahoo and Yandex
agree on Schema.org, a vocabulary for associating meaning to data on the web.
10. Information retrieval:
Information retrieval is the activity of obtaining information system resources that are relevant to
an information need from a collection of those resources. Searches can be based on full-text or
other content-based indexing. Information retrieval (IR) is finding material (usually documents)
of an unstructured nature (usually text) that satisfies an information need from within large
collections (usually stored on computers). An information retrieval process begins when a user
enters a query into the system. Queries are formal statements of information needs, for example
search strings in web search engines. In information retrieval a query does not uniquely identify
a single object in the collection. Information retrieval can provide organizations with immediate
value--while it's important to try to figure out ways to capture tacit knowledge, information
retrieval provides a means to get at information that already exists in electronic formats.
Retrieval is the act of getting something back, or of accessing stored data and files in a computer.
An example of retrieval is when you lose your keys down an elevator shaft and you get your
maintenance man to help you get them back.
Information retrieval tools aid the library user to locate, retrieve and use the needed information
in various formats. These information retrieval tools are bibliography, index and abstract, shelve
lists, Online Public Access Catalogue (OPAC) and library card catalogue. For an internet search
engine, data retrieval is a combination of the user-agent (crawler), the database, and how it's
maintained, and the search algorithm. The users then views and interacts with the query
interface. A simple model of an information retrieval system provides a framework for
subsequent discussion of artificial intelligence concepts and their applicability in information
retrieval. Concepts surveyed include pattern recognition, representation, problem solving and
planning, heuristics, and learning.
Problem Statement:
Information Retrieval (IR) is one of the fundamental area in information science-IR allows user
to search and find relevant documents in a collection (also refer to as a library). Users will
present their information needs by providing some key words, which is referred to as query. The
IR system searches the document collection for relevant documents based on the user's query.
These relevant documents will be ranked based on their degrees of relevance There are many
algorithms to calculate the degree of relevance. A simple and popular one is based on the Term
Frequency, or TF. The value of TF of a document regarding a specific query is calculated by the
frequency of the query term(s) in the document, divided by the total number of terms in that
document. For example, given a simple document containing only one line of text "I am taking a
Python class. There are 35 students in the Python class" For the document, there are 14 terms in
total (including the number 35) . If the e query is "students", the TF will be 1/14-0.0714 is If the
query is "Python", the TF will be 2/14-0.1429 For a given query, the IR system calculates TFs
for all documents in the collection and ranks them from high to low. The rank list will be present
to users as the search result. If the TF for a document is zero, the document is considered to be
not relevant to the query and will not be presented. If the TF of every document in the collection
is zero, it means no relevant documents are found in the collection. In this project you are
requested to develop an IR system that ranks documents based on their degrees of relevance
calculated using TF algorithm we mentioned above Constraints: » The collection of documents
will be provided * The program shall
be written using Python programming language Class (or function) shall be implemented and
utilized in your program * Error exception block shall be implemented in your program to
capture errors The user input query contain one term ° With the user's input query, your program
shall be able to provide a ranked list of relevant documents and their associated TF values All
past queries and the corresponding result lists need to be recorded in a query-result log file
(optional: only unique queries and their result list are recorded) » Record your name, class
number, and date of creation in each of your Python program Meet all the basic requirements
listed in the constrains Provide a Readme document (50 points) that a Outine the structure of
your program b. Briefly specify the purpose of each file in your final project folder, including
this readme.txt file C. Briefly specily how to use your program by users(mcluding some
screenshots) d. Briefly specify what can be improved in the future Improve the quality of your
codes a. Readability i. Using comments or docstrings to increase readability ii. Using whitespace
to separate blocks of codes b. Reusability:
i. using functions to increase reusability ii. Using classes to increase reusability ii. Store your
functions and classes in files, and import them into your main program to access these classes or
functions c. Stability: i. Tolerating input errors from user ii. Tolerating and responds
appropriately to different types of errors Paste the captured screens here demonstrating that your
program works.
Problem Statement:
A search process in unstructured wireless network generally employee whole network due to this
it will generally carried out flooding problem. Existing system contains flooding algorithm to
represent search process but this system address lack of search problem and inefficiency factors.
The flooding algorithm needs to search on each node on over unstructured network to find out
property which consumes an extra time. Energy Rate allocation and flooding problems are main
aspects in unstructured networks. It leads high computational problems and which consumes
extra processing time. To improve any wireless network environment performance it necessary
to accessing structure in wireless network. The energy consumption is a key aspect in wireless
network according to the random walk process will unbalance the energy and dynamic query
search process. A native approach of flooding random walk algorithm will lead flooding
problems and unbalanced energy rate allocation will be lead to network life time problems.
Problem Statement:
The problem statement stage is the first and most important step of solving an analytics problem.
It can make or break the entire project. A good data science problem should be relevant, specific,
and unambiguous. It should align with the business strategy. A problem statement should
describe
an undesirable gap between the current-state level of performance and the desired future-state
level of performance. A problem statement should include absolute or relative measures
of the problem that quantify that gap, but should not include possible causes or solutions. What
types of questions can data science answer? “Data science and statistics are not magic.
Step 1: Define the problem. First, it's necessary to accurately define the data problem that is to be
solved.
Step 2: Decide on an approach.
Step 3: Collect data.
Step 4: Analyze data.
Step 5: Interpret results.
Step 6: Conclusion.