Professional Documents
Culture Documents
Overview and Tutorial On Artificial Intelligence Systems: Eric Conrad
Overview and Tutorial On Artificial Intelligence Systems: Eric Conrad
Eric Conrad
Philips, Helen. Instant Expert: The Human Brain, NewScientist.com news service. http://www.newscientist.com/channel/health/brain/dn9969 2 ibid.
The network contains input nodes (artificial neurons), a layer of hidden nodes, and an output node. Hidden nodes add complexity, which allows representation of complex relationships.3 Nodes are connected by synapses, each given a weight. A node fires if a threshold weight is exceeded. If the final output node fires, a DoS attack has been detected. The DoS ANN is trained by feeding it data from previously identified DoS attacks. During each round of the training loop, the synaptic weights are adjusted until the ANN identifies all DoS attacks in the data set (and does not misidentify normal traffic as a DoS). When that process is complete, the DoS ANN is trained and can be used on new live data to identify future DoS attacks. The DoS ANN can be retrained (if necessary) as DoS attacks change and new attack data is collected.
Expert Systems
An expert system is a form of artificial intelligence that uses a knowledge base (KB) and inference engine to make decisions. Input for the knowledge base is gathered through a user interface. The inference engine uses a series of If-Then statements to derive intelligence from the knowledge base. Expert systems tend to focus on one area of (typically narrow) expertise (the problem domain). Early examples of successful expert systems include Stanford Universitys Mycin (an expert system designed to diagnose blood disorders) and DECs XCON. DECs (Digital Equipment Company) XCON (eXpert CONfigurator) system (originally called R1) was designed to simplify mainframe orders. A typical mainframe order in the
3
This diagram shows one layer of hidden nodes; multiple layers might be used.
late 1970s and early 1980s could contain any one of 5,000 parts, and many parts (such as disk drives) had specific dependencies (proper cables, port capacity, and so on). Humanplaced orders often contained mistakes and additional (unnecessary) parts, and this inefficiency cost DEC millions of dollars each year. XCON could analyze an order much faster than a human could and with higher accuracy. By 1985, XCON had processed 80,000 orders, with accuracy of 95%98%, saving DEC 25 million dollars that year.4 Here is an example XCON If-Then rule:
If the most current active context is distributing massbus devices And there is a single-port disk drive that has not been assigned to the massbus And there are no unassigned dual-port disk drives And the number of devices that each massbus should support is known And there is a massbus that has been assigned at least one disk drive And this massbus should support additional disk drives And the type of cable needed to connect the disk drive to the previous device on the massbus is known Then assign the disk drive to the massbus5
Genetic Algorithms
A genetic algorithm is a type of search algorithm (an algorithm that takes input and computes an output, where multiple paths might be taken). Genetic algorithms are a part of evolutionary computation that use concepts borrowed from nature to conduct the search, including selection, mutation, and crossover rate. Genetic algorithms use strings6 to represent a genome (also called a chromosome). Genetic algorithms are used in information security for tasks such as misuse detection, intrusion detection, data mining and analysis, and related tasks. Genomes are designed to perform a specific function. A simple example is a tic-tac-toe7 genetic algorithm (called TTT GA). There are 827 distinct boards8 in a TTT game9, so the TTT genome ash 827 genes (specific functions). Each gene has an allele (value), which is the corresponding move to make.
Huyc, Chris. Configuration with R1/XCon (1978). Middlesex University. http://www.cwa.mdx.ac.uk/bis2040/lect6ES2/xCon.html 5 Goel, Ashok. CS 6601 Artificial Intelligence. Fall 2003 class notes. http://www.cc.gatech.edu/classes/AY2004/cs6601_fall/notes/rules.txt 6 Genetic programming is a related field, using tree structures of LISP expressions instead of strings. 7 Hochmuth, Gregor. On the Genetic Evolution of a Perfect Tic-Tac-Toe Strategy. http://www.geneticprogramming.org/sp2003/Hochmuth.pdf 8 ibid, Hochmuth 9 After discarding impossible grids such as all Xs and taking into account mirror images, and so on.
Here is a gene representing one of the 827 possible TTT boards (# 437), and the 5 possible legal responses (alleles):10
Each gene has one allele, which might be chosen from the pool of 1 (final move) to 9 (empty board) alleles. A population of X (which usually numbers in the hundreds to thousands) TTT genomes are generated. Each TTT genome contains 827 genes, along with 827 randomly generated alleles. Based on the previous diagram, in a first generation genome, gene437 would be randomly assigned one of the 5 legal alleles. The population of genomes then play a series of TTT games with each other. During each game, the TTT GA looks up the gene representing the current board and plays the corresponding move represented by the genes allele. A fitness score is assigned to each genome (more wins = higher fitness score). When rated, genomes are crossbred, and higher performing genomes are more likely to breed. The children genomes inherit a percentage of alleles from both parents. Some children then receive random mutations to their genomes (where Allele1 in the preceding diagram is randomly swapped for another allele, such as 5, for example). The next generation of genomes performs the same function, and the process repeats. This process can evolve a genome that plays a perfect game of tic-tac-toe. Gregor Hochmuths TTT GA evolved a perfect genome after 373 generations, using a population of 500 genomes11.
10 11
Selection
One form of selection is called roulette wheel selection. Imagine a roulette wheel with variable-size slots. A genome that is twice as fit as its neighbor has a slot twice as large. When the wheel is figuratively spun, genomes with larger slots are more likely to win a chance to crossbreed. In this diagram, Genome 6 scored the highest fitness score (and is more likely to be selected for crossbreeding); genomes 4 and 5 scored the lowest.
Crossover
The simplest type of cross over is 1-point crossover, where the parent alleles are swapped beginning at a specific gene.
Other crossover schemes exist, including N-point crossover (parent alleles are swapped at multiple points) and uniform crossover (each parent allele has a 50/50 chance of being selected). After crossbreeding, some children are randomly mutated:
Summary
Although it does not actually simulate human intelligence, AI techniques enable computers to simulate activities that humans perform. We covered neural networks and expert systems.