Productive Thinking

Productive Thinking...
the Gestalt Emphasis
Productive Thinking...the Gestalt Emphasis

Lecture Material
q q
Illusions and Ambiguous Figures Organizing Principles of Perceptual Grouping r Introduction r Wertheimer Perceptual Grouping Partitioning the Physical World into Objects r Some Natural Scenes r Some Artificial Scences r A Procedure that Utilizes Physical Constraints Problem Solving Representation and Constraints r Cryptarithmetic Problem r Checkerboard Problem r Mutilated Checkerboard Problem r MatchMaker Problem
Terms, Concepts, and Questions

q
Assignments/Exercises
q q q
Traffic Lights ! Cryptarithmetic Problem Matchstick Problems
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/gestalt.html11/11/2005 12:59:12 PM
Illusions and Ambiguous Figures
Does the mind always represent the world accurately and unambiguously?
Perceptual illusions and ambiguous figures were of special interest to the Gestaltists. Artists have also been fascinated by these perceptual phenomenon. Perceptual illusions and ambiguous figures are of special interest in the investigation of thinking because:
q
illusions seem to indicate that our mind does not always accurately represent the perceptual input. For the Gestaltist, this suggested that the mind was "actively" involved in interpreting the perceptual input rather than passively recording the input. ambiguous figures exemplify the fact that sometimes the same perceptual input can lead to very different representations. Again, the Gestaltist took this as suggesting that the mind was actively involved in interpreting the input. what I will call completion figures are figures which the mind rather unambiguously interprets in a particular way despite the fact that the input is incomplete relative to what is typically "seen"
What follows are a variety of examples of these phenomena which you can select and explore.
Illusions
q q q
Mller-Lyer Illusion Hering-Helmholtz Illusion Ebbinghaus Illusion
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Illusions.html (1 of 2)11/11/2005 12:59:30 PM
Illusions and Ambiguous Figures
Ambiguous Figures
q q q
Rubin Vase Old/Young Woman What is this a photo of?
Completion Figures
q
Triangle Completion
Some Artists' Versions (click to enlarge)
Escher
Riley
Magritte

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Illusions.html (2 of 2)11/11/2005 12:59:30 PM
Untersuchen zur Lehre von der Gestalt - 1
Wertheimer on organizing principles of perceptual grouping

Max Wertheimer was one the more famous gestalt researchers. Gestalt can be translated as "form" and part of the emphasis of the gestalt group was that "the whole (or gestalt) was greater than the sum of its parts." We aren't interested here in exactly what this slogan was meant to convey. However, notice that this represented an explicit focus on the question of composition; that is, how are our ideas structured. The slogan was meant to convey the gestaltist's belief that this process of composition probably wasn't a simple one. This slogan can be understood a bit better if we get a little more technical. Summation is, of course, an additive process. If you recall the implications of the axioms of addition, you may recall that the contribution of one number to the sum is independent of when you add it...or put another way, where it appears in the summation. E.g. 2 + 3 + 4 = 9 = 3 + 2 + 4. That is, the 3 that appears in bold, and in fact the whole equation yields the same result regardless of the context. And in fact 3 is still three in the equation 1,899, 622, + 3 + 567,333. The gestaltist were suggesting that we really had to consider the context in which some element occurred in order to understand how it contributed to the whole or gestalt. They thought that the mind rarely combined things or organized things independent of the context in which they appear. Consider the two configurations shown below:
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer.html (1 of 2)11/11/2005 12:59:43 PM
Untersuchen zur Lehre von der Gestalt - 1
In both of these configurations we have a larger rectangle placed between two smaller rectangles. But note that the "large rectangle" in the top figure is exactly the same size as the two "smaller rectangles" in the lower figure. Here the "context" is presumably influencing the way in which I encode the various rectangles. The fact that there are two ways of viewing configurations of this sort was very important to the arguments that went on between the gestaltists and behaviorists during the earlier part of the 20th century. On one view, an absolute or context independent view, we place the center rectangle of the top figure and the left and right rectangles of the lower figure into the same equivalence class because they are exactly the same size and shape. But, on another view, the middle rectangles of each figure should be seen as equivalent because they both depict a large rectangle that is between two smaller rectangles. The fact that there are two possibilities may pose a problem for a "passive mind". The environment can hardly be asked to decide which of these alternatives is seen by the viewer. The "stimulus" seems to be a somewhat ambiguous notion when thought of in this way...But, since the behaviorist studied the way in which stimuli became associated with responses, the behaviorist couldn't allow the stimulus to be a fuzzy notion. (Note that ambiguous figures present exactly the same problem and this is one of the reasons that they have been studied so extensively in perception.) Recall that I referred to the large rectangle between two smaller rectangles. Relations such as 'between' also gave rise to some difficulties. Is "between" something that is out there in the world? Can you point at it? Well, not really. It seems to be a way in which we describe a situation and not an intrinsic property of the situation. (Cf. the Wrzburg Group discussed in Chapter 1 of your text.) And if that isn't bad enough, I can also correctly say things such as: "There is not a circle in either picture." "There is not a triangle in either picture." There is not an apple in either picture." etc. ad nauseum.... Even this simple innocuous every day word "not" causes problems since we can use it to make true statements about the world. But, how does the situation or stimulus elicit statements of this sort....there are infinite number of such statements that I could make about any situation!
Wertheimer on Grouping of Elements

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer.html (2 of 2)11/11/2005 12:59:43 PM
Untersuchen zur Lehre von der Gestalt 2
On Perceptual Grouping
Adapted from Max Wertheimer, Untersuchen zur Lehre von der Gestalt, II. Psychologishe Forshung, 1923, 4,301-305.
Now we turn to Wertheimer's consideration of perceptual grouping. What is most important about this work in the present context is that Wertheimer attempted to see what our mind does against a consideration of what our mind might have done. In the work we review here, Wertheimer explores the way we typically perceptually group elements. But in order to determine whether the mind is exerting a "bias" he considers some of the ways these same elements could have been grouped but weren't. The behaviorists typically weren't this analytic. Consequently, it was the gestaltist that kept trying to point out these pesky problems that can arise when you are a bit more analytic about what you are up to. The examples that Wertheimer constructed were very simple; most of them consisted of a set of dots. The purpose of these examples was to aid in the understanding of the different factors that influenced the grouping or composition of elements into wholes. The gestaltist had suggested what they called unit forming factors that influence how elements are grouped or organized in wholes. These unit forming factors were:
q q q q q q
similarity; proximity; common fate; good continuation; set; and past experience.
The examples that follow are mainly concerned with the first two factors. The first example shown below consists of a row of 10 dots. Think of labeling these dots from left to right as a, b,...,j. Below this row of dots is shown two ways in which the dots might be organized. The one on the left is in terms of groups of two as indicated by ab/cd/.... where the elements that are close to each other are grouped into a unit. An alternative basis for grouping, depicted on the right by a/bc/de/... involves grouping the leftmost and rightmost elements of adjacent pairs together; e.g. bc. This is a logical possibility; and, if you work at it you may be able to "see" that organization. But, it is not the organization that we would typically see; nonetheless, we can imagine a mind that would be biased to see this latter organization of the dots.
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer2.html (1 of 6)11/11/2005 1:00:30 PM
We have shown only two ways in which the dots might be organized. In fact, there are many possible ways. There are 10 dots in this example. If we think of these 10 dots as being members of a set, then what we have called a grouping is simply a partition of this set. The number of possible partitions of a set of n items is given by something called a Stirling number, and these numbers can get large very rapidly. To get some idea of why this is the case, consider something called the power set. The power set is simply the set of all possible subsets of a set. For example, the power set of the the set {x,y,z} is the set { {x,y,z}, {x,y}, {x,z}, {y,z}, {x}, {y}, {z}, {} }. Note that a set is a subset of itself and every set has the empty set, {}, as a subset. Now, a partition of a set is simply a selection of a set of subsets such that no element occurs in more than one of the subsets selected and each element occurs in one of the selected subsets. For example [ {x,z} ,{y} ] is a partition. Note that there are 8 sets in the power set of this set of three elements. In general, a set of size n has 2 to the n subsets. The set of 10 dots below has 2 to the 10 subsets, and then there are all the ways to choose from these 1024 subsets to form partitions! In case your interested in seeing a list of the power set of the 10 element set {a,b,c,d,e,f,g,h,i,j}, click here. The next example below is similar except that now the dots are arranged in a diagonal. Again, two of the many possible groupings are shown with the one that we are biased to use appearing on the lower left.
The figure below has increased the number of dots to 15, but, despite the enormous increase in the number of possible groupings, we don't see this as very different from the 10 dot figure above
In order to help you visualize these groupings, the two groups above have been animated as well as a random grouping. To see each of these animations, click on the selections below.
In the next example below, it is clear that the distance between the dots influences whether we organize it as rows or columns.
In the top matrix, all of the dots are equidistant and you can probably organize them in a number of ways. But the matrices below show that the similarity in "color" is a another factor that influences the way we group these elements.
In the examples below, we tend to group the dots as forming two lines; e.g. in the left and middle figures we tend to see the A and C segments grouped as a single "line" and the B segment grouped as a line; rather than the A, B, and C segments each as a line.
The next two figures, one composed of dots and the other of continuous lines further demonstrate our bias in grouping such figures.
The next example illustrates the factor that the gestaltists referred to as common fate. Here the arrows pointing in a common direction tend to be grouped.
Finally, the examples below provide evidence for the effect of past experience and context on the grouping of elements. Notice that in the top figure the middle lines are grouped into single whole, but in the next figure they are grouped into two elements.
And, finally the image below shows only the center element which is, of course, the same in both examples.

Charles F. Schmidt
Partitioning the Physical World
REALITY AND POSSIBILITY:

Seeing Reality by Imagining the Possibilities
"In order to determine what happened, imagine all of the possibilities and when you have eliminated all but one, then no matter how improbable, that remaining possibility is the truth." (Sherlock Holmes' Father, I think.)
Think of clouds in the sky, waves in the sea, and flags in the wind. Describe their shape. The pictures below provide a snapshot of the shape of clouds, waves and flags as caught by the camera for a brief moment. Recall that in the previous section we considered the question of how we organized simple figures such as sets of dots. When there were ten dots we noticed that we grouped them into an organization that was referred to as a partition. A partition is a special way in which to divide a set into subsets. A partition is a set of subsets of a set such that each member of the set is in one and only one subset. And, each member of the set is in some subset. With our previous examples of dots and lines this seemed like such an obvious condition to impose that you probably didn't even think about it. But study the clouds in the picture above of the coast near Can Cun and the sail boats and breaking waves near Diamond Head on Oahu in the picture below. How many clouds are there? How many waves? Notice that in order to try to answer the question of 'how many' you need to establish exactly where the borders are that separate one cloud from another and one wave from another. Not an easy task even in a snapshot; and the reality is much more complicated since had the snapshots been taken a few instants later, an entirely different configuration of clouds and waves would be recorded. Clouds and waves have a shape and at times can be individuated; but the individuals don't persist. The shape changes and the identity of the momentarily individuated item soon disappears forever in the constant flux of change. But what about the flags shown in the next picture below? In this case it is quite easy to individuate these flags flying over Fort Sumner. And their shape? Clearly they are rectangular in shape.
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz1.html (1 of 3)11/11/2005 1:00:40 PM
But is this so obvious? Or is this an inference not drawn from the visual information alone but also drawn using our information about how flags are made? The flag poles also help with this inference. If the last flag pole at the far left were not visible would you still be as certain that there are exactly six flags in the picture? Is it really so easy to decide the contours to those flags on the far left? Or, put in a way that harks back to the idea of a partition; it it really a simple matter to determine for each contour the one and only one flag that it belongs to? Notice that. as with the waves and the clouds, the shape of the flags is varying constantly with the vagaries of the wind. Is our belief that the flags really have a constant rectangular shape justified? If so, why wouldn't we be justified in believing that clouds were actually square and waves rectangular? Consider next the picture shown below on the left of one the newer buildings in Manhattan. What is its shape? How many buildings are in the picture? Now look at the picture shown below on the right. Is this the same building? What is its shape? How many buildings are shown in this picture? Notice that in this case we assume that the shape of the building is not changing in the wind and its individuation is secure (Think of the problems establishing property rights if this weren't the case!). What has changed in the two picture is the point from which the building is viewed.
Additionally, in both pictures various degrees of shadowing create "contours" where we are quite certain there are no contours......only shadows. But how do you know this? Have you made some assumptions which have licensed this conclusion? Also of particular interest in the picture on the right is the trihedral vertex...the point where three lines intersect in what is almost the center of the picture. What is going on here? Trihedral vertices will be examined in detail in the next page and perhaps we will then understand why this aspect of the picture may seem problematic. In this section we will examine the role assumptions play in our ability to quickly and quite accurately organize the complex visual information that informs us about our everyday world. The particular assumption that is our focus is the assumption that much of our world is populated with objects that don't change shape despite the fact that their shape appears to change as we move around relative to the objects or the objects move relative to ourselves. This assumption, that many of the objects of our world do not change shape is called "the rigid body assumption". On the next page are various drawings that will sharpen our visual intuitions regarding partitioning the world of objects. Next Page on Partitioning the Physical World into Objects
Charles F. Schmidt
Partitioning the Physical World 2
Some Artificial Figures

It appears that our mind probably has to make the rigid body assumption an assumption about the world in order to be able to organize the contours of objects into a set of objects. To get some feel for what we are talking about, look at the pictures shown below which were designed to thwart out ability to quickly organize the world into a set of objects.
Try to organize the mass of connecting lines shown above into a single partition of objects. I created this picture from a basic rectangular wire frame that outlines a rectangular solid. Unless you use certain types of computer graphics programs you rarely encounter this type of schematic information. When we view a rectangular solid we see only a subset of the edges of object. What you should have experienced, if you are at all like me, is that you could organize these wire frames in a variety of different ways...but, you probably weren't able to fix on a single coherent partition of the lines into a set of "objects". Even though the shape of this object is not changing; it is difficult to come to a fixed organization of the object. The next picture is more of the same, but I made it larger and a little more complex. If you have a large enough screen, then expand your browser window so you can view the entire picture. Now, try and focus on the center and note the ways in which your mind organizes the lines.
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz1A.html (1 of 5)11/11/2005 1:01:12 PM
If this didn't challenge your mind enough, then try the next picture which is more of the same, but lots more and and now all in a jumble. The lack of regularity in the composition will probably make it harder for you to structure this picture.
And, if this hasn't sated your visual appetite, then sink into this next one which is constructed solely from a simple triangle. You should be able to see many ways in which to organize this picture, some of which are triangles, others squares and other rectangles.
Now that you've experienced some of the difficulties your mind has with both photographs of the natural world and with these artificially created pictures, we are now ready to ask why the mind doesn't run into these kind of difficulties all of the time. Well, we don't really know the full answer to this question. But, one intriguing line of research that addresses this issue also fits in with our emphasis on comparing the logical possibilities with those that actually seem to be considered by the mind. The discussion here is based on work done by David Waltz in the early 70's on the question of how one might go from two dimensional information about lines and the way they are connected to an organization of these lines and their connections into a set of objects. One way to try to get your mind around this problem is to focus your eyes on some messy area in your current environment. My desk is always available to fill this bill. Notice that the objects overlap each other and only parts of their surfaces are visible. Now think of simply drawing a line on a sheet of paper for each object edge that you can see. These lines will intersect and form vertices. This set of lines and the vertices that they created were what Waltz took as the input to his computer program. The output was to be an organization of these lines and vertices into a set of objects....what I referred to above as a partition of
this vertex information into a set of objects. The next page will briefly present the work of Waltz on this partition problem. Waltz Research on Partitioning the Physical World into Objects
Charles F. Schmidt
The Waltz Research on Paritioning a Simple Visual World into Objects
Partitioning the World into Objects - Using Physical Constraints

The middle drawing below consists of about 80 lines and 73 vertices (points where lines join). The vertices are indicated in the drawing on the right with a blue circle. In the middle drawing no attempt has been made to make it appear threedimensional. It is a flat, two-dimensional collection of lines that intersect. Nonetheless, you have probably organized this jumble of lines and vertices in a way that is consistent with the view that a collection of three-dimensional objects are depicted in this drawing. The drawing on the left includes the lines that might complete the various objects. The grouping of the lines and vertices of the middle figure into a set of objects results in a partitioning of the set of lines. A partition is a grouping of a set of things into subsets such that no element occurs in more than one subset and each element occurs in one of the subsets. How might this partitioning be accomplished? The set of possible partitionings is enormous-even for this trivial example.
Completed "Objects"
Basic Lines and Vertices
Vertices Explicitly Depicted
Of course, we don't exactly know how the human mind accomplishes this task. But, we can study a simplification of this task and thereby:
q q
obtain a clear understanding of the nature of this problem; and demonstrate the way in which assumptions about the world can simplify what is otherwise an intractable problem.
The work we will discuss is based on research carried out by David Waltz. His work was more extensive than what we will present here. We will limit our discussion of his work to scenes where no more that three lines come together at any vertex and we will ignore shadows and cracks. Waltz actually considered a more general case, but the story is more simply told if we limit ourselves to scenes that only involve trihedral vertices. We take two rectangles as shown in the figure below on the left and join them to obtain the set of lines and vertices shown in the figure on the right. This figure will serve as our simple example object for use in explaining this work.
Two Rectangle Composed
Example Figure
Example Figure - Vertices Only
Example Figure - Lines Only
The next figure above on the left shows the example figure without the lines and only the vertices. Note that the three lighter colored vertices involve two lines and the rest involve three...these are trihedral vertices. Notice that even though these vertices are not identical, it appears that there are only a few types represented here. The next figure above on the right, shows the example figure as only the set of lines that connect the vertices. The vertices themselves are deleted. Now it turns out that for the simple case that we are considering, the lines must be either boundary lines or interior lines. For a boundary line, the object lies on one side of the line but not on the other. If we think of walking clockwise around the object on its boundary lines, then we can distinguish two types of boundary lines based on the direction that we are moving. For an interior line, the object lies on both sides. The lighter colored lines in this figure will turn out to be interior lines. We can also distinguish two types of interior lines; those that are convex and those that are concave. The middle white line above is an example of a concave line and the remaining interior lines are convex. Now, pick up some square or rectangular object, move it around and rotate it while focusing on a particular corner. You will notice that the vertex changes and may even disappear as you rotate the object. There is a systematic way (using a Euclidean three-space and moving the observer relative to this space) to exhaustively identify the types of vertices that can occur in the world we are considering. The figure below illustrates all of the possibilities. Notice that there are only 4 types of junctions or vertices. This does not mean that they are physically the same. For example, the size of the angle of an L vertex can vary enormously. But, for purposes of interpreting scenes composed of objects it is important to see these as equivalent.
Notice that I have labeled this image as depicting the physically possible vertices; and there are only 18 of them! What are the logical possibilities? The L vertex has two lines, and each line could be labeled in one of 4 ways. Thus, there are 16 logically possible L vertices. The other 3 types of vertices each have 3 lines. Again, each could be labeled in one of 4 ways. Thus, each of these vertices admits 64 possibilities. Then, the total logical set of possibilities is 16 + (3 x 64) or 208. Thus, by building in knowledge about what is physically possible...by making the so-called "rigid body assumption" we can drastically reduce the possible labelings that must be considered. Now, recall the obvious fact that any line connects two vertices. And, of course, a line can only be labeled in one way. Thus, a decision about how to label one particular vertex affects the other vertex which shares the line with that particular vertex. When making one decision influences another decision, we say that there is a dependency between the decisions or that one decision constrains the other. Sometimes, as in the traffic light example you worked out, the constraint is so strong that knowing one decision completely determines another. Sometimes it is weaker, and knowing one decision precludes others, but does not uniquely determine the other.
The figure above shows our example object with each vertex identified as to its type. Below the object is a vertex by vertex table where a cell entry is 1 if the two vertices share a line. I have also referred to this as a communication matrix. This term is used because in communication if I can communicate something to you, then it may be that you will communicate my message to someone else. If you do, then I have indirectly communicated with that person. Things are analogous here. For example, referring to the matrix and picture above, notice that L1 shares a line with A1 but not with L2. But A1 shares a line with L2. Consequently, a decision about how to label L1 will affect L2 even though there is not a direct connection. The algorithm that works out all of these effects is called a constraint propagation algorithm. The animation below will give you some feel for how it proceeds. Note I have not attempted to be completely accurate to the algorithm in this animation, but simply to give you an intuitive feel for how it works. You will notice that on some steps the algorithm will fill in the labels in blue and on other steps it will fill in the labels in green. These latter steps, those that involve green, represent the case where the label is simply propagated along a line from one vertex to the next.
The point at which the algorithm begins is arbitrary. In this example figure, the algorithm would arrive at a unique labeling for each line in the figure. This is what typically happens, and it can happen quite rapidly because of the small number of possible labelings for any vertex and the way in which the labelings constrain each other because of the rigid body assumption mentioned earlier. Intuitively, what this assumption says is that if the world is made up of rigid objects, then the shape of one object can't really depend on the shape of another object...there are no constraints to propagate beyond object boundaries. Note this is not the case with non-rigid objects such as flags or pools of water. Toss a rock into a pool of water, and the effect is propagated indefinitely. We know from the study of dependency networks and the analysis of algorithms for propagation of dependency that generally this class of problems is computationally complex (or intractable to use a more technical term)....that is, if you have a large problem then it may take a very long time to see if there is a consistent way in which to assign values to the network of dependencies. The algorithm works efficiently for partitioning a physical scene into objects because the rigid-body assumption holds most of the time. And, it appears that this knowledge is "built into" our way of reasoning about such scenes. Thus, this is an example where the mind is biased, but biased in a way that allows it to efficiently, and usually accurately, interpret physical scenes. What about all those crazy pictures at the beginning? Well, what I was attempting to do was to create pictures that were complex enough to either defeat your ability to maintain focus on a coherent set of vertices or create pictures that were ambiguous, that is, had more than one consistent labelings. The picture below depicts a "ribbon" of vertices. If you focus in the center, you will see that there are not enough constraints to force you to a single interpretation. You can see a line as convex or concave and it can shift back and forth. Your mind can give this picture more than one consistent interpretation
There are some pictures that you can give no consistent interpretation. But your mind has a hard time determining exactly why. Many of the constraints hold "locally" but there is not a consistent global interpretation. These types of figures are often referred to as impossible figures...because they are. The impossible triangle that was used earlier is shown again below on the left. It is also shown on the right but with circles used to impose "local" views of the figure. If you look at each of these views, you will see that each is perfectly fine. It is when the constraints get "passed" to the next view that we recognize the impossibility of the figure.
And, just in case this all hasn't seemed impossible enough....here are two more figures. The impossible prong on the right and an impossible nut on the left!

Charles F. Schmidt
Problem Solving and Productive Thinking
Problem Solving - The Cryptarithmetic Example

In addition to research in perception, the Gestaltist also focused on problem solving. In retrospect, we can see that problem solving often exhibits two properties that support the Gestaltist position. First, it is typically the case that many steps are required to solve a problem. These "steps" often don't need to be (and in some case couldn't be) made in a physical sense. This complexity of the "response" supports the idea of a mind actively manipulating "mental stuff." Second, many times a problem can be represented in differing ways...and, the way in which the problem is represented is often crucial to the solution. In this section, we will use several different problems to exemplify and amplify these points. The first problem that we will examine is an example of a cryptarithmetic problem. The problem statement is given below. This same problem is given in the text. You should take some time trying to solve the problem so that our discussion of this problem will be easier for you to grasp.
If you were unable to solve the problem, try to determine why. If you solved it, think back over your problem solving efforts and try to identify points where you ran into particular difficulties. Problems such as this are not easy to solve. You must "think" of the problem in the right way, follow a strategy for pursuing the solution, and be systematic in remembering intermediate steps in your solution. One of the questions that we would like to understand better is exactly what makes one problem hard and another easy. In this introductory level course we won't be able to develop the sophistication required to fully describe the way this question is being addressed, but we can begin to point out some of the features of problem solving that are importantly related to problem difficulty. First, a little review. Previously, we discussed the line labeling problem. We noted that the general problem of satisfying some arbitrary set of constraints over a set of objects is a very difficult problem. It is what is referred to as an intractable problem. But the physical world, it turns out, isn't really an arbitrary set of objects. Rigid objects can't influence other rigid objects over arbitrary distances. This fact, and just as importantly, the fact that our mind seems to utilize this fact, makes the line labeling problem tractable. Now, it turns out that this cryptarithmetic problem is also a problem that involves satisfying a set of constraints. But, your mind doesn't automatically exploit the implications of this fact. You must have some knowledge of algebra and use this to construct an algebraic representation of the problem to actually take advantage of the fact that this is a constraint
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/GProblemSolve.html (1 of 3)11/11/2005 1:02:08 PM
satisfaction problem. Note, this is not the only way in which the problem could be represented. You could simply think of it as involving a set of 10 letters and a set of 10 integers and think of the basic problem solving move as assigning each of the integers to a letter and testing to see if the assignment is correct. This set theoretic representation together with this move of assigning integers to letters is, in fact, a way in which the problem can be solved. I know of no agent, human or machine, that ever solved it in this fashion. The reason is that even with one of the assignments given, namely D=5, there are still 9 more assignments to make. And, there are 9! (362,880) ways of making these assignments. A very large set to look through! And, to make matters worse, there is no way to know when you are getting close. In this way of representing and thinking about the problem, an assignment is either correct or incorrect. There is no such thing as being partially correct. But now contrast this set theoretic representation with the algebraic representation of the problem. This representation is depicted below. The representation consists of rewriting the problem as a set of six equations where we have explicitly added the carry terms (c1, c2, ....) that are involved in the addition. At the top of the picture in the green box, the possible value assignments are explicitly shown (the 'v' is the symbol used to represent logical 'or'). The picture also includes arrows linking the terms in the equation. Each of these words is six letters long, and there are only 10 letters involved in the problem. Thus, letters must recur at various points in the equations. The arrows link places where the same letter occurs. Now, we know that the integer that we assign to a letter in one equation constrains the integer that can be assigned to another letter in the equation. For example, assigning 5 to D in (1) constrains us to assign 0 to T and 1 to c1. If these letters occur in other equations, then this assignment affects those equations....again, the constraints propagate!
Note that once we have represented the problem as one that involves six subproblems, we now have the opportunity to (intelligently) choose which subproblem to work on first, which next, and so on. Intelligent choices involve following the dependencies (as well as knowing a bit about how to exploit algebraic identities, as in equation 5) so that the number of possible candidate values for a particular letter is strongly constrained. The figure below was created to help you visualize these constraints or dependencies between subproblems. The letters represent the letters in the problem. Each rectangle or triangle represents one of the six subproblems. The triangles represent the case where an equation involves three letters and the rectangles where only two letters are involved. The letters within a shape and the line between them also serve to indicate that the letters occur in the same equation. The intersection of the figures reflects those cases where the same letter appears in differing equations. This gives you an overall picture of the structure of the dependencies in this problem. The picture to the right also includes the carry terms and each geometric figure is labeled with the corresponding equation or subproblem.
By seeing the problem represented in this way, I hope that the relation between this type of problem solving and the line labeling problem is now apparent. Think of each equation (the rectangle or triangle above) as corresponding to a vertex. The letters of the equation then correspond to the lines in the vertex, and the integers are the "labels" for the letters corresponding to the labels for the lines. We have seen two very different cases where a problem has been broken down or decomposed into parts or subproblems. And, having found a "good" decomposition and an order in which to work on the subproblems; the problem becomes rather easy to solve. This will be a recurring theme. But be warned, we can't always find a good decomposition. And, there are a great many to look through. Recall the partitioning of the dots and the Sterling number for the number of partitions....the same combinatoric principles apply to problem decompositions.

Charles F. Schmidt
Checkerboard Problem
Problem Solving and Problem Representation

Below is presented what is referred to as the checkerboard problem. An illustration of the starting state of this problem is also presented to help you think about the problem solution. Read the problem statement and solve the problem.
Now, that you have solved this problem, move on to a small variant of this problem called the mutilated checkerboard problem. Mutilated Checkerboard Problem
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/CBoard1.html11/11/2005 1:02:14 PM
Mutilated Checkerboard Problem

Below is presented what is referred to as the mutilated checkerboard problem. An illustration of the starting state of this problem is also presented to help think about the problem solution. Read the problem statement and solve the problem.
Were you able to solve this problem? If so, did it take awhile to get on track? Now move on and try the matchmaker problem. Matchmaker Problem
Charles F. Schmidt
MatchMaker Problem

Below is presented what is referred to as the Matchmaker Problem. Read the problem statement and solve the problem.
You should have found this very easy to solve. And, you probably have noticed that there is a abstract resemblance between this problem and the mutilated checkerboard problem. Hopefully, your experience with these problems has alerted you to the importance of how a problem is thought about, that is, how a problem is represented. Productive Thinking...the Gestalt Emphasis
Charles F. Schmidt
Terms, Concepts, Questions
Some Terms, Concepts and Questions Productive Thinking Reproductive Thinking What are some of the reasons for proposing this distinction? What determines these choices? Problem Representation Alternatives and Choice Shift or Change of Representation What is the relation between Choice and Bias? What makes a problem difficult? Where do problem decompositions come from?
Problem Decomposition
Problem Representation Problem Constraints Subproblem Dependencies Dependence Independence Consistency
Constraint Propagation
How is knowledge incorporated into this procedure?

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/GestaltT_C.html11/11/2005 1:02:34 PM
Traffic Lights - Possibilities and Dependencies
Traffic Lights - Possibilities and Dependencies

Illustrated below is a "traffic light" ...a vertical arrangement of a red, an amber, and a green light each of which can be either on or off. We will use the idea of traffic lights at an intersection to illustrate the relation between the concepts of possibilities, dependencies, and constraints. In addition to the basic traffic light, the illustration below lists two sets of possibilities. One is termed the logical possibilities (or in this case you could think of it as the physically logical possibilities). There are 8 logical possibilities ranging from the case where all three light are on at the same time to the case where all three lights are off at the same time. Note that each object - one of the lights - can be in one of two states - on or off. Since there are three lights, the number of logical possibilities is the number of states raised to the power of the number of objects - in this case 2 to the 3. In the darker gray box is the set of what I have termed 'designed possibilities.' This is the set of possibilities that any wellbehaved traffic light will exhibit. And there are only three such possibilities, the cases where one of the three lights is on and the remaining are off. Finally, the pointers to the circles is meant to convey the obvious, but important fact that the designed possibilities are a subset of the logical possibilities. Keeping these ideas in mind, pretend that for some reason you can't see a traffic light, but your companion traveling with you can. Further pretend that when you ask your companion to tell you the state of the traffic light, your companion says that it is red. Notice that if you assume that this is a well-behaved traffic light, then you know all you need to know because the only designed possibility in which the red light is on is one where the green and yellow lights are off. You can "infer" this because this dependency between the on\off values of the three lights will always hold for well-behaved traffic lights. Whenever, the value of one thing depends on the value of another we say that there is a dependency between these two things. Notice that in some cases the dependency may be such that knowing one value uniquely determines another value. That is the case here...knowing that the red light is on uniquely determines the value of the remaining lights. However, knowing that the yellow light is off doesn't allow us to infer the value of the green light or the value of the red light. It does, however, let us infer that one of these lights is on and one is off. Anytime, there is a dependency among the values that different entities can take, we can potentially take advantage of these dependencies to reason about the objects. Working cross word puzzles is one long exercise in attempting to use dependencies to reason to a unique set of values for each square in the puzzle. Now let us gain some experience in explicitly thinking about possibilities and dependencies. Consider the standard traffic intersection with a light at each of the four corners. Work out the number of logically possible values that this four light configuration can take on. Next work out the 'designed' possibilities and write out the inferences that can be derived when these dependencies hold. Finally, work out a system that functions in the same manner but now uses only two lights ... a red and a green light on each "traffic light." Finally, decide whether you could reduce the number of lights use to just one...say a green light on each "traffic light." Productive Thinking...the Gestalt Emphasis
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/TrafficLights.html11/11/2005 1:02:43 PM
DONALD + GERALD = ROBERT
The Cryptarithmetic Problem

The problem below is an example of a cryptarithmetic problem. The problem statement is given below. This same problem is given in your text. Solve this problem and try to write down as detailed a record as possible of your thinking and decisions as you attempt to solve this problem.
If you were unable to solve the problem, try to determine why. If you solved it, think back over your problem solving efforts and try to identify points where you ran into particular difficulties.

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/CryptProblem.html11/11/2005 1:02:45 PM
Matchstick Problems - Table of Contents
MatchStick Problems
q q q q
Four Squares Four Squares Solutions 40 Squares Problems Large and Small Matches Problems

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Matchstick_contents.html11/11/2005 1:02:52 PM
Table of Contents 305
Go to Course Syllabus
Table of Contents
Part I. Historical Perspective and Basic Approaches to the Study of Thinking
q q q q q q
Introduction Associationism and Behaviorism Productive Thinking...the Gestalt Emphasis Experimental Decomposition of Thinking Computational Approach to the Study of Thinking Cognitive Development and Learnability
Part II. Aspects of Thinking/Cognition

q q q q
Deduction Induction, Concepts, and Reasoning under Uncertainty Understanding, Interpreting and Remembering Events Problem Solving and Planning
Home Course Materials Page

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/contents.html11/11/2005 1:02:54 PM
Introduction Contents
Introduction Lecture Material

q q
Some Quotes The Mind and Formal Systems r Some Examples Structure and Randomness Discussion r Media and Memory r Structure Experiment
Terms, Concepts, and Questions!

q
Terms and Concepts, and Questions
q q q q q
Exercise/Assignment ...What Kind of Mind.. Exercise/Assignment ...What does the Mind See? Some views... Exercise/Assignment ...Looking for Structure in Recall Exercise/Assignment ...What is Thinking and what isn't Exercise/Assignment ... What kind of memory is a photograph?
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/Introduction.html11/11/2005 1:03:33 PM
Syllabus 305
Syllabus 830:305 Cognition - Section 1

Fall, 2005 Text: Mayer, R. E. Thinking, Problem Solving, Cognition. Second Edition. New York: W. H. Freeman, 1992. Room PH 115, Busch Campus Monday & Thursday 1st Period (8:40 - 10:00 A.M.) Prof. Charles Schmidt Room 135A, Psychology Bldg, Busch Campus 732 445-2874 cfs@rci.rutgers.edu http://www.rci.rutgers.edu/~cfs Monday, 12:00 - 1:00 PM or by appointment
Place: Time: Instructor: Office: Phone: Email: Course URL: Office Hours: T.A.
Click here for PDF version of the Syllabus
Course Outline
Part I. Historical Perspectives and Basic Approaches to the Study of Thinking
Introduction Reading: Website:
Chapter 1. Beginnings. pp 3-18. Introduction
http://www.rci.rutgers.edu/~cfs/305_html/syllabus.html (1 of 5)11/11/2005 1:03:54 PM
Syllabus 305
Associationism and Behaviorism Reading: Website:
Chapter 2. Associationism: Thinking as Learning by Reinforcement. pp. 19-38. Associationism and Behaviorism
Productive Thinking...the Gestalt Emphasis Reading: Website:
Chapter 3. Gestalt: Thinking as Restructuring Problems. pp. 39-78. Productive Thinking...the Gestalt Emphasis
Experimental Decomposition of Thinking Reading: Website: Chapter 7. Mental Chronometry: Thinking as a Series of Mental Operations. pp. 203-224 Experimental Decomposition of Thinking
EXAM 1 Computational Approach to the Study of Thinking Reading: Website:
Chapter 6. Computer Simulation: Thinking as a Search for a Solution Path. pp. 167-202. Computational Approach to the Study of Thinking
Cognitive Development and Learnability Reading: Website:
Chapter 10. Cognitive Development: Thinking as Influenced by Growth. pp. 283-323.
Part II. Aspects of Thinking
Syllabus 305
Deduction Reading: Website: Chapter 5. Deductive Reasoning: Thinking as Logically Drawing Conclusions. pp. 283-323. Deduction
EXAM 2 Induction, Concepts, and Reasoning under Uncertainty Reading: Website: Chapter 4. Inductive Reasoning: Thinking as Hypothesis Testing. pp. 81-113. Chapter 9. Question Answering: Thinking as a Search of Semantic Memory. pp. 259-279 Induction, Concepts, and Reasoning under Uncertainty
Understanding, Interpreting and Remembering Events Reading: Website:
Chapter 8. Schema Theory: Thinking as an Effort after Meaning. pp. 225-258. Understanding, Interpreting and Remembering Events
Problem Solving and Planning Reading: Chapter 14. Analogical Reasoning:Thinking as Based on Analogs, Models and Examples. pp. 415-454. Chapter 13. Expert Problem Solving: Thinking as Influenced by Experience. pp. 387-414. Chapter 15. Mathematical Problem Solving: Thinking as Based on Domain-Specific Knowledge. pp. 455-489. Chapter 16. Everyday Thinking: Thinking as Based on Social Contexts. pp. 490-507. Problem Solving and Planning Final Exam (Dec 17 8-11AM)
Website:
Homework, Grading, Etc.
Syllabus 305
Exams The exams will cover the material presented in the text, lectures, and the website. Assignments The website will often include for a section some assignments, exercises or questions to be considered. These activities are primarily intended to focus your thinking about the course material being covered. In many cases there may be no obviously correct answer. In other instances, the primary purpose of the exercise is to help you to reflect upon your own thinking and performance when doing a cognitive task. The assignments may be discussed in class. Class Participation The lecture material for much of this course is provided on the website. The purpose of this is not to relieve you of the onerous task of attending class. The purpose is to:
q
q q
Allow you to read over the material before it will be presented and discussed in class so that you can determine which aspects of the material you may not understand; Relieve you of the necessity of taking extensive notes in class and giving you the freedom to follow the lecture and discussion actively and critically; Relieve me from covering in detail the material, allowing me to emphasize the main points of the material as well as add additional information; Provide additional time and opportunity for questions and discussion in class; Finally, you will notice that the material in the website often includes examples of the ideas under discussion. It is important for you to not only work through the examples, but also to make sure that you understand the basic ideas or concept that the example is being used to illustrate. The examples will, at times, include a great deal of detail since this can easily be included in the web pages. The detail is there to help you develop your intuitions concerning the ideas presented. It is not presented as something that you are expected to be able to reproduce.
If at all possible, I suggest that you at least skim the material on the website prior to the class in which the material will be presented. If there are aspects that you do not understand but you are reluctant to ask about it in class, then you might want to let me know about this via Email (cfs@rci.rutgers.edu) prior to the class in which the material will be presented. I will make it a point to read this directory prior to class. Extra Credit If you wish to do an extra credit project for the course, then this project should be approved no later than the thirdt week in October and turned in to the instructor no later than Monday, Dec. 12. Some possible extra credit projects include:
q
q q
creating content related to the course that could potentially be included in the course website. This might involve creating examples or experiments related to the course material, extending either the depth with which a topic is covered, or adding additional related topics. creating additional tools for use of the website. participating in some research on problem solving. In this case, you would analyze and discuss your own data in relation to various ideas about human problem solving. ...
Course Grade Your course grade will mainly be determined by your performance on the exams. In addition to these exams, your participation in class - questions formulated, discussion, and assignments - may also contribute to the determination of
Syllabus 305
your grade for the course. And, of course, any extra credit work will also be considered when assigning the final course grade.
Printing the pages on this site

If for some reason you decide that you wish to print one or more of these pages then be sure that the print setup is in landscape mode. Note, however, that these HTML pages have not been constrained to have any particular vertical limit. Consequently, a page may print onto several pages and the page breaks may occur at arbitrary points. If at all possible, I recommend making every effort to use these pages on line rather than printing since they were developed under the assumption that this would be the primary mode of use. Using them on line will allow you to view the animations, JavaScript related features as well as view the most recent updates of the pages. Table of Contents
Charles F. Schmidt
Associationism and Behaviorism
Associationism and Behaviorism Lecture Material

q q
Some Quotes Some S-R Theory

q
q
Anagram Exercise/Assignment r Anagram Critique
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Behaviorism/Assoc_Behav.html11/11/2005 1:04:01 PM
Mental Chronometry
Experimental Decomposition of Mental Processes

Lecture Material
q q q
Mental Olympics Scanning Short Term Memory Additive Factors and Analysis of Variance

q
Some Terms, Concepts and Questions
q
What Might be Predicted ....?
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/MentalChron/MChron.html11/11/2005 1:04:09 PM
TOC: Computational Approach
Computational Approach to the Study of Thinking

Lecture Material
q q q q q q q q
Following Instructions Machines/Automata Levels Hypothesis Search / Generate and Test Search Control Strategies Search Control Animations Problem Reduction Search Production Rule Example

q
q q
Computation and Computers as Physical Devices Computing and Thinking
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Computation/comptoc.html11/11/2005 1:04:12 PM
Deduction
Deduction
Lecture Material
q q q q q q
Propositional Logic:Some Intuitive Ideas Reasoning in the Syntax and in the Semantics; An Example Contradition and Proof Euler Diagrams and Quantified Expressions Defeasible Inference: Inheritance Deduction Overview
Reference Material
q q q q q q q q
Truth Tables Some Logical Identitites Some Logical Implications Liars Paradox Some Definitions for First Order Logic Some Rules for Quantifiers Some Definitions of Terms used in the Study of Formal Systems Resolution Theorem Proving

q
q
Describing Things
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/deduction_toc.html11/11/2005 1:04:14 PM
Induction, Concepts, and Reasoning under Uncertainty
Induction, Concepts, and Reasoning under Uncertainty*

Lecture Material
q q q
q q q q
Introduction Knowing a Concept and Concept Indistinguishability Example of Concept Identification Task r Possible Hypotheses after One Example Structuring the Hypotheses Space and Hypothesis Revision r Version Space and Learning the Concept of an Arch Example of Inducing Rules on Patterns Standard Probability Axioms and Beliefs Belief Revision/Bayes Rule The Grue Property
Reference Material
q q q q
Algebra of Sets and Definition of a Lattice Graph of a 3 Element Lattice Dutch Book http://www.pbs.org/wgbh/pages/frontline/shows/gamble/

q
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Induction/induction_toc.html11/11/2005 1:04:16 PM
Understanding, Interpreting and Remembering Events

Lecture Material
q q q q q q q
Properties of Language Syntax and Sentence Understanding Parsing Sentences NonDeterminism and Parsing CaseGrammar Example of Representing Textual Information Example of Story Interpretation

q
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/Understanding_toc.html11/11/2005 1:04:20 PM
Problem Solving and Planning

Lecture Material
q
River Crossing Problems r Missionary and Cannibals r Jealous Husbands r Solution Spaces Tower of Hanoi r Tower of Hanoi Story r Tower of Hanoi 3 Disk Solution r The Tower of Hanoi 3 - Disk Space r Tower of Hanoi Problem Decomposition r n Disks and m Pegs "Tower of Hanoi" Example Problem r A Problem from Tower of Hanoi Space r Monster Problems r Related Spaces

Terms, Concepts, Questions - Problem Solving
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/ProblemSolving_toc.html11/11/2005 1:04:22 PM
Course Materials for Courses Taught By Prof. Charles F. Schmidt
Syllabi and Course Materials for:

Fall, 2005 Fall, 2005 830:472 830:305 Computation and Cognition Cognition
These materials are under active development. They are intended for use by students at Rutgers taking one of the above courses with Prof. Schmidt. Consequently, no explicit attempt has been made to insure that these materials form a coherent presentation of the material for someone who is not taking the associated course from this instructor. As you might expect there is considerable overlap in the material covered in these courses. The course 830:305 Cognition is a general introduction to the study of human cognition; whereas the course 830:472 focuses on the study of cognition (by human or machine) as a kind of computation. Consequently, 472 goes into the relation between computation and cognition more rigorously, deeply and technically than is done in 305. In general, the materials for each course are different although in some cases the same material is used in each site. Where there is a corresponding section in each course, you may find it helpful to consult that section in the course that you are not taking. I have tried to keep the presentation of the materials in the 305 course at as simple and intuitive a level as the material allows. Consequently, if you a student in 472 and are having difficulty with the material in that course site, you might consult the corresponding section in the 305 site. Or if you are bored to tears with the simplicity of the material presented in the 305 site, you might see if there is a corresponding section in the 472 site. These materials are constantly under development. Some sections are incomplete. Inquiries, comments, corrections, etc. concerning these materials may be sent to: Prof. Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/index.html11/11/2005 1:04:30 PM
Quotes
Some Quotes Exemplifying the Rationalist and Empiricist Positions *

Here are some quotes that highlight the rationalist and empiricist views of the human mind. Leibniz and Boole illustrate the rationalist position where the mind is regarded as an entity that, like a mathematical system, follows rules that are unique to itself and in a sense independent of the external world. Locke and Hume, speaking from the empiricist position, do not see the mind as independent, but as derivative from the world of experience. During the first half of the 20th century, the empiricist position dominated the thinking and research on human reasoning in the United States. During the second half of the century, the argument against this position intensified and the rationalist position has achieved increasing dominance. Much of the impetus for the rationalist position has arisen from the mathematics associated with defining and studying computation as well as from the experience of using computation in everyday life. (*The quotes below provide only glimpse of these individuals thoughts and works. Where possible I have provided a link to an online version of the original text or to a related writing in case you wish to pursue their ideas further. )
Gottfried Wilhelm von Leibniz (1646-1716), On reasoning. 1677

"All our reasoning is nothing but the joining and substituting of characters, whether these characters be words or symbols or pictures, ... if we could find characters or signs appropriate for expressing all our thoughts as definitely and as exactly as arithmetic expresses numbers or geometric analysis expresses lines, we could in all subjects in so far as they are amenable to reasoning accomplish what is done in Arithmetic and Geometry. For all inquiries which depend on reasoning would be performed by the transposition of characters and by a kind of calculus, which would immediately facilitate the discovery of beautiful results ..."
George Boole (1815 - 1864), An Investigation of the Laws of Thought on which are Founded the Mathematical Theories of Logic and Probabilities (London, 1854);
"Nature and Design of this Work The design of the following treatise is to investigate the fundamental laws of those operations of the mind by which reasoning is performed; to give expression to them in the symbolical language of a Calculus, and upon this foundation to establish the science of Logic and construct its method; to make that method itself the basis of a general method for the application of the mathematical doctrine of Probabilities; and, finally, to collect from the various elements of truth brought to view in the course of these inquiries some probable intimations concerning the nature and constitution of the human mind." The Calculus of Logic by George Boole, first published in The Cambridge and Dublin Mathematical Journal, vol. 3 (1848)
http://www.rci.rutgers.edu/~cfs/305_html/Intro/Quotes.html (1 of 2)11/11/2005 1:14:23 PM
Quotes
John Locke (1632-1704) on Empiricism from Essay concerning human understanding (1690):
"2. All ideas come from sensation or reflection. Let us then suppose the mind to be, as we say, white paper, void of all characters, without any ideas:- How comes it to be furnished? Whence comes it by that vast store which the busy and boundless fancy of man has painted on it with an almost endless variety? Whence has it all the materials of reason and knowledge? To this I answer, in one word, from EXPERIENCE. In that all our knowledge is founded; and from that it ultimately derives itself. Our observation employed either, about external sensible objects, or about the internal operations of our minds perceived and reflected on by ourselves, is that which supplies our understandings with all the materials of thinking. These two are the fountains of knowledge, from whence all the ideas we have, or can naturally have, do spring."
David Hume (1711-1776) from An Enquiry Concerning Human Understanding (1777 edition)
"Nothing, at first view, may seem more unbounded than the thought of man, which not only escapes all human power and authority, but is not even restrained within the limits of nature and reality. To form monsters, and join incongruous shapes and appearances, costs the imagination no more trouble than to conceive the most natural and familiar objects. And while the body is confined to one planet, along which it creeps with pain and difficulty; the thought can in an instant transport us into the most distant regions of the universe; or even beyond the universe, into the unbounded chaos, where nature is supposed to lie in total confusion. What never was seen, or heard of, may yet be conceived; nor is any thing beyond the power of thought, except what implies an absolute contradiction. But though our thought seems to possess this unbounded liberty, we shall find, upon a nearer examination, that it is really confined within very narrow limits, and that all this creative power of the mind amounts to no more than the faculty of compounding, transposing, augmenting, or diminishing the materials afforded us by the senses and experience. When we think of a golden mountain, we only join two consistent ideas, gold, and mountain, with which we were formerly acquainted. A virtuous horse we can conceive; because, from our own feeling, we can conceive virtue; and this we may unite to the figure and shape of a horse, which is an animal familiar to us. In short, all the materials of thinking are derived either from our outward or inward sentiment: The mixture and composition of these belongs alone to the mind and will. Or, to express myself in philosophical language, all our ideas or more feeble perceptions are copies of our impressions or more lively ones.To prove this, the two following arguments will, I hope, be sufficient. First, when we analyse our thoughts or ideas, however compounded or sublime, we always find, that they resolve themselves into such simple ideas as were copied from a precedent feeling or sentiment. Even those ideas, which, at first view, seem the most wide of this origin, are found, upon a nearer scrutiny, to be derived from it. The idea of God, as meaning an infinitely intelligent, wise, and good Being, arises from reflecting on the operations of our own mind, and augmenting, without limit, those qualities of goodness and wisdom. We may prosecute this enquiry to what length we please; where we shall always find, that every idea which we examine is copied from a similar impression."
Introduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/Quotes.html (2 of 2)11/11/2005 1:14:23 PM
Formal Systems and a Language of Thought

"All our reasoning is nothing but the joining and substituting of characters, whether these characters be words or symbols or pictures, ... if we could find characters or signs appropriate for expressing all out thoughts as definitely and as exactly as arithmetic expresses numbers or geometric analysis expresses lines, we could in all subjects in so far as they are amenable to reasoning accomplish what is done in Arithmetic and Geometry." Leibniz (1677) Judging from the quote above, it appears that Leibniz was quite certain that thinking of the mind as a formal system is a useful way to view reasoning. The explicit idea of a formal system is pretty much an intellectual product of this century, but Leibniz uses areas of mathematics as examples of what he has in mind. These areas certainly qualify as examples of what we would now refer to as a formal system. The interest in the idea of a formal system arises from the intuition that there is a kind of "language of thought." Indeed, a naive assumption is that the language of thought is determined by the natural language that we have learned as a child. And, certainly it is hard to escape the intuition that the language we speak is intimately related to our thought. But we needn't resolve this issue now, because in this century the idea of a formal language or system has been welldefined. In one sense, this is simply an abstraction of the idea of a natural language. And, as such it provides a clear presentation of some of the basic properties of a "language." We could be very careful and exact in defining the class of things that we call formal systems, but at this point we just want to get the idea out there so you can use it to help think about the issues that were, and still are argued about, by people who study human reasoning. A formal system consists first of all of a set of things, usually we think of this set of things as a set of symbols. A symbol is something that someone "dreams up" as opposed to something that nature provides on its own. To capture this idea, it is often said that symbols are 'arbitrary.' For example, the letter 'A' is not a phenomenon of nature... someone decided to adopt this set of conventions to make this form and treat it as - the letter 'A'. And, a symbol doesn't automatically refer to anything other than itself...you and I had to learn that the letter 'A' could be used to refer to the sound-A. Another property of symbols is that we usually try (or are taught to try) to make the symbols unambiguous... if you are writing an 'A' you try to write it in such a way that it won't be confused with any other symbol that is in the set of symbols you are using. The numbers used in mathematics, letters used to write down a natural language, notes used to write down music, are just some of the familiar examples of differing sets of symbols. Notice that the letters of our alphabet and the notes used in music are each finite in number...there are 26 letters in the English alphabet. But we can use these finite sets to create sets of things, expressions; and the set of possible expressions is not really bounded in size....the set is infinite. In a technical sense, there are an infinite number of sentences in the English language and we can use the alphabet to express each of these. A similar claim could be made about the number of musical expressions. In the picture on the left, I have used three different sets of symbols - integers {3,2,5...}, a 'stick' {|}, and the English alphabet {B,a,n,...}. In each of the gray boxes I have grouped these symbols in particular ways to serve as examples for this discussion. First, note that I added some symbols - for example, + and = as well as a blank space and and period (.) in the case of the sentences. These symbols seem to be a bit different and they are. Recall that we have only a finite number of symbols but we want to be able to create an infinite set of things that we call expressions from this finite set. Well, the only way in which
http://www.rci.rutgers.edu/~cfs/305_html/Intro/Ideas_Syntax.html (1 of 3)11/11/2005 1:14:30 PM
to obtain an infinite set of expressions from a finite set of expressions is to define ways in which to compose expressions from the elements of the vocabulary. +, = and space in the case of sentences are used to represent a composition of elements of a set. For example, 2 3 2+3 2+3+2 2+3+2+3 and so on. So how does this help us think about the mind. Well, perhaps the mind has a finite vocabulary of "basic ideas"....perhaps, it has a finite set of ways of composing these ideas into wellformed expressions (complex ideas)...and perhaps it has a set of syntactically defined rules of inference. Perhaps, then there is a sense in which we have an infinite set of ideas (how do we fit them into our brain then?) And, perhaps the mind can imagine syntactic expressions that are false or describe a completely imaginary world such as Alice's Wonderland. Could this be possible without a language of thought? But as soon as we allow ourselves to string elements of our set together, we need to introduce the idea of following rules for stringing them together. We call these rules, syntactic rules....they are rules that define the way in which we form expressions using our basic set of symbols. In the figure above, the first two sentences in the lower box are syntactically correct. The last sentence, shown in red is not syntactically correct. A more general term that is often used to refer to this distinction is to say that syntactically correct expressions are well-formed expressions or formulae. Now we can create an infinite set of expressions from a finite set of elements. Can there be more? Well, yes. We would like to able to say something more about these expressions...more specifically, we would like to be able to say something about possible relations between elements of these expressions. Note, that I have exemplified the "commutative law" in the equations in the upper right. Now, if the commutative law holds; then if we have the expression '2 + 3 = 5,' then we can infer or derive the expression '3 + 2 = 5' using the commutative axiom. This represents a rule of inference and rules of inference are another component of a formal system. I used a similar type of rule with the "Bacon and Eggs" phrase to derive the lower sentence from the first. Note that the rule of inference says something about how to modify one expression to yield another...and technically, that is all it says. This last point is important ... formal systems are also often called syntactic systems to contrast them with systems where the expressions are intended to refer to something outside the system...to have an associated semantics. Now, this can get really tricky but the intuitions are familiar. "Bacon and Eggs have high Cholesterol." is simply a wellformed expression and nothing more from the syntactic point of view. But, of course, these words refer to something outside the syntactic system and in addition to being syntactically correct, the sentence may be semantically correct....it may make a true statement about the things that the word 'Bacon' and the word 'Eggs' and the word 'Cholesterol' refer to in the world. is a number (expression) is a number (expression) is a number (expression) is a number (expression) is a number (expression)
So how does this help us think about the mind. Well, perhaps the mind has a finite vocabulary of "basic ideas".... perhaps, it has a finite set of ways of composing these ideas into well-formed expressions (complex ideas)...and perhaps it has a set of syntactically defined rules of inference. Perhaps, then there is a sense in which we have an infinite set of ideas (how do we fit them into our brain then?) And, perhaps the mind can imagine syntactic expressions that are false or describe a completely imaginary world such as Alice's Wonderland. Could this be possible without a language of thought? This idea that the mind possesses a "language of thought" in this formal sense is, more or less, the rationalist position. This stands in contrast to the empiricist position that relies on the "world outside our mind" to populate our mind with ideas.
Introduction
Charles F. Schmidt
Some Examples
Some Examples
This page provides some examples that may help you to understand the idea of a combinatoric system and that of a formal system. A formal system consists of:
q q q
A finite set of symbols or vocabulary, . A set expressions that are formed from the vocabulary. This set of expressions is a subset of Rules of inference.
For example, if the vocabulary is the set of words: { Jim, is, old }
then the set of possible expressions that could be formed using concatenation to combine (i.e, putting one element after another) the elements of this vocabulary are: {Jim, is, old, Jim Jim, Jim is, Jim old, is Jim, is is, is old, old Jim, old is, old old, Jim Jim Jim, Jim is Jim, Jim old Jim, is Jim Jim, is is Jim, is old Jim, old Jim Jim, old is Jim, old old Jim, Jim Jim is, Jim is is, Jim old is, is Jim is, is is is, is old is, old Jim is, old is is, old old is, Jim Jim old, Jim is old, Jim old old, is Jim old, is is old, is old old, old Jim old, old is old, old old old
...} The set of possible expressions is referred to as and is simply the set of all combinations of the elements of the vocabulary of length n where n = 1, 2, 3, 4, ... We have explicitly shown the sets of possible expressions of length 1, 2 and 3 for this three element vocabulary. There are 3 expressions of length 1; 9 of length 2, and 27 of length 3. In general, if the vocabulary has n elements then the total number of possible expressions of length j or less is given by the summation over j of n to the jth power. Thus, for this example, the total would be 39 for j =3, 120 for j = 4, 363 for j = 5, 1092 for j = 6 and so on. This is an example of what is called a combinatoric system. In this system concatenation provides a basis for generating the set of all of the possible combinations that can be obtained from a base vocabulary. However, in a formal system not all possible expressions are legal or syntactically correct expressions. Only the expressions in bold
http://www.rci.rutgers.edu/~cfs/305_html/Intro/ExampleFS.html (1 of 4)11/11/2005 1:14:34 PM
Some Examples
above would be syntactically correct if this formal system used the syntax of our natural language. In order to illustrate the notion of inference in a formal system, I have provided a different example. Here the vocabulary is { |, +, = } where | is a 'stick' which I am interpreting as a mark denoting 1.Then this vocabulary can be thought of as one that can be used to describe ordinary addition. In this case, I have computed all of that is of length 5 or less. The items in bold face represent those that are syntactically correct expressions. ( I may have missed some!) Those that are in bold and in italics and underlined are expressions that while syntactically correct would violate the rules of addition. If we were to generate more of the possible expressions we would find the expressions: | + | = | | and | | = | + | this is an example of a pattern which could be used to define a rule of inference; namely, whenever you encounter the expression on the left you may infer or derive the expression on the right. This rule of inference is valid in addition because the operation of + is commutative. Note that this rule of inference can be stated by simply making reference to the syntax (or "shape") of the statements. We don't have to "understand" addition. {|, +, =, | |, | +, | =, + |, + +, + =, = |, = +, = =, | + = | + = | + = | + = | + = | + = | + = | + = | | | | | | | | | | | | | | | | | | | | | | | | |, |, |, +, +, +, =, =, =, | | | + + + = = = | | | + + + | + = | + = | + = + + + + + + + + + | + = | + = | + = | + = | + = |, |, |, +, +, +, =, =, =, + + + + + + + + + + + + + + + | | | + + + = = = | | | + + + | + = | + = | + = = = = = = = = = = |, |, |, +, +, +, =, =, =, | + = | + = | + = | + = | + = = = = = = = = = = = = = = = = | | | + + + = = = | | | + + + |, |, |, |, |, |, |, |, |, +, +, +, +, +, +,
|, |, |, |, |, |, |, |, |, +, +, +, +, +, +,
|, |, |, |, |, |, |, |, |, +, +, +, +, +, +,
Some Examples
| + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + =
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
= = = | | | + + + = = = | | | + + + = = = | | | + + + = = = | | | + + + = = = | | | + + + = = = | | |
+, | +, + +, = =, | =, + =, = =, | =, + =, = =, | =, + =, = | |, | |, | |, | |, | |, | |, | |, | |, | |, + |, + |, + |, + |, + |, + |, + |, + |, + |, = |, = |, = |, = |, = |, = |, = |, = |, = |, | +, | +, | +, | +, | +, | +, | +, | +, | +, + +, + +, + +,
+ + + + + + + + + + + + | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + =
= = = | | | + + + = = = + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+, | = +, + = +, = = =, | = =, + = =, = = =, | = =, + = =, = = =, | = =, + = =, = = | | |, | | |, | | |, + | |, + | |, + | |, = | |, = | |, = | |, | + |, | + |, | + |, + + |, + + |, + + |, = + |, = + |, = + |, | = |, | = |, | = |, + = |, + = |, + = |, = = |, = = |, = = |, | | +, | | +, | | +, + | +, + | +, + | +, = | +, = | +, = | +, | + +, | + +, | + +,
= = = | | | + + + = = = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + =
+, +, +, =, =, =, =, =, =, =, =, =, = | = | = | = + = + = + = = = = = = = | = | = | = + = + = + = = = = = = = | = | = | = + = + = + = = = = = = = | = | = | = + = + = + = = = = = = = | = | = |
| | | | | | | | | + + + + + + + + + = = = = = = = = = | | | | | | | | | + + +
|, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, |, +, +, +, +, +, +, +, +, +, +, +, +,
Some Examples
| + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + =
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+ + + = = = | | | + + + = = = | | | + + + = = = | | | + + + = = = | | | + + + = = =
+ + + + + + = = = = = = = = = | | | | | | | | | + + + + + + + + + = = = = = = = = =
+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =,
| + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + =
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + = = = | | | + + + = = = | | | + + + = = = | | | + + + = = = | | | + + + = = =
+ + + + + + = = = = = = = = = | | | | | | | | | + + + + + + + + + = = = = = = = = =
+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =,
| + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + = | + =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
+ + + = = = | | | + + + = = = | | | + + + = = = | | | + + + = = = | | | + + + = = =
+ + + + + + = = = = = = = = = | | | | | | | | | + + + + + + + + + = = = = = = = = =
+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, =, = }
Introduction
Charles F. Schmidt
Structure
Structure and Randomness

We have already mentioned that one of the basic questions in the investigation of human cognition is whether our thoughts are typically highly structured and if so how this structure comes about. The question as posed presupposes that we know what we mean when we say that something is structured or unstructured. Actually, we can't really talk about structure in a careful way until we have adopted some sort of mathematical framework. For example, we could talk about the set of names of persons in this class and we could contrast this with the names of persons in this class that are alphabetically ordered. A set is an unordered collection of things. If there are 100 persons in this class, then there are 100! (recall that ' ! ' is read as factorial and n! is n x (n-1 ) x ... x 1) possible ways to order the list of names. Assuming that no two persons have the same name, then there is only one way in which to alphabetically order the list of names. Consequently, we might decide that a distinguishing feature of structure is that it is rare relative to the set of possibilities. But, there is a problem here because I could pick out any one of the 100! orderings and it would, in this sense, be just as rare as the alphabetic case. So that doesn't quite work.
(To give you some idea of the magnitude of these numbers, 20! = 2,432,902,008,176,640,000. 100! is much larger.)
Another tack is to simply consider how some collection of stuff was generated. The two "pictures" (7 x 7 cells filled with the colors red, blue, green and yellow) below were generated randomly. That is, I used a random number generator to assign a color to each cell. This procedure can yield an enormous number of different 7 x 7 pictures. For example, if there were only two cells instead of 49, then each of these cells could have any of the 4 colors. Thus, there are 4 x 4 or 16 different two cell "pictures". For the 49 cell case you need to raise 4 to the 49 th power. Since there are 7 cells in a row and each of these could be colored in 4 ways, there are 16,384 different ways in which to color a row. And there are 7 rows! And, if you are still curious, there are 316,912,650,000,030,518,197,354,496 ways to color the total 7 x 7 square.
Two Randomly Generated 7 x 7

And now some that were not randomly generated, but were generated by following some rules that I thought up.
http://www.rci.rutgers.edu/~cfs/305_html/Intro/Structure.html (1 of 2)11/11/2005 1:14:38 PM
Structure
Two Non-Randomly Generated 7 x 7

Notice, that these last two pictures could have been randomly generated. (I may be lying to you about following some rules.) But, because we can see regularities....because it looks like the generator of these pictures was following rules ... we think it unlikely that they were randomly generated. And, this is analogous to the way cognitive psychologists reason in studying the mind....if they can describe the products of the mind as exhibiting regularities or following rules, then they assume that the thinking is a structured process. BUT, minds seem to be able to create infinitely many products and regularity is not always so obvious in these products. So the cognitive scientist who studies human reasoning will have to be very clever, and maybe a little lucky as well, to be able to unravel the way the mind works.
**One interesting and quite general way to describe structure is the following: Assume that we have some fixed programming language. Now, for each entity write a program that will generate the entity. The shorter the program the more structured the entity. This is sometimes referred to as Minimal Length Encoding. This and other related ideas are often referred to as Kolmogorov complexity.**
Introduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/Structure.html (2 of 2)11/11/2005 1:14:38 PM
Structure and Generation
Media and Memory

Some Analogies
In the page containing quotes, there was a quote from John Locke that read, "Let us then suppose the mind to be, as we say, white paper, void of all characters, without any ideas; ..." An empty sheet of paper is used here as an analogy to help us think about the following question....What is the medium that serves as a basis for storing and remembering events, or ideas? Compared to John Locke's day, we now have many different types of media. Paper, blackboard, photographic film, magnetic audio tape, CD's, random access memories (RAM), hard disks, optical disks, etc. are some of the media that we use to 'remember' or store information. These media differ from each other in many ways. Consider first audio tape and photographic film. Both of these storage media are designed to be sensitive to the physical energy that comes in contact with the media. (see the page What kind of memory is a photograph? for a simple discussion of how black and white film records an image). The physical energy interacts directly with the media (the magnetic tape or the film) and as a result of the physical laws that govern this interaction an acoustic or visual "image" is laid down. The media, the acoustic tape or film, has no ability to exert any control over this process of storing a "memory". The memory stored is solely a function of the physical energy and the sensitivity of the medium to the physical spectrum of energy. Contrast this with the white paper of John Locke or a blackboard. It is true that a physical process must be used to create marks on the paper or blackboard. But unless it is a work of art...a drawing...the marks are not the point of the memory. The marks 'stand for' or represent a memory ...and, it really doesn't matter if the marks are in black, yellow or green; thick or thing; angular or loopy; ... because the memory is not about the marks, but what the marks represent. But, let us return to the magnetic tape and film; memories that are layed down as a result of direct interaction with physical energy. Let us consider what is involved when magnetic tape is used to "remember" the acoustic events of say a live performance of music. We say that that physical energy is continuously valued. A line is typically used to convey the idea of something that is continuously valued. The figure to the right depicts a line. The idea is that the line is composed of an infinite number of points. If we divide the line in half, then there are still an infinite number of points in that half line, and so on. How might we remember this line shown above. One way would be to place a stick at the leftmost point of the line and then trim the stick at the rightmost point of the line. The stick would then be 'about' the same length as the line. We could never guarantee that it would be exactly the same length. But our stick would now constitute a pretty good memory for the length of the line. Note the stick is not the line...just a media used to remember something about the line. It won't remember the color or thickness of the line. But if we have a crayon, we could use the crayon to create something that resembles aspects of our remembered line. We simply take a piece of paper, place our stick on the paper and trace along the length of the stick with our crayon. The crayon may not be the right color....or thickness....and we may have some tremor in our hands as we trace the line. Consequently, artifacts (aspects that weren't there in the original line) may be introduced in the process of "remembering." Now, with this very simple example we have, I hope, demonstrated several features that seem to be characteristic of this attempt to store information about physical events or energies with these type of media; namely,
http://www.rci.rutgers.edu/~cfs/305_html/Intro/Structure_Generation.html (1 of 6)11/11/2005 1:14:55 PM
q q q
the physical event that is 'stored or remembered ' is distinct from what is stored in the media; what is stored is always partial information about the physical event; if the physical event is continuously valued, this value can only be approximated by the storage media; the 'retrieval' of the memory for use in recreating the event may (and usually does) introduce artifacts.
In the case of acoustic tape, it stores the continuous physical energy that constitutes the sound, the magnetic media is typically not capable of 'remembering' all of the physical information. Rather it is designed to primarily record the physical energy in the acoustic range of the spectrum. And, it is limited with respect to the accuracy with which the energies are recorded. The information recorded is said to be analog information. The way we used the stick to record and reproduce the remembered stick illustrates what is meant by analog. Note, that if we asked how long in meters was the original line, we couldn't answer that question with this stick memory. The stick "is" the length. And, because it directly reflects the length, we were able to use it as a basis for drawing the "remembered" line. Contrast this with remembering the length of the line by "measuring" it. The figure to the right shows the line again but now I have placed a "ruler" over this line. According to this rule the line is 3 units long. As you can see, this isn't a very accurate measure because I have used rather course units. The next figure divides those units in half giving us a more accurate approximation of the length of our line. It now appears to be 6 units long. But this is still an approximation and we can do better if we refine our ruler even more. The next refinement yields an estimate of the line's length as 13 units long. And the final refinement yields an estimate of 31 units of length. Using this last "ruler", the line's length could be "remembered" simply by remembering the number 31 and the unit length of this ruler. Now, we can simply answer from this memory the question about the length of the line...it is 31 units long. But this memory "31 units" can't reproduce the line directly; only indirectly. A procedure must be devised to use this information to recreate our remembered line. Magnetic tape used to record sound is similar to our stick. It remembers physical aspects of the original events and these physical aspects provide the basis for reproducing the original event. Of course you need to amplify the signal and drive some quite complicated devices called speakers which then reproduce the remembered sounds. And, even if you spend a great deal of money, artifacts will be introduced when the events are recreated. Is this a good analogy for the mind. Does the mind have a medium which physically interacts with physical energy to "remember" those physical events? The answer is, probably yes; but a highly qualified yes. For example,the ear drum does vibrate as a consequence of the physical energy that strikes the ear drum. But, it doesn't appear that we remember these vibrations. They are simply a first step and this analog information is quickly changed to digital information. And to the degree we can remember something as complex as a musical performance, that memory is more similar to the way sound is "remembered" on a CD, than to our magnetic audio tape. (Magnetic tape can, of course, be used as the medium to "remember" digital information, and in this case it would be similar to the CD format.) But exactly how does a "CD memory" differ from an "analog magnetic tape memory?"
The compact disk (CD) was probably the earliest use of the digital encoding of information (in this case sound) that appeared on the mass market. Think for a moment about a CD of a live performance that you might have purchased.When you purchase such a CD you are purchasing a "memory" for some events that occurred at a particular place over a particular period of time. The events, 'sounds,' are changes in physical energy in the acoustic spectrum. A memory for these complex events must involve some medium which can be altered in a way that allows the events to be reproduced. Recall, that the physical energy that we recognize as sound is energy described by values that are said to be continuous. But the CD's "memory" is digital. Consequently, it is an approximation of the continuous information and the digital format is, in some sense, an arbitrary code or language within which to remember acoustic events. It is arbitrary in the sense that other codes could have been chosen. Indeed, codes are constantly changing in consumer electronics much to the bewilderment of the consumer. Let us look a bit more closely at the nature of the code used in CDs. The format for audio CDs is 16 bit/44.1 kHz, stereo and is known as the Red Book standard. The 44.1 kHz refers to the fact that each second of audio contains 44,100 samples. The 16 bit refers to the fact that this many bits are used to encode the information in each of these 44,100 samples. The figures to the right will help you to visualize this method of encoding acoustic energy. Each of these figures shows a sound recorded over a 6 second interval. (This is not a "CD-quality" encoding. The format is 8 bit/22 kHz; but this is sufficient for illustrative purposes.) In the top figure a tuning fork engineered to vibrate at 5623.3 hertz (middle C) is recorded. The second figure is a recording of an electronic tuning device that is playing the A below middle C. A is typically defined as 440 hertz. Finally, the bottom figure is a male baritone voice singing an A. At the top of each figure is a graph where the X coordinate is time measured in seconds. The dots represent the sampling of the acoustic energy over time.If these were pictures of the physical sounds, then rather than dots we would have a jumble of waveforms that represented the changing physical energy over time. But these waves have to be sampled; and the dots represent the result of the sampling. The graph below each figure is the result of computing
a Fast Fourier Transform of the sound sample that corresponds to the sound spectra that the red line passes through. The X axis corresponds to hertz or the sound frequency. The Y axis corresponds to the amount of energy at that Hertz measured in db.
The Fourier transformation is a way of decomposing the acoustic spectra into its frequency components. Human hearing does something quite analogous to this and is why we can hear differing pitches. (Note that our vision system does not decompose the visual spectrum of energy into its components.) Notice that the greatest energy appears at the frequency of C in the case of the tuning fork, and A in the case of the electronic tuner and the human voice. But the actual distribution of the energy over the spectrum and over time is what allows us to easily distinguish these differing sound sources. Technically, a great deal of information is lost when instead of remembering the actual physical signal we rather remember a digital code that stands for or represents the original physical information. But note we could just remember that an A was heard. In this case, we have lost all of the information that distinguishes the middle panel from the lower panel representing the human voice. Now, you have heard music stored on CD's played. It sounds very much as it was intended to sound; namely like the musical events it encodes. And most of us would have difficulty telling the difference between the same music recorded as an analog signal on tape and that recorded on a CD.
But, don't be fooled by this "functional equivalence"...that is, they both do a good job of "remembering" musical events. There is a fundamental difference. The 1's and 0's of the digital language of the CD are an "arbitrary" language that is used to "remember" the musical events. This language of 1's and 0's can also be used to encode color, images, text, or numbers, shape, or whatever. For example, I see no reason, in principle, why a musical CD couldn't be played back as a dynamically changing array of colors. Since this code is arbitrary, we must associate with the code a set of instructions that can serve to appropriately interpret the code...in this case, to interpret the code as musical events. Consequently, every CD player has a computer chip which contains the instructions that must be followed to interpret the 1's and 0's as musical information that must be mapped back into an analog representation that can be amplified and drive the speakers of your sound system. To reiterate, the point of all this is that: An arbitrary language used to represent X, requires a set of instructions on how to interpret expressions in the language and an agent capable of executing the instructions. An arbitrary language, the 1's and 0's of our example, is a language which bears no intrinsic relation to what is represented. In our example the instructions are written into the microprocessor which also serves as the "agent" that carries out those instruction. Hopefully, you can now see that if we assume that the mind possesses an "arbitrary" language of thought, then that commits us to believing that our memories are interpreted. Arbitrary Languages, Interpreters, and "Compression" Once "memory" is viewed in this way, we can explore all kinds of possible ways in which to "remember things." The most obvious commercially exploited possibility is to Line Only 1,241 bytes "compress" the original information in a way that requires Line & Ruler1 1,395 bytes less resources (e.g. space) to remember but still can be Line & Ruler1,2 1,649 bytes quickly interpreted in a way that quite accurately Line & Ruler1,2,3 1,828 bytes reproduces the remembered events. The animation that appears at the top of this page is a simple example of this. Line & Ruler1,2,3 1,894 bytes The animation consists of the 5 pictures of the line and Sum of Pictures 8,007 bytes rulers that are also shown on this page. They are shown Animation 4,042 bytes sequentially. The table to the right lists each picture and its size in bytes. These sum to 8,007 bytes, but the animation requires only 4,042 bytes of storage. The animation stores the sequence of pictures by taking into account the fact that much of the information is the same from one picture to the next. This same information needn't be stored each time as long as we can instruct our interpreter to repeat the same information appropriately. Thus, the animation requires about half the space required to store all of the pictures. Some of this space is taken to store control parameters, that are passed to the interpreter. For example, the interpreter is provided information which determines how long each frame is displayed, how often the animation is repeated, and the like. The language or format for these memories varies and in the commercial world fortunes are made or lost depending on which standards (languages) are adopted. The animation pictures are stored in a format called GIF and it is one of the standard formats used for pictures displayed on the Web. The other is JPEG. MPEG is a format for encoding moving pictures. DVD and I believe the direct satellite providers use an MPEG format. But, let us return to the mind and consider whether the mind uses these kinds of principles for storing information. To consider this we will return to use of the 7x7 squares considered earlier. However, we will continue this on a new page since otherwise this page will take forever to load. Onward to Structure, Memory and Figures.
Introduction
Charles F. Schmidt
Structure Experiment
Structure and Example Figures

In order to develop some intuitions about the relation between structure and memory, we will conduct an informal experiment. Listed below (Figures 1 - 5) are links to various example figures similar to the colored figures that we have used to illustrate ideas about randomness and structure. Each of these figures will be shown for a few seconds. When it is shown you should look at it and try to remember exactly the way the colored cells are arranged in the figure shown. Do each recall task in the order provided. That is, first look at Figure 1, then recall that figure, next look at Figure 2, then recall that figure and so on. To open a figure simply place the mouse over the name and depress the mouse. The figure will pop up and then close after a few seconds. Then, place the mouse over the words "Memory Task Instructions" and depress the mouse. A window will pop up with instructions on how to record your recall. To close that window simply click anywhere in the window. Open and read the Memory Task Instructions now. Then, close that window and begin the experiment. Refer to the memory instructions each time to insure that you follow the same recall procedure each time. Follow this procedure for each of the five figures. Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 All Figures Memory Task Instructions
Scoring your Data

After you have completed your recall of each of the figures, you can now proceed to scoring your recall results. To help with this all of the figures can be viewed by clicking your mouse on the following phase "All Figures" shown in the Table above. Again you can close this window by clicking anywhere in the window. There are several things that we can look at. The most straightforward is accuracy. Here I would suggest that you determine how many cells were marked
q q q
with the correct color, or with the incorrect color, or were left blank.
Your tally should sum to 49 if you remembered that these were all 7 x 7 matrices. But is this measure of accuracy really a straightforward measure of your recall? It may be that you recalled the pattern quite well, but you reversed something, or you were one off, etc. This might have led to a much poorer score than you "deserve." (You can probably recall some test where you did everything correctly in solving a math problem but made an elementary arithmetical error on one point in the solution.)
http://www.rci.rutgers.edu/~cfs/305_html/Intro/StructureEx/Structure_exp.html (1 of 2)11/11/2005 1:15:34 PM
Structure Experiment
Another simple thing to do is to count the number of words used in your description of each figure. Is there a relation between this score and your recall result? Introduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/StructureEx/Structure_exp.html (2 of 2)11/11/2005 1:15:34 PM
Some Terms, Concepts and Questions* problem Associationism Empiricism vs Rationalism ideas / brain states ("neural structures") Mind / Brain thinking / brain activity Formal Systems Atomism - Holism Structured Normative Theory Descriptive Theory Could a photograph lie? Ideas and Semantics Could a photograph tell the truth? * The Table is organized with the general "concept" referred to in the first column, "terms" related to that concept are entered in the second column of the row, and "questions" to pose to yourself are listed in the third column. Introduction
Charles F. Schmidt
goal, goal directed well- vs. ill-defined state elements(ideas) and associations(connections) similarity, contiguity, contrast
How are these terms defined and related? How are these terms defined and related? What distinguishes these views? What are the possible ways in which the mind and the brain might be related? What is the possible relation to human reasoning? What is the relation to the idea of a formal system?
Symbols, syntax, rules of inference, (interpretation) composition, context random, well-formed, composition, similar, equal
http://www.rci.rutgers.edu/~cfs/305_html/Intro/IntroT_C.html11/11/2005 1:15:39 PM
Seeing and Describing
Assignment - Seeing and Describing

In this assignment you will be asked to simply describe what you see...the description can be in any form which you can create on a piece of paper...e.g., words, sentences, icons, drawings, etc. Below you see a bullet list. Clicking on one of these examples in this list will take you to a page where an image will be presented to you. If all goes as intended, here is what will happen. On entering the page you will see a gray rectangle for about a second. Then an image, either of a natural scene or a piece of art will be presented for about 5 seconds. Look at the image during this interval. Then, the gray rectangle should appear again. Now record your observations on a sheet of paper. (Use a separate sheet for each example and mark on the sheet the example number.) When you have finished, go back to this page by clicking on the phrase Seeing and Describing which will appear at the bottom of the page. Then go on to do another example. BUT, things go slightly differently for those examples marked as Scene + . In this case, after you have finished recording your description, you will go on to the next image that can be reached from that page. This will be a related image. After viewing it, revise your original description of the first image if it is appropriate to do so. (Do not describe the second image.) Use a different color pen or some means to distinguish between your original observations and the revisions. Then come back to this page by clicking on the phrase Seeing and Describing. Try to do all of the examples but if, for some reason, you cannot, then at least do Example 1 and one of Examples 3 and 4 and one of Examples 3 and 6. The Table below allows you to choose either a small or large version. If you have a 17 inch or better screen, then the large version is the appropriate one to choose. Large Picture Version Small Picture Version
Example 1- Scene Example 2- Scene Example 3- Scene + Example 5- Scene + Example 6- Print Example 7- Scene
Example 1- Scene Example 2- Scene Example 3- Scene + Example 5- Scene + Example 6- Print Example 7- Scene
Example 4- Painting Example 4- Painting
Introduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/IntroM1.html11/11/2005 1:15:50 PM
Assignment - What kind of Mind ....?
Assignment - What kind of Mind...

In the quote from John Locke's Essay concerning human understanding he invites us to"suppose the mind to be, as we say, white paper, void of all characters, without any ideas." Although the mind, or at least one's own mind, would seem to be easily examined and understood on its own terms, it turns out that we have often recruited other concepts to use in developing an understanding of the mind. Locke suggests a blank sheet of paper as a starting point. Leibniz and Boole considered a calculus. For this assignment, I would like you to do something similar, but with a hopefully interesting twist. Forget about the mind. Simply choose some concept (that you understand fairly well) -- a blank sheet of paper, an Email system, a calculus, a stone, an ATM machine, .... whatever. Then, ask yourself what kind of "mind" such a concept would constitute. That is, could it remember, retrieve, forget, learn, ... and if so what would be the properties of the way these functions would be realized by such a concept. Obviously, there are no right or wrong answers. You should strive to put your ideas together in a coherent and analytic fashion. A few paragraphs will probably serve to hint at how you are approaching the task, but you can write more if you wish. Also be prepared to summarize your thinking for your classmates.
Introduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/IntroAssign.html11/11/2005 1:15:59 PM
Assignment 1 - Some Views of Scene 1

Different manipulations of the image:
Red portions of the image.
Yellow portions of the image.
Blue poetions of the image.
Green portions of the image.
http://www.rci.rutgers.edu/~cfs/305_html/Intro/IntroM1_Reps.html (1 of 2)11/11/2005 1:16:15 PM
Introduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/IntroM1_Reps.html (2 of 2)11/11/2005 1:16:15 PM
Exercise Assinment 1
Assignment - Looking for Structure in Recall

One of the major issues that will engage our attention in this course revolves around the question of how our thoughts and ideas are organized or structured. The Empiricist position discussed in chapter 1 of your text suggested that there are three broad principles that serve to structure our ideas. Your text mentions three doctrines of associationism that were suggested as factors that accounted for how our ideas are structured. These were:
q q q
Contiguity in Time & Space Similarity Contrast
In order to consider this question of how ideas are structured, take 10-15 minutes to try and recall and write down the events that you experienced 3 or 4 days ago (and/or if you were with someone, have them write down the events that they recall). Next, go over the recall and try to identify the ways in which your thoughts are organized. Some hints on how to do this... you will need to identify all the "simple ideas" (one's which would be meaningless if you tried to break them down further...e.g., "brushed my teeth and got dressed" is probably two "simple ideas" and not one....usually, commas, conjunctions, or other linguistic connectives will separate simple ideas or they will appear in different sentences. Now, to identify ideas that are related to each other in your recall, you need to ask yourself, "Why did I output these ideas together?" The answer could be the factors mentioned above or perhaps other factors. See what you find. An additional way in which to give yourself some feel for how tightly your thoughts were organized is to ask yourself whether you could have organized them differently and still created something that read coherently. Another way is do a recall task for some static scene...for example, your room or the house you grew up in. Are there different principles of organization or are they pretty much the same? Another thought experiment to do is to think of novels or movies that you are familiar with and think about the ways in which these materials are organized.
Introduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/intro_ea1.html11/11/2005 1:16:17 PM
Exercise/Assignment 2
Exercise - What is and is not "thinking"?

Think about the various activities that you do and try to identify:
q q q
some that clearly involve thinking; some that clearly do not involve thinking; some that seem especially difficult to classify one way or the other.
Can you suggest a basis for distinguishing these cases? What is the basis suggested in your text? As a (perhaps related) aside....think about a thermostat that senses the temperature and causes the heat to go on if the temperature drops below its setting.......You would probably not want to say that the thermostat is thinking. Or think of playing chess against a computer...do you want to say that the computer is thinking?? Or think of Andre Agassi trying to return a 128 mph serve hit by Goran Ivanisevic...do you want to say that Andre is thinking? Or think of yourself driving down the Turnpike at 80 mph and swerving to avoid something lying in the roadway in front of you....were you thinking?
Introduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/intro_ea2.html11/11/2005 1:16:19 PM
Photograph and Memory and Semantics
What Kind of Memory is a Photograph?

One of the questions that can be posed is whether a photograph could lie...or for that matter tell the truth? This question is an attempt to get you to think about semantics. Semantic issues arise if we think that symbols refer to something other than themselves. So, if we have ideas in our minds, than do these ideas have a semantics...i.e, do they refer to something? Are all our ideas true? Do all or just some of our ideas refer to something? Does the question of how our ideas get into our minds bear on these questions. Below is a figure that describes in a rudimentary fashion how photographic film works. Read it over ... it may help you to think a bit more clearly about some of these issues. (Or it may just confuse you more...but at least you will understand a little better how film works!)
Introduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Intro/Photo_Memory.html11/11/2005 1:16:25 PM
Quotes - Behaviorism
Some Quotes From the Behaviorist Tradition

Here are some quotes from two of the more influential and radical behaviorists beginning with the "father" of behaviorism, John B. Watson. John B. Watson, Behaviorism. 1924 Chapter Title: "Chapter X. Talking and Thinking which, when rightly understood, go far in breaking down the fiction that there is any such thing as a "mental" life" "The behaviorist advances the view that what the psychologists have hitherto called thought is in short nothing but talking to ourselves." "My theory does hold that the muscular habits learned in overt speech are responsible for implicit or internal speech (thought)." B. F. Skinner, About Behaviorism. 1974 "The present argument is this: mental life and the world in which it is lived are inventions. They have been invented on the analogy of external behavior occurring under external contingencies. Thinking is behavior. The mistake is in allocating the behavior to the mind."

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Behaviorism/Quotes_SR.html11/11/2005 1:17:24 PM
S-R Theory
Some S - R Theory
There were a variety of flavors of S - R psychology. The important terms of S - R psychology were:
r r r
S R -
Now that seems pretty simple doesn't it? And it is, although it seemed so simple that many characterized it as simplistic! The basic concern shared by all flavors of behaviorism was that psychology concern itself only with observables. Thus, the label behaviorism since on this view behavior is that component of an organism's activity that is "observable." The first and second terms; the S for stimulus and the R for response were the observable anchors to this S - R theory. Now, on the "radical behaviorism" view, the third term, the '-' is an empty term. And this is how it should be since we can not observe what goes on in the organism from the point in time that the organism is exposed to the S and emits an R. The organism is simply viewed as a "black-box" . This is the attitude most of us adopt most of the time to the technology around us. We don't care how the calculator works as long as it returns the correct output to the values that we input to it. However, note that in the example of the calculator,we can precisely describe the input to the calculator and the output provided by the calculator. We know that the calculator will respond to the numbers entered and will not be sensitive to which finger we used to enter the number, or whether we used the eraser tip of a pencil, or whether they were entered quickly or slowly, etc. And, we know exactly where and when to look at the calculator's screen to find the output. But are things really this simple with living organisms? Can we apply the formula:
Here we have used standard mathematical notation which reads that "R is some function of S." This seems fairly innocuous. But note that as soon as we give the statement mathematical pretensions, we begin to suspect that what started out as a slogan will require some scrutiny before we can claim that it is anything more than a slogan. Functions take some mathematical object (like a number) and return a similar mathematical object as the value of the function. But we aren't entirely sure what a stimulus or a response is much less how to assign numbers or some other suitable mathematical object to them. Now you begin to appreciate Skinner's solution....just forget about all of these niceties.......a response is what the experimenter decides to measure and a stimulus is what the experimenter decides to vary in some systematic fashion. The justification is in the results. If the "response" varies in some systematic way with the "stimulus", then the experimenter made some wise choices and 'truth' has been found. Forget about mathematically describing anything; simply publish the curve that shows the relation between "stimulus" and "response". Thus, this type of S - R Theory was a theory about why one didn't need or want a theory. Other S - R psychologists were less comfortable with this approach. They were looking for a way in which to make theoretical statements while ensuring that those theoretical statements were really about observables. There "slogans" went something like this:
http://www.rci.rutgers.edu/~cfs/305_html/Behaviorism/SR_Theory.html (1 of 3)11/11/2005 1:17:35 PM
S-R Theory
The kind of internal physical events they had in mind were things like the motivational level of the organism (if it was dozing off it probably was not behaving in quite the same way as when it was wired!) and changes in the strength of associations, and the like. But recall, like any upstanding behaviorist, these folks wanted to only talk about observables. (Those of you with some philosophical background can relate this position to that of the positivist movement that was quite prominent in the U.S. at this time.) But what if internal events are solely a function of external events? Then we can talk about internal events by talking about external events! So now on to operationalism - which is kind of a fancy way to justify Skinner's position that a response is what I say it is and the same goes for a stimulus (Skinner was a fan of Gertrude Stein ..."...a rose is a rose is a rose...." so it probably all makes sense in some sense of sense.) Operationalism says that it is the scientist responsibility to 'operationally define' any term that the scientist is going to use. A term is operationally defined when the rules for making the observations that the term ultimately refers to are specified. Thus, a response might be defined as any depression of a bar that is sufficient to cause the recorder to make a mark on a strip of paper. Notice that the response is now something that is recorded by some instrument. And, we can define 'associative strength' (or what these types called 'habit strength') as a term that increases as some function of the number of reinforced pairings of a stimulus with a response. And, voil! ... an internal event; i.e., learning is now under the "control" of the experimenter because it is the experimenter that controls the number of times a stimulus and response pairing is reinforced. Thus 'habit strength,' an internal event, becomes operationally defined as a value that increases as some function of the number of trials where the stimulus and response pair is reinforced.
Similarly, the motivation or 'drive' as it was referred to could be defined to increase as some function of the per cent ad libitum body weight of the animal. Some of the main terms used in this S - R theory (often referred to as the Hull-Spence Theory after its two main proponents) are shown below.
H or habit strength and D or drive have already been mentioned. One of the major preoccupations of the hay day of this theory was to determine how the values of these terms combined to yield the observed behavior. Analysis of Variance, a relatively new statistical technique at the time, was used to attempt to answer questions concerning how the values of these "theoretical" terms combined to determine the observed behavior. We will take up this use of the Analysis of Variance in more detail in the section on mental chronometry. As a result of a great many experiments, it was argued that the motivational level, D, and the habit strength of a particular S - R connection combined multiplicatively. But notice that a term referred to as reaction potential, E, is introduced in the equation below.
S-R Theory
Finally, the equation that mentions the observed response, R, is shown below. The response is equal to its reaction potential, E, minus a term referred to as oscillatory inhibition. This term is essentially a random variable and is assumed to be a normal random variable. Recall that it is assumed that there are a set of responses that are connected to a particular stimulus. Each of these responses has some associative strength to the stimulus. Thus, they can presumably be (partially) ordered based on this strength. This partial order is what they refer to as the response hierarchy. Note that for a fixed value of D the response with the highest value of habit strength will always be associated with the highest level of reaction potential, E. Thus, the organism would always do the same thing. But not if we add a random variable as is done in the equation below. Now, the dominant response(s) will still have the greatest probability of being observed, but they will not always be observed because sometimes the random variable will assume a value that causes the dominant response to, at the moment, have a lesser value than some other response.
There is much more that could be said of this version of S - R Theory. It eventually gave way to much more sophisticated mathematical models of learning such as Stimulus Sampling Theory as developed by Estes and others. And the basic idea that feedback from the environment can be used to learn an "arbitrary" function has been developed in exciting and extraordinary ways within an area commonly referred to as "Connectionism." And, there are still quite lively debates as to whether any of this has anything to do with human reasoning. However, one thing is quite certain; we now possess a rather deep mathematical understanding of the requirements for and implications of this style of learning.

Charles F. Schmidt
Some Terms, Concepts and Questions*
Associationism
Law of Exercise Law of Effect
Who Decides?
Can these terms be precisely defined? S - R Theory Stimulus Response Habit Family Hierarchy Can anything be learned? What makes one thing more difficult than another?
Mediation Theory
Mediational Response Credit Assignment Problem
Who Decides?
Peripheralist Theory Centralist Theory Positive Transfer Negative Transfer Identical Elements How Defined? If a task has elements then how does this fit with S - R Theory?
Transfer
* The Table is organized with the general "concept" referred to in the first column, "terms" related to that concept are entered in the second column of the row, and "questions" to pose to yourself are listed in the third column. Associationism and Behaviorism
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Behaviorism/AssocT_C.html11/11/2005 1:17:37 PM
Anagrams
Anagram Experiments
On p. 25 in the discussion of the experiments on anagrams, your text states that "Each way of arranging the letters can be considered a response in the habit family hierarchy." Note that 5 letters have 5! or 120 different orderings. (If you want to see all 120 of them for the letters {b e a h c} then click on the top thumbnail image below. To see a graphical picture of the trace of the recursive function that generated these orderings, click on the bottom thumbnail image below ). In the anagram experiment, subjects were given one ordering, e.g. beahc and the average of the times that it took subjects to arrive at the correct answer (in this case beach) was analyzed.
q q
Critique this set of experiments and the theoretical interpretation of the results. And, find one or more friends and run the following experiment measuring the amount of time (in seconds) it takes your friend to find the answer.
to the left below are six 5-letter words. In the middle column are arrangements that are a cyclic permutation of the word on the left. The right hand column are also a permutation of the word on the left. Choose three items from the middle and three from the right column but no two from the same row, and use these in your experiment. beach train sugar crate heart model hbeac intra garsu ratec thear eldom ahbce itnar urgsa tearc rehta dmleo
When you have completed your experiment look at the data by rank ordering the 6 times and then computing and comparing the average rank for items from the middle column with those from the column on the right. (If you did this experiment with more than one friend, do this scoring separately for each friend.)If you obtained some clear results, comment on these results in relation to the discussion of S-R psychology in your text.** ** For the instructor's comments click here.
Here are some additional questions to think about (but you needn't hand your thoughts in): Is this a typical type of task involving thinking? Is there an optimal strategy for solving these problems? What might the distribution of probabilities look like for the 120 different orderings associated with a word? Could a person generate all of these orderings?
http://www.rci.rutgers.edu/~cfs/305_html/Behaviorism/anagrams.html (1 of 2)11/11/2005 1:17:50 PM
Anagrams

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Behaviorism/anagrams.html (2 of 2)11/11/2005 1:17:50 PM
Anagram Experiments Critiqued

"One nice thing about the S-R association representation of thinking is that it makes precise predictions that can be tested." p. 25 This critique of the anagram experiments really resulted from the above quote in your text. Since the details of how to derive the "precise predictions" weren't worked out, I decided to try to work them out myself. My first step was to try to determine how to characterize the experimental task within the S-R framework. An anagram problem is defined as one where the given sequence of letters must be rearranged in such a way that all of the letters are used; and, they form a word. (Typically the initial sequence of letters is not a word.) Abstractly then the problem is: <letter sequence> ---> ?x where x is a word Now as good S - R psychologists, one of the first question we must ask is exactly what is the stimulus (S) and what it the response (R)? The most obvious answer would seem to be that the stimulus is the <letter sequence> and the response is the answer in the form of some word. But if we adopt this assumption then what can we assume about the strength of association between these two elements. For example, what is the strength of association between: beahc hbeac ahbce and and and beach beach beach
or between any of the other 116 other possibilities and beach? Presumably the strength of association should be the same for all 116 possibilities because we have probably not had the exciting experience of seeing these possibilities before taking this course! It was this observation that suggested the experiment that you did in the assignment. The initial experiments discussed in your text involved varying the frequency with which the answer, the word x, occurs in some sampling of texts. This frequency is taken as a "measure" of exactly what? We have already ruled out that it is a measure of the strength of association that a person would have between a particular sequence of letters and the word. And note, there was no mention of, nor explicit manipulation of the particular sequence of letters used in each of the problems. The experiment simply involved creating problems where the answers to some of the problems were high frequency and the answers to others were low frequency words. What we are left with is only the 'R side' of the S-R relation. The frequency of occurrence of a word is presumed to be a measure of the relative position of that word as a response in some response hierarchy. But what is the response hierarchy? If we have a particular stimulus, S, then our S-R theory assumes that there is a set of responses that are "elicited" by that stimulus. Each response in this set has some strength of association to the particular stimulus. In general, these responses can be partially or completely ordered using this measure of strength of association. This is what is meant by a response hierarchy (it is really not a hierarchy in the mathematical sense of the term). The first response(s) are those with the highest value of strength of association and those with the highest or high values are referred to as the dominant response(s). The probability that a response will be emitted is presumably a function of its position in this ordering along with other relevant variables.
http://www.rci.rutgers.edu/~cfs/305_html/Behaviorism/anagram_critique.html (1 of 3)11/11/2005 1:17:55 PM
Now, we can give an S-R interpretation of the manipulation of the high and low frequency words as answers to an anagram problem. The interpretation would be, (I think), that regardless of the stimulus, the higher the word frequency of the answer then the higher the strength of association of that word to the stimulus. Note, for both high and low frequency words, they are probably the highest valued word in the response hierarchy that is an answer to the problem. The advantage of the high frequency word over the low frequency word rests not in this relative measure of strength but in its absolute value. That is, in order to predict the results we assume that: (1) a word need not ever be paired with a stimulus in order for that word to be in in the set of responses "elicited" by the stimulus; and/or (2) the probability of any word in this set being emitted depends on its absolute value as determined by frequency of usage; not on its relative position in a particular response hierarchy. The first assumption would seem to predict that, in general, every word that we know is, in some sense, "elicited" by every sequence of letters (the stimulus) and that the "response hierarchy" is independent of the actual stimulus. This is a logical possibility and therefore we can't rule it our a priori. But, it would be quite surprising if true. The second assumption is not required if the first assumption is unconditionally made. In this case, every word occurs to every stimulus and thus every high frequency word has a greater probability of being elicited than every low frequency word. But what if the response hierarchy varied with the letter sequence? Then it could turn out to be the case that a low frequency word is the dominant response for some particular problem and a high frequency word was not among the dominant responses for some other particular problem. If the probability of response is dependent on the relative position in the response hierarchy, then the low frequency word would have the advantage in this situation. In order to avoid this we could assume that the probability of response is independent of the particular response hierarchy elicited.This seems rather strange but it would certainly make the theory more mathematically tractable. The next idea considered in your text is that the "response strength" of a word is measured by some function (e.g., the average) of the estimates of the frequency of each adjacent letter pair. For example, the pairs for 'beach' would be {be, ea, ac, ch}. This is essentially a way of estimating a words "strength" from its components! This is fairly ad hoc. Why not also use the triples? (e.g., {bea, eac, ach}); the quadruples? (e.g., {beac, each}). But more importantly, this has changed the rules of the game. Remember, in the above the R of the S-R was the word. Now the R has components! That is, strength(R) = f(ri,rj,...rn). Now intuitively this makes perfectly good sense. But it complicates our SR theory. Recall the blank slate analogy that is often used to characterize the assumptions typically made in this S-R approach to the study of cognition. A slate has a smooth and homogeneous surface - it has no intrinsic structure or divisions. Whatever structure appears is the result of the way "the world" has written on this slate. If a word has divisions, then it must be the world that places these there. And, if the strength of an association is a function of the strength of the components of the response; then we must discover the components and the function that combines the values of the components to yield any prediction from this theory. (Remember these remarks when you read about Thorndike's view on transfer of learning as discussed in your text.) The final explanation offered in the text really has nothing at all to do with frequency. It is simply an assertion that "subjects tend to make dominant responses such as moving just one or two letters first; if that doesn't work, they try "weaker" responses such as moving all five letters."(p.27) Recall that a dominant response is the one that has the highest strength. Note that in this explanation we have shifted from talking about the word that is the answer to the anagram
problem, e.g. "beach" to talking of the stimulus; i.e. the sequence of letters. Now, it may well be that we are more likely to move one letter than two or three or four; but the question for the S-R psychologist is to explain how this results from experience and why this set of responses is "elicited" by the stimulus. In the previous explanations the stimulus "elicited" words as responses; but now in this case the stimulus "elicits" letter-moving responses. A further problem arises with this explanation due to the fact that moving just one letter first is not really a well-defined description of a particular response in this context. With the five lettered stimuli used in these experiments, any one of the five different letters might be moved and there are many locations to which the letter might be moved. Thus, there are really many such "dominant" responses. A final comment...it is possible that there is no good general strategy for solving arbitrary anagrams problems. Systematically examining each of the possible permutations of the sequence of letters guarantees that you will find the solution if it exists. But this exhaustive strategy doesn't really require a great deal of thought...it simply requires an algorithm for generating all of the permutations and the patience to systematically look through the permutations. This leads one to ask whether this task is representative of the kinds of tasks that characterize human reasoning. Although some may find that doing anagrams is an interesting way to pass the time; I suspect they are not particularly characteristic of our reasoning activities. (Although one test of 'intelligence' involves only analogy problems!) Associationism and Behaviorism
Charles F. Schmidt
Mental Olympics
Mental Olympics
Mental performance depends on physical processes...things that happen in the brain. This appears to be a truism that few would now dispute. Almost everyone who studies human reasoning believes that the mental depends on the physical. However, there are various ways in which to interpret 'depends on' and over this there is considerable disagreement. We often make a distinction between a thing's structure and its function. For example, a stringed instrument such as an acoustic guitar has a structure that includes a resonating chamber located beneath the strings, a neck on which the fingers can be depressed over the string to change the length of the vibrating string, and attachment points which allow the strings to be anchored and stretched under considerable tension. This structure is largely determined by the guitar's intended function, namely to produce musical sounds. In general, in trying to determine the function that something performs, a good starting point or heuristic is to assume that there is a relation between its structure and its function. But, of course, this doesn't always pay off. Sometimes, the connection between structure and function is quite difficult to determine. Many times there are structural characteristics of an object for which one can find no functional explanation. And, often one can find a structural similarity which is quite unrelated to functional similarity. For example, if you look inside your car's engine you will find reservoirs and tubing which structurally are quite similar. However, the reservoirs may hold brake fluid, water, battery acid, or oil and the lines may carry oil, gasoline, water, air, or electric current. In this case, it is quite important to the integrity of your engine that you not assume that similarity of structure implies similarity of function. Now in the case of thinking about the mind/brain dependence, one of the simplest interpretations of the dependence between mind and brain is to assume that there are basic physical components of the brain that are composed or 'wired together' in a simple manner to yield cognitive performance. For example, think again of the acoustic guitar mentioned above. The number of physical components is really very small. Yet in the hands of a skilled musician, this guitar can be used to create the sounds of a literally infinite set of musical compositions. But what are the basic physical components of the brain? Well, certainly back in 1868 when Donders conducted his experiments on reaction time tasks we had no idea how to answer this question.(* ) But, maybe that didn't matter. Rather than focusing on the physical stuff of the brain, why not simply try to identify the simplest things that you could ask the mind/brain to do. For example, why not present a stimulus, and then ask the person to indicate if some stimulus was present. Not only was this a simple task, it was also one that had to be involved in a great many mental tasks. Let's just assume that this task involves a "noticing function" and a "responding function." That seems innocent enough! But now comes the reification trick...assume that the noticing function is carried out by a noticing component! Notice, in this case we knew absolutely nothing about the actual structure; we have nothing more that an intuitive analysis of function; an assertion of simplicity; and from this have concluded that there is a component of the brain that carries out the noticing function. If you buy that, then one can go ahead and develop the whole logic of the subtraction method and carry out experiments which purportedly tell you exactly how fast certain functions can be carried out by the brain/mind. Now, you may have surmised from the way I have presented this argument that I don't think that the conclusions follow logically. But, one can and often does stumble onto the "truth" using reasoning strategies that are only a heuristic for finding "truth", but not a guarantee. In other words, this story could be correct. However, one aspect of the logic of this approach to the study of the mind/brain was challenged rather effectively. The use of the subtractive method to identify the time associated with a process assumes that the process is independent of the "context" in which that process takes place. But, of course, to use the subtractive method you had to devise experiments that differed in the processes that were presumably required to perform the experiments. Thus, this assumption of independence could never be tested, but only accepted or rejected. This criticism of the subtractive method lessened the appeal of this experimental methodology.
http://www.rci.rutgers.edu/~cfs/305_html/MentalChron/MOlympics.html (1 of 4)11/11/2005 1:19:19 PM
Mental Olympics
Then, a couple of things happened in the middle part of this century that revived the importance of reaction time measurements in cognitive research. One had to do with developments in the area of statistics which eventually resulted in a somewhat more sophisticated understanding of their potential use. In the area of cognitive psychology, Saul Sternberg was the important figure whose work on the scanning of short term memory reinvigorated the use of reaction times in the study of cognitive processes. To read more about this memory scanning research and the statistical rationale behind it click here. The other important development was more diffuse. By the 50's electronics was a familiar world. And, it was clear to most people that you could design electronic circuitry that would carry out functions that we intuitively thought of as "mental" functions. Computers were being designed and used. The relation between physical functions that electronic components could be designed to carry out on electrical signals and "information processing functions" required to manipulate information was becoming well-understood. For example, circuits could be built that would "carry out" logical functions, e.g., 'and', 'or', 'not' and so on. And, you could build devices that would serve as buffers to hold onto signals until some other component was ready for the signals; devices that served as a memory for information; and so on. This relation between hardware, the organization of electrical components, and information processing was not lost on cognitive psychologists. Perhaps, the brain was itself a particular organization of "hardware" that realized certain information processing functions. If the primitive components could be identified and experimentally isolated, then they could be studied and we could determine the basic properties of these components. For example, we might be able to determine: the speed at which a component could operate; the storage capacity of a component; the length of time items could be retained; and the type of items that could be stored by a component. But, again there is the question of exactly what are the components. We can distinguish two ways in which this question has been approached. One approach identifies components with the basic functions, predicates, control regimes, and data types of a programming language. Thus, the analogy is to software, and the relation to the brain is much more abstract. It is simply that there is a "language," in the sense of a programming language, of the mind which is realized using some of the physical properties of the brain. We will turn to this approach in more detail in the next section on computational approaches to cognition. The other approach is to identify components of the brain by analogy to hardware. The hardware analogy assumes that it is reasonable to think of the brain as the hardware of the mind. Consequently, we should expect to find components that are organized in ways that are analogous to computers. Thus, the idea of buffers, short term memory, long term memory, central processing unit, and so on. Of course, computers can be designed in many ways. Most of the computers that have been designed follow an architecture that is referred to as von Neuman machines. In fact, we don't really have any deep understanding of the space of possible computer architectures. Consequently, it may be that we shouldn't become too enamored with the analogy that is drawn between current computer architectures and the "architecture" of the brain. The figure below provides a rough picture of the "architecture" that has been suggested by this approach. It is rough in the sense that probably no one who espouses this approach agrees with it in every detail. The picture was developed by Newell, Card and Moran as a first order approximation to the architecture of the human information processing system. Their idea was that such an approximation could be useful to make educated guesses about human performance. Consequently, they went through the research literature that has focused on identifying components of the architecture and studying their properties. The properties that are of interest are several. For a process, the main question is how long it takes a process to carry out a single step...we refer to this below as the cycle time and it is analogous to determining the speed of the cpu in a computer. Three processes are distinguished below and depicted as oval figures. Input processesvision and audition; a cognitive process; and a motor process. The time given in the equations is the authors' best guess at the value for that property. The figures below in brackets indicate the range of times that had been observed in the various experiments that studied this property.
Mental Olympics
In addition to the processes, the storage components are identified. A Visual and an Auditory Buffer, a Short Term Memory, and a Long Term Memory. Three parameters or properties are associated with each of these storage components. The first, symbolized as d for decay time, is an estimate of the amount of time that an item can be held in that storage device. Notice that the decay times are quite short for the buffers indicating that items are lost after a short period of time, but that they are assumed to never be lost once they have been stored in Long Term Memory.
The second parameter, m, refers of the capacity of the storage unit. In the case of time, we had a physical basis for the unit of measurement. For storage this is somewhat problematic. Since most of the experiments used letters of the alphabet as the stimuli, the results for the buffers are given in letters. For the Short Term Memory component things are more complicated. The capacity unit is given in chunks. It isn't entirely clear exactly what a chunk is, but the intuition is quite clear. For example, if I give you 13 letters of the alphabet in some random arrangement you will remember some subset of them--around 7 plus or minus 2 is the classic conjecture of George Miller. If I give you 13 words to remember, you will also remember some subset of them, but the number of letters in the subset of words will be much larger than 7. From these types of observations, it appears that we must allow our short term memory the cleverness to structure or chunk things in a way that affects its storage capacity.
If you want to see an example that illustrates this chunking phenomenon, then do Experiment 1 and Experiment 2. In both of these experiments
q q
q q
be prepared to try to remember the set of letters you will be presented. After the letters are presented, you will be shown some numbers. Mentally subtract 3 from each of these numbers. Then, you will see a ? When you see this try to recall the letters. Simply write them down.
Mental Olympics
Click on the image to the left labeled Experiment 1 to start the Experiment. When you finish that experiment return to this page and then click on the image to the right labeled Experiment 2. Determine how many letters you correctly remembered in each experiment. Now do Experiments 3 and 4. The procedure is the same as before except that in this case you will be asked to remember a list of words. Now click on the image to the left labeled Experiment 3 to start the next Experiment. When you finish that experiment return to this page and then click on the image to the right labeled Experiment 4. Determine how many words you correctly remembered in each of these experiments. Now, you know why psychologists talk about chunks, even though we really aren't all that sure about what they really are or how "chunking" is to be explained.
The final parameter in the picture is v, the "data-type" that the component is able to store. Finally, the red arrows indicate the presumed communication pathways between the components that the different processes serve to realize. Note, according to this view, everything must go through Short-Term Memory. This first order approximation, aside from illustrating the general nature of this approach to the study of the mind/brain, can also serve a practical purpose. If we believe the picture, and take the times as a lower-bound on how fast we can carry out some basic operations, then we can use this model to guesstimate how fast the human mind could accomplish some task. For example, using this type of approximation, Newell and Simon came up with an estimate that it will take roughly 10 years to become an expert at something. We will speak more of this later in the semester when we look at expert reasoning.
* The image to the left is called a Hipp Chronoscope. It was the development of this instrument that enabled Donders, back in 1866, to measure reaction time in milliseconds. A description and more detailed pictures of this instrument can be a accessed on the Web by clicking the figure.

Charles F. Schmidt
Scanning STM
Scanning Short Term Memory

You will be shown a sequence of items which you are to remember. After an interval, a ? followed by an item will appear. Say Yes if the item was in the original sequence of items that you saw. Say No if it was not in the original sequence of items that you saw. Make your decision as quickly and accurately as possible. Now click on the image to the right labeled Experiment 1 to begin the experimental trial.
This single trial illustrates one of the more well-known experiments that uses reaction time to measure the speed of a basic mental process. In this case, the process is presumably the scanning of short term memory for a matching item. The basic experimental paradigm involves: selecting a set of items, in this case the integers 0 through 9; constructing subsets of this set that vary in size where the items within the subset occur in a random, as opposed to numeric order. The subset is then presented one item at a time to the subject. Then, after an interval, the target item is shown and the subject must determine as quickly as possible whether or not that item was in the original subset. If we assume that there is a component of the brain that is used to temporarily store a set of items, a short term memory (STM), then this experiment provides data on the time that is required to access, search and match a target item to the items that have been stored in STM. In this type of experiment, Sternberg found that the time to determine whether or not the target was in the original set was a linearly increasing function of the number of items in the original set. He considered two types of scanning processes that might yield this result. The first he termed exhaustive search. In this type of process all of the items are examined before an answer is made. The theoretical predictions associated with this type of scanning process are shown at the top of the figure below. Note that there is no order effect associated with this type of search. Since all of the items are examined, the position of the target item in the original list will have no effect on the reaction time.
http://www.rci.rutgers.edu/~cfs/305_html/MentalChron/MChronscan.html (1 of 3)11/11/2005 1:19:31 PM
Scanning STM
The other type of scanning process is referred to as self-terminating search. The predictions associated with this type of process are shown at the bottom of the figure above. In this type of process, the search terminates as soon as a match is found. If the search is serial, then on average the target item will be found after half of the target list has been searched. This is the reason why the curves diverge as a function of the length of list for the self-terminating search but remain parallel for the exhaustive search. In both types of search, if the target item is not on the list, then all of the items must be looked at. But, if it is on the list, then for self-terminating search only half, on average, of the items need to be scanned. Thus, the differing slopes of the Negative and Positive Target curves. Sternberg's evidence was for parallel curves with a slope of roughly 40 msec. per item. He interpreted this as consistent with an exhaustive search. This result ran counter to many people's intuitions, and this added to the interest in this type of research. (Psychologists tend to be particularly impressed if an experimental result runs contrary to their intuitions. Whether or not this is a desirable trait is another story for another time.) Now, recall the criticism of the subtractive method. Sternberg wanted to determine that this scanning process was a basic component of the mind/brain. He had shown that one variable, the length of the list to be scanned, affected this process in a very systematic fashion. Did this mean that the scanning process was a basic component of the mind/brain?
Scanning STM
Well, the logic to answering this question is this. If, it is argued, one can find other independent variables that could potentially affect this component, and they affect this component independently of each other; then one's belief that you have found a basic component is strengthened. Why, you may ask? Well, I don't really know. There seems to be a bias here that in order for something to be "basic" it should be independent of other components...independence needn't be a prerequisite of what one decides to call "basic"....but its a nice thing, so...why not? Perhaps the bias reflects the preference for linear functions. Well, now to testing for independence to increase our confidence that we have a basic component. What we need is another variable. If you click on the image labeled Experiment 2 on the left then you can do a trial from another condition which will illustrate the variable that Sternberg chose to study. This was the same type of trial that you did above except that we have made the contrast between the items and the background somewhat less pronounced. This type of manipulation Sternberg referred to as stimulus degradation. The figure to the right illustrates the rationale for this study. Sternberg thought of this task as involving three basic processes. The encoding of the test stimulus (shown in blue); the exhaustive search process (shown in green); and the decision process (shown in red). Now, it was argued that the stimulus degradation manipulation could affect both the encoding and the search process. If it affects the search process, then it should add to the time it takes to make each comparison. If this is the case, then these two variables--stimulus degradation and list length would interact...that is, the amount of time it takes to make a decision will depend not only on the number of items but also on the degree of stimulus degradation. The presence of an interaction of these variables would then lessen the confidence that these folks would have in their identification of the basic processes involved in this task. Luckily for them, stimulus degradation did not interact with list length and they felt it appropriate to continue to believe that they had not only identified a basic process of the mind/brain; namely, scanning short term memory for a match; but also identified the nature of the process--exhaustive search--and the speed with which an individual comparison could be made. To appreciate in a more detailed fashion this argument about independence and the use of the interaction test in an analysis of variance as a test, you are encouraged to look over the page on Additive Factors and Analysis of Variance.
Back to Mental Olympics

Charles F. Schmidt
ANOVA and Additive Models
Analysis of Variance and the Additive Factors Method
Analysis of Variance is a statistical procedure much beloved and revered in psychology. In its most revered form it involves more than one variable with two or more levels of each variable. The simplest version of this most esteemed form is shown below where we have two independent variables, A and B. For each of these variables we have two values, A1 and A2 and B1 and B2. These are completely crossed yielding four experimental groups. Now, since we need to add, square and do similar violence to our dependent measure, it is most advisable to believe that your dependent variable is measuring something that has the same properties as a well-behaved domain such as the real numbers or at least the integers. In the case of the mental chronometry work, we are presumably in fine shape because the reaction time measure is a measure of good old physical time just like the physicists measure. We also should assume that the response measure has a random component associated with it; and this random component should have a more or less Gaussian distribution. If is also nice if real numbers (or at least integers) can be used to describe the levels of the independent variables. This is because we are going to use these values in the equations below....the terms such as ai and bj. All of these assumptions are probably rarely met in psychological experiments, but no one seems too really much care. Now notice in the figure below we have written a bunch of formulae and done some algebra on these formulae. In these formulae, Rij refers to the dependent variable or response to the treatment combination AiBj of the independent variables. If in our analysis of experimental data, the main effects for A and B are significant as well as the interaction of A and B, then all of the terms in these formula contribute to the prediction of the response. The terms in red represent the interaction terms. Intuitively, the interaction terms reflect the possibility that the response is influenced by the particular combination of values of the independent variables.
But, if the interaction is non-significant, then these terms drop out and we are left with the table shown below. Here, the interactive terms have been dropped and you can see that a simple linear equation describes the way in which the response depends on the values of the independent variables.
http://www.rci.rutgers.edu/~cfs/305_html/MentalChron/MChronAdd.html (1 of 2)11/11/2005 1:19:46 PM
ANOVA and Additive Models
What is called the Additive Factors Method is simply the use of this feature of the analysis of variance model to test for statistical independence. That is, if the interaction terms are not significant, then the response does not depend on the particular combination of values of the independent variables. Thus, there is in a weak sense, now a method for testing the independence required by the subtractive method rather than simply assuming that independence holds. Namely, if the investigator has identified two stages of a mental processes; and, if the investigator believes that two independent variables have been idntified which affect the stages independently, then if an experiment establishes this statistical indpendence then the investigator has more confidence that the two stages are in fact independent stages of a mental process. Experimental Decomposition of Mental Processes
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/MentalChron/MChronAdd.html (2 of 2)11/11/2005 1:19:46 PM
Terms, Concepts and Questions - Decomposition of Mental Processes
Some Terms, Concepts and Questions time process
Stage
How are stages identified?
Subtraction Method Additive Factors Method Statistical Independence Interaction Parallel Serial Search Exhaustive Self-Terminating Structure/Function Components DecayTime Capacity Decomposition of Mind Data Type Processes Cycle Time Relation to Mehods above?

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/MentalChron/MChronTermsQs.html11/11/2005 1:19:48 PM
Assignment: What Might be Predicted...
Assignment: What Might be Predicted Based on this Picture of the Mind's Architecture?
Recall that the figure below (discussed on the page entitled Mental Olympics) was intended by the researchers who developed it as a summary of what some believe are the most important properties of the human "architecture" that underlies or supports human cognition. Assume that this characterization of the human architecture is accurate. Note that there are basically three sorts of assumptions embedded in this picture.
q q
Assumptions about the basic components of the human architecture. Assumptions, represented by the red arrows, about how information can or cannot flow between the components of this architecture. Assumptions about the basic properties of each component.
1. State one or more predictions that you think are probably true about human reasoning that can be made from these assumptions together with a brief statement showing how the prediction follows from the assumptions represented in this figure. 2. State one or more predictions that you think are probably false about human reasoning that can be made from these assumptions together with a brief statement showing how the prediction follows from the assumptions represented in this figure.
http://www.rci.rutgers.edu/~cfs/305_html/MentalChron/MChron_Assign.html (1 of 2)11/11/2005 1:19:50 PM
Assignment: What Might be Predicted...

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/MentalChron/MChron_Assign.html (2 of 2)11/11/2005 1:19:50 PM
Following Instructions
Instructions, Computation and Thinking

"Follow these few simple instructions and you will have..." If you have encountered the phrase above, you have probably come to appreciate that the notion of "simple" instructions is not a very well-defined idea. This is partly the case because we fail to distinguish clearly between the instructions themselves, what is written down or depicted in some manner; and, the agent who is to carry out the instructions. If my agent is Picasso, the instruction "Paint a picture" will do; if my agent is Bach, the instruction "Compose a fugue" will do. For agents less experienced and expert in these matters, much more extensive instructions are required; and, it is not clear that we could even come up with a successful set of instructions for every task that we might like to see accomplished. In the early part of the 20th century there was a great deal of interest in rigorously analyzing the idea of a set of instructions and the notion of an agent that can carry out the instructions. One of the first things that we can note is that the idea of a set of instructions; and, that of the agent that carries out the instructions, are not separable ideas. The instructions always presume an agent with certain capacities. In mathematics, the idea of proof had undergone considerable refinement. Is there a sense in which proving something in an area of mathematics can be thought of as simply following instructions? If so, then how sophisticated must the agent be that can follow these instructions? Each area of mathematics had developed a notation for representing the objects of interest; and had usually developed procedures which if followed would yield some new expression in the notation. And, if all was well, it was said that the new expression 'followed from' the starting expressions. For example, in propositional logic we might start with two expressions such as p, p -> q, and then come up with some new expression, q. This rule of inference, modus ponens, is presumably a valid thing to do in the proposition logic. Part of the interest in studying instructions was to try to make as explicit as possible the notion of a proof -- what does it mean for one thing to follow from another? One possibility is that stating a proof depends very heavily on the agent's knowledge and capacity. The hope was that the opposite possibility would be true. Namely, that one could find a simple language for stating instructions that presumed only a marginally competent agent but was flexible enough to be used to state how to carry out proofs and compute functions. Note, we are not saying anything at this point about whether the agent could come up with the instructions for doing these things. Only that if someone comes up with the instructions, then the agent can follow the instructions. You have probably guessed by now, that this work formed the basis for the field that is now known as the mathematical study of computation. Computation was defined mathematically long before computers were built or Wall Street had the vaguest notion of their future economic importance. Now, we will introduce a bit of mathematical notation in describing this field...after all it is mathematics. But the ideas are quite intuitive and the notation is simply to try to guard against reading more into all this than is really there. For some reason, our culture both denigrates and elevates computers and computation. Some dry and boring notation will hopefully help us avoid either of these extremes.
The Intuitive Idea

An instruction always involves an action that the agent is to carry out....place the medium size widget through the upper hole in the medium gray colored whatzit. The action takes place in some context - in this case a context of widgets and
http://www.rci.rutgers.edu/~cfs/305_html/Computation/instructions.html (1 of 3)11/11/2005 1:20:37 PM
whatzits. In addition to an action, an instruction will often state a condition which must be satisfied if the action is to be carried out. The condition also refers to the context in which the actions take place. Usually, more than one instruction is required to complete the task. Consequently, we also need to worry about how to put instructions together. One simple way to do this is to order them. In addition to the set of instructions, we have the stuff that is provided at the start of the task....the widgets and whatzits. And, we also need some way of knowing whether we have successfully completed the instructions. Now, let's move from these intuitions to some mathematics to see how whether we can create a set of instructions suitable for a dumb agent.
The Idea of a Machine or Automaton

We introduce three kinds of things.
q
We will need a set Q, which we will refer to as a set of control states. q0, ql,..., qf, and the like will be used to refer to elements of Q. Q is a finite set, but that's about all there is to say about this set. We will need a set S (usually represented by the capital Greek S and referred to as Sigma). s1, s2, si, and the like will be used to refer to elements of S. We will also include a special element, #, which will be used to refer to a blank. Again, this set is finite. And, you can think of it as the vocabulary used to refer to the domain. Finally, we will need a set of actions, A.This set is finite, and we will use R, L or some si to refer to an element of A. When we use si, this is a shorthand for the action of writing down si. We will use ak to refer to some element of A.
Now, we are ready to define the syntax of an instruction. An instruction is a tuple: q i sj a k qm The first two elements state a condition and the last two elements are the action portion of the instruction. Now, if this is what an instruction looks like, what sort of capacities does the agent who is to carry out these instructions require? Let's take each element in turn. Q is a finite set of control states. Our agent will have to be able to represent or "hold onto" as many different control states or names as are needed for a certain set of instructions. And, additionally, the agent will have to keep track of which of those control states is the current state that the agent is in. We also need some way to communicate input to the agent. Let's assume that the agent has a tape divided into squares where we can place an element of S on a square. We will assume that the agent is able to be at and have access to the contents of only one square at any point in time. What will be on the square will be an element of S or # and our agent must be able to recognize each of these elements of S. That is, if a and b are elements of S, then our agent must be able to tell the difference between a and b. These capabilities allow our agent to be able to recognize the condition side of the instruction. The agent needs to be able to store all of the instructions for a task and be able to find that instruction that matches the agent's current state and the element from S that is on the square of the tape where the agent's read head is currently located. If no such instruction exists, then the agent "hangs" or fails because there is no instruction that covers this condition. However, if there is an instruction for this condition, then the agent must follow the rest of this instruction. Recall that the third position is where the action is. We only need the agent to be able to execute three types of actions. One is to move one square to the right on the tape, R; another is to move one square to the left on the tape, L; and finally, to write one of the si on to the current square of the tape. The final capability is to change the name of its current control state from the name in position 1 of the instruction to the name of the control state in position 4 of the instruction and remember this as the current state.
This is a rather simple agent and the instructions certainly seem quite clear. There are a few additional conventions that are needed in order to fully specify the idea of following a set of instructions. First, we need to make sure that our agent always starts in the same control state. We will call this control state, the start state and it is usually denoted as q0. And, we must insure that the agent always starts to the left of the square where the first input element has been placed. With these two conventions we have a well-defined start of a computation. Finally, we need some way for the agent to know that it has finished the instructions and needn't look for any more. This is accomplished by taking some non-empty subset of Q which are used to represent final states. Then, if the agent is in one of these final state, the agent "knows" that the computation is finished. The contents of the tape at the beginning of the computation is called the input and a subset of the content on the tape at the end of a computation that has successfully halted may be called the output. Together they are called an input-output pair, or an <I,O> pair. It is this pair then that specifies the result of following instructions to take some initial notation into some new or final notation. Computational Approach
Charles F. Schmidt
Machines/Automata
Machines that follow Instructions - But So What?

We have arrived at a rather minimal agent who can understand and carry out very simple instructions. (If you want to read Turing's development of these ideas, click here.) But can this agent be instructed to solve any really interesting problems? Only a very small subset of problems? All problems? ... Well, we can define several interestingly different agents by only slightly altering the way they are constructed. The simplest machine or automaton that we think of as a computational device is called a finite state machine (FSM). This machine can read from its input tape, but most significantly, it can not write on the tape. So what, you might say. Well, think of yourself doing some long division problem, or shopping for a big party, or doing the cryptarithmetic problem that we did earlier. Now, what if you couldn't write anything down to aid yourself in carrying out the computation required? It turns out that if you can't write to a memory, then certain types of functions can not be computed using this limited instruction set/agent. Exactly what these limits are is a bit more subtle than you probably can imagine. So don't trust your intuitions. But, do explore the idea of an FSM a bit further by perusing some of the further information provided below. q Formal definition of Finite State Machines....probably only useful if you are comfortable with set theory; q Three ways of depicting an FSM...this one computes anbm; q Animation of the above FSM based on the Markov State Representation; q Animation of the above FSM based on the Tuple Representation; The example above showed that we could recognize the set consisting of 0 to n a's followed by 1 to m b's. Note, this set has no largest element. One million a's followed by 10 b's or whatever, is included. So the size of the set or language recognized is not the critical feature. What is critical is whether the agent has to keep track of dependencies in the strings that make up the elements of the set. If we change the set or language to strings that involve n a's followed by n b's, then no FSM can recognized this set. In order to recognize this set, we have to keep tack of and make sure that there are exactly the same number of a's as b's. We can write an FSM to do this for a bounded n, say up to 100, but we can't do it in general. And, perhaps more significantly, the FSM doesn't capture regularities of this sort in a succinct and elegant way. A Turing Machine (TM) can recognize this set. The Turing Machine can write to the tape as well as move either direction on the tape. A seemingly small change but one that has some rather amazing consequences. But, before we move on to those, it will help if you study the representation of the Turing Machine referred to below. Try to build a picture of how and why this algorithm works.
q
Depiction of a Turing Machine that recognizes anbn;
http://www.rci.rutgers.edu/~cfs/305_html/Computation/Machines_305.html (1 of 2)11/11/2005 1:20:39 PM
Machines/Automata
Now that you have seen how simple even these machines are, let's turn to the amazing aspects of all this. The first aspect is referred to as Church's Thesis or the Church-Turing Thesis. This thesis is the proposal that the informal notion of "a function that can be computed by an algorithm" (or computed "mechanically") can be identified with the the set of functions computable by a Turing Machine. Roughly, this says that we won't be able to come up with any formalism, F, that captures our intuitive notion of an algorithm that will be more powerful than a Turing Machine. Here "more powerful" means that there are functions that can be computed by F that are not computable by a Turing Machine but not the converse. Thus, this very simple device appears to be sufficient to compute anything that is computable. Aha, the qualification! It turns out that not everything is computable....there are an infinite number of functions that are; and, there are an infinite number of functions that are not...such is life. At least, we now have some understanding of what the limits of computation (or following instructions) are as well as an appreciation for the source of these limitations. Now, for the other amazing aspect. One can construct something called a Universal Turing Machine. It is a Turing Machine just like the others except that its instructions are about how to simulate another Turing Machine! A Turing Machine can be thought of simply as long string of symbols...the tuples that we have become acquainted with. In addition to the tuples, there is the tape, which is also just a string of symbols. And, finally, there are the two control pieces of information that the machine keeps track of...namely; its current control state and the current location of the read head. All of the information can be put together on to a tape. This tape can then be seen as the input to our Universal Turing Machine. The tape contains the program (the specific TM), and the input data. Thus, one can view a TM as data or as instructions describing a process. Now, despite being overwhelmed by these amazing facts, you may still wonder what this all has to do with human reasoning! From the advent of work on computation there has always been the question of the relation between computation and the mind. If human reasoning is a kind of computation, then we are a kind of computational device. And, it follows then, that one way to develop an understanding of human reasoning is to try to determine exactly what kind of computational device we are and also determine exactly how we accomplish various reasoning tasks. If we aren't computational devices, then we have to figure out some other way to talk about and study the mind. But even if we aren't a computational device; it is important to understand exactly what constitutes a computational device in order to understand the differences between mind and machine.
Computational Approach
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Computation/Machines_305.html (2 of 2)11/11/2005 1:20:39 PM
Levels Hypothesis
Levels Hypothesis
The so-called Levels Hypothesis claims that some devices can be described at a variety of levels...it will suffice to distinguish three such levels. They are:
q q q
the Physical Level; the Symbolic or Computational Level; and the Semantic or Knowledge Level.
Computers and humans are, according to this hypothesis, both devices which can appropriately described at each of these levels. (A lump of coal, a red apple, etc. are examples of things that aren't appropriately described at these various levels...only rather special things can be described at all of these levels.) What is a level? Intuitively, it is a way of describing things. When we speak of our brain, its neurons, their organization, and the like we are describing things at the Physical Level. In addition to the way we carve up and name the "stuff" at this level, there are also laws, in this case physical laws, that describe the way in which the "stuff" behaves. Computers can be described at the hardware level, chips, transistors, bus, etc. And, the physical laws of electricity hold in describing this level. Thus, a level represents a kind of commitment to the existence and lawful behavior of certain kinds of entities. The Symbolic or Computational Level is a level at which we describe symbols, expressions composed from symbols, processes that map from expressions to expressions, and the like. The Turing Machine was, of course, a description at this level. Now, the idea is that there are usually many ways to physically realize or instantiate a computational device. Nonetheless, the computational description and laws describe the device quite independently of its physical instantiation. The Semantic or Knowledge Level is the level at which we describe the notion of a rational agent. One sense of this is that a device is rational to the extent that it uses its knowledge to attempt to satisfy its goals. Another, broader and more technical sense, is that a computational device realizes some semantics if and only if it generates outputs that are derivable from some well-specified semantics of the domain. In order to simplify the discussion, let use turn to the game of Tic Tac Toe and try to use it to illustrate some of these distinctions. An illustration of the idea that there are many ways to physically realize something that is described computationally can be viewed by clicking on Tic Tac Toe Example.
http://www.rci.rutgers.edu/~cfs/305_html/Computation/Levels.html (1 of 2)11/11/2005 1:20:46 PM
Levels Hypothesis
The three different physical settings within which Tic Tac Toe could be played are clearly quite different. However, all that matters about the physical setting is that it provide a way in which to represent the computational ideas of a move; of the entities controlled by each player at each point in the game; and finally, a relation that holds over 8 subsets of size 3 of the entities. Viewing the animation of a game in these three settings illustrates that the physical similarity between the traces of a game can be fairly substantial to non-existent. Now, reflect for a moment on the various possible sequences of events that are possible at each level. At the physical level, any physically realizable sequence of moves is possible. For example, as depicted in the figure to the right, 3 X moves could immediately be made across the top row. This sequence violates the rules of Tic Tac Toe, but certainly no physical law prevents it. Thus, if we find that only "legal" Tic Tac Toe sequences are observed, then either we must assume that these regularities arise from some other level or there is some Physical "Tic Tac Toe" Law that we have yet to discover. Similarly, at the level of the game the set of legal move sequences includes moves that we would view as quite "irrational". For example, we find it strange if a player's next move could win the game, but the player makes some other non-winning move. The figure to the right provides an example of this "irrational" case. The "semantics" of a game is to try to achieve the goal of winning. Sequences of moves can violate this "semantics" but still constitute a syntactically correct game. Thus, we have the set of possible Tic Tac Toe games and the set of competitive Tic Tac Toe games. And, this latter set is a subset of the former. There seems to be only a small set of devices which can also be viewed at this semantic level where we consider not only what the device can do but what it "ought" to do if it is considered to be "rational." These remarks point to the potential relevance of normative ideas and models to the study of reasoning and intelligence. The levels hypothesis claims that the mind can be viewed from these three related but distinct perspectives. Hopefully, the Tic Tac Toe example will help you to remember these differing perspectives on the mind. Computational Approach
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Computation/Levels.html (2 of 2)11/11/2005 1:20:46 PM
Generate and Test
ARTIFICIAL INTELLIGENCE AND SEARCH

Work on the mathematical study of computation has informed us concerning the the syntax of the minimal instructions/agent that allows computation to be defined. But what is the "minimal agent/syntax" that is required to realize intelligence. This is one of the questions that researchers in the field of Artificial Intelligence have addressed. The "birth date" of AI is generally traced to the Dartmouth Conference which was held in the summer of 1956. Earlier attempts at using computers to realize "intelligent" systems had been attempted within the framework of "neural-like" networks. AI followed a more abstract road in that it sought to understand "intelligence" and design "intelligent" systems within the framework of the kind of symbol systems developed in the mathematical study of computation. The basic requirement is to again define a kind of "dumb agent" that can carry out certain basic instructions. But, now the desire is for the "dumb agent" to be able to learn or become more intelligent as a result of its activities.The idea of search served as the starting point for our "dumb" or "minimally intelligent" agent. The basic components that are involved in search are:
q
Some way of representing the information that is available at the start of the search, this is usually referred to as the start state and is often symbolized as s0; Some way of advancing the search by generating new information from the given information, these ways of generating new information are often referred to as generators or operators, and, they are thought of as partial functions which take the agent from one state of information to another; Some way of representing the goal as a test that can be evaluated against the agent's state of information at each point in its search.
The search procedure is typically referred to as generate and test. The figure displayed to the right depicts a simple and concrete situation which we will use to discuss the ideas of search. Four colored blocks are depicted. We will pretend that the picture is the "world" and the letters at the top represent the this world.That is, if the expression at the top maps correctly into those aspects of the world that we wish to capture, then the expression is said to represent this world. In this case each block is referred to by the letter that is on its face. Also, the left to right ordering is reflected in the expression. And, finally, the '#' symbol is used to indicate the different stacks of blocks. In this case each stack contains a single block. Note that this expression does not encode the color of the blocks, the separations between the blocks, their size, and so on. For example, if the blocks were all blue, this expression would not change.
http://www.rci.rutgers.edu/~cfs/472_html/AI_SEARCH/GenTest1.html (1 of 3)11/11/2005 1:21:10 PM
Generate and Test
Now, assume for the moment that our problem is to create two stacks of blocks, and the situation depicted above is our initial state of the world. This goal situation might be represented by the string #wx#yz# where we treat w, x, y, and z as variables that may take any block in the world as a value. Now, we will look at the way this problem is approached using the method of generate and test known as State Space Search. To carry out state space search we require generators that correspond to actions that can be done in our world. And, the generators should be rules that when applied to our representation of the world yield a representation of a new world. And, this new world should correspond to the world that would obtain if we had actually carried out the action represented by the generator Several types of action can be done in our little world. The blocks could be moved about, stacked, and unstacked. Note that although we might be able to change the color of the blocks in our world by painting them, this generator can not be represented because the color of blocks is not represented in our representation language. Clearly, we will need a stack generator to achieve our goal. The stack generator might look something like: Stack(sx,cy) which might be thought of as indicating that stack is an action that takes two arguments, a stack, sx, and a clear block, cy. To actually realize this action we would need to construct a procedure which would take an expression like #A#B#C#D# and be able to produce an expression like #A#CB#D# which represents the world shown in the next picture. We won't worry about how to actually do this since that would take us into more detail than is necessary at this point. Now with this stack generator, we can solve the problem. For example, stack can be applied to #A#CB#D# to produce #CB#DA# as is shown in the next picture. It is obvious that the solution can be obtained. What is less obvious is the nature of the state space within which the solution is to be found. Look again, at the initial state represented as #A#B#C#D#. How many different stacking moves can be made from this state? Well, block A could be put on top of B or C or D; B could be placed on top of A or C or D; and so on. It would appear that there are 12 different possible stacks! And, what are the stacks that could be created from these stacks? Well, take one case. If we had placed B on top of C as in the above example, then we could now place either A or D on top of the CB stack or we could place D on A or A on D. It would appear then that each of our 12 possibilities admits these 4 possibilities...thus 12 x 4 or 48 possible worlds have been considered and half of them achieve the goal.
Generate and Test
However, note that if we had just generated those states of the world that had a stack of three blocks, then we could never achieve the goal. Or if the starting state had been #ABC#D# rather than #A#B#C#D# we would not have been able to achieve the goal, because we did not represent the action of unstacking as a generator. We leave it to the reader to work out the details of what happens to the search space if we add this unstack generator. The world will be more realistically represented in this case and more problems are soluble in this world. But, a much larger space of states must be considered. Add the possibility of exchanging the position of blocks, for example from #A#B#C#D# to #C#B#A#D# and the space grows again. Now, hopefully you have a better intuitive feel for the basic way in which a state space comes about. And, with this understanding you can now better appreciate the issues that arise when we consider how to systematically control the search through this space for a solution. In order to give you another way in which to understand the basic control regimes discussed in class, you can click Search Control Stategies below to see animations of a hypothetical breadth first, depth first, and hill climbing search. Search Control Stategies
Charles F. Schmidt
Exhaustive Search Methods
Methods of Controlling Search

We consider first the exhaustive search methods known as breadth first and depth first search. We will use the hypothetical search tree exhibited below to illustrate some of the ideas involved in the various search methods.
The root node of this tree is labeled s0 to indicate that this is the starting state for our search. The node in the center that is three levels down and which has a g to its right is our goal node. For purposes of illustration we will pretend that this is the total search tree. This is, of course, pretty unrealistic. Most search trees will have hundreds if not millions of nodes. Our little example has only 23 nodes or 22 states that can be reached from the start node; and, the goal node is only three moves away from the start node. The figure below uses this tree to illustrate one kind of exhaustive systematic search which is referred to a breadthfirst search.
In this type of search, the search proceeds by generating and testing each node that is reachable from a parent node before it expands any of these children.The test is indicated in this animation by the appearance of a ? over the node generated. The node and branches remain yellow in color as long as they must be retained in memory. Search is terminated when a solution is found; that is, the test of a state returns true. In this animation, when the goal node is discovered, it is changed to green to indicate this fact. Then, the solution is traced back to the starting node to indicate the noting of the solution path. This is also indicated by changing this path to a green color. Finally, the paths that were searched but were not part of the solution are changed to gray to indicate that they no longer need to be held in memory. In this example, 18 nodes were opened before the solution was found. The search is said to be exhaustive because the search is guaranteed to generate all reachable states before it terminates with failure. By generating all of the nodes at a particular level before proceeding to the next level of the tree (the breadthfirst strategy) this control regime guarantees that the space of possible moves will be systematically examined. We usually assume that the space that will be searched is finite. Of course, it could be quite large and the solution may lie a thousand steps away from the start node. Consequently, in practice one would usually specify a depth or move distance from the starting node beyond which you would not search . In our case the maximum depth was only three.
http://www.rci.rutgers.edu/~cfs/305_html/Computation/ExhaustiveSearch_305.html (1 of 3)11/11/2005 1:21:40 PM
You will notice that this method of search requires considerable memory resources. It does, however, guarantee that if we find a solution it will be the shortest possible. Another type of exhaustive search is referred to as depth-first search because the tree is examined to some depth d, before another path is considered. The next animation illustrates this type of exhaustive search.
Recall that in this example, the maximum depth that can be obtained in our tree is only three. When this limit is reached and if the solution has not been found, then it is possible to extend the search to other branches of the tree that were intially ignored. To realize this, the search backtracks to the previous level and explores any remaining alternatives at this level, and so on. It is this systematic backtracking procedure that guarantees that it will systematically and exhaustively examine all of the possibilities. In this animation, you will notice that nodes become gray after backtracking has occurred. Again, this indicates that these nodes need no longer be retained in memory. In this example, only 12 nodes were explored before the solution was found and we never had to hold more than three nodes in memory. However, the solution might have been the first node generated on the far right of the tree. In this case, the breadth first procedure would have found the solution much quicker that the depth first procedure. Again, if the tree is very deep and the maximum depth searched is less that the maximum depth of the tree, then this procedure is "exhaustive" modulo the depth that has been set. Finally, we illustrate one kind of heuristic search, namely, hill-climbing. In this case, the control regime is basically that of depth first search. However, at each choice point in the depth first search all of the candidates are generated and then evaluated using an evaluation function. The result of this evaluation informs the decision about which of these alternatives to pursue. The best candidate according to the evaluation function is depicted in blue in this animation.
Now that you have seen these differing control regimes, review them with an eye toward the computational resources, time and memory, that they each require. After all of this click below if you want a diversion where you go to a page with only the animations and if you can get them all onto your screen at once you can watch them race each other to the solution. Breadth First ,Depth First Search and Hill Climbing
Charles F. Schmidt
Search Animations
Breadth-First Search
Depth-First Search
Hill-Climbing
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Computation/SearchAnimations_305.html11/11/2005 1:21:42 PM
Problem Reduction Search

Sometimes problems only seem hard to solve. A hard problem may be one that can be reduced to number of simple problems...and, when each of the simple problems is solved, then the hard problem has been solved. This is the basic intuition behind the method of problem reduction (also referred to as problem decomposition and related to the method referred to as divide-and-conquer) . This type of search differs in many ways from state space search. The typical problem that is used to illustrate problem reduction search is the Tower of Hanoi problem because this problem has a very elegant solution using this method. Here we will compare the state space formulation with the problem reduction formulation of this problem.The story that is typically quoted to describe the Tower of Hanoi problem describes the specific problem faced by the priests of Brahmah. Just in case you didn't decide to read this story, the gist of it is that 64 size ordered disks occupy one of 3 pegs and must be transferred to one of the other pegs. But, only one disk can be moved at a time; and a larger disk may never be placed on a smaller disk. Rather than deal with the 64 disk problem faced by the priests, we will consider only three disks...the minimum required to make the problem mildly interesting and useful for our purpose here...namely to illustrate problem reduction search. The figure below shows the state space associated with a 3-disk Tower of Hanoi Problem. The problem involves moving from a state where the disks are stacked on one of the pegs and moving them so that they end up stacked on a different peg. In this case, we will consider the state at the top of the figure to be the starting state. In this case all three disks are on the left-most peg. And we will consider the state at the bottom right to be the goal state. In this state the three disks are now all stacked on the right-most peg.
State Space for the 3 Disk Tower of Hanoi Problem

Recall that in state space search the generators correspond to moves in the state space. Thus, the two states below the top state in the triangle of states corresponds to the movement of the smallest disk either to the rightmost peg or to the middle peg. The shortest solution to this problem corresponds to the path down the right side of the state space. This solution is shown in the animation below.
http://www.rci.rutgers.edu/~cfs/305_html/Computation/ProbRed_305.html (1 of 2)11/11/2005 1:21:58 PM
In problem reduction search the problem space consists of an AND/OR graph of (partial) state pairs. These pairs are referred to as (sub)poblems. The first element of the pair is the starting state of the (sub)problem and the second element of the pair is the goal state (sub)problem.There are two types of generators: non-terminal rules and terminal rules. Nonterminal rules decompose a problm pair, <s0, g0> into an ANDed set of problem pairs {<si,gi>, <sj,gj>, ...>. The assumption is that the set of subproblems are in some sense simpler problems than the problem itself. The set is referred to as an ANDed set because the assumption is that the solution of all of the subproblems implies that the problem has been solved. Not all of the subproblems must be solved in order to solve the parent problem. Any subproblem may itself be decomposed into subproblems. But, in order for this method to succeed, all subproblems must eventually terminate in primitive subproblems. A primitive subproblem is one which can not be decomposed (i.e., there is no non-terminal rule that is applicable to the subproblem) and its solution is simple or direct. The terminal rules serve as recognizers of primitive subproblems. The symmetry of the state space shown above may have led you to suspect that the Tower of Hanoi problem can be elegantly solved using the method of problem decomposition. The AND tree that solves the 3 disk problem is shown below.
AND Tree Showing the Problem Reduction Solution to the 3 Disk Tower of Hanoi Poblem
Let us number the state space solution shown in the state space above 1 through 8 so that we can refer to the states by number. 1 corresponds to the topmost or starting state and to the right corner or goal state. These two states are the first and second element in the problem shown as the root node in the AND tree above. The red arc is used to indicate that the node is an AND node. The problem is decomposed into three subproblems. The left subpropblem consists of states 1 and 4; the middle subproblem consists of states 4 and 5; and the right corresponds to states 5 and 8. Note that the left and the right subproblems correspond to the top and bottom nodes of the upper and lower triangles respectively. The middle subproblem corresponds to the move that links these two triangles of states. Note that this middle subproblem has no further decomposition. It is a primitive problem that corresponds to moving the large disk from the first to the third peg. The yellow border is used to depict primitive or terminal subproblems. The left and right subproblems are not primitive subproblems and they are each decomposed further. The three subproblems for each of these subproblems are primitive and correspond to the first three and the last three moves of the solution. In this example, only AND nodes are shown. An OR node corresponds to the case where one or more different nonterminal rules are applied to a particular subproblem. For an OR node, at least one of the OR nodes must be solved in order to solve the (sub)problem Computational Approach
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Computation/ProbRed_305.html (2 of 2)11/11/2005 1:21:58 PM
Blocks Problem Example of Production Rule Architecture
Example of Production Rule Architecture used to Clear the Bottom Block in a Stack of Four Blocks
The production rule system exemplified here is slightly different from that described in your text. It is a system that was written for use in the lab associated with 472 which I also teach. Nonetheless, working through this description and following the animation should give you a better feel for how production systems work. Note particularly that the amount of memory, the working memory, available to support the search for a solution is quite small. And, since no tree is maintained, the "state of the search" is maintained in this working memory. The figure below illustrates the general form of the production rules used in this example. LHS refers to the left hand side or condition side of the rule; and, RHS refers to the right hand side or action portion of the rules. The form used here mirrors the form used in the lab for 830:472, and, in fact, this software was used to generate this example. In this form, the RHS is divided into two portions: the first line, here shown in red, lists those expressions which are deleted from working memory, WM, and the second line here shown in blue, lists those expressions which are added to working memory, WM.
The specific production rules that we used in this example are shown in the field below. The first, unstacks or makes a block clear if we have already established the goal of clearing that block. The second rule intoruduces a goal of clearing a block that is currently not clear. Note that in this architecture we have explicitly introduced "meta" relations; e.g. the 'goal' relation for use in controlling the action of the production rule system.
The next field animates the activity of this system using these rules. The illustration includes three components. First, in the upper right of the figure is shown the rule that was applied on a particular cycle together with the bindings that were found for the variables in the rule in the match phase. Second, a depiction of the contents of working memory (WM) is shown. In this depiction, those items shown in darkened colors indicate items that either are not yet present or are no longer active or relevant. Third, to the lower right is depicted the state of the "world that corresponds to the expressions in WM at that point.
http://www.rci.rutgers.edu/~cfs/305_html/Computation/PRULEEx.html (1 of 2)11/11/2005 1:22:14 PM
Blocks Problem Example of Production Rule Architecture
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Computation/PRULEEx.html (2 of 2)11/11/2005 1:22:14 PM
Terms, Concepts And Questions - Computation
Some Terms, Concepts and Questions Vocabulary Actions What is the relation between the Agent and the Instructions?
Instruction
Finite State Machine, FSM Turing Machine, TM British Museum Algorithm
Action {R} Actions {R, L, W}, infinite tape
How do these machines differ?
Resources required? Generators
Generate and Test
Start State, Test Control Structure
Exhaustive Breadth-First Depth-First State Space Search Heuristic Search Hill-Climbing Resources required?
Problem Reduction (Means-Ends Analysis)
Non-Terminal Rule Terminal Rule
How is the AND/OR Search Space Defined?
Production Rule Left Hand (Condition) Side (LHS) Production System Recognition - Act System Right Hand (Action) Side (RHS) Working Memory Long Term Memory Conflict Resolution How is control realized?
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Computation/ComputationQs.html11/11/2005 1:22:19 PM
Computation and Computers as Physical Devices
Computers as Physical Devices and Computation

If you can describe the physical laws that govern the operation of each physical device that makes up a computer, then do you understand the computation that is being carried out on the computer? If you can describe the linkage of each physical device that makes up a computer and the physical laws that govern the flow of energy through the linked devices, then do you understand the computation that is being carried out on the computer? If you can have a complete temporal record of the changes in the distribution of energy in the physical devices that make up a computer and know the physical laws that govern the changes in the distribution of energy in these devices, then do you understand the computation that is being carried out on the computer? Computational Approach
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Computation/Computer_Thinking_Qs.html11/11/2005 1:22:24 PM
Computing and Thinking
Computing ?=? Thinking Computation ?=? Mind

Can a computational system be designed to:
q q
q q q q q q
intentionally ignore an input ? respond differently to the same physical events (either input events or internal events) i.e., interpret physical events? modify its internal state? respond differently depending on its internal state? modify its way of doing things? represent knowledge about the world? acquire knowledge about the world? ...
q q q
What is essential to these abilities? What are the minimum capabilities required to say that something is a computational system? Are there problems which no computational system will ever be able to solve?
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Computation/Computer_Thinking_Qs2.html11/11/2005 1:22:30 PM
Propositional Logic:Some Intuitive Ideas
Propositional Logic: Some Intuitive Ideas

On seeing a film with Steve McQueen after he had died, "He must have made that movie before he died."...Yogi Berra It ain't over 'til its over.....Yogi Berra
One of the questions we might ask about human reasoning is whether or not it is logical. One answer to this question is that: "Human reasoning is either logical or it is not logical." On the face of it, this is not a very interesting answer to what seems like a profound question. But notice the form of this "answer"...Either it is or it isn't." Now this form is interesting precisely because we recognize that any answer of the form 'p or not p' is not a very interesting answer. Why do we know this? Is this something we must be taught or does the human mind have built into it certain assumptions such as this one that provide the basis for what we often refer to as logic? Consider next the value assigned to some statement p. If the value assigned to p is something like 'true' or 'false,' then most of us would, I hope, agree that if things are either true or false; and, if something is not true, then it is false. That is, we would agree that if we are speaking of truth or falsity of a statement, then negation of a statement should reverse these values. The little table to the left reflects this assumption. This table is referred to as the truth table for Negation. What it tells us is that if we have some proposition, p, then if it is true (T), the entry on the left, then the negation of p (the funny little hyphen-like sign, ) shown on the right is false (F). And, row 2 says that if p is F then the negation of p is T. Two things can be noted at this point. First, in order to think of something as being true or false, that something can't be the thing itself! If I point to an apple, the apple itself and the fact that-it-is-red, that-it-is-onthe-counter, etc. are not things that are true or false. A statement about that apple; for example; that 'the apple is red' may be true or false. It is because our mind can represent beliefs about things that we can talk about a relation between those beliefs and the things themselves. And, one such relation is whether the belief is true or false. Second, note that we have made a distinction here between some statement about something, p and an operation on the statement, namely, negation. Negation is a unary operator, it takes a single proposition, and it maps the truth value of p to the opposite truth value. That is what the truth table above says and you can think of it as analogous to the minus sign which is a unary operator on a number. That is, if I have some number n, then -n is also a number. For example, (3) is -3 and -(-3) is 3. Now a question. This operation of negation, is it something that happens in the world; or is it something that only our mind can carry out? If it is not something that is in the world, then our mind may already have the idea "built-in" so to speak. Recall that we distinguished above between a representation about something and the thing itself. This distinction is, roughly speaking, the distinction between what is termed syntax and semantics. Recall that so far we have spoken of some proposition which we refer to as p, a unary operation, '' on propositions which we refer to as negation, and two values, true and false. Our language so far consists of names for statements where the names are things like p, q, p1, p2, ... and the operator. The syntax of this language is quite simple. We can either say a name, or a name preceded by any number of symbols and nothing else. That is, we can not say p, or p, or pqppr, and so on. And, we assume that every p has associated with it one and only one value from the set {T,F} or {True,False}. This truth value is what
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/deduction_intuitions.html (1 of 3)11/11/2005 1:22:58 PM
tells us about the relation between the proposition and "the world". This is the semantics....a mapping of each p, q, ...to one and only one truth value. Thus, all we have are two sets, a set of names of propositions and the set of truth values, {T,F}. And there is a mapping from each proposition name to a single truth value. What could be simpler? Well things are going to become a bit trickier. There are other operations that we feel are appropriate operations to carry out on propositions. These latter operations are all binary operators, they operate on two propositions rather than one. For example, let p be "the apple is red" and q be "the apple is on the counter" then it seems reasonable to be able to say that "the apple is red" and "the apple is on the counter" or in our shorthand for these propositions, p and q. Now just as in the case of negation, the intuition is that the truth value of this conjunctive statement should be a function of the truth values of each component. Now, what should be the function between the truth values of 'p' and of 'q' and the truth value of 'p and q'? The truth table on the left provides the standard answer. For example the top row says that if p is T and q is T then the conjunctive statement is T. For all other combinations, the table says that the truth value of the conjunctive statement is F. Again, as in the case of 'not', it does not seem that this sense of 'and' is something that is a property of the world. Rather, it appears to be a property of the mind...one of the operations that the mind feels comfortable doing (on at least some but maybe all) propositions! Our language provides many words that can be used to put two statements together. For example, or, if...then, ...because...., ....therefore..., and so on. In the development of "formal logic" one of the decisions that must be made is to decide how many logical operators are required and the truth table that define the meaning of the operator. To the left is shown the truth table of the logical operator known as disjunction or inclusive 'or'. This sometimes causes confusion because we intuitively recognize another sense of 'or' which is referred to as exclusive 'or'. The truth table for this operator is shown to the right.
There are a few more logical operators that are typically defined. The most important is the conditional or " If ...then" operator which is shown below.
This is probably the one that causes the most confusion to persons when they study propositional logic. Part of the difficulty may be that we don't really have an unambiguous natural language term for this operator. For example: "If p, then q"; "p only if q"; "p is a necessary condition for q"; "q is a sufficient condition for p"; "q if p"; "q follows from p"; "q provided p"; "q is a logical consequence of p"; and "q whenever p" have all been suggested as appropriate readings for this logical operator.
(As you read the chapter in the text you will find that psychologists have done a good deal of research on
people's ability to interpret conditional statements. It isn't always clear exactly why this has generated so much interest.)
One final truth table and we will have looked at all of the standardly defined logical operators. This final operator is referred to as the biconditional and is often read as "If and only if". It turns out that for purposes of developing propositional logic, you don't really need all of these logical operators because you can define some in terms of others. This is perhaps one of the points at which our intuitions and the development of propositional logic part company. For example, an equivalence for the conditional is shown below. This means that we can avoid using this connective if we wish. Now you may have to think a long time before this equivalence seems intuitively obvious. But if you refer to the truth table above for the conditional and work out the truth values for 'not p or q', then you will see that you end up with the same truth table. Another, logical operation that can be "defined away" is the biconditional. Conjunction and the conditional can be used to define an identity for the biconditional. A more extensive list of some of the more important logical identities is provided for your reference. Now, if you are blurring over a bit at this point, not to worry...that is entirely appropriate if you haven't spent much time pondering truth tables. So let us step away from this detail and recall that the basic ideas are:
q q q q
there are a set of basic propositions; each proposition has a truth value associated with it; complex propositions can be formed using the logical operators the truth value of the complex proposition should be a function of the truth values of their constituent basic propositions.
Each of these basic ideas seems quite reasonable. And, each seems like something that could well be true about the way our mind works. Well, it is a pretty sure bet that our mind doesn't work exactly this way. But is that because it "tries" but can only approximate this way of reasoning, or, is there a totally different "logic" that is used by the mind, or have our intuitions totally misled us and each of the ideas above are totally foreign to the working of the mind?
Deduction
Charles F. Schmidt
Proof in Proposition Logic
Syntactic and Semantic Proof

Here we consider an example of a proof. The figure to the right provides the basis for our example. In this example we begin with a set of assertions stated in English. The first step is to identify the basic propositions that are contained in these assertions. These are shown below and will be referred to as f, s, m and c. Next, we must determine the meaning of the connectives used in the English assertions in order to identify the complex propositions. These are shown as three premises in the next box and correspond to the first three sentences. We wish to determine whether or not the negation of s follows from these premises. Or, another way of putting the question is shown in the last box. Here the entire set of English assertions are mapped to a single complex proposition. If this complex proposition is a tautology, then the negation of s is implied by the three premises. A tautology is a proposition that is true under all possible assignments of truth values to its constituent propositions. The complex proposition 'p or not p' is an example of a tautology. Conversely, a contradiction is a complex proposition that is false under all possible assignments of truth values. The complex proposition 'p and not p' is an example of a contradiction. There are two basic ways in which to try to determine if a proposition deductively follows from a set of propositions. One way is to reason in the syntax of logical expressions and the other is to reason in the semantics of the expressions.
A Syntactic Proof
We first consider reasoning in the syntax of logical expressions. This is referred to as the proof theoretic approach. In this approach the logical expressions are manipulated in order to discover a deductive proof. The figure below shows some of the better known rules of inference. The rule is shown on the left and the common name for the rule is shown on the right.
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/proplogic_proofs.html (1 of 5)11/11/2005 1:23:38 PM
To read these as rules of inference you must do two things. First, think of the p,q,r, and so on as standing for any proposition. Thus, these are general rules and technically should probably be called inference schema. Second, find the '>' that is not enclosed in any sort of parenthesis. Then the expression to the left of this implication is the premise and the expression to the right is the implication or deduction that is permitted from the premise. For example, the first rule which is called addition, says that if 'p' is true then you may infer that 'p or q' is true. And, of course, if you check out the truth table for this you would immediately see why this is valid. You would also note that this complex expression is a tautology. The figure to the right shows the syntactic proof of our example. For each line of the proof I have written the basis for that line to the right.Thus, the first two lines are simply the first two premises. The conjunction of these premises provide the left side of the inference rule called hypothetical syllogism and the third line is the right hand side of this hypothetical syllogism. Thus, each line of this proof is justified, and of course the last line is the proposition that we set out to prove. This seems rather straightforward. But, there is a practical problem. The number of rules of inference that we might try at any point is usually quite large. And, in fact, finding a proof is simply a form of search....and one that we want to be exhaustive so that we will be sure to find a proof if one exists. Thus, the practical problem. These search spaces can get very large and their size increases very dramatically as we increase the number of premises that we are given.
A Semantic Proof
Perhaps, reasoning in the semantics will make life a bit simpler. Recall that the semantics of propositions is simply their truth value. Consequently, the first step is to consider all possible combinations of truth values that could be assigned to the basic propositions in our example. These are shown in the figure to the right. Notice that we have 4 basic propositions, each can be assigned two different truth values. In this case, there are 16, or 2 to the 4, possible unique combinations of truth values. And, in general, if there are n basic propositions, then we will need to consider 2 raised to the nth power of combinations of truth value assignments. The next step is to consider the truth value of the complex proposition of our example under all of these possible assignments of truth values to the basic propositions. This complex proposition is shown again here for ease of reference.
We can't determine the possible truth values of this complex proposition in a single step. We must build up the truth values of the propositions in this expression from the basic constituent propositions. We could do this in any order, but in this example we will proceed in a left to right fashion.
The first constituent is 'f or s.' The figure to the right shows all of the truth value assignments for 'f' and 's' on the left and using the rule for 'or' assigns the appropriate truth values to this complex proposition.
The next step is to consider the complex proposition 'f or s implies m.' The figure to the left shows the results for this proposition. Note that we used the results for 'f or s' in determining the truth value assignments for this proposition.
Next , we consider the constituent 'm implies c'. The figure to the right shows the result for this proposition.
The next figure on the left shows the result of the conjunction of these complex propositions.
And, the next figure on the right shows the truth values for the conjunction of the full set of premises for this example.
Finally, the figure to the left shows the result for the entire expression. And, it turns out that we have shown that this expression has the value T or true for all possible assignments of truth values to its constituents.
Thus, 'not s' or "stones don't sing" does indeed logically follow from these premises! We have proved this in the semantic model for these expressions. Again, the procedure is simple and straightforward. But, the procedure requires considerable memory and "computation" to obtain the result. And, remember that the combinations of truth values that must be considered grows exponentially with n. Deduction
Charles F. Schmidt
Inconsistency and Proof
Inconsistency and Proof

A set of axioms is inconsistent if both p and not-p can be proved from the axioms. It is said that if a set of axioms is inconsistent, then anything is provable from that set of axioms. At first blush this may not be immediately obvious to you. But a simple example should help. We repeat some of the logical implications in the figure below since these will be used in our example.
The figure below shows a proof of r (an arbitrary proposition). Here we are given the single premise (1) of p and not p and we will see if r can be proven from this premise alone. The lines numbered 3 through 6 show the steps of the proof. Note that although r is a simple proposition, we could substitute any complex proposition and the same pattern of proof could be used.
The rules of inference used in this proof are intuitively reasonable. It is hard to imagine any argument against their validity. For example, the rule of inference referred to as simplification simply takes a conjunctive expression that is true and allows one to assert either of the components as true. Addition allows one to take a true expression and add a disjunct to that expression. Given the truth table for disjunction, this clearly yields an expression that is True. The only slightly involved rule of inference is disjunctive syllogism. The intuition behind this rule is actually quite straighforward. That is, if we have a disjunctive statement of two expressions that is true, and we know that one of the expressions is false, then the reamining expression must be true. As you can see, this is the point where holding that a contradition is true gets us into trouble.
Deduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/proof_inconsistency.html11/11/2005 1:23:45 PM
Euler Diagrams and Quantified Expressions

Euler diagrams, inaccurately referred to as Venn diagrams in your text, were an informal way to convey the ideas that are involved in what was subsequently known as quantification. They are not the best way in which to think about quantified statements, but the psychological literature discussed in your text has used them in the study of what they term categorical syllogisms. One kind of quantification is called universal quantification and the other existential. The idea behind universal quantification is that some statements are universally true. For example, "All triangles have three sides." Consequently, we don't want to have to write down this statement for each individual ... particularly, if, as is the case with triangles, there are an infinite number of such individuals. Note that the semantics for quantified statements is still True or False. Your text provides two alternative Euler diagrams for the statement "All A are B," and claims that this indicates that the statement is ambiguous. The two alternative Euler diagrams provided in your text for this universally quantified are shown in the figure to the right. Beneath each is the quantified expression that corresponds to the diagram. For example, the one on the left states that: "for all x whenever A(x) then B(x)" whereas the one on the right states that: "for all x A(x) and B(x)." The next figure shown on the left illustrates the Euler diagrams that your text provides for what your text terms a particular affirmative; namely "Some A are B." This corresponds to what we referred to above as existential quantification. The idea here is that there is at least one individual about whom a statement is true. For example, "Some triangles are equilateral triangles." Again, we are not being explicit about how many there are, only that there is at least one. Note that there are four Euler diagrams that correspond to what your text refers to as the Particular Affirmative, namely the statement "Some A are B." The corresponding existentially quantified expression is shown at the bottom of the figure and reads: "there is some x, such that A(x) and B(x)." The next figure on the lower left shows the diagram that corresponds to what your text refers to as the "Universal Negative," namely, the statement "No A are B." Again, the quantified statements that correspond to this case are shown at the bottom. The two statements shown are equivalent.
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/EulerDiags.html (1 of 3)11/11/2005 1:24:07 PM
And finally the above figure on the right shows the diagram that corresponds to what your text refers to as the "Particular Negative," namely, the statement "Some A are not B." Again, the quantified statement that correspond to this case is shown at the bottom. Your text refers to these statements; "All A are B, "Some A are B, "No A are B, "Some A are not B," as 'categorical propositions.' And your text states on page 118 that "most of the propositions are ambiguous." By this the author means that there is more than one Euler diagram that can be associated with the "categorical proposition." But this presumes two things. First, that these "categorical propositions" represent an appropriate syntax for logical expressions; and second, that Euler Diagrams represent an appropriate semantic model for these propositions. Neither of these presumptions are made in modern logic (and, as far as I know, may never have been made by anyone other than some psychologists who used these syllogisms in their research). Thus, in making sense of this chapter, it is useful to distinguish between an English statement, e.g., All A are B; the logical expression(s) that correspond to the English statement; and the semantic model(s) that correspond to the logical expression. Logics are typically constructed so that there is no ambiguity in the technical sense between a logical expression and the semantic model that corresponds to the expression. Now, if we assume that humans when reading such 'categorical propositions' map them directly into the set of Euler diagrams and reason in these diagrams, then the "ambiguity" of this mapping may explain some of the difficulties that people may have when reasoning in these "categorical syllogisms." To illustrate this, we consider the categorical syllogism: All B are A No C are B. Are Some A not C?
The figure to the right illustrates the mapping of the premises into Euler diagrams (the top three diagrams enclosed in light gray boxes); and the composition of these possibilities to yield the four alternatives shown one level down and enclosed by a darker gray rectangle. Now in order to infer from the premises that "Some A are not C" this statement must be true in all possible semantic models of the premises. In this case, there are four such models, and if you check them visually you can see that the conclusion is true in each model. Therefore the conclusion follows from the premises. The figure on the right below shows the same problem. Here the premises and the conclusion are shown on the right and expressed in the syntax of first order predicate logic (FOL). The proof that the conclusion follows from the premises is shown on the right of the figure. The top line simply restates the premises as a conjunctive statement. And, since we are not asked to prove a universal statement, we have replaced the universal quantification with a single constant, K The next shaded rectangle shows the premises of the rule of inference known as modus ponens. These premises are directly obtained from line 1 of the proof. The next lightly shaded rectangle contains the conclusion of applying modus ponens, namely A(K). The next line asserts the conjunction of A(K) and premise 2 from line 1. Then since we have proved it for some K, we can substitute the variable x for K and existentially quantify over the expression. This is shown in the next line. The last line simply eliminates the conjunct B(x) from the previous line in order to arrive at a form that is identical to the conclusion that we were asked to prove. This proof is an example of a proof that is carried out in the syntax. Note that it is a rather simple and straightforward proof.
Deduction
Charles F. Schmidt
Defeasible Inference -- Inheritance

Recall that formal logic uses a single uniform "language" within which to represent knowledge. A representation that has this property is often referred to as a domain independent representation. One of the advantages of a domain independent representation language is that the procedures that manipulate the expressions in the language can be written so that they are domain independent. Deduction in first order logic is the same procedure whether we are reasoning about knowledge about cooking or knowledge about Euclidean geometry. But, it turns out that the algorithms that implement the deductive procedures of first order logic are generally not very efficient. As the number of facts that must be considered increases, the number of possible inferences increase very rapidly. Consequently, the time and space required for the computation of a deduction can become very large. It seems to generally be true that we can design a domain dependent representation language that is more efficient in reasoning about certain aspects of the domain than a domain independent representation language. What is lost, of course, is generality and in the case of logic we usually also lose the guarantee that if a proof for some statement exists then it will be found. (Newell reflected on this tradeoff between what he referred to as weak methods and strong methods back in the 60's. Weak methods are very general but usually rather inefficient; whereas strong methods are usually quite efficient but are limited in generality because of their domain dependence. We will see this distinction again when we look at the question of what constitutes expertise.) One suggestion to account for human "deductive-like" reasoning is that we utilize not a domain independent representation language but domain dependent languages. And, AI researchers have developed domain dependent reasoning strategies and carefully investigated the advantages and disadvantages of their use. You may recall the figure on the right from our discussion of search. In this case, I adopted a very simple and restricted language for use in representing the block problem. And, it would be quite easy to write special procedures for this language that could carry out certain types of deductively valid inferences. For example, it would be trivial to write a procedure to "deduce that block C is to the 'right of' block A. Now note that I have used this as an example of a domain dependent language. But the "data-type" of this language is that of a string (i.e., a sequence of characters). In what sense, are strings a domain dependent language for the "Blocks domain?!" Well, what people really mean by domain dependent is this: if the syntax of the language rather directly reflects a property of the domain that we wish to reason about; then we tend to refer to that representation language as domain dependent. Here the string, a left to right structure, mirrors the left to right structure in the domain. And, I used the '#' to delimit stacks and the order that I wrote the stack down to mirror the "on top of" relation. These syntactic properties of the "data-type" can then be exploited to yield efficient, but limited procedures that yield deductively valid inferences.
Trees, Hierarchies, and Inheritance

One of the earliest and most thoroughly investigated special representational language can be thought of as a kind of graph known as a tree. This type of graph was used to represent information about classes, instances of the classes, and properties of the classes. You will probably remember that we used trees to represent search. Trees are much beloved graphs because they have some properties that can be exploited when writing algorithms to search the trees.
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/Defeasible_Inheritance.html (1 of 5)11/11/2005 1:24:50 PM
The figure to the right illustrates the way in which a variety of class information, in this case information about animals, could be organized. The organization is shown as a tree where Animal is the root node. Various classes of animals are shown in blue and these classes are organized in a hierarchy that reflects the class inclusion relations among these classes. For example, both Birds and Mammals are subclasses of Animal; Mammal is a superclass of Elephant; and so on. In white, at the terminal nodes of this tree, are shown instances or individuals that are members of the class. For example, both jumbo and clyde are members of the class Elephant. Finally, in yellow are shown various properties that are characteristic of a class. For example, Birds fly; and so on. Contrast this representation with the figure to the left. This figure contains the same information but expresses it as a set of logical implications. Note that more information is explicitly stated in this figure. There would be even more except that I got lazy and simply entered '...' to indicate that much more would have to be listed to make it complete. Note also that on the left are some of the logical implications that involve negation. Again, many more would have to be listed to complete the list. Finally, although I did impose some organization on the presentation of the information in this figure; this organization is not required nor depended on when we carry out deduction. The statements could be written down in any order. Note, that this isn't true for the tree representation. Comparing these two figures should help you see that the tree structure is a succinct way in which to organize this type of information. And, because it is organized in his way, an inference procedure can take advantage of the fact that lower entities in the tree inherit the properties of their ancestors in the tree. For example, "jumbo is an elephant", "jumbo is grey", "jumbo is a mammal", "jumbo bears live young", and so on.
The picture to the right marks in red the nodes along the path that must be identified in this structure in order to determine that "jumbo is grey." The procedure simply searches for jumbo in the terminal elements of the tree. If jumbo is not found then jumbo would be sought in the set of parents of each of these terminal elements. In this case, jumbo is a terminal element. Consequently, the procedure focuses only on this element and begins tracing the paths that move upward from jumbo. The first node encountered is Elephant which does not equal "grey", so the procedure continues to the next connected nodes which are {Mammal, Grey}. Consequently, at this point the procedure has derived from this structure and its associated procedure that "jumbo is grey;" and the process terminates. Note, that this pattern is reminiscent of the repeated application of modus ponens. If we had tried to derive that "jumbo is white," then the procedure would have continued until it had reached the top of the structure. At this point there would be no new places to look. Consequently, the procedure has failed to derive that "jumbo is white," and based on this failure derives that "jumbo is not white." This type of inference is known as Negation by Failure. This is really quite encouraging. A simple data structure allows us to define a highly efficient procedure for making inferences. But, are these inferences "deductively valid"? The answer is ...it depends. The procedure followed is most definitely not a deductive procedure in the technical sense. Negation by Failure is certainly not a deductive rule of inference. In order to do deductive inference we would have to list all the information as illustrated in the second figure and then employ some deductive procedure such as resolution theorem proving. Whether all of the inferences that a domain dependent procedure allows us to make are deductively valid will simply depend on the procedure. There is no way to know in advance; and it may be quite hard to firmly answer the question even when the procedure is known. What will generally be true is that inference procedures that depend on the map between properties of the data structure and properties of the domain will not provide all deductively valid inferences. This is because the map focuses the procedure on a subset of the deductively valid inferences. For example, the procedure outlined above would not yield the deductively valid statement "jumbo or clyde implies not-flies".
Defeasible Inference
An entirely different issue arises with the type of knowledge that we are considering here. Notice the nodes circled in red. This tree representation has been used to represent that both penguins and robins are birds, and that birds fly. Notice that our procedure could derive that tweety is a penguin and also that tweety flies. The figure to the left shows the problem. In general, Birds fly and of course Penguins are Birds. But they are an exception. They swim very nicely, but they just don't fly. We need some way in which to represent these facts which yields the appropriate conclusion and blocks
the inappropriate conclusions about tweety. This turns out to be a rather serious problem because it raises the question of the exact nature of this type of knowledge. In standard logics either a statement holds universally or it doesn't. But here we have knowledge that is quite general, but not without exception. And, if you think about it a bit, this seems to be a characteristic of much of our knowledge But, how can we reason with knowledge that is not universal but must be qualified with terms like 'normally p' or 'typically p'? Or is there a simple way in which to handle these exceptions within first order logic and/or within the graph language of a tree structure as depicted above?
Exceptions: General Axioms are often inadequate.

Because of exceptions, the general axiom that states that if x is a bird then x flies leads to an inconsistency as illustrated in the axioms on the figure to the right below. An alternative is to throw out the general axiom and explicitly list the exceptions as illustrated above. However, other exceptions come to mind...what if a wing is broken, what if an oil spill has covered the bird, what if the bird is still too young to fly, ....etc. The fact that we can come up with so many exceptional conditions, makes us suspect that there isn't a closed set of exceptions. Perhaps, first order axioms are simply not the way to capture this type of knowledge. But how can exceptions be handled while retaining the ability to represent this type knowledge?
Default Theory
One suggestion for representing and reasoning with common sense knowledge was developed by Ray Reiter and is known as default theory.
The idea is to reason in first order logic but to have available a set of default rules which are used only if an inference can not be obtained within the first order formulation. The general framework of this proposal is depicted on the left. Probably the simplest place to start is with the example rule shown at the bottom. One kind of knowledge that we seem to possess and use is knowledge about what is typically the case. Here, we have the "generalization" that "Typically an American adult owns a car." Clearly this is
not the same as the logical statement, "For all x if x is an American and x is an adult, then x owns a car." Consequently, such knowledge has to be used with care or we will constantly be introducing contradictions into our beliefs. The default rule consists of three components. First, there is the prerequisite. Think of this as some kind of foundation or evidence for the rule. And this foundation must be logically derived from your beliefs. You must be able to prove that "John Doe" is both an American and an adult to employ this rule. Then, the second component requires that you not be able to prove that "John Doe" does not own something which is a car. This is the consistency test. If the prerequisite and the consistency component have been satisfied, then the third component of the rule, the consequent, may be asserted.
It turns out that only under very special circumstances can this way of formulating this type of knowledge be used efficiently. Notice that checking whether some belief is consistent with our current beliefs amounts to trying to prove its negation. And, deduction is generally an intractable problem. Also, note that a default conclusion isn't the same as a deduction. It may be defeated by susbsequent information and consequently the use of Default theories results in a logic that is non-monotonic. For example, we may run into John Doe and he may tell us that he doesn't own a car. Rather than arguing with John by pointing out to him that he is an American and an adult and therefore should, by our default theory, own a car; we should probably just throw out the statement that "John Doe owns a car" and substitute the statement "John Doe does not own a car".
Deduction
Charles F. Schmidt
Deduction Overview
Overview Of Deduction
Are we capable of reasoning logically? If so, do we typically reason logically or is this a rarely seen capacity? In order to approach these questions, we must first see whether we can characterize what it means to reason logically and determine whether there are certain capacities that are presupposed. The figure to the right provides my take at succinctly defining deduction in the proof theoretic sense. Deduction presupposes what I refer to as a language, L, that is suitable for expressing statements about a domain. Note that there is not a requirement that this language be universal or "domain independent," although this is usually presumed when we speak of logic and deduction. Secondly, we require some way of forming complex statements from "simple" statements. The logical connectives are used for this purpose. And, finally we include the quantifiers that allow us to make statements over individuals. The language is used to construct statements that are about some domain. In particular, each statement is to have an unambiguous meaning; namely, it is to be either True or False but never both. Now to carry out deduction, we must start with a set of statements about the domain of interest. And, we require that the statements be consistent; that is, we should not be able to derive a contradiction 'p and not p' from the initial set of statements. Given all this, deduction is simply a procedure that adds statements to S, but in such a way that it is guaranteed that inconsistency not be introduced....said another way, the procedure preserves the truth value assignments of the statements. What have we required by virtue of this definition? First, note that we have made a distinction between statements about some world and the world that the statements are about. This is often referred to as the distinction between syntax and semantics in logic. This distinction is not unique to logic. A basic assumption about human reasoning is that the "contents" of the mind are about things. For example, I believe that I am typing at this moment. The belief itself is not "me-typing" but a statement that represents "me-typing". Thus, one of the main requisites for reasoning logically is in our possession of a syntax within which to form statements about "things". Let's call this a representational capacity. A second requirement that arises from our definition is that: 1) statements must be True or False but not both; and 2) the truth value of a statement cannot be changed. Collectively, we can refer to this as the assumption that deduction is truth functional. Consequently, if some statement, s, is True; then the deductive procedure must insure that no statements are ever added that would allow the conclusion 'not-s'. This is a very strong constraint.It may help you to appreciate this constraint if you think back to the cryptarithmetic problem. The requirements for that problem were quite analogous to this requirement of truth functionality. Each letter had to have one and only one value, and the value couldn't be changed. In that problem, the representation of the problem as a set of equations helped us to see the dependencies that existed among the individual letters. In the present deductive case, we simply have a set of statements. There are no explicit clues in the syntax of the statements to suggest the way in which the truth value of one statement might depend on the truth value of other statements. There are two ways to deal with this consistency constraint; and, we have seen that both have been considered in the study of deductive reasoning. The first, is to limit the deductive procedure to adding those statements which are permitted
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/DeductionOverview.html (1 of 2)11/11/2005 1:25:06 PM
Deduction Overview
under all possible assignments of truth values to the statements that are used in the derivation. This is simply a way of saying that the resulting statement can be derived independently of the particular truth values assigned to the statement in the model. If deductive inference is limited in this fashion we obtain what is termed a monotonic logic; one where no statement is ever withdrawn. A second way to deal with the consistency constraint is to explicitly check the constraint. Thus, if I "deduce" p; then before adding p to the set S, I see whether I can "deduce" not-p. If I can not, then I allow p to the added to the set S. The adoption of this strategy allows many more statements to be "deduced." The disadvantage is that in the best of circumstances it is very difficult to test whether both p and not-p can be derived from a set S; and, in the worst of circumstances it is impossible. Further, if p is added, and then later q; it may turn out that the addition of q now allows not-p to be "deduced". Thus, although the check for consistency sounds like a "local" property--just check that you can't derive 'p and not-p"; it is really global--that is the check must be made for all p.
Well, where does that put us with respect to our original question; namely, do we reason logically? On the one hand, it does seem that we possess the basic requirements to reason logically. Our mind uses representations and implicitly understands the distinction between syntax and semantics. And, I suspect that for at least some domains we naturally think of statements about the domain being either True of False but not both; and, therefore recognize that 'p and not-p' makes no sense. But, what I have tried to show in this little discussion is that deduction turns out to be quite a tricky procedure to realize. And it turns out that deduction in first order logic is only semi-decidable and the problem of deduction is again one of those problems that computer scientists refer to as intractable problems. Consequently, it seems very unlikely that deduction in this technical sense is something that characterizes our reasoning. However, this is not the same thing as saying that our reasoning is illogical or that we do not employ patterns of reasoning that lead us to conclusions. From this point of view, the important question for cognitive psychologists is not whether or not in some special and usually quite simple experimental situations our reasoning arrives at the same conclusion as a formal logic. Rather it is to identify those patterns of reasoning that we do employ to derive new beliefs from our existing beliefs. Clearly, we do add beliefs based on our current beliefs. And, clearly we usually do that under some constraints. We don't usually accept that anything follows from anything. Deduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/DeductionOverview.html (2 of 2)11/11/2005 1:25:06 PM
Truth Tables
Truth Tables
Deduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/TTRef.html11/11/2005 1:25:13 PM
Some Logical Identitites
Some Logical Identities

The figure to the right provides a listing of some of the more important logical identities. To the right of the identity is listed either the property that is the basis for the identity (for example, idempotence) or the common name given to the identity (for example, contrapositive). These identities are provided mainly for reference. In looking over the identities, think back to your high school algebra class where you learned various identities for equations. This is exactly analogous. Logic is also a kind of algebra.

Charles F. Schmidt
Deduction
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/log_equiv.html11/11/2005 1:25:20 PM
Some Logical Implications
Some Logical Implications

The figure to the right provides a listing of some of the more important logical implications. To the right of the rule of inference is listed the common name given to the rule of inference (for example, addition). These rules are provided mainly for reference.

Charles F. Schmidt
Deduction
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/log_inf.html11/11/2005 1:25:21 PM
Liar's Paradox
Liar's Paradox
Shown to the right is one of the many variants of what is known as the Liar's Paradox. It is selfexplanatory and is a non-trivial example of thinking in truth tables. The answer is at the bottom, so don't look there if you want to solve it yourself. This is included for laughs; and, in case you ever find yourself in a place where there are people who tell the truth as well as your garden variety liars.
Deduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/Liar%27sParadox.html11/11/2005 1:25:29 PM
Some Definitions for First Order Logic
First Order Logic Syntax of First Order Logic
Semantics for First Order Logic
Deduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/FOL.html11/11/2005 1:25:46 PM
Some Rules for Quantifiers
Rules and Examples for the use of Quantifiers
Deduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/Quantifiers.html11/11/2005 1:25:57 PM
Formal Systems - Definitions

(from Ruth E. Davis, Truth, Deduction, and Computation. New York: Computer Science press, 1989.)
Listed below are the technical definitions of many of the terms that are used in the investigation of deductive reasoning. Denumerable - A set is denumerable if it can be put into a one-to-one correspondence with the positive integers. Countable - A set is countable if it is either finite or denumerable. A formal theory T consists of: (1) A countable set of symbols. (A finite sequence of symbols of T is called an expression of T.) (2) A subset of the expressions, called the well-formed formulas (abbreviated wffs) of T. The wffs are the legal sentences of the theory. (3) A subset of the wffs called the axioms of T. (4) A finite set of relations R1, ..., Rn on wffs, called rules of inference. For each Ri there is a unique positive integer j such that for every j wffs and each wff A one can effectively decide whether the given j wffs are in the relation Ri to A; if so, A is called a direct consequence of the given wffs by virtue of Ri. For example, the rule modus ponens is a relation on three wffs, A, A -> B, and B, by which B is a direct consequence of A and A -> B. Since the set of axioms is often infinite, this set is often specified by providing a finite set of axiom schemata. A schema is a statement form; it provides a template showing the form of a wff while leaving some pieces unspecified through the use of metavariables. In the above example A and B are metavariables which stand for wffs of the theory. An instance of a schema is a wff obtained from the statement form by substitution. Deducible from S in T - Let S be a set of wffs, and let P be a wff in the formal theory T. We say that P is deducible from S in T (denoted by S |-T P) if there exists a finite sequence of wffs P1, ..., Pn such that Pn = P and for 1 < i < n Pi is either an axiom, a formula in S (called a hypothesis), or a direct consequence of previous Pi's by virtue of one of the rules of inference. Proof or Derivation - The sequence of Pi's is called a derivation of P from S, or a proof of P from S. Theorem - If P is deducible from the empty set, we write |-T P , and say that P is a theorem or P is provable in T. The following properties of deducibility are consequences of the definition of deducible from: Let S1 and S2 be sets of wffs, and A a wff, then 1. If S1 is a subset of S2 and S1 |- A, then S2 |- A. This property is called monotonicity. 2. S1 |- A if and only if there is a finite subset S2 of S1 such that S2 |- A. This property is called compactness. (Since a derivation must be finite, it must require only a finite number of hypotheses)
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/FormalSystemDefs.html (1 of 2)11/11/2005 1:25:59 PM
3. If S2 |- A, and for each wff B in S2, S1 |- B, then S1 |- A. (If every wff in S2 can be derived from the set of wffs S1, then anything that can be derived from S2 can also be derived from S1.) Interpretation - An interpretation supplies a meaning for each of the symbols of a formal theory such that any wff can be understood as a statement that is either true or false in the interpretation. Model - An interpretation is a model for a set of wffs S if every wff in S is true in the interpretation. Completeness - A theory is complete if every sentence that is true in all interpretations is provable in the theory. Soundness - A theory is sound if every provable sentence is true in all interpretations. If a theory is sound and complete then truth and deduction are equivalent. A computation method is complete if for every sentence S, the algorithm will terminate on input S in a finite amount of time, indicating whether or not S is true in all interpretations. Decidable - A formal theory is decidable if there exists an effective procedure that will determine, for any sentence of the theory, whether or not that sentence is provable in the theory. (Any property is said to be decidable if there exists an effective procedure (i.e., terminating algorithm) that will determine whether or not the property holds.) Consistent - A theory is consistent if it contains no wff such that both the wff and its negation are provable. First Order Predicate Calculus (Logic) is sound, complete, and consistent, but not decidable.
Deduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/FormalSystemDefs.html (2 of 2)11/11/2005 1:25:59 PM
Example of Resolution Theorem Proving
Resolution Theorem Proving*

Resolution Theorem Proving (Robinson, 1965) is one proof theoretic method for proving theorems in First Order Logic. One of the steps in this procedure is to limit the syntax of FOL. All FOL expressions are converted to what is referred to as conjunctive normal form. To the right is a database for use in the example of resolution theorem proving. The statements are shown in English (in dark blue), in conjunctive normal form (in green), and in the syntax of First Order Logic (in white).
The figure to the left shows the successful part of the search for a proof of the assertion that Marcus hates Caesar. The database is reiterated in yellow at the top of the figure. Then you will notice the statement that Marcus does not hate Caesar. Resolution theorem proving works by negating the assertion that is to be proved, and trying to prove a contradiction...or the empty set. The numbers refer to the item in the database which is "resolved" with the expression. This procedure involves two syntactic rules. One is substitution which is
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/ResolutionTP.html (1 of 2)11/11/2005 1:26:17 PM
Example of Resolution Theorem Proving
carried out by a process called unification. You don't really want to know the details; but what this process does is substitute, in a clever way, values for variables. The Marcus/x2 at the top of the search represents the substitution of Marcus for the variable x2. The other rule is modus ponens although that may not be so obvious. You need to remember that 'not p or q' is identical to 'p -> q'. Then clearly if you have 'p', 'not p or q' you can write down 'q'. That is what is going on in the figure to the left. And, if you get rid of everything, represented by the white box in the figure which represents the empty set, then you have a contradiction! But, you can get rid of the contradiction by asserting the negation of what you started with....which was the negation of what you wanted to prove...therefore, you just proved it!
(* This example has been adapted from E. Rich and K Knight, Artificial Intelligence. Second Edition.NY: McGraw Hill,
1991.) Robinson, J.A. 1965. A machine-oriented logic based on the resolution principle. Journal of the ACM, 12(1): 23-41.
Deduction
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/ResolutionTP.html (2 of 2)11/11/2005 1:26:17 PM
syntax semantics Logic Proof Deduction truth value truth table tautology contradiction satisfiable unsatisfiable, inconsistent What is meant by the expressiveness of a logic? What are the defining or basic features that define a deductive logic?
Proposition Logic
proposition complex proposition atomic formula Well formed formula, wff. Quantification identities
How expressive is Propositional Logic?
First Order Logic (FOL)
How expressive is First Order Logic?
Syntactic Proof
implications resolution
Decidable, tractable, complete, sound, consistent?
Semantic Proof NonDeductive Inference Defeasible Inference Trees, Hierarchies, and Inheritance
truth tables
Tractable, complete, sound, consistent?
domain dependent representation representation dependent inference Trees, strings
Relation between domain and special representations? Relation between domain dependent representations and valid inference?
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/Deduction_C.html (1 of 2)11/11/2005 1:26:19 PM
Non-first Order Knowledge, Exceptions, Default Theory
Negation by Failure Consistency and Defaults
Deductive? Monotonic?
Deduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/Deduction_C.html (2 of 2)11/11/2005 1:26:19 PM
Describing Things in English and Logic
Describing Things
The picture to the right will be used in this exercise. The picture contains a variety of things. Despite its simplicity there are many things that could be written down to describe aspects of the picture.
q
First , write down, in English, one or more informal descriptions of the world depicted in this picture. Next, using the syntax of propositional logic, write down a set of propositions, both simple and complex, that describe aspects of the world depicted in this picture. r Next to each proposition write down an English sentence that corresponds to that proposition. Finally, using the syntax of first order logic, write down a set of expressions, both simple and complex, that describe aspects of the world depicted in this picture. r Next to each expression write down an English sentence that corresponds to that expression.
Some Questions:
q
What were some things that were : (1) impossible, or (2) very hard, or (3) awkard to express r in English; r in the propositional logic; r in the first order logic. Compare the terms used in the informal English description and the terms used in the English sentences that corresponded to the logical expressions. What are some of the differences? Is it possible to write down all of the expressions that are true in the world depicted in this picture?
Deduction
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Deduction/DescAssign.html11/11/2005 1:26:47 PM
Introduction to Induction
One of the characteristics of human reasoning is that we form generalizations based on our experience or observations. But,
q q q
What is the relation between a generalization and the observations on which it is based? What should be the relation between a generalization and the observations on which it is based? What can be the relation between a generalization and the observations on which it is based?
The first question is an empirical question and is the focus of psychological research. In the first half of this century psychologists simply referred to this question as one of learning...and the term induction was rarely encountered in the psychological literature. The second question became of particular interest with the rise of empirical science. What should be the relation between the concepts and laws developed by a science and the observations on which the laws and concepts are based and to which they apply? One might think that the scientific concepts and laws should follow from or be implied by the observations....thus, if the observations are true, and if the laws follow deductively from the observations, then the laws are true since deduction is truth preserving! Life is wonderful, science is the way to truth, ... But, almost everyone who carefully thought about this problem became convinced that things weren't this simple. Induction could not be reduced to deduction. (If it could, then once a law was established it could never be defeated by any new observations if it in fact was simply a deductive truth.) So, whereas deduction seemed like a nice dream about what the relation should be, it was only a dream. And, this whole idea of induction had to be looked at and studied quite carefully. Around the middle of this century, it became possible to pose the last question. By posing the question as well as the induction process as a mathematical system, it became possible to say something about the kinds of "theories" that could be learned from observations. This mathematical area is known as "learnability theory". Now, as mentioned above, it appears that induction cannot be reduced to deduction. But, if the relation isn't deductive, then what else might it be? It certainly shouldn't be arbitrary! The figure to the right provides an initial way to think about this question. Recall that observations (or examples) of a concept or law are distinguished. Thus, one place to begin is to assume that we have a set of examples; some of which are positive--that is, they exemplify the concept; and some of which may be negative-that is, they are not examples of the concept. How do we get these sets? We don't say...we simply assume that we have them or we invent a "teacher" who presents them to our induction process. Further, since the induction process is supposed to yield a theory of the observations, we
http://www.rci.rutgers.edu/~cfs/305_html/Induction/InductionIntro.html (1 of 2)11/11/2005 1:27:11 PM
will assume that the induction process has some language suitable for forming theories. Usually, we simply assume that language is a logic...but this is done for generality and to simplify the development of the ideas. Again, where the language comes from we don't say. The induction process simply has it. Finally, there is the training sequence, TS, which consists of some of the examples presented in some sequence to the induction process.
With these givens, we can now say a little bit about the relation that should exist between observations(examples) and our theory. We wish our theory, S in the above figure, to allow the positive examples of the concept to be derived or deduced and not to allow the negative examples of the concept to be derived. Now comes the tricky part. If you think about it a bit you will recognize that this is a very weak criterion. It can be easily satisfied. The set TS is always finite. Consequently, one can simply write down in the language each of the examples. Thus, S is just the examples that have been seen together with the appropriate logical connective to the concept that allows each example to be derived from S. Less carefully stated, S is just a memorization of the examples of the training sequence. But this doesn't allow one to make any predictions about new examples! Consequently, the requirement is added that we have some procedure that generalizes S to S'. And S' then is required to make the correct predictions for all possible future examples. To do this S' must "go beyond the information given" and consequently S' can not be deductively related to S. This is one way to view what is known as the induction problem. And, we know that we can not guarantee that we can find an inductive procedure that meets this criterion for any concept that can be imagined. Induction,Concepts, Uncertainty
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Induction/InductionIntro.html (2 of 2)11/11/2005 1:27:11 PM
Knowing a Concept and Concept Indistinguishability
Knowing a Concept and Concept Indistinguishability

What does it mean to say that someone knows a concept? What kinds of capabilities do we expect from someone that knows a concept? There are a variety of ways to answer this question. The figure below lists a set of capabilities that one might expect to be available to a person who knows a concept. The first is the abilitiy to correctly recognize whether a particular example is or is not an example of the concept in question. I have referred to this procedure as Identify. A second is the ability to Generate positive and/or negative examples of a concept. And, finally, there is the ability to explicitly State the definition of the concept.
This provides a characterization of what it might mean to "know" a concept; and it also provides a basis for deciding when two different entities...say a "teacher" and a "learner" have the same concept. This is indicated in the lower section of the above figure. Notice that I have not tried to define the notion of identity between C and C'. Rather I have been content to focus on the conditions under which we might characterize C and C' as indistinguishable. And, as you can see, the definition depends on seeing whether or not the answers of the two concept holders are or are not the same. Induction,Concepts, Uncertainty
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Induction/p_induction.html11/11/2005 1:27:27 PM
Example of Concept Learning
Example of A Typical Concept Identification Experiment

The typical concept identification experiment involves:
q q q
defining an n dimensional space; defining the concept within this space; presenting the learner a sequence of positive and negative examples;
The example concept experiment developed here involves 4 dimensions, each of which has two values. The dimensions are:
q q q q
Shape: Square(sq) or Circle(sq); Size: Large(l) or Small(l); Color: White(w) or Black(w); Position of Shape on Field: Right(r) or Left(r);
where the letters in parentheses indicate the abbreviation that will be used in the figures when we describe the concept instance as a conjunction of propositions. Note that with 4 dimensions each of which can take on one of two values there are 2 to the 4th or 16 possible concept instances. Clicking on the example figure on the right will open a window in which these 16 concept instances are displayed. A concept definition will partition these 16 possibilities into two spaces that correspond to the positive examples and the negative examples of the concept. Consider the example shown to the right. Assume that this example is an instance of the concept that is to be learned. In this example a white small square is shown on the left of the field. The description of this example as a conjunctive description is shown beneath the figure. If this is the only instance, that is the concept is exactly this conjunction of features, then all of the other possibilities lie in the other element of the partition. If the concept is 'white and square', then the space is partitioned into two sets of 4 and 12 elements. In order to gain some experience with this type of task, there are two different examples of a concept identification experiment provided below. For each of them, you can present examples of a concept that you are to learn by sequentially clicking on the 'Example #' links provided. Clicking on the 'Example link' will take you to a page which contains an example from the concept space. As you look at each example sequentially, try to identify the definition of the concept. If you scroll down the page where an example is presented, you will see either a '+' or a '-' indicating whether the example is a positive or negative example of the concept. Determine whether you think that an example is positive or negative prior to scrolling to check the answer. Experiment 1. The first experiment provides only four examples. They are given below: Example 1 Example 10
http://www.rci.rutgers.edu/~cfs/305_html/Induction/ExampleConcepts/InductEx.html (1 of 3)11/11/2005 1:27:38 PM
Example 4 Example 13 Make a note of the concept definition that you arrived at after these few examples. You may have noticed that in this experiment you were shown only positive examples of the concept. What kind of learning strategy is required to learn from positive examples only? Experiment 2. The next experiment uses all 16 instances from the concept space and includes positive and negative examples of the concept. Example 1 Example 2 Example 3 Example 4 Example 5 Example 6 Example 7 Example 8 Example 9 Example 10 Example 11 Example 12 Example 13 Example 14 Example 15 Example 16 Did you identify the concept? Was there any point prior to the last example where you could be absolutely certain that you had learned the correct concept? Click on Possible Hypotheses to see some hypotheses that could have been considered after Example 1. Did you consider more than one of these at the time?
Possible Hypotheses after One Example

Charles F. Schmidt
Induction,Concepts, Uncertainty
Some Possible Hypotheses after seeing Example 1
Some Possible Hypotheses after a Single Positive Example
+
The example above is a positive exemplar of the concept. A great many hypotheses concerning the concept are supported by this single exemplar. Some of these are shown below. The Concept Language is shown in the box in the upper left and consists of four dimensions or properties each of which can take on one of two values. The exemplar is shown again on the right together with a description of the exemplar in the concept language.
Both conjunctive and disjunctive hypotheses are shown. These are described with the English names of the values on each dimension to simplify reading and understanding each of these hypotheses.
Note that 26 possible hypotheses are shown here. Recall that there are only 16 possible instances of the concept. Many other hypotheses are possible. For example, Black and Large or Circle; Black or White and Circle; ... Clearly, the number of hypotheses about a
http://www.rci.rutgers.edu/~cfs/305_html/Induction/ExampleConcepts/Ex1_Possible.html (1 of 2)11/11/2005 1:27:53 PM
concept can be larger than the number of instances of the concept! And, if you think back to your own experience with this problem, you will undoubtedly note that you did not entertain many hypotheses at all.
The number of possible hypotheses can become large when our concept language allows for a great many different expressions to be constructed in the concept language. If "And" or conjunction were the only logical connective allowed then only 15 different hypotheses would be possible. Clearly, if we possess a bias for conjunctive concepts, then this vastly reduces the number of hypotheses that we would initially consider. It also increases the likelihood that we may develop an hypotheses that is inconsistent with the training set that we have seen. We can actually set a bound on the number of possible concepts when we know the number of possible concept instances. Recall that in this case the number of possible instances is 2 to the 4 or 16. These 16 possible instances can be grouped in 2 to the 16 or 65, 536 ways. This is what is referred to as the power set. The power set is the set of all possible subsets of a set. (You may remember that we referred to the power set when considering Wertheimer's discussion of perceptual grouping.) The power set includes the empty set as well as the set that includes all items. These two bounding sets can be excluded as possible interesting concepts. Thus, we can reduce the space of possible hypoptheses from 65,536 to 65,534!! Example of Concept Identification Task
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Induction/ExampleConcepts/Ex1_Possible.html (2 of 2)11/11/2005 1:27:53 PM
Structuring the Hypothesis Space and Hypotheses Revision

One of the interesting features of natural language is that it allows sentences to be construct sentences that can be ordered from general to specific. For example, "Someone is going someplace next week." is more general than either "Harry is going someplace next week." or "Someone is going to New York next week"; and all of these are more general than "Harry is going to the Village on Tuesday." When things can be partially ordered (or completely ordered) it allows us to "move" systematically from one point in the order to another. Logical statements as well as sets can also be partially ordered. In fact, they can be partially ordered in a special way - in a structure that is referred to as a lattice. This is important because it suggests that this order, when it exists in a concept space, might be exploited to our advantage in inductive learning. The figure to the right consists of 2 of the example instances that we used earlier; namely, a large black square and a small black square. I have chosen a single dimension - size so that the set of concepts is quite small. And, the power set of a 2 element set is also conveniently small it contains only four elements. These four elements of the power set are the four sets shown in the figure. The top set consists of both the small and the large black squares. The bottom set is the "empty" set. The lines connecting the sets represent the fact that there is a subset relation between the sets connected by the line. The middle two sets are both subsets of the top set and the empty set is an element of each of the sets above it. (The empty set is considered to be an element of every set.) This arrangement illustrates the way in which these sets can be partially ordered. And, in fact, these sets form a lattice. The page Algebra of Sets and Definition of a Lattice provides a more detailed discussion of sets and lattices and you may want to refer to it later. But first work through the material on this page so that you can see the way in which these abstract ideas relate to inductive learning. For our purposes here, the important feature of a lattice is that its structure can be used to find the generalization of a pair of elements of the lattice that is the smallest generalization that must be made to include both elements. And, it can also be used to find the specialization of a pair of elements of the lattice that is the smallest specialization that must be made to exclude both elements. Recall that these are the two hypothesis revisions that might be made when a new instance in inconsistent with the existing hypothesis. If the new instance is positive and the hypothesis fails to identify it as such, then the hypothesis must be generalized. If the new instance is negative and the hypothesis fails to identify it as such, then the hypothesis must be specialized.
http://www.rci.rutgers.edu/~cfs/305_html/Induction/ExampleConcepts/LatticeSpace.html (1 of 4)11/11/2005 1:28:35 PM
The example above involves the minimum required to even talk of a space of concept examples. Recall that in our experimental example we used 4 dimension each with two values. This resulted in a total of 2 to the 4 or 16 possible concept examples. When we know the number of concept examples, then we can actually set a bound on the number of possible concept hypothesis. These 16 possible instances can be grouped in 2 to the 16 or 65,536 ways. This is what is referred to as the power set. The power set is the set of all possible subsets of a set. (You may remember that we referred to the power set when considering Wertheimer's discussion of perceptual grouping.) The power set includes the empty set as well as the set that includes all items. These two bounding sets can be excluded as possible interesting concepts. Thus, we can reduce the space of possible hypotheses from 65,536 to 65,534!! In order to gain an appreciation of these ideas, it will be useful to consider a more realistic example. However, we will not use the full experimental space because I really have no desire to draw a lattice of 65,536 sets. Consequently, only half of the space is used by limiting our consideration to only the dimensions of color and size. We will fix the shape at the value 'square' and fix the value of position at 'left'. The power set of these 4 possible concepts instances yields a space of 16 possible concept hypothesis. The lattice of these 16 sets is shown below.
The curly braces are used in this figure to enclose the elements of a set. The 5 horizontal levels along which the sets are organized reflects the partial order structure of this set of sets. The top set includes all four of the possible concept instances. The next level consists of four sets each of which contain a different combination of three of the four possible concept instances. Similarly, the next level consists of six sets each of which contain a different combination of two of the four possible concept instances. The fourth level consists of 4 sets each containing one of the four possible concept instances. And, finally, the bottom level consists of the empty set which contains none of the possible concept instances. The lines in the figure connect a set at level n - 1 with a set at level n if the set at level n - 1 is a subset of the set at level n. Now in order to see how the lattice property proves useful to hypothesis revision, let us choose one of the sets to represent our current hypothesis about the concept. Choose the set at the middle level at the far left. According to this hypothesis the concept is any instance that is black. And, we assume that this hypothesis is consistent with the training instances that have been seen to this point. Now assume that we encounter an instance that consists of a large white square. Further, suppose that we are told that this instance is a positive instance of the concept. Consequently, our hypothesis must be generalized. Note that there are 5
hypotheses in this space that are more general than the current hypothesis. How should we choose among this candidate set of revisions? Note first that the current hypothesis is a subset of only 3 of the 5. Thus, 2 of 5, the two at the far right of the figure, can be eliminated since adopting either of these would result in an hypothesis that was inconsistent with the training sequence that has been observed to this point. Consider next the generalizations of the current instance. There are seven generalizations that include this instance. However, only two of these seven generalizations include both the current instance and the current hypotheses. And one of these two is the most general hypothesis at the top level of the lattice. One other generalization of the hypothesis remains. That, is the set at the far left of the lattice at level 2. This hypothesis is preferred since it is the minimal generalization required to bring the hypothesis in lines with the training sequence. A similar line of argument could be illustrated if a negative instance were presented, such as a small black square. In this case the specialization of the hypothesis would result in the identification of the set at the far left of the lattice at level 4.
The figure to the left shows a training sequence that we is used to illustrate the movement in the hypothesis space in response to this training sequence. The training sequence consists of three positive instances of the concept that is to be learned. They are presented in order from left to right. The animation below annotates the movement in the space using white ovals and redrawing the relevant 'subset line' in white.
Here we have assumed that we began with the most specific hypothesis, namely the null set at the very bottom level of the lattice. Since the training set included only positive examples, this resulted in all hypothesis revisions involving generalization, or movement up the lattice.
Example of Concept Identification Task

Charles F. Schmidt
Version Space
Example of Version Space Approach to Concept Learning

We have argued that if the hypothesis space can be structured then this structure can be exploited in hypothesis revision. But you may have noted that the space of hypotheses, even for trivial examples, is very large. To explicitly represent the structure of this space, even for rather simple concepts, can involve representing millions of possible hypotheses! (For example, if we use 5 dimensions where each dimension can take on one of two values than there are 2 to the 5 or 32 possible instances. The size of this power set is 2 to the 32 or 4,294,967,296. This is a large space to explicitly represent and hold in memory.) This raises some serious doubts about the usefulness of these ideas. However, recall that when we looked at graph search similar concerns were raised. There we noted that if the search space has some minimal algebraic structure, then the space could be implicitly represented and searched using the general method of generate and test. A similar story can be told in this case. Whenever mathematics provides a constructive method for solving some problem in a domain we know that there exists some computational algorithm that can realize this constructive method. Tom Mitchell, a computer scientist, demonstrated in his dissertation research how this structure could be efficiently exploited. Essentially he developed a way in which to implicitly represent and use a structured space of hypotheses. The space was called a version space. The basic idea is to start an inductive learning task with two of the possible hypotheses. But they are a very special two hypotheses. They are the most general hypothesis (corresponding to the top of the lattice structures that we have discussed) and the most specific hypothesis (corresponding to the bottom of the lattice structures that we have discussed). Then, positive examples will always be consistent with the most general hypothesis, but will be inconsistent with the most specific hypothesis. Consequently, this most specific hypothesis will be made more general. A negative example will be consistent with the most specific hypothesis, but inconsistent with the most general hypothesis. Consequently, this most general hypothesis will be made more specific. Thus at any point in the training sequence the learner will maintain two hypothesis; and, the true hypothesis must lie somewhere in the area of the hypothesis space that connects these two hypothesis. If at some point the most general hypothesis is the same as the most specific hypothesis, then the learner has arrived at a unique definition of the concept. Thus, with a method of this sort the definition of a concept is usually not uniquely determined. It is only when enough training instances have been observed that this unique determination is possible. We won't explain the algorithm that generates the new hypothesis in response to the training instances. However, an example of the process is illustrated below. The concept that is being learned in this example is the concept of an "arch". The six training examples that are used are shown below. The first is a positive example of the concept, and, except for instance number 5, the other instances are all negative examples.
http://www.rci.rutgers.edu/~cfs/305_html/Induction/Vspace.html (1 of 3)11/11/2005 1:29:00 PM
Version Space
In the figure below, a geometric representation of the concept instance is shown in the upper left of the figure together with an indication of whether the example is positive or negative. A representation of the example in a concept language is shown to the right. Below the dashed line is shown the most general and most specific representation of the concept that holds after the current example is considered. Note that at this point the most general hypothesis is the most general possible while the most specific is simply a representation of the current example
The animation below shows the way in which these two hypotheses are altered during the training sequence.
Version Space
Note that a positive example may affect the most specific hypothesis by making it more general. Conversely, a negative example may affect the most general hypothesis by making it more specific. The "true" hypothesis is bounded by these two hypothesis. The representation of these bounds together with an appropriate revision strategy allows the space of possible hypotheses that are consistent with the current training sequence to be implicitly represented. If the most specific and most general hypothesis are equal, then the concept has been uniquely identified from the training sequence. If these cross (that is, the most general becomes more specific than the most specific), then there is no hypothesis in this concept language consistent with the training sequence. Note, that what is required in order to realize this type of representation of a concept space is a representation language where the expressions in the concept language can be partially ordered, and where the partial order satisfies the property of a lattice. Think of various commonsense concepts. Is this property rare or common in the language used to describe these concepts?
Charles F. Schmidt
Patterns and the Induction of Rules

Another type of concept learning or induction task is where you are given the first n, usually 3 or so, members of a pattern; and you are to discover the rule that covers the examples and predicts the future elements in the sequence. Usually the rule is some function of n. An interesting set of materials of this type that were studied with children were patterns which are generated from "geometric numbers." These are functions that not only have a simple mathematical form, but they also allow one to depict the function in a regular geometric form.The figure to the right animates the first 3 elements of one of these types of geometric sequences. The figure below summarizes the basic questions that were asked in these experiments.
Clicking on the figures below will open a page that shows cards 1 - 3 and card 9 so that you can try solving these rule pattern problems.
Pattern Problem 1
Pattern Problem 2
Pattern Problem 3
Pattern Problem 4
http://www.rci.rutgers.edu/~cfs/305_html/Induction/Patterns/Rules_Patterns.html (1 of 2)11/11/2005 1:29:20 PM
Pattern 1 Pattern 2 Pattern 3 Clicking on the figures above will open a page that animates the first 11 elements of the sequence.
Pattern 4
Note that in order to answer these questions you probably found it useful (and probably even necessary) to think about the pattern using the geometric representation as well as a representation as an equation expressing the value as a function of n.
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Induction/Patterns/Rules_Patterns.html (2 of 2)11/11/2005 1:29:20 PM
Standard Probability Axioms
Standard Probability Axioms and Assigning Probabilities to Beliefs

In deductive reasoning we assumed that the semantics for the propositions under consideration was either True or False. Then, we considered how the propositions and the complex propositions that could be formed using these logical connectives could be used to derive the consequences of this set of "beliefs". Although it certainly seems to be the case that we can hold beliefs that are either true or false, it also seems to be the case that some of our knowledge is uncertain, that is, we can not say that it is True or False. The question is, if we have uncertain beliefs, then
q q
How is uncertainty represented? How do we reason with uncertain beliefs?
Normally or at times, we think that it is possible or likely or probable that we may possess knowledge that is uncertain, that is, knowledge to which we are unable to assign the value of true or false....but nonetheless think that we can assign varying degrees of certainty to that knowledge. The words italicized above provide some examples of the linguistic hedges that we employ to convey our uncertainty. You may recall that in discussing deduction we spoke of "default inferences." For example, "Rutgers will probably loose the football game on Saturday may have been a default rule acquired by many football fans during the 2001 season." However, since such statements are about the future, we can often imagine that out inference may turn out to be mistaken even though we think it unlikely. There are perhaps many different types of uncertain knowledge. Predicting the outcome of the flip of the coin at the start of the game may seem to some to call into play a different kind of uncertainty.....or perhaps not. The most carefully formulated ideas about uncertainty date back to the 17th century when games of chance were subjected to mathematical scrutiny. The theory of probabilities is the result, and one proposal is that probability theory provides the appropriate treatment of all of the various types of uncertain knowledge that we may entertain. On this view the semantics human of uncertainty is equivalent to probabilities; that is, for each uncertain belief that a person holds, there is some "subjective probability" that represents the degree to which the person believes the statement to be true. This raises the issue of how we might acquire these subjective probabilities as well as the question of how they might be updated in light of new information and experience. But, we will leave this issue aside for now and assume that such subjective probabilities are an appropriate way in which to represent uncertainty. This allows us to consider what is required of the human mind if we are to accurately use a probability calculus to guide our reasoning with uncertain knowledge. The standard probability axioms are shown in the figure below.
http://www.rci.rutgers.edu/~cfs/305_html/Induction/ProbabilityAxioms.html (1 of 5)11/11/2005 1:29:51 PM
Note: a partition is mutually exclusive and exhaustive. These axioms must be satisfied by any assignment of probabilities to a set of propositions. The next figure shown below illustrates some of these relations between propositions and the assigned probability for a very simple world that involves the toss of two pennies. At the top of the figure is enumerated the set of propositions that can hold in this simple world. Next, the universe of possibilities is shown. This is the set of all possible state descriptions that could possible hold. In this simple world there are only four such states. Since each proposition may either be true or false, all possible worlds is given by 2 to the n where n is the number of propositions.
Notice, that in the above figure we have only stated constraints on the probability assignment. An actual probability assignment has not been made. If the penny is a "fair" coin and the procedure for tossing the coins is "fair," then it is reasonable to assume that the possible events are equiprobable. Each of the two coins can come up heads or tails. Thus, if these events are equiprobable then each has a probability of 1/2 or .5. And, the outcome for one coin has no influence on the outcome for another coin - the events are independent. Then the probability of the joint events can be obtained by multiplying their probabilities. Thus, the likelihood that they both come up heads is .5 x .5 or .25; that the first comes up heads and the second tails is also .25; that the second comes up heads and the first tails is .25; and that they both come up tails if .25. Note that the probability of at least one of the coins coming up heads is .75 since we must add the cases that can yield this outcome. Note that in this simple case where the outcomes of the coin tosses were independent, we needn't worry about belief revision. Independence implies that the previous history of the outcomes has no influence on the current likelihood of either coming up heads or tails. However, what if we had rather special coins? Perhaps one of the coins, we aren't sure which will come up heads with a somewhat higher probability whenever the other coin has come up tails on 4 or more of the most recent 6 trials. In this case, the outcomes are not independent and we will constantly have to revise our estimates based on prior outcomes of the coin tosses. Now that we have examined a very simple world involving only two coins we are ready to consider the general case. Note that in the general case we can not assume that the events are equiprobable and independent. Consequently, belief revision will clearly be a computationally expensive task. And, another difficult problem is to determine exactly how to assign probabilities to a particular event.
For example, pretend that we have 30 pennies that are tossed on every trial. And assume that these pennies are very special pennies that can influence the outcome of some of their fellow pennies. For example, penny 1 and penny 9 may be particularly in sympathy with each other and always take on exactly the same value. And, half of the time penny 6 will take on the opposite value of penny 1, and so on. How can we determine, for example, the probability of penny 6 coming up heads? The next figure below discusses this general problem of assigning probabilities to a set of propositions in a coherent fashion for this general case.
If the propositions are not all independent, then assigning probabilities to propositions in a coherent fashion seems to be one of those intractable problems that we keep encountering. And, it is not entirely clear that the world is populated with domains where the propositions are independent. The contrived world of games of chance may represent the exception rather than the rule.
Conditional Probabilities and Bayes Rule

The general idea of belief revision is that whenever new information becomes known, this new information may require us to revise our beliefs. We encountered this idea in the previous discussion of deduction and what was termed defeasible inferences. If I believe that it is highly likely that Larry owns a car, then it may be that when I find out that Larry lives in Manhattan I may consider revising this belief. Or if I learn that Larry likes carrots, then I may consider revising my belief about Larry's ownership of a car. What does fondness for carrots have to do with car ownership!!? The problem with deductive logic and with standard probability theory is that there is nothing in these formalisms that allows us to indicate that 'living in Manhattan' and 'liking carrots' are probably not equally relevant to questions of car ownership. Recall that in a probabilistic representation of knowledge, each proposition (simple or complex) has associated with it a probability. Consequently, the process of belief revision is one of updating the probabilities of events when new information becomes available. The probability p of some proposition after the receipt of information that some proposition q has occurred is called its conditional probability. It is written as Pr(p|q) and read as the probability of p given that q has occurred. For example, the probability of 'a fire in the Empire State Building' may be some small value, say .0002. And the probability of 'smoke in the Empire State Building' may also be some small value, say .002. But if smoke has been observed then it may be appropriate to update or revise our estimate of 'a fire in the Empire State Building.' For example, it might be changed to .0015. Bayes Rule is a formula for belief revision. This rule is given below. Note that the propositions referred to in the formulae below include both simple and complex propositions; and, this updating process must be carried out for every proposition. Thus, again we run into a problem of the tractability of this method.
http://www.rci.rutgers.edu/~cfs/305_html/Induction/Bayes.html (1 of 2)11/11/2005 1:30:21 PM
http://www.rci.rutgers.edu/~cfs/305_html/Induction/Bayes.html (2 of 2)11/11/2005 1:30:21 PM
The "Grue" Property
The "Grue" Property alla Nelson Goodman

The "grue" property is defined as: x is grue if and only if x is green and is observed before the year 2000, or x is blue and is not observed before the year 2000. This is a "weird" property but there is no obvious reason why we couldn't make up such a property. Now, let us pretend that the x referred to above are actually emeralds. Further, pretend that we have observed many emeralds and they have all been green and thus have had the property "grue". Then, intuitively, this should increase our belief that the next emerald we observe will be green and that it will be grue. This intuition is fine until New Years Eve in 1999. Now our pretend emeralds observed in 2000 should be grue and therefore blue and not green. Strange.......Is it still strange if we pretend x are marbles rather than emeralds. Why is this strange?
Goodman, Nelson. Fact, Fiction and Forecast. Indianapolis: Bobbs-Merrill, 1965.
http://www.rci.rutgers.edu/~cfs/305_html/Induction/Grue.html11/11/2005 1:30:29 PM
Algebra of Sets and Definition of a Lattice
Sets and Lattices
In this section on induction, the idea of a set is often used to speak of the space of inductive hypotheses. The table above describes some of the algebraic properties of the operations of union and intersection that are defined over sets. (In the figure above union is represented by the U-shaped symbol and intersection by a upside down U-shaped symbol.) Recall that a set is simply a collection of things. For example, the letters a, b, s could be grouped together as a set which would be expressed as {a,b,s} where the curly braces are used to enclose the items that constitute a set. In addition to the set {a,b,s} we define the sets {b,a} and {c,t,q}. The union operation can then be illustrated as: {a,b,s} U {b,a} = {a,b,s} {a,b,s} U {c,t,q} = {a,b,s,c,t,q} and the intersection operation as: {a,b,s} I {b,a} = {a,b} {a,b,s} I {c,t,q} = {} where {} is the empty set. The empty set is referred to as in the above figure. Finally, we can define a relation between two sets in order to make this all a bit more interesting. The relation is referred to as 'subset of' and is often represented as a 'U' on its side with a line underneath and the open portion of the 'U' to the right. In the example above the set {b,a} is a subset of the set {a,b,s} since every element of the set {b,a} is in the set {a,b,s}. It is also true that the set {b,a} is a subset of the set {b,a}; that is, it is a subset of itself. Sometimes the inverse of 'subset of' is also sometimes used and is read as 'contains'. Again this relation is represented as a 'U' on its side with a line underneath ( ) but in this case the open portion of the 'U' is to the left. In the example above the set {a,b,s} contains the set {b,a} since every element of the set {b,a} is in the set {a,b,s}. The subset (or the contains) relation can be used to partially order a set of sets. If some set A is a subset of a set B then these sets are partially ordered with respect to each other. If a set A is not a subset of another set B and B is not a subset of A then these two sets are not ordered with respect to each other. Thus, this relation can be used to partially order a set
http://www.rci.rutgers.edu/~cfs/305_html/Induction/Lattices.html (1 of 3)11/11/2005 1:31:08 PM
of sets. Partial orders are so popular with computer scientists and mathematicians that they have given them a nickname - POsets.
Sets become more interesting when they can be shown to possess some additional structural or algebraic properties. The figure above defines a particular kind of structure that is referred to as a lattice. The next two figures below provide additional definitions and properties related to lattices using the 'contains' relation. Before you become too enthralled with the technical details provided here, step back and consider the implications of this structure for concept learning. Recall that one of the problems in inductive tasks is to determine how to revise a hypothesis in response to new evidence. Usually there are many different ways to revise the hypotheses, and there is no obvious way in which to choose one revision over another. And, all of the revisions usually cannot be followed because there are often too many of them. With this in mind, simply recall that a lattice is a partially ordered set where for any pair of sets (hypotheses) there is a least upper bound and greatest lower bound. Now let our current hypothesis be H and the current training example E. If E is a subset of H, then no change of H is required. If E is not a subset of H, then H must be changed. The minimal generalization of H is the least upper bound of E and H, and the minimal specialization of H is the greatest lower bound of E and H. Thus, the lattice serves as a kind of map that allows us to locate out current hypothesis,H, with reference to the new information, E. The example lattice referred to below will help to illustrate this general idea.
An example of the lattice that is formed from the power set of the set {a,b,c} is provided and an examination of its structure should clarify these ideas.
The figure above shows the correspondence that exists between the algebra of propositional logic and the algebra of sets. Recall that we have referred to a hypotheses as either a logical expression or rule that defines a concept or as a subset of the possible instances constructible from some set of dimensions. It is because of this correspondence that we can use either language to refer to a concept definition. Note that disjunction corresponds to union, negation to set complement (everything that is not in a set), and conjunction to intersection. And. recall that union and intersection were the important operators used to define a lattice. Now, it should come as no surprise that the expressions in propositional logic can also be organized into a corresponding lattice. It is this correspondence that is used in the version space concept learning approach that we examined.
Lattice of the Power Set of {a,b,c}
Lattice of the Power Set of {a,b,c}

The figure to the right is based on the power set (all possible subsets) of the three element set {a, b, c}. These subsets form a special kind of partial order that is referred to as a lattice.
Notice that the set at the top of the figure, U, consists of all of the elements of the set. Sets D, E, and F are each in the subset relation to U, (for example, every element of D is an element of U; and so on. This subset relation is a basis for partially ordering the sets. We have placed the sets D, E and F below U and drawn an arrow from each of these set to the set U to represent the fact that they are ordered with respect to U. Note that they are not ordered with respect to each other. It is for this reason that we refer to this as a partial order.
Notice further that the set A is a subset of D and of E; B a subset of D and F and so on. Again the position and the arrows indicate the ordering relations that hold under the subset relation. Finally, at the bottom of the figure is the empty set. It is ordered in the figure with respect to sets A, B and C. Now the subset relation is a transitive relation. That is, if A is a subset of D, and D is a subset of U, then A is a subset of U. We didn't bother to put arrows into the figure to reflect this transitivity. Graphs or diagrams of partial orders that leave out the transitive relations are referred to as a Hasse diagram. We have mentioned that a lattice is a special kind of partial order, and now it is time to illustrate this claim. Consider first the operation of Union. Consider any pair of sets in the diagram. The Union of the pair will always yield a set which contains them both. For example, D Union E equals U; A Union B equals D; and so on. (If one of the sets precedes the other in the partial order, then the union yields the set that occurs higher in the order. For example, A U D equals D.) The set obtained under union is referred to as the Least Upper Bound. Next consider the set operation of intersection. Again consider any pair of sets in the diagram. For example, D Intersection E equals A; A Intersection B equals ; and so on. The set obtained under intersection is referred to as the Greatest Lower Bound for the pair. If every pair of elements in the partial order have a unique Least Upper Bound and a unique Greatest Lower Bound; than the partial order is said to form a lattice. In this example shown here, the elements of the partial order are sets and the ordering relation is subset.
Back to Sets and Lattices
http://www.rci.rutgers.edu/~cfs/305_html/Induction/SimpleLattice.html11/11/2005 1:31:26 PM
Dutch Book
Subjective Probabilities and Dutch Book
http://www.rci.rutgers.edu/~cfs/305_html/Induction/DutchBook.html11/11/2005 1:32:00 PM
Symantec Web Security: Access Denied
Access Denied
The requested document, http://www.pbs.org/wgbh/pages/frontline/shows/gamble/, will not be shown. Reason: Found in Denied List (Gambling). Entry causing block is pbs.org/wgbh/pages/frontline/shows/gamble.
Copyright (1996-2003) Symantec Corporation All Rights Reserved. Legal Notice
http://www.pbs.org/wgbh/pages/frontline/shows/gamble/11/11/2005 1:32:07 PM
Terms, Concepts, Questions - Induction

positive example negative example Induction Training Sequence concept space generalize - specialize extensionally defined intensionally defined Induction Problem Description - Generalization Concept Revision
Concept
Structuring the Hypothesis (Concept) Space Version Space Concept as an interpretation in a language other than the observation language
more general than - more specific than Lattice
Relation to Hypothesis (concept) Revision
Mutually Exclusive Axiom Exhaustive Axiom Probability Theory Conditional Probability Probability (Belief) Revision Independence Dependence Assigning Probability to Events Revising Probabilities / Bayes Rule
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Induction/Induction_C.html11/11/2005 1:32:16 PM
Properties of Language
Noam Chomsky's linguistic research in the late 1950s and 1960s was one of the first to use the work in formal theories of computation to illuminate some of the properties of the human mind. ( Click here to go to a reprint of a 1968 lecture by Chomsky on the Linguistic Contributions to the Study of Mind.) At this time most of the work in psychology was dominated by the behaviorist point of view. The emphasis was on the learning of 'verbal materials' - nonsense syllables, randomly constructed lists of words, and the like. There was virtually no work on the learning of materials that were syntactically well-structured. And from the behaviorist point of view, to the extent that a theory might be required; the ideal theory was one that predicted 'observed behavior'. This is quite a tall order if one moves outside the confines of a well-controlled and suitably circumscribed experimental situation. And, even in such an experimental context, a probabilistic prediction seemed to be the best that could be expected. For example, in the early 60s a psychologist, Gordon Bower showed that a one-element Markov process could model the behavior observed in a very circumscribed experimental situation. The experiment was what was known as a paired associate learning task. In such a task there are a number of items, the "stimuli" that are presented and these are followed by a number of items termed the "responses." The subject's task is to learn to respond with the correct response when one of the stimulus items is shown. Presumably, the learning involves building an association between these paired elements. In the Bower experiment the stimuli were single digit integers and the responses were alphabetic materials. A Markov process shares many of the properties of the finite state machines that we studied earlier. You may recall that one of the ways we used to represent a finite state machine is known as a Markov diagram. In fact Markov was a major figure in the early work on the development of mathematical models of computation (Markov A. A Theory of Algorithms. Moscow: National Academy of Sciences, 1954.). In a Markov process each of the possible transitions from one state to the next has some probability of occurring (of course, this set must sum to one). However, there may be many possible starting states each of which has some initial probability (again these must sum to one). And, there need not be a final state. Thus, you can see that one can view a finite state machine as a special case of the more general idea of a Markov process. Markov processes, and stochastic processes in general, were being explored during this period in a variety of domains within the area of the social sciences including areas within economics, linguistics, political science and sociology as well as psychology. But how could one predict linguistic behavior even probabilistically. There are a great many possible sentences that anyone might say at any point in time. As I write this, I don't even know exactly what I will say next and exactly how I will say it. And, to make matters worse Chomsky argued that the number of sentences in any natural language is, in principle, infinite. Perhaps we could come close to predicting the linguistic utterances used in "greeting behavior" using a probabilistic model; but, almost anything else seems hopeless. (Political debates and speeches might be an additional exception.) When you have a game that is impossible to win, then don't play that game. Find one that you have some chance at winning. This is what Chomsky did;
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/LanguageProps.html (1 of 2)11/11/2005 1:33:02 PM
he changed the game. His 1956 article (Chomsky, N. Three models for the description of language. IRE Transactions on Information Theory, 1956, IT2(3), 113-124.) defined a new game. In this game a theory is not asked to predict specific behaviors in a specific context. Rather the theory is asked to 'generate all syntactically correct strings of words (and only) the syntactically correct strings of words of some language.' That is, the theory should capture the essential properties of all language behavior. This eventually led psychologists to shift there attention from the memorization of linguistically related materials to questions about the kind of capacities that the human mind must possess in order to use language. The properties of natural language became more important than some specific linguistic utterance.
The figure above provides a list of various general properties that are exemplified in our use of natural language. Cognitive psychology is still attempting to work out the implications of these properties for our theory of the mind. For example, recall the Gestaltist's interest in ambiguous figures. Do ambiguous sentences suggest similar properties of the mind? What is the relation between metaphor and our experience? How do we determine the meaning and aptness of a metaphor? The starting point for this work on language and the mind was in the area of syntax and it is to this work that we turn next.

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/LanguageProps.html (2 of 2)11/11/2005 1:33:02 PM
Syntax and Sentence Understanding

The sentences below are similar in their surface form but differ significantly in the way they are understood. Consider the sentences: John saw the boy in the park with a telescope. John saw the boy in the park with a dog. John saw the boy in the park with a statue. or to summarize the three: John saw the boy in the park with a {telescope; dog; statue}. These sentences are (on the surface) identical except for the choice of the last word. However, they can be seen to differ in important ways if we use our linguistic knowledge to determine which part of the sentence the prepositional phrase 'with a x' modifies in each case. The figure to the left provides a representation of the underlying structure which explicitly shows the aspect of the sentence that is modified by the prepositional phrase. In the first sentence, the seeing is being accomplished using a telescope. In the second, it is the boy who is with a dog. And, in the third it is the park that contains a statue. The fact that the surface form of the sentence does not provide a basis for these distinctions is part of the argument that the mind must possess rules which capture linguistic knowledge and are used to understand linguistic utterances.
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/SyntaxEx.html (1 of 2)11/11/2005 1:33:20 PM

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/SyntaxEx.html (2 of 2)11/11/2005 1:33:20 PM
Parsing Example
Parsing Example
Parsing a sentence involves the use of linguistic knowledge of a language to discover the way in which a sentence is structured. Exactly how this linguistic knowledge is represented and can be used to understand sentences is one of the questions that has engaged the interest of psycholinguists, linguists, computational linguists, and computer scientists. The figure to the right provides an example of the way in which the computational formalism of a context free grammar can be used to represent a fragment of the linguistic knowledge that any speaker of English would possess. Here linguistic knowledge is represented as a set of rules. A rule consist of two parts; the part to the left of the arrow (the leftside) and the part to the right of the arrow. You will note that some of the rules have a vertical slash on the right side. For example the second rule reads as NP --> Det N | Prop. The slash is a convention used to represent that the NP can be replaced either by Det N or by Prop.Thus, this is really two rules. The top-most rule states that the symbol S is replaced by the symbols NP and VP. Here S refers to 'sentence' and NP and VP to 'noun phrase' and 'verb phrase' respectively. These are part of the non-terminal vocabulary of the grammar. Words that actually occur in a sentence, such as 'the,' 'home,' and 'go;' are part of the terminal vocabulary of the context free grammar. You may have noted the similarity between this formalism and the method of problem reduction that was described in the section on the computational approach to cognition. A context free grammar can be viewed as a special kind of problem reduction. Like the method of problem reduction, the context free grammar involves breaking a problem into subproblems until only primitive subproblems remain.The reading of the first rule as "If the goal is to discover the syntactic structure of this string of words, then try to partition the string of words into one part that is a noun phrase and another part which is a verb phrase" may help you to see this as an example of a rule that decomposes a problem into subproblems. However, note that this is not a general problem solving system, but rather one that is dedicated to the single goal of assigning structure to sentences. We will use this simple set of syntactic rules expressed as a context free grammar to illustrate one of the questions that cognitive psychologists have pursued in their attempt to discover how our minds are able to use linguistic knowledge to understand sentences. The question concerns the interplay between the input and the process of attempting to "parse" the input....or in this example, the process of identifying the syntactic structure that can be assigned to the sentence. You may recall that when we discussed the cryptarithmetic problem we noted that, in principle, any of the subproblems (assigning the correct integers to some subset of
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/Parsing.html (1 of 3)11/11/2005 1:33:57 PM
Parsing Example
the letters) could be pursued at any An animation of a Left to point. However, in practice, it was Right Top-Down Parse advantageous to solve some of the subproblems before others. The parse tree shown in the center to the right represents the decomposition of the sentence into its components. Is there a particular order in which each of these subproblems should be pursued? There are many orders that are possible. The Parse Tree of the Sentence "The boy went home"
An animation of a Right to Left Top-Down Parse
Four of these possibilities are An animation of a Left to An animation of a Right animated in the figures surrounding to Left Bottom-Up Parse Right Bottom-Up Parse the parse tree. In these animations, the symbol(s) to which a rule is applied are briefly underlined and shown in red. Then, a new line is added showing the result of applying the rule to the symbol(s). Each of these ways of parsing the sentence arrive at the same assignment of structure to the sentence. They only differ in the way in which the rules are used and the order in which they are applied. The top two animations illustrate what is referred to as a top-down parse because the expansion of the tree begins at the top or root-node. The bottom two animations illustrate what is referred to as bottom-up parse because the construction of the tree begins from the terminal nodes (the words of the sentence) of the tree. A top-down parse is realized by matching the left side of a rule to an appropriate symbol and then replacing that symbol with the symbols on the right side of the rule. For example, replacing S with NP VP. The bottom-up ordering is realized by matching the right side of a rule to the appropriate symbol(s) and then replacing the matched symbol(s) with the symbol on the left side of the rule. For example, replacing NP VP with S. In addition to the way the rule is used, we can distinguish the order in which the operations are applied. The animations on the left illustrate a left to right order; that is, at a particular level of the tree, the subproblems on the left are chosen before the subproblems on the right. The animations on the right illustrate the reverse or right to left order. In language understanding it would seem most natural to parse an input sentence in a left to right order since that is the order in which the words are heard. However, in the presence of noise, it might be that the order chosen is determined by the ease with which a subproblem can be solved. Deviation from the left to right order of processing requires some means of remembering the words that have been received but temporarily ignored. The distinction between top-down and bottom-up processing is one that aroused considerable interest. Although these terms were originally defined by the kind of parsing process illustrated here, they have been used in a broader sense; and, it is perhaps better to refer to a bottom-up process as a postdictive process and a top-down process as a predictive process.
Parsing Example
Recall that the bottom-up process involves the triggering of the right hand side of rules. And, in our example the initial triggering of a rule depends entirely on the input string. And, in fact, a rule is not applied until the input required for its application is satisfied. In this sense, the knowledge is utilized postdictively and the process never goes beyond or gets 'ahead' of what the input justifies. Contrast this with the top-down process. Here verification of a rule occurs as the last step in the application of the rules. When a left-hand side of a rule is triggered, then the contents of the right-hand side can be viewed as a prediction about what will be seen. A final distinction concerns the question of determinacy. Most of the time our intuitions support the idea that the sentences we hear have but one interpretation. But notice that even when this is true, it does not imply that at each point in the process of sentence understanding only one choice, the correct choice, was available. A parsing process that considers many different possible interpretations is referred to as a nondeterministic process. There are a variety of ways to realize a nondeterministic process. One way is to pursue all of the possible hypotheses at once. You may remember when we studied induction we noted that this space could be quite large for most inductive tasks. It appears that the space can also be quite large in the language understanding task.

Charles F. Schmidt
Marcus Work
Look-Ahead Argument
Recall that one of the issues that has engaged the interest of cognitive scientists is whether human sentence understanding is a deterministic or nondeterministic process. That is, as we process an incoming sentence do we always or almost always make the choices that will result in the correct interpretation? Or do we consider possible interpretations that turn out to be incorrect? The figure to the left provides examples that might be interpreted as requiring that we consider possible interpretations that turn out to be incorrect....that we use a method of parsing that is nondeterministic. In the first two examples, the surface form of the sentence (whether it begins with 'Is' or 'The') provides reliable information on which to correctly guess that the (a) sentences are to be interpreted as question sentences and the (b) sentences as declarative sentences. But note that in the third pair of sentences it is not until the eighth word that the input provides information on which to choose the correct interpretation. Below this third example pair is shown a representation of the way these two sentences are structured. Note that either one must develop:
q q
both interpretations; choose one and hope you are lucky and don't have to backtrack to an alternative; or wait until the eighth word arrives before making a choice.
It is some work that considers this last choice that we will consider here. One kind of parsing strategy that had been studied in work on parsing formal languages is a parser that is referred to as a look-ahead parser. A look-ahead parser is one that when deciding how to interpret input symbol i is allowed to look at the next k input items before making its decision.
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/Marcus.html (1 of 3)11/11/2005 1:34:16 PM
Marcus Work
In his dissertation research, M. Marcus set about the task of attempting to construct a parser for natural language that parsed sentences deterministically. The figure to the left is adapted from Marcus' research. At the top of this figure is his statement of the determinism hypothesis. Recall the various exhaustive search methods that were considered in the section on computational approaches to the study of cognition (Exhaustive Search Methods). Since these exhaustive methods may consider all of the possible hypothesis, it is obvious that both the method of breadth-first search and the method of depth-first search are nondeterministic methods. But notice they realize the nondeterminism in differing ways. In breadth-first search each alternative is explicitly represented and maintained until a solution is found. In depthfirst search back-tracking is used when one alternative is abandoned and another is explicitly considered. In the center of this figure, Marcus enumerates various ways in which nondeterminism might be realized in an algorithm without explicitly maintaining the alternatives. Finally, at the bottom the the figure are listed the properties that Marcus hypothesized as sufficient to realize a deterministic parser of English. And the figure below, provides a schematic picture of the structure of his parser. Consider the first property; namely, that the parser be Partially Data Driven. The data that supports this claim is exemplified by sentences (1a) and (1b). A strictly top-down parse would consider many possible parses. But the first word of each of these sentence can be used to immediately limit the hypotheses considered. But the parser must also Reflect Expectations. This is a characteristic of top-down parsing and the assumption here is that human sentence understanding is both responsive to the implications of the current input data, but also predictive. Sentences (2a) and (2b) provide an example of the type of data that support this assumption. In (2a) a noun phrase is expected
Marcus Work
whereas in (2b) a sentence is expected. Finally, in order to correctly handle sentences such as (3a) and (3b) it is assumed that there is a capacity for Constrained Look-Ahead. Note, because the capacity for look-ahead is constrained to some limit, k, if the parsing of some sentence exceeds this bound, we would expect the parser to break down and be unable to deterministically construct a parse of the sentence. In the figure to the left, the Stack is the structure that contains the subproblems introduced during the parse. This stack provides the basis for implementing the predictive property of the parser.
The Input Buffer provides the structure that is required to realize the Look-Ahead property of the parser. Beneath this diagram is an example rule. Notice that the If portion of the rule makes reference both to the Stack (the predictive aspect of the parsing process) and to the Input Buffer (the data-driven aspect of the parsing process). Is this then a correct model of human sentence understanding? Perhaps not, but it illustrates the way in which our ability to construct computational artifacts that attempt to realize human competence and performance in some domain (in this case sentence parsing) extends our ability to imagine and attempt to empirically test the various possible ways our minds might work.

Charles F. Schmidt
Case Grammar
Case Grammar
In the previous pages on the interpretation of language we considered what is referred to as syntactic knowledge. Roughly speaking, syntactic knowledge includes knowledge about:
q q
syntactic categories such as noun, verb, verb phrase, ... and and how these categories may co-occur and may be ordered.
More abstractly, syntactic knowledge may be defined as linguistic knowledge that can be stated without any reference to whatever the words may refer. Semantic knowledge is linguistic knowledge that does depend on properties of a word's referent. For example, "The stone sang beautifully." is a syntactically correct sequence of words, but it is semantically anomalous. Stones can't sing beautifully; in fact, they can't sing at all! This is semantic knowledge because it states a regularity that depends on the referent of the word stones. If the referent of stones was the Welsh people, then many would not find the sentence semantically anomalous. Charles Fillmore was one of the first linguists to introduce a representation of linguistic knowledge that blurred this strong distinction between syntactic and semantic knowledge of a language. He introduced what was termed case structure grammar and this representation subsequently had considerable influence on psychologists as well as computational linguists. The figure to the right presents the basic ideas that define a case structure grammar. Notice that linguistic knowledge is organized around verbs or more precisely, a verb sense. (A verb may have more than one sense or meaning and these are represented separately. For example, to run for office is a different sense of run than to run to first base, and these would be two different representations in a case grammar.) Associated with each verb sense is a set of cases. Some of the cases are obligatory and others are optional. A case is obligatory if the sentence would be ungrammatical if it were omitted. For example, John gave the book is ungrammatical. There are two notable features that are illustrated in the example representation. First,
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/CaseGram1.html (1 of 3)11/11/2005 1:34:37 PM
Case Grammar
the cases associated with a verb seem to be associated with questions that we one would naturally ask about an event. Who did what to whom when? The representation seems well adapted to the retrieval of the information provided in a sentence. This feature was particularly appealing to psychologists and computational linguists A second interesting feature is that the same representation is provided to both the active and passive forms of the sentence. In the figure the active form is shown above the representation and the passive form below. This feature would be consistent with a finding that we rarely recall the exact syntactic form of the sentence but do recall the basic information provided by the sentence.
In the next figure to the right, the green portion illustrates the basic case structure representation of the sentence information. But, when we hear this sentence we also know the set of inferences shown; namely, that John had a book at some time b and Jim has the book at some time a and time b is before time a. The blue box illustrates the way in which these inferences can be linked to the case structure representation. The case structure representation served to inspire the development of what was termed a framebased representation in AI research. Within a frame-base architecture it is quite natural to have these type of inferences triggered by the representation of the sentence. (For those familiar with certain types of Object Oriented programming language; the frame-based architecture in AI was a somewhat more complicated and elaborated programming
Case Grammar
environment.) One of the consistent findings in human sentence understanding is that we seem to draw these inferences automatically. And, we rarely remember whether or not such information was explicitly stated in the sentence. This observation is consistent with some of the features of a frame-based representation as suggested by case structure grammar Another aspect of the case grammar representation is that it can be effectively used to parse incomplete or noisy sentences. For example, while John gave book is not grammatical, it is still possible to create an appropriate case grammar parse of this string of words. However, case grammar is not a particularly good representation for use in parsing sentences that involve complex syntactic constructions. The web page on representing textual information will give you some appreciation of this difficulty.

Charles F. Schmidt
Example of Representing Textual Information

If sentence understanding is not solely determined by the surface form of a sentence and if we rarely remember the surface form of a sentence, then how can we study human memory for linguistically presented information? Verbatim recall of such materials is very bad, but we do seem to remember the gist of what was put forward. But what is 'gist'? We know what is it not. It is not the surface form of the sentences and it seems to include information that is not necessarily even in the surface form of the sentence. Kintsch and other researchers attempted to use ideas from linguistics, case grammar in particular, to provide a basis for representing the information that was either explicitly or implicitly present in a set of linguistic materials. The figure to the right takes one of the examples considered in your text on page 247. At the top of this figure is the input sentence, a single sentence that is quite complex syntactically. At the bottom of the figure I have provided within the dashed boxes two different expansions of this input sentence.. These expansions are intended to provide a more explicit statement of the information that is conveyed by the sentence at the top of the figure. There are two such expansions. The first is one that was created to match that provided in the text. The second is the expansion that seems most correct to me. In the center of the figure is a pictorial representation of the information (both explicit and implicit) that must be recovered from the input sentence in order to fully understand it. Notice that there are three verbs involved; form, grows, and contributes. Also note that I have connected some of the lines by dotted lines to indicate relations among the arguments. And some of the arguments are rendered in light gray to indicate that they are not explicitly mentioned. By mapping linguistic materials in this type of form, the researchers believed that they could obtain a more accurate measure of what a human subject remembered when asked to recall the input materials. This again illustrates
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/KintschEx.html (1 of 2)11/11/2005 1:34:54 PM
one of the challenges that must be faced in studying the human mind. Measures that pertain only to the surface aspects of the experimental materials are not sufficient. There is a strong sense in which the empirical study of topics such as recall of linguistic materials requires a theory about mental representations in order to meaningfully measure recall performance.

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/KintschEx.html (2 of 2)11/11/2005 1:34:54 PM
Story Interpretation
The figure below contains the story, The Lost Pocket. This is a very simple story taken from an old second grade reader (a politically incorrect reader it should be noted). The bottom of the figure annotates the linguistic form of the first three sentences of the story. The first two sentences are imperative sentences; the syntactic form typically associated with commands. The third sentence uses the modal term 'may' which is typically associated with giving permission, Thus the linguistic form of the sentence would suggest that Mother is commanding Mary to put on her red dress. Similarly, the linguistic form of the third sentence would suggest that Mother is giving Mary permission to go to the store for her. But note, commands are typically given when it is assumed that the person is unlikely to be motivated to do the action without someone in authority giving an explicit command. And permission is typically given when someone is motivated to do the act, but doesn't have the authority to do so without the permission. Mary's reaction to the command to put on her red dress suggests that Mary was quite motivated to do so. If this is true then why did Mother command Mary to put on the red dress? Did Mother think that Mary didn't like to wear that dress? And Mary didn't remark at all on receiving permission to go to the store. Was Mary not really yearning to go to the store for Mother? Or was Mary quite motivated to put on her red dress and rather indifferent about going to the store for Mother? If so, was Mother badly out of touch with Mary's true motivation or was Mother doing something else? One point of view is that in order to really understand linguistic utterances we must go beyond a linguistic analysis. On this view, linguistic utterances, must be viewed as actions--speech acts. We must determine what speech acts are being performed, not merely the linguistic meaning of the sentences. Perhaps the Mother's utterances are really an attempt to persuade Mary to do this errand for her. Mother allows Betty to do something she wants, putting on the red dress; and, in gratitude Mary does something her Mother wants, going to the store for her. Thus, understanding the "literal meaning" is not enough; how this literal meaning is being used is what must be inferred. If someone says, "It is stuffy in here;" they may not be all that interested in informing you of that fact but more interested in requesting you to do something about it. Thus, none of us would be surprised if someone opened a window in response to such utterance.
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/BettyStory.html (1 of 2)11/11/2005 1:35:12 PM
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/BettyStory.html (2 of 2)11/11/2005 1:35:12 PM
Terms, Concepts and Questions - Understanding
Some Terms, Concepts and Questions deep structure surface structure Natural Language knowledge of syntax knowledge of semantics knowledge of world
Parsing
top down (predictive) / bottom up (postdictive) left-right / right-left Deterministic Non-Deterministic
How do these concept relate to ambiguity, resource requirements, expectations
Case Grammar
Verb Sense Cases
Question answering - Inference
Attributes, values, defaults Schema, Frames, ... Consistency Inheritance Instances How are these concepts related to gist, bias, forgetting, remembering, inference, and expectations

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Understanding/UnderstandingTermsQs.html11/11/2005 1:35:19 PM
Missionaries & Cannibals Problem
Missionaries & Cannibals Problem

Missionaries & Cannibals puzzle (mc): 3 missionaries (MMM) and 3 cannibals (CCC) are on the left bank of a river (R). A boat (B) is available which will hold two people, and which can be navigated by any combination of missionaries and cannibals involving one or two people. If the missionaries on either bank of the river are outnumbered at any time by cannibals, the cannibals will indulge in their anthropophagic tendencies and do away with the missionaries who are outnumbered. Find a schedule of crossings that will permit all the missionaries and cannibals to cross the river from the left bank to the right bank safely. Let MMMCCCBr be the start where the boat, all missionaries and cannibals are on the left bank, and rMMMCCCB be the goal where the boat, all missionaries and cannibals are on the right bank

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/MCInstructions.html11/11/2005 1:36:01 PM
Jealous Husbands Problem
Jealous Husbands Problem Jealous Husbands Problem

Three jealous husbands and their wives need to cross a river. They find a small boat that can contain no more than two persons. Find the simplest schedule of crossings that will permit all six people to cross the river so that none of the women shall be left in company with any of the men, unless her husband is present. It is assumed that all passengers on the boat unboard before the next trip and at least one person has to be in the boat for each crossing.

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/HWInstructions.html11/11/2005 1:36:09 PM
River Crossing Problems

Missionaries & Cannibals puzzle (mc): 3 missionaries (MMM) and 3 cannibals (CCC) are on the left bank of a river (R). A boat (B) is available which will hold two people, and which can be navigated by any combination of missionaries and cannibals involving one or two people. If the missionaries on either bank of the river are outnumbered at any time by cannibals, the cannibals will indulge in their anthropophagic tendencies and do away with the missionaries who are outnumbered. Find a schedule of crossings that will permit all the missionaries and cannibals to cross the river from the left bank to the right bank safely. Let MMMCCCBr be the start where the boat, all missionaries and cannibals are on the left bank, and rMMMCCCB be the goal where the boat, all missionaries and cannibals are on the right bank
Jealous Husbands Problem

Three jealous husbands and their wives need to cross a river. They find a small boat that can contain no more than two persons. Find the simplest schedule of crossings that will permit all six people to cross the river so that none of the women shall be left in company with any of the men, unless her husband is present. It is assumed that all passengers on the boat unboard before the next trip and at least one person has to be in the boat for each crossing.
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/RiverCrossingProblems.html (1 of 2)11/11/2005 1:36:42 PM
And finally below are the two spaces shown in correspondence.

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/RiverCrossingProblems.html (2 of 2)11/11/2005 1:36:42 PM
Tower of Hanoi Story
The Tower of Hanoi Story

Taken From W.W. Rouse Ball & H.S.M. Coxeter, Mathematical Recreations and Essays, 12th edition. Univ. of Toronto Press, 1974. The De Parville account of the origen from La Nature, Paris, 1884, part I, pp. 285-286. In the great temple at Benares beneath the dome that marks the centre of the world, rests a brass plate in which are fixed three diamond needles, each a cubit high and as thick as the body of a bee. On one of these needles, at the creation, God place sixty-four discs of pure gold, the largest disk resting on the brass plate, and the others getting smaller and smaller up to the top one. This is the tower of Bramah. Day and night unceasingly the priest transfer the discs from one diamond needle to another according to the fixed and immutable laws of Bramah, which require that the priest on duty must not move more than one disc at a time and that he must place this disc on a needle so that there is no smaller disc below it. When the sixty-four discs shall have been thus transferred from the needle which at creation God placed them to one of the other needles, tower, temple, and Brahmins alike will crumble into dust and with a thunderclap the world will vanish. The number of separate transfers of single discs which the Brahmins must make to effect the transfer of the tower is two raised to the sixty-fourth power minus 1 or 18,446,744,073,709,551,615 moves. Even if the priests move one disk every second, it would take more than 500 billion years to relocate the initial tower of 64 disks.

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/TOHstory.html11/11/2005 1:36:47 PM
The Tower of Hanoi 3-Disk Solution
The Tower of Hanoi 3-Disk Space and Solution
The 3-disk Tower of Hanoi Problem is shown above. The left figure depicts the starting state and the right the goal state.
The animation above shows the optimal solution to this problem. In the state space depicted below, this optimal solution corresponds to the path that begins at the apex of the large triangle and terminates at the lower right of the large triangle.

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/TOH3DiskSol.html11/11/2005 1:37:01 PM
The Tower of Hanoi 3-Disk Space
The Tower of Hanoi 3-Disk Space

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/TOH3DiskSpace.html11/11/2005 1:37:07 PM
n Disks and m Pegs "Tower of Hanoi" Example Problem
n Disks and m Pegs "Tower of Hanoi" Example Problem

The purpose of this example, is to point out that when one parameter of a problem is changed, it may not always be entirely obvious how this changes the structure of the original problem. In this case the parameter changed is the number of pegs. The animation below begins with 8 disks placed on the leftmost peg. The goal is to move these eight disks to the peg on the right. The rules are the same as the standard 3 peg problem, except that all four pegs can be used. What is the relation between 4 peg problems and the standard three peg version? For that matter, what if we allow m pegs to be used. Clearly, there are similarities and there are differences. For most of us, I suspect the relationship is not immediately obvious. If you think Solution about it awhile, the relationship for an 8 disk and 4 peg "Tower of Hanoi" Problem may become apparent. Once found it may then seem obvious. The animation above shows the solution to this particular problem and it may help you to discern the general relationship to the the 3 disk problem.

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/TOHnDiskmPegSol.htm11/11/2005 1:37:37 PM
A Problem from the Tower of Hanoi State Space
A Problem from the Tower of Hanoi State Space
Shown above is a problem taken from the space defined by the Tower of Hanoi problem. Write down your solution to this problem. What aspects of your knowledge about the Tower of Hanoi problem facilitated your problem solving efforts on this problem? What aspects of your knowledge about the Tower of Hanoi problem interfered with your problem solving efforts on this problem?

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/From_TOHspace.html11/11/2005 1:37:53 PM
Monster Problems
The Monster Problems

From Kotovsky, K., Hayes, J. R., and Simon, H. A. Why are some problems hard? Evidence from Tower of Hanoi. Cognitive Psychology, 1985, 17, 248-294. Monster Move Problem Three five-handed extra-terrestrial monsters were holding three crystal globes. Because of the quantum mechanical peculiarities of their neighborhood, both monsters and globes come in exactly three sizes with no others permitted: small, medium and large. The small monster was holding the large globe, the medium-sized monster was holding the small globe, and the large monster was holding the medium-sized globe. since this situation offended their keenly developed sense of symmetry, they proceeded to transfer globes from one monster to another so that each monster would have a globe proportionate to its own size. Monster etiquette complicated the solution of the problem since it requires that: 1. Only one globe may be transferred at a time; 2. If a monster is holding two globes, only the larger of the two may be transferred; and 3. A globe may not be transferred to a monster holding a larger globe. By what sequence of transfers could the monsters have solved this problem? [Hint condition: Your first goal should be to take care of the small monster (i.e., to get him the right-sized globe.)]
Monster Change Problem Three five-handed extra-terrestrial monsters were holding three crystal globes. Because of the quantum mechanical peculiarities of their neighborhood, both monsters and globes come in exactly three sizes with no others permitted: small, medium and large. The small monster was holding the large globe, the medium-sized monster was holding the small globe, and the large monster was holding the medium-sized globe. since this situation offended their keenly developed sense of symmetry, they proceeded to shrink and expand the globes so that each monster would have a globe proportionate to its own size. Monster etiquette complicated the solution of the problem since it requires that: 1. Only one globe may be changed at a time; 2. If two globes have the same size, only the globe held by the larger monster may be changed; and 3. A globe may not be changed to the same size as a globe of a larger monster. By what sequence of changes could the monsters have solved this problem? [Hint condition: Your first goal should be to take care of the small monster (i.e., to get his globe to the right size.)]
Acrobat Problem
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/TOH_Iso.html (1 of 2)11/11/2005 1:38:02 PM
Monster Problems
Three circus acrobats developed an amazing routine in which they jumped to and from each other's shoulders to form human towers. The routine was quite spectacular because it was performed atop three very tall flag poles. It was made even more impressive because the acrobats were very different in size: the large acrobat weighed 700 pounds; the medium acrobat, weighed 200 pounds; and the small acrobat, a mere 40 pounds. These differences forced them to follow these safety rules: 1. Only one acrobat may jump at a time. 2. Whenever two acrobats are on the same flagpole, one must be standing on the shoulders of another. 3. An acrobat may not jump if someone is standing on his shoulders. 4. A bigger acrobat may not stand on the shoulders of the smaller acrobat At the beginning of their act the medium acrobat was on the left, the large acrobat in the middle and the small acrobat was on the right. At the end of the act, they were arranged small, medium, and large from left to right. How did they manage to do this while obeying the safety rules?

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/TOH_Iso.html (2 of 2)11/11/2005 1:38:02 PM
Monster Related Tower of Hanoi 3-Disk Spaces
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/TOH_MSpaces.html (1 of 2)11/11/2005 1:38:16 PM

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/TOH_MSpaces.html (2 of 2)11/11/2005 1:38:16 PM
Terms, Concepts, Questions - Problem Solving
Problem Understanding Problem Solving
identifying relevant features identifying appropriate representation Role of Problem Representation Role of Search Method
Problem Isomorph
Recognize Establishing Isormorph Organized Factual knowledge Domain Specific (tailored) Representations Knowledge of Domain Relevant Decompositions Ability to Critique Solutions Choice of Base Evaluating the Map
Complexity of finding isomorphs
Expertise
Analogy
Use in creative tasks Use as a model - advantages/pitfalls Difficulties in their discovery, use and verification
Problem Isomorphs Parametric Problem Transfer

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/ProblemSolving_Planning/ProblemSolving_C.html11/11/2005 1:38:23 PM
Home of 830:472 AI and Psychology

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/472_html/home472.html11/11/2005 1:47:52 PM
TOC
Go to Course Syllabus
Table of Contents
Part I. Historical Perspective and Basic Approaches to the Study of Thinking
q q q q q q
Introduction Associationism and Behaviorism Productive Thinking...the Gestalt Emphasis Experimental Decomposition of Thinking Computational Approach to the Study of Thinking Cognitive Development and Learnability
Part II. Aspects of Thinking/Cognition

q q q q
Deduction Induction, Concepts, and Reasoning under Uncertainty Understanding, Interpreting and Remembering Events Problem Solving and Planning

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/top305.html11/11/2005 1:48:03 PM
Topics
Table of Contents
I.1. Introduction to Computation & Cognition I.2. Formal Models of Computation I.3. Artificial Intelligence and the Design of Intelligent Systems I.4. Human Cognition as Computation II.1. Problem-Solving and Learning II.2. Planning II.3. Knowledge Representation, Commonsense Knowledge, and Inference
Home 830:472
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/472_html/topics.html11/11/2005 1:48:33 PM
I.1. Introduction to Computation & Cognition
I.1. Introduction to Computation & Cognition Overview Page Web Readings

q q
Timeline of Events Related to Computing (PDF Docutment) Chess, Deep Blue, Kasparov and Intelligence r Mean Chess-Playing Computer Tears at Meaning of Thought, By Bruce Weber, February 19, 1996 r Conventional Wisdom Says Machines Cannot Think By George Johnson, May 9, 1997 r IBM Chess Machine Beats Humanity's Champ By Bruce Weber, May 12, 1997 r Computer in the News: Kasparov's Inscrutable Conqueror By Robert D. McFadden, May 12, 1997 Minsky, M. L. Why People Think Computers Can't. AI Magazine, 1982, 4, 3-15
q q q
Some History Depiction of the Space of Possible Theories Levels Hypothesis r Tic Tac Toe Example r Tic Tac Toe and Homomorphisms r Tic Tac Toe and Associated Combinatory Spaces r A Tinkertoy computer that plays tic-tac-toe by A. K. Dewdney
Text Readings
q q
Chapter 1. What is Artificial Intelligence? Skim Appendix
Assignments
q q
Problem and use of Logic Levels Essay Question
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/472_html/Intro/IntroToc.html11/11/2005 1:48:53 PM
I.2. Formal Models of Computation
I.2. Formal Models of Computation Overview Page Web Readings

q q q
q q q
q q
Informal Properties of an Algorithm Formal Definitions of Finite State Machines Three Ways of Viewing a Finite State Machine r Finite State Machine Assignment Introductory Chapter on Machines and Computation Turing on Turing Machines Turing Machine Examples r Some "Apple" Acceptors r AnBn Turing Machine r AnBnCn Turing Machine s Turing Machine Assignment r Encoding of Turing Machines and the Universal Turing Machine Hierarchy of Grammars Defined r Some Examples of the Different Types of Grammars r Derivation Example Formal Theory Essay Question More Advanced Material r Learning Issues in the Context of FSM s "Learning" and "FSM" s Finding the Minimal FSM r FSM and Regular Expressions r Computational Complexity
Text Readings Assignments

q q q
Finite State Machine Assignment Turing Machine Assignment Formal Theory Essay Question
Table of Contents
http://www.rci.rutgers.edu/~cfs/472_html/TM/FormalModelsToc.html (1 of 2)11/11/2005 1:48:58 PM
I.2. Formal Models of Computation
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/472_html/TM/FormalModelsToc.html (2 of 2)11/11/2005 1:48:58 PM
I.3. Artificial Intelligence and the Design of Intelligent Systems
I.3. Artificial Intelligence and the Design of Intelligent Systems Overview Page Web Readings
q q
q q
Artificial Intelligence and Search The Physical Symbol System Hypothesis r Newell and Simon 1975 Turing Award Lecture Heuristic Search Exhaustive Search Methods r Search Animations r Graph Search - An Example r State Space Formulation - An Example s Search Assignment r Problem Reduction Search
Text Readings
q
Review Appendix (Introduction to Logic) Chapter 2. Search and Planning.
Assignments
q
Search Assignment
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/472_html/AI_SEARCH/AI_SearchToc.html11/11/2005 1:49:01 PM
I.4. Human Cognition as Computation
I.4. Human Cognition as Computation Overview Page Web Readings

q q q
Possible Constraints that Characterize the Mind Some quotes illustrating David Marr's point of view Overview of Production Rule Systems r Information Processing System Assumptions r Production Rule Constraints and An Example r Blocks Problem Example of Recognition-Act Architecture s Production Rule Assignment r Soar r Protocol Analysis
Assignments
q
Production Rule Assignment
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/472_html/CogArch/CogArchToc.html11/11/2005 1:49:04 PM
II.1. Problem-Solving and Learning
I.1. Problem-Solving and Learning Overview Page Web Readings

q
q q
q q
Introduction r Introduction to the Learning Problem r The Concept Identification Experiment and Assignment s Some Possible Hypothesis after a Single Positive Example s Structuring the Hypothesis Space and Hypotheses Revision s An Example of acquiring a Decision Tree for the Example Concept Winston and Structural Learning Partially Ordered Concept Spaces r Version Spaces r Lattices s Subset Example s Axioms Tower of Hanoi Research Learning Baseball Concepts
Text Readings
q
Chapter 9. Induction
Assignments
q
The Concept Identification Experiment and Assignment
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/472_html/Learn/LearningToc.html11/11/2005 1:49:07 PM
II.2. Planning
II.2. Planning Overview Page Web Readings

q q q
q q q q q
Introduction to Planning GPS - Means-End Analysis STRIPS - Linear Planning r Introduction r Example r Triangle Table s Assignment Tower of Hanoi Example The Method of Goal Regression * Non-Linear Planning Partial Provisional Planning Comparison of Linear and NonLinear Planning Algorithms * Different Control Regimes for Problem Reduction * r On the Use of Problem Reduction Search for Automated Music Composition (PDF file) Reactive Planning r Reactive Planning using a "Situation Space"(PDF file) Plan Recognition r The Plan Recognition Problem (PDF file)
Text Readings
q q
Chapter 2. Search and Planning.(Review) Chapter 3. Logic and Inference.
Assignments
q
Assignment Tower of Hanoi Example
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/472_html/Planning/PlanningToc.html11/11/2005 1:49:10 PM
II.3. Knowledge Representation, Commonsense Knowledge, and Inference
II.3. Knowledge Representation, Commonsense Knowledge, and Inference Web Readings

q q q q q q q q q q
Logic, Knowledge, and Computation Example Proof in Syntax and Semantics Inconsistency and Proof First Order Logic (FOL) FOL and Resolution Theorem Proving Example Examples of Knowledge that is hard to state in FOL Defeasible Inference Default Logic Frame Systems Reference r Truth Tables r Some Logical Identities r Some Logical Implications r Formal Systems Definitions
Text Readings
q q q q q q
Review Chapter 3. Logic and Inference Skim Chapter 4. Closed World Assumptions Chapter 5. Defeasible Inference Chapter 6. Reason Maintenance Chapter 7. Memory Organization Chapter 8. Probabilistic Inference
Table of Contents
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/472_html/Logic_KR/KnowledgeRepToc.html11/11/2005 1:49:13 PM
Muller-Lyer Illusion
Mller-Lyer Illusion
The center connecting line is seen as shorter in the top figure that in the lower figure. The figure below shows the line as it appears in both figures.
The figure below shows both figures superimposed on one another in order to demonstrate in yet another way that the center line is of equal length in both figures.
Additionally, the next figure shows all of the components of the two figures separately. Close your eyes and see if you can mentally put the components together and experience the illusion. I suspect that you can not. What might this tell us about the nature of the illusion ?
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/ML_illusion.html (1 of 2)11/11/2005 1:50:47 PM
Muller-Lyer Illusion
Finally, below is an animation of the two figures. You probably experience the illusion even in this animated version. The animation involve superimposing the figures as we did above, but in this case the figures aren't both seen simultaneously.
Back to Illusions and Ambiguous Figures

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/ML_illusion.html (2 of 2)11/11/2005 1:50:47 PM
Hering-Helmholtz Illusion
In the top figure you probably perceive the middle lines as bowing in slightly and in the next figure as bowing out slightly. The parallel lines below show the lines that actually appear in each figure.
Both figures are superimposed below leading to a rather interesting effect.
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Her_Illusion.html (1 of 2)11/11/2005 1:51:05 PM

Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Her_Illusion.html (2 of 2)11/11/2005 1:51:05 PM
Ebbinghaus Illusion
Ebbinghaus Illusion
You probably perceive the middle circle as smaller in the figure on the left than the circle in the center of the second figure. They are actually the same size. Back to Illusions and Ambiguous Figures
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/EbbIllusion.html11/11/2005 1:51:10 PM
Rubin Vase
Rubin's Vase
This ambiguous figure demonstrates our ability to shift between figure and ground which provides the basis for the two interpretations of these figures. Back to Illusions and Ambiguous Figures
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Vase.html11/11/2005 1:51:18 PM
Young Woman or Old Woman
Young Woman or Old Woman
In this famous ambiguous figure it is possible to see either a young woman or an old woman. It is a drawing and if you examine it in detail it will probably be rather hard to decide what all of the different components represent in each of the interpretations. Nose, hat, feather, ear, etc. are identifiable...but you're mind seems to be imposing these interpretations on the drawing rather than being compelled by the "perceptual evidence." Back to Illusions and Ambiguous Figures
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Woman.html11/11/2005 1:51:25 PM
High Contrast Photo Scene
The high contrast of this photograph may make it difficult to interpret. Back to Illusions and Ambiguous Figures
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/HiContrast.html11/11/2005 1:51:33 PM
Triangle Completion
Triangle Completion - Seeing what is not there!
These two figures shown above rather strikingly illustrate the mind's willingness to see an equilateral triangle despite the fact that no border information about the center triangle is in the picture.
The figure on the left retains only the slightest trace of the vertices of the center triangle but you can probably still "see" the center triangle.
In the figure on the right, we have retained the outlines of the vertices of each equilateral triangle, but introduced some space between where these vertices intersect each other. You may still "see" the triangles in this case, but the effect is not striking.
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/TriComplete.html (1 of 3)11/11/2005 1:51:52 PM
Triangle Completion
Here the information about the join of the vertices has been destroyed; lessening the effect even more.
In the final figure on the right all of the lines have been joined, allowing one to parse the picture as a star. When this is done the two triangles are not perceived. However, you may find that you can still see one or the other.
Triangle Completion
Charles F. Schmidt
Matchstick Problems - 4 Squares
Arrange the Matches into Four Squares

The problem below is discussed in your text. The configuration of matches shown on the left of the figure under the label 'Start' represents the way the matches are arranged at the start of the problem. The goal is to find a way to arrange them so that they form four squares instead of five. But the constraint on your solution is that only 3 matches may be moved to achieve the solution. If you have not already solved the problem do so now. If you have solved the problem, then see if you can find some additional solutions.
Matchstick Problems
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/MatchStickProbs/FourSquares.html11/11/2005 1:55:38 PM
Matchstick Problems - 4 Squares Solutions
Reasoning to a Solution
When we solve a problem we typically must not only find a solution but first form an understanding of the problem. This can involve deriving inferences from the problem statement as well as determining a way in which to represent or think about the problem. The figure below depicts the matchstick problem considered in the text as well as some of the possible reasoning that some mind might have moved through in solving the problem. The inferences enclosed in the black box were ideas that turned out to be irrelevant to the problem solution. Read through this material and see if you can identify the relevance of the the other ideas to representing and solving the problem. Also check if you had recognized all of the solutions shown here...if not, can you explain why you didn't find these as well?
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/MatchStickProbs/FourSquares_inf.html (1 of 2)11/11/2005 1:56:00 PM
Matchstick Problems - 4 Squares Solutions
Animations of two of the above solutions. The subscript refers to the match stick that was moved.
To the right is a solution that moves only two matches. This solution was submitted by a problem solver in Canada. Assignment The example above of reasoning involved in solving the problem represents my (somewhat contrived) reconstruction of the way in which I thought about the problem. See if you can solve the additional matchstick problems that have been provided and create a reconstruction of what you thought about in solving them. Matchstick Problems
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/MatchStickProbs/FourSquares_inf.html (2 of 2)11/11/2005 1:56:00 PM
Matchstick Problems - 40 Sqaures Problems
40 Squares Problems
Below are two additional "Matchstick Problems" that I thought up. In the problem shown on the left, you are to construct 40 squares using as many matches as possible. Find a configuration of squares that achieves this and determine the number of matches used. Are you sure that this is the best that can be done?
In the next problem, shown on the right, you are to construct 40 squares using as few matches as possible. Find a configuration of squares that achieves this and determine the number of matches used. Are you sure that this is the best that can be done? Even though the two problems seem very similar, you probably found it a bit harder to solve this second problem. What do you think makes this one harder? Matchstick Problems
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/MatchStickProbs/Max_Min.html11/11/2005 1:56:47 PM
Matchstick Problems - Large and Small Matches
Two Squares and Ten Squares Problem

Here are some additional matchstick problems that were thought up for your edification and entertainment. How do you need to think about these problems in order to come up with a solution? What is the relation between the solution to the problem on the left and the problem on the right?
Matchstick Problems
Charles F. Schmidt
http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/MatchStickProbs/LargeSmall.html11/11/2005 1:57:00 PM

Productive Thinking

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Productive Thinking

Uploaded by

Copyright:

Available Formats

Productive Thinking...

the Gestalt Emphasis

Productive Thinking...the Gestalt Emphasis

Terms, Concepts, and Questions

Terms, Concepts, and Questions

Traffic Lights ! Cryptarithmetic Problem Matchstick Problems

Illusions and Ambiguous Figures

Mller-Lyer Illusion Hering-Helmholtz Illusion Ebbinghaus Illusion

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Illusions.html (1 of 2)11/11/2005 12:59:30 PM

Illusions and Ambiguous Figures

Rubin Vase Old/Young Woman What is this a photo of?

Some Artists' Versions (click to enlarge)

Productive Thinking...the Gestalt Emphasis

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Illusions.html (2 of 2)11/11/2005 12:59:30 PM

Untersuchen zur Lehre von der Gestalt - 1

Wertheimer on organizing principles of perceptual grouping

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer.html (1 of 2)11/11/2005 12:59:43 PM

Untersuchen zur Lehre von der Gestalt - 1

Wertheimer on Grouping of Elements

Productive Thinking...the Gestalt Emphasis

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer.html (2 of 2)11/11/2005 12:59:43 PM

Untersuchen zur Lehre von der Gestalt 2

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer2.html (1 of 6)11/11/2005 1:00:30 PM

Untersuchen zur Lehre von der Gestalt 2

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer2.html (2 of 6)11/11/2005 1:00:30 PM

Untersuchen zur Lehre von der Gestalt 2

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer2.html (3 of 6)11/11/2005 1:00:30 PM

Untersuchen zur Lehre von der Gestalt 2

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer2.html (4 of 6)11/11/2005 1:00:30 PM

Untersuchen zur Lehre von der Gestalt 2

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer2.html (5 of 6)11/11/2005 1:00:30 PM

Untersuchen zur Lehre von der Gestalt 2

Productive Thinking...the Gestalt Emphasis

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/wertheimer2.html (6 of 6)11/11/2005 1:00:30 PM

Partitioning the Physical World

REALITY AND POSSIBILITY:

Partitioning the Physical World

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz1.html (2 of 3)11/11/2005 1:00:40 PM

Partitioning the Physical World

Productive Thinking...the Gestalt Emphasis

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz1.html (3 of 3)11/11/2005 1:00:40 PM

Partitioning the Physical World 2

Some Artificial Figures

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz1A.html (1 of 5)11/11/2005 1:01:12 PM

Partitioning the Physical World 2

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz1A.html (2 of 5)11/11/2005 1:01:12 PM

Partitioning the Physical World 2

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz1A.html (3 of 5)11/11/2005 1:01:12 PM

Partitioning the Physical World 2

Partitioning the Physical World 2

Productive Thinking...the Gestalt Emphasis

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz1A.html (5 of 5)11/11/2005 1:01:12 PM

The Waltz Research on Paritioning a Simple Visual World into Objects

Partitioning the World into Objects - Using Physical Constraints

Basic Lines and Vertices

Vertices Explicitly Depicted

Two Rectangle Composed

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz2.html (1 of 6)11/11/2005 1:01:52 PM

The Waltz Research on Paritioning a Simple Visual World into Objects

Example Figure - Vertices Only

Example Figure - Lines Only

http://www.rci.rutgers.edu/~cfs/305_html/Gestalt/Waltz2.html (2 of 6)11/11/2005 1:01:52 PM

The Waltz Research on Paritioning a Simple Visual World into Objects