Download as pdf or txt
Download as pdf or txt
You are on page 1of 19


Adapting the Human Plausible Reasoning Theory to a Graphical User Interface

Maria Virvou, Katerina Kabassi, University of Piraeus, Greece
interfaces can prove difficult to traverse and use. On the other hand, traditional on-line help is not always sufficiently helpful. For example, Matthews et al. [23] highlight the fact that on-line manuals must explain everything and novices find them confusing, while more experienced users find it quite annoying to have to browse through a lot of irrelevant material. These problems have motivated a lot of research on intelligent help to users. In most existing intelligent help systems, intelligent help is generated based on user models that the systems construct (e.g. [4], [11], [13], [26], [36], [44], [45]). Very often, help is given after an explicit users request like in UC [7], [24], [44], which is an intelligent help system for Unix users. However, one important problem that has been revealed by empirical studies (e.g. [40]) is that users do not always realize that they have made an error immediately after they have made it. Therefore, they may not know that they need help. This problem can be addressed by active systems that intervene when they judge that there is a problematic situation without the user having initiated this interaction. Examples of active systems are CHORIS [36] and Office Assistant [13]. CHORIS maintains explicit user models. Explicit user models are based on information that users have provided explicitly about themselves, whereas implicit models infer information, by observing and interpreting the users behavior [34]. However, users who may have not realized that they need help, may also have a problem in providing accurate information about themselves. Therefore, explicit user models may not be able to address this problem. Office Assistant, on the other hand, keeps implicit user models. However, Office Assistant mainly intends to help users by optimizing their plans which may be correct rather than help them with their errors. An interesting variant on a help system is PHelpS [1], which models workers so that it can assist one worker in identifying a peer who can assist him/her. However, PHelpS does not model the reasoning of users but their characteristics. In view of the above, we explored the utility of a cognitive theory, called Human Plausible Reasoning [9] to serve the purpose of adding more human reasoning to the computer so that the interaction may become more human-like and userfriendly than it is currently. The main reason for selecting this theory to be adapted in a user interface was the fact that it looked very promising as a domain-independent tool that could render the interaction more human-like in the sense of providing spontaneous help to users errors based on human reasoning.

Abstract This paper describes the adaptation of a cognitive theory, called Human Plausible Reasoning (HPR), for the purposes of an intelligent Graphical User Interface. The GUI is called Intelligent File Manipulator (IFM) and manages files and folders in a similar way as the Windows 98/NT Explorer. However, IFM also incorporates intelligence which aims at rendering the interaction more human-like than in a standard explorer in terms of assistance to users errors. IFM constantly reasons about users actions, goals, plans and possible errors and offers automatic assistance in case of a problematic situation. Human Plausible Reasoning is used in IFM to simulate the reasoning of users in its user modeling component and the reasoning of human expert helpers when they try to provide assistance to users. The adaptation of HPR in IFM has focused on the domain representation, statement transforms and certainty parameters. IFM has been evaluated by comparison of its reasoning to that of human experts in computer science that acted as advisors. The evaluation results showed that IFM could generate plausible hypotheses about users errors and helpful advice to a satisfactory extent; hence HPR seemed to have fulfilled the purpose for which it was incorporated in IFM. Index TermsCognitive Theory, Intelligent Help, Intelligent User Interfaces, User Modeling.

oftware users often encounter problems while interacting with the computer. They may make mistakes in the use of the system, they may miscomprehend the systems feedback or they may overlook information provided by the system. These problems often cause them frustration since they do not achieve their goals without errors or, in some cases, they do not achieve them at all. The users frustration may be increasing in cases when users believe that they have provided enough evidence for the computer to understand what their real intentions were. However, the computer does not reason in the same way as a human observer would and this is something that often turns the computer into an unfriendly interactant. The computer lacks the required reasoning ability for making plausible guesses about the users goals and beliefs which may be either correct or incorrect. Of course, there have been a lot of efforts within the field of Human-Computer Interaction to render user interfaces userfriendlier. For example, graphical user interfaces are certainly more user-friendly than command language interfaces. However, as McGraw [25] points out, even graphical user


2 Human Plausible Reasoning theory (henceforth referred to as HPR) is a domain-independent theory originally based on a corpus of peoples answers to everyday questions. Starting from a question asked to a person the theory tries to model the reasoning that this person employs in order to find a plausible answer, assuming that s/he does not have a ready answer. In this respect, the theory tries to model peoples reasoning based on analogies, which is employed when they make plausible guesses about something that they do not know well. In a user interface, this kind of reasoning may be useful for a user modeling component to simulate the users reasoning when they make plausible mistakes in their effort to conform with the interfaces formalities and achieve their goals. For example, Haakma [12] in an attempt to explain the behavior of users, points out that analogical reasoning stimulates users to transfer procedural knowledge from one task to another. In this sense HPR provides a formal framework for computing such transfer of knowledge. In addition, it may be used to simulate a human observers reasoning when s/he is watching a user work. In this case too the human observer makes plausible guesses about the users intentions and beliefs. The domain that we selected to examine the capabilities of HPR was a graphical user interface of general use by a very wide range of users. Therefore, we developed a GUI that manages files and folders in a similar way as the Windows 98/NT Explorer [28] which is a program used by a very large portion of computer users. However, the GUI that we developed also incorporates the reasoning adapted from HPR and is called Intelligent File Manipulator (IFM). IFM monitors users while they work and constructs user models. In case IFM diagnoses a problem with respect to the users hypothesized beliefs and intentions, it provides spontaneous advice. HPR has been previously used in another system of a different domain. That system was called RESCUER [37], [39]. RESCUER provided automatic assistance to users of the UNIX operating system. The user interface of UNIX is a command language interface, which is different from a graphical user interface that involves mouse events. Moreover, command language interfaces are considered less user-friendly than GUIs and are probably used by a smaller number of computer users than GUIs. Therefore, the exploration of the utility and application of HPR in a GUI after it has been applied in a command language interface is very useful. In particular, it reveals the potential of HPR for a more general framework for the development and incorporation of intelligent human-like help into user interfaces. The remainder of this paper is organized as follows. In Section II we present and discuss related work in the reasoning of intelligent help systems. Then we present briefly the principles of HPR. In Section III we give an overview of the operation of IFM. In Section IV we describe the domain representation in IFM so that HPR may be used. In Sections V and VI we show how an inference mechanism of HPR has been adapted and used in IFM. In Section VII and VIII we describe briefly how IFM has been evaluated, we discuss the adaptation of HPR in IFM and we give the conclusions drawn from this work. II. RELATED WORK A. Human Plausible Reasoning theory Human Plausible Reasoning (HPR) is a descriptive theory of human plausible inference which categorizes plausible inferences in terms of a set of frequently recurring inference patterns and a set of transformations on those patterns [2], [3], [9]. The theory is used to formalize the plausible inferences that frequently occur in peoples responses to questions for which they do not have ready answers. In this sense the theory includes a variety of inference patterns that do not occur in formal-logic based theories or in the various non-classical logics such as fuzzy logic [46], intuitionist logic [22], or variable-precision logic [27]. Lately, a large part of the literature in Naturalistic Decision Making provides descriptive theories of human reasoning that also address the limitations of more formal reasoning paradigms. Naturalistic decision making (NDM) [16], [17] is a relatively new but rapidly growing research field, which is concerned with the examination and explanation of decision making by experts in environments that satisfy specific criteria [30]. The naturalistic decision-making model is based on extensive field work. It differs from a decision event model in that much effort is devoted to situation assessment or figuring out the nature of the problem and single options are evaluated sequentially through mental simulation of outcomes in order to find one that would be satisfactory. Similarly, HPR tries to model human reasoning. However, a main difference between HPR and NDM is that most NDM models assume that the people modeled have some level of expertise in the field, they are not necessarily expert, but definitely not novice [29]. HPR, on the other hand, can be used for modeling the reasoning of both experienced and non experienced users. Thus, HPR is suitable for simulating the users imperfect but plausible reasoning that may have led them to making plausible errors, which have created problems to them. HPR detects the relationship between a question and the knowledge retrieved from memory and drives the line (type) of inference. For example, if the question asked was whether coffee was grown in the Llanos region in Colombia, the answer would depend on the knowledge retrieved from memory. If the subject knew that Llanos was in a savanna region similar to that where coffee grows, this would trigger an inductive, analogical inference, and would generate the answer yes [5]. HPR models the reasoning of people who have a patchy knowledge of certain domains such as geography. By patchy knowledge we mean partial knowledge of the facts and relations in the domain. The theory assumes that a large part of human knowledge is represented in dynamic hierarchies, that are always being updated, modified or expanded. Node A in any hierarchy can

3 be a descriptor of node B in another hierarchy, that is, A can be used to characterize node B. Such relation can be written as the term A(B). A term A(B) can take values (called referents in the theory). Applying a descriptor to an argument (a node or a sequence of nodes) produces a specific value characterizing the argument. The resulted expression is called a statement. In general statements are recordings of information within the hierarchies. For example, if flower is in a hierarchy of things and England is in a hierarchy of places, flower-type might be a descriptor for England. This produces a statement of the form: flower-type(England)={daffodils, roses, ...} In the above statement, flower-type is a descriptor, England is an argument, flower-type(England) is a term, and daffodils and roses are referents for the term. The statement means: The types of flower that grow in England are daffodils, roses etc.
TABLE I HPRS ELEMENTS OF EXPRESSIONS arguments a1, a2, f(a1) e.g. Sam, whale, Sams food descriptors d1, d2 e.g. size, animal-type terms d1(a1), d2(a2), d2(d1(a1)) e.g. animal-type(Sam), size(whale), size(animal-type(Sam)) referents r1, r2, {r2} e.g. whale, large, large plus other sizes. Statements d1(a1) = r1 : g, f e.g. size(whale)=large: certain, high frequency (translation: I am certain almost all whales are large) TABLE II DESCRIPTION OF THE RELATIONS Generalization GEN in CX(, d()) e.g. bird GEN chicken in CX (birds, physical features(birds)) Specialization SPEC in CX(, d()) e.g. chicken SPEC fowl in CX (fowl, food cost(fowl)) Similarity SPEC in CX(A, d(A)), where A represents superordinate of and e.g. ducks SIM geese in CX (birds, all features(birds)) Dissimilarity SPEC in CX(A, d(A)) , where A represents superordinate of and e.g. ducks DIS geese in CX (birds, neck length(birds))








Fig. 1. A type of hierarchy of flowers

The core theory consists of: 1. A set of primitives. 2. A set of inference rules. Each primitive consists of elements of expression. These elements are the arguments, descriptors, terms, referents and statements that have already been described. Moreover, a summary of them is given in Table I. The primitives can be classified into four groups: 1) Statements representing peoples beliefs about the world. 2) Statements involving relations. These represent different relationships such as generalization (GEN), specialization (SPEC), similarity (SIM) and dissimilarity (DIS) among concepts in hierarchies. Table II illustrates the four relations in the core system and the kinds of statements they occur in. 3) Relational expressions which are either mutual implications or mutual dependencies. These represent peoples approximate knowledge about what depends on what, which can be specified with more or less precision. 4) Certainty parameters that act to condition these three kinds of expression and which affect the certainty of the different inferences described in the next two sections. The set of inference rules consists of: a. Statement transforms. b. Transforms based on dependencies and implications.

In IFM, we have used the inference pattern of statement transforms; therefore we will give a brief description of what statement transforms are. Statement transforms Human knowledge about a domain is represented as a collection of statements. An example of a statement is: precipitation(Egypt) = very-light, which means that the precipitation of Egypt is very light. The descriptor in this statement is precipitation, Egypt is the argument, verylight is the referent and precipitation(Egypt) is a term. The simplest class of inference patterns are called statement transforms. In general a statement transform is the move from one statement to another. Statement transforms exploit the 4 possible relations among arguments and the four relations among referents to yield 8 types of statement transform. These eight statement transforms allow plausible conclusions to be drawn. In particular, there are four types of statement transform that are called argument transforms and four types of statement transform that are called referent transforms. The argument transforms are statement transforms which change the node being characterized, i.e. the argument. In order to achieve this they move up, down or sideways in the argument hierarchy using GEN, SPEC, SIM or DIS respectively. Examples of argument transforms are illustrated in Table III. The referent transforms do the same in the referent hierarchy and change the results of a term. The main difference between

4 argument transforms and referent transforms is that in the first the transformation is made on the argument of a term based on the hierarchy of objects where this argument belongs. On the other hand, in the referent transforms the transformation is made on the referent of a statement based on the hierarchy of objects where the referent belongs. Argument hierarchies are usually different from referent hierarchies. For example, from the statement flower-type(England)=roses, we can make statement transforms illustrated in Table III, given the type hierarchy for flowers shown in Fig. 1 and a similar type hierarchy for geographic regions (not illustrated).
TABLE III EXAMPLES OF EIGHT TYPES OF STATEMENT TRANSFORM FOR THE STATEMENT FLOWER-TYPE(ENGLAND)=ROSES Argument transforms GEN flower-type(Europe)=roses SPEC flower-type(Surrey)=roses SIM flower-type(Holland)=roses DIS flower-type(Brazil)roses Referent transforms GEN flower-type(England)=temperate flowers SPEC flower-type(England)=yellow roses SIM flower-type(England)=peonies DIS flower-type(England)bougainvillea

almost 1 there is great confidence in the transformation, otherwise, the confidence decreases. The degree of typicality () represents how typical a subset is within a set (for example, the cow is a typical mammal). Dominance () indicates how dominant a subset is in a set (for example, elephants are not a large percentage of mammals). Finally the only parameter applicable to every expression is the certainty parameter (). This parameter indicates the degree of belief a person has that an expression is true. For example, in the formal representation of statement transforms the certainty parameter represents the degree of certainty of a person about this transform. The formal representation of the similarity statement transforms which are quite important is
TABLE IV SIM-BASED ARGUMENT TRANSFORMS AND AN EXAMPLE OF APPLICATION SIM-based argument transforms d(a) = r: 1, , a SIM a in CX(A, D(A)): , 2 D(A) d(A): , 3 A, a SPEC A: 4, 5 d(a) = r: = (1, , , , 2, , 3, 4, 5) Example E.g. livestock(West Texas)=cattle,: 1 = high, = high, = high Chaco SIM West Texas in CX(region, vegetation(region)): = moderate, 2 = moderate vegetation(region) livestock(region): = high, 3 = high West Texas, Chaco SPEC region: 4 = high, 5 = high livestock(Chaco) = cattle,: = moderate

Given the fact that roses grow in England, the first generalization argument transform is that roses also grow in the whole of Europe where England belongs to, which is a kind of induction. Similarly for the SPEC operator it is a plausible inference that the county of Surrey in southern England grows roses. The SIM and DIS inferences are also made in some context. In the case of the transforms of arguments the context might be countries of the world with respect to the variable climate. Holland is quite similar to England with respect to climate while Brazil is quite dissimilar (SIM and DIS argument transforms). Therefore, one might plausibly infer that roses grow in Holland as well but not in Brazil. If one believes that roses grow in England, then one might also plausibly infer the following referent transforms. Given the fact that a yellow rose is a kind of rose which is a kind of temperate flower, then one may plausibly infer that yellow roses (SPEC referent transform) and temperate flowers (GEN referent transform) also grow in England. It is also reasonable that peonies grow in England, since they are similar to roses with respect to the climates they grow in. However, bougainvillea grows in more tropical climates, so it is rather unlikely for this kind of flower to grow in England. These plausible inferences may in fact be correct or incorrect. Certainty parameters The core theory also introduces certainty parameters, which are approximate numbers ranging between 0 and 1. Certainty parameters affect the certainty of different plausible inferences. SIM and DIS statement transforms depend on the degree of similarity (), which represents the similarity of one set to another one. In particular, if the degree of similarity is

presented in Table IV and V. HPR has been applied in IFM by assuming that a user asks himself/herself how to issue a correct command to achieve his/her goal. In the case of a users error, we assume that the user did not know the correct command and tried to use his/her reasoning to infer what the command would be. Thus, statement transforms are used to show what possible plausible errors s/he may have made. Certainty parameters are used to represent the degree of confidence of the system concerning its hypotheses about the beliefs of the user. The way that HPRs representations have been used in IFM is
TABLE V SIM-BASED REFERENT TRANSFORMS AND AN EXAMPLE OF APPLICATION SIM-based referent transforms d(a) = r: 1, , r r SIM r in CX(d, D(d)): , 2 D(d) A(d): , 3 a SPEC A: 4 d(a) = r: = (1, , r, , 2, , 3, 4, 5) Example E.g. Sound(wolf)=howl,: 1 = high, = high, r = low Bark SIM howl in CX(sound, means of production(sound)): = high, 4 = high Sound(wolf) = bark,: = moderate

presented in detail in Section V. B. Reasoning in Intelligent Help Systems In a user interface, automatic assistance may rely on many reasoning mechanisms. One such reasoning mechanism is

5 plan recognition. The system needs to have recognized what the plans of the users are so that it may provide help in the realization/optimization of these plans. Therefore, a lot of intelligent help systems focus their research on plan recognition. AQUA [31], [32], for example, is a help system that conducts dialogues with UNIX users and tries to help them when they face problems in their interaction. In AQUA, the user explicitly states what his/her situation was when the problem arose. The system tries to identify and highlight his/her planning misconceptions. The systems knowledge base consists of sets of planning relations, such as action A causes state S or action A is a normal plan for achieving state S. Planning relations are associated with planning failures which are regarded as potential explanations. A similar approach that detects planning misconceptions in the domain of filling-in a tax form is that of Calistri-Yeh [4]. In this model there are ten classes of plan-based misconceptions. Examples of these classes are violated precondition where the user knows about a precondition but does not know that it has been violated, and missing precondition where the user does not even know that the precondition exists. However, the above approaches assumed that the user must have explicitly stated his/her goal and must have realized that s/he has a problem. By contrast, IFM infers users goals and errors from the actions of the user. Another difference is that IFM uses HPR transforms rather than causal logic to provide a classification of misconceptions. HPR is a cognitive theory and can provide explanations for the cognitive reasoning of a user. Indeed, Calistri-Yeh [4] points out that some limitations of his approach could be addressed by a stronger cognitive and psychological theory to support the selection and weighting of misconception features. An approach similar to that of IFM is presented by Eller and Carberry [11]. Their system works in the domain of naturally occurring dialogues rather than user interaction with a GUI like IFM. They have used meta-rules for hypothesizing the cause of dialogue ill-formedness and for relaxing the plan inference process. This approach of relaxing the semantic interpretation of a users utterance in the context of a dialogue is very close to IFMs approach of transforming a users action in the context of the users interaction with a GUI. In the context of the dialogue, relaxing the semantic interpretation of an utterance means removing some of the constraints on the interpretation and allowing it to be interpreted less precisely than it was originally perceived. This weakening is carried out when the system has difficulty in assimilating the users plans and goals. In IFMs case, there is also a kind of relaxation of a users action that allows it to be interpreted in a less formal way than it was compiled by the GUI language. In IFM such relaxation of a users action is achieved by the HPR statement transforms rather than by meta-rules. The advantage of HPR is that it provides a relatively domain independent method of relaxation. Moreover, HPR is aimed to simulate human reasoning, thus it can be used to simulate both the reasoning of a user and a human expert that acts as an advisor to a user. In this respect, it provides a unifying framework for all kinds of human reasoning required by an intelligent help system. On top of plan recognition, several AI methods and approaches have also been used in order to improve the reasoning of systems in user modeling. Case-Based Reasoning [19], [20] has been used to present to a user a set of solved problems (cases) that are similar to the problem s/he is trying to solve. The user in such systems is expected to learn these cases before solving a new problem e.g. [35]. A quite different approach is adopted by systems using Bayesian Networks (e.g. [21]). More specifically, the use of Belief Bayesian Networks entails the development of a model of how users actually reason. This model is used to identify the users misconception and provide adaptive instruction [10]. The above mentioned techniques base their adaptivity on making hypotheses about the users reasoning and can be quite effective in providing intelligent assistance. However, none of these methods provides a generalized framework suitable for directly modeling both users and observers who act as advisors like HPR. Indeed, HPR models the way humans reason in order to make plausible inferences and proposes the criteria that they use in order to select the best one. In IFM, we use this reasoning to simulate how users may have drawn an incorrect but plausible inference about a piece of the domain (e.g. a command), which they do not know well. Furthermore, we use this reasoning to model human experts who act as observers while evaluating the actions of a user. HPR in combination with a decision making method can simulate the reasoning of human experts in their effort to provide spontaneous advice to users if needed. This reasoning involves making hypotheses about what the users really think in the evidence of users actions and making a decision about whether to intervene and how. However, one problem that occurred with HPR was that the theory did not have a formal way for calculating the weights of the criteria that it proposes to be taken into account. A solution to this problem may be given by Multi Attribute Decision Making (MADM). MADM involves making preference decisions (such as evaluation, prioritization, selection) over the available alternatives that are characterized by multiple, usually conflicting criteria. There are many MADM methods. One such method calculates weighted scores for each option and is called SAW (Simple Additive Weighting) [14]. The SAW method is one of the simplest but nevertheless good decision making methods. A method like this can be used in a complementary way with HPR. This is because SAW provides a formal way for connecting criteria in order to make a decision. However, it does not specify any criteria. On the other hand, HPR specifies criteria to be taken into account in drawing plausible inferences but does not specify a formal way for calculating the weights of these criteria. Therefore, the SAW method or another similar method can be used in the context of HPR to solve this problem. Indeed, we have used SAW in the context of HPR.

6 III. THE CONTEXT OF APPLICATION OF HPR IN IFM In this section we give a brief description of IFM. Here, we focus on a description of the overall operation of IFM and its overall algorithmic approach. The issues relating to the application of HPR will be described in detail in subsequent sections. Intelligent File Manipulator (IFM) is a graphical user interface for file manipulation, such as Windows 98 Explorer [28], that provides intelligent help to its users. IFM monitors users actions and reasons about them. In case it diagnoses a problematic situation, it provides spontaneous advice. When IFM generates advice, it suggests to the user a command, other than the one issued, which was problematic. In this respect, IFM tries to find out what the error of the user has been and what his/her real intention was. A very simple example of a typical interaction of a user with IFM is the following: The users initial file store state is illustrated in Fig. 2. The user deletes the files letterJohn.txt and letterM.txt which are in the folder A:\documents\social1\ and intends to delete the folder as well. However, s/he accidentally attempts to delete A:\documents\social\, which is very similar to A:\documents\social1\ because they have similar names and they are neighboring folders in the graphical representation. In this case, if the user executes the erroneous action s/he runs the risk of losing the content of the folder A:\documents\social\. IFM suggests the user to delete the folder A:\documents\social1\ instead of A:\documents\social\ for 2 reasons: 1. The action del(A:\documents\social1\) is very similar to the action del(A:\documents\social\) therefore there may have been a confusion of the two actions. 2. A:\documents\social1\ has become empty; therefore, its existence has become pointless unless it acquired some new content. On the other hand, if the command issued by the user was executed it would result in the destruction of the existing files in folder A:\documents\social\, which is not empty. particular example followed IFMs advice. In the case of a standard Explorer, the user would probably have completed his/her plan and therefore, s/he would lose all files stored in the folder A:\documents\social\. As can be seen from the example, one important task of IFMs reasoning is error diagnosis. As Cerri and Loia [6] point out, if a system performs error diagnosis, a user modeling component should be incorporated into its architecture. IFMs user modeling component maintains an implicit user model [33], [34]; every time the user issues a command, it generates hypotheses about the users plans and possible errors or misconceptions. IFM evaluates the users actions in terms of their relevance to his/her hypothesized goals. As a result of that, every user action is categorized in one of four categories: Expected: In this case the action is expected by the system in terms of the users hypothesized goals. Neutral: In this case the action is neither expected nor contradictory to the users hypothesized goals. Suspect: In this case the action contradicts the systems hypotheses about the users goals. Erroneous: In this case the action is wrong with respect to the user interface formalities and would normally produce an error message. One important assumption about users is that they do not intend to produce an error message, therefore actions like this are considered unintended. The categorization of user actions in the above categories is done based on what we call instabilities. Instabilities are added and/or deleted from a list as a result of user actions. For example, the creation of an empty folder adds an instability to the file store because the system would expect a subsequent user action by which the folder would acquire a content or be deleted. Instabilities are deleted when an action results in a file store state that should not contain them. For example, an instability associated with the existence of an empty folder is deleted if this folder is removed or if it acquires some content. In this sense the deletion of an instability represents the continuation of a user plan that started earlier. An action is considered expected if it deletes at least one of the existing instabilities of the file store state. It is considered neutral if it neither adds nor deletes instabilities and suspect if it only adds instabilities although there are already other instabilities that have not been deleted. However, Intelligent File Manipulator uses the categorization of user actions as a way of acquiring some idea about which action may need more attention. By no means does it intervene based only on the categorization of commands. A summary of the basic steps of the algorithmic approach of IFM are given below. The basic steps are exemplified by an analysis of the reasoning of the example that was presented in the beginning of this Section. a. The user issues an action. In the example, the user issued the action del(A:\documents\social\). b. The system reasons about the action so as to categorize it in one of the four categories.

Fig. 2. The users initial file store state

In case the user has been observed to be prone to accidental slips then this would be a third reason for the system to suggest the alternative action. Indeed, the user of the

7 c. If the action is categorized as expected or neutral it is executed. The action of the example is not categorized as expected or neutral and therefore, it is not executed. d. If the action is categorized as suspect or erroneous then it is transformed based on an adaptation of HPR. The transformation of the given action is done so that similar alternatives can be found which would not be suspect or erroneous. This step is related to the way HPR is applied in the system and is explained in length in the subsequent sections of this paper. The action del(A:\documents\social\) that the user issued in the example is considered by IFM as suspect because it adds an instability to the file store without deleting any. Such an action starts a new plan while there are others pending and, therefore, they are considered to contradict the users goals and plans. So, IFM applied HPR in order to transform the given action and find similar alternatives. Indeed, the system generates the action del(A:\documents\social1\), which is very similar to the users initial action. e. The system reasons about every alternative action so that it can categorize it in one of the four categories in a similar way as in step 2. In the example, IFM reasons about the alternative action del(A:\documents\social1\). f. If an alternative action is categorized as neutral or expected it is suggested to the user. Expected actions have priority over neutral ones. The alternative action generated by IFM in the example, is found expected as it would delete one instability. Indeed, as the user had previously deleted the contents of the particular folder, the system assumes that it is among the users goals to delete the folder itself. g. For each alternative action that is neutral or expected, IFM calculates a degree of certainty, which represents how certain the system is that the user really intended the particular alternative action. For the calculation of the degree of certainty, IFM uses the certainty parameters introduced in HPR. If more than one alternative action have been generated, IFM selects the one with the highest degree of certainty. In the example, the system found only one alternative that was neutral or expected and proposed it to the user. h. If an alternative action is categorized as suspect or erroneous then it is ignored and is not suggested to the user. i. If no better alternative can be found then the users action is executed without the user realizing that the system was alerted. After the execution of the action, instabilities are deleted or added accordingly. The user of the example followed IFMs advice and, therefore, the system deleted the instability connected with the empty folder A:\documents\social1\ but added one instability for the folder with only one child A:\documents\social\. IV. DOMAIN REPRESENTATION IN IFM The domain representation in IFM concerns knowledge about commands and the file store state. Concepts concerning the GUI are classified in isa hierarchies in order to be compatible with the main underlying assumptions of HPR. In this paper we need to refer to many concepts concerning GUI commands; therefore we will use the following terminology: Commands will generally mean the actual keywords that refer to their meaning (e.g. copy). Selections will mean the objects selected (e.g. book.txt). User actions will mean the complete actions of users. However, we will use a brief textual notation of the form command(object) (e.g. copy(document1.txt)) to mean a sequence of actions involving the mouse and/or the keyboard, such as: 1. select document1.txt 2. press the buttons CTRL-C An important hierarchy is that of users actions (Fig. 3). The hierarchy represents the semantic and/or syntactic structure of actions. Moreover, it is constructed in such a way that every descendant node of a parent node inherits all the properties of the parent node. In this hierarchy, actions are first classified into six categories depending on their purpose: a) Selector In this category there is only the action select(T), which corresponds to clicking on an object in order to select it. b) Clipboard actions. All actions that use the clipboard at an intermediate stage are called clipboard actions. For example, the command copy, which may be issued by the user in three different ways: 1. Selection of copy from a menu. 2. Selection of the icon assigned to copy from the toolbar. 3. Combination of keys (Ctrl + C) c) Information providers All the actions that may be used for providing information to the user. For example, open shows contents of files or folders and explore shows contents of folders. d) Creators. All the actions that create a new object are considered as creators. For example, the command mktxt creates a new text document in the file store. e) Destroyers. All the actions that destroy an object are considered as destroyers. For example, the command delDir deletes a directory from the file store. f) Modifiers. These operators modify the properties of an object. For example, the action Rename(T) changes the name of the object T, where T can either be a file or a folder. The third level of the hierarchy in Fig. 3 represents the actual GUI actions that correspond to the parent nodes specified. The actions are distinguished by their names and arguments. For example, explore(T) opens the folder T. Some of the actions of the third level of the hierarchy can be analyzed further. For example, the command mkfile may be analyzed in the fourth level of the hierarchy which specifies what kinds of file may be created such as text files, word documents etc. These commands can be found in menus; if the

8 user selects file, then New, then s/he is presented with options such as Text Document, wav file, Bitmap image etc. objects that the command is referred to. Every time the user issues a command, IFM assumes that the user has asked himself/herself the following questions: What is the syntactic structure of the action that I should issue? Is the execution of the action acceptable to Windows? The above questions form what we call the basic principle. The basic principle represents the assumption that the user has issued an action after having reasoned about it and believes (or hopes) that the action would be correct. The HPR statements that correspond to the above questions are the following:
internal-pattern(action)=selected-pattern Windows-acceptable(selected-pattern)=yes

Fig. 3. A part of the hierarchy of users actions

Commands can also be divided into two main categories with respect to their syntactic structure. The first category of command is called with-argument and consists of the commands that take at least one argument. This means that the user must have selected at least one object before executing a command belonging to this category. Examples of such commands include the delete, cut or copy commands because the user must have selected at least one item before executing them. Similarly, all the commands belonging to the categories Selector, Information Provider, Delete, Modifier are with-argument commands. In addition, the commands belonging to the category clipboard are with-argument commands. Cut and copy take the argument selected_source and paste takes the argument selected_target. The second category of command is called without-arguments and consists of commands that do not take any argument. This means that the user does not have to select any argument before executing a command belonging to this category. For example, the commands mkdir or mkfile belong to this category because the user does not have to select any object before executing them. Similarly, all the commands belonging to the category Creator are without-argument. However, this information is not illustrated in Fig. 3, which presents only a part of the hierarchy of user actions because of shortage of space. V. STATEMENT TRANSFORMS IN IFM A. The Basic Principle HPR attempts to formalize the reasoning that people use to give an answer to a question for which they do not have a ready answer. Therefore, the application of HPR into the system required the existence of questions asked to users. In IFM, the system makes the assumption that users ask questions to themselves. These questions are made to themselves in their effort to choose the right command and the

The users beliefs about the syntactic structure of the action are represented in the first HPR statement of the basic principle. This statement makes use of the categorization of actions as explained in Section IV and illustrated in Fig. 3. For example, if the user had selected the item document1.txt before executing the command delete then the first statement would have been internal-pattern(delete document1.txt) = delete selected_item. The connection between the first and the second statement is made by the selected-pattern, which refers to the syntactic structure of the command executed. The second statement would be formed by replacing the selected pattern with the referent of the first statement. In the case of the example of the use of the command delete, the second statement would be: Windows-acceptable(delete selected_item) = yes. This means that the user believes that the action s/he has issued is acceptable by Windows. B. Misconceptions Users misconceptions may vary a lot, from deep conceptual confusions to accidental slips. However, in IFM misconceptions of all kinds are treated as gaps in the users knowledge. IFM supposes that the user applied the theorys transforms of similarity, generalization and specialization to what was known to him/her in order to fill these gaps in his/her knowledge. Four cases of transform to the two statements of the basic principle are identified. In each case we explain where the misconception may have occurred and how deep this may have been. The depth of the misconception is related to the part of the two statements of the basic principle which is transformed. The four cases are the following: 1. Intention for another action This case is characterized by an argument transform in the first statement. The transform results in the replacement of the action issued by another action, which means that a similar but different action was meant to be issued. For example, the user may have issued copy(file1) instead of copy(file2). In this case the first statement of the basic principle would be:

9 Such actions are to be replaced by another action, similar to the one issued. The misconception that the user had in this case is considered to be superficial involving accidental slips. By accidental slips we mean that the user tangles up neighboring objects or commands in the graphical representation or objects with similar names, for example Doc and Docs. The system generates alternative actions by searching for alternative commands or alternative objects. 2. Intention for a Windows-acceptable internal pattern. This case is characterized by a referent transform in the first statement. It is assumed that the user must have intended a similar pattern to the one typed. Hence, the first statement would be: internal-pattern(action)=transformed-selected-pattern In this case, the action intended was the action issued but the user thought that a different internal pattern corresponded to this action. However, the pattern that the user had in mind was Windows-acceptable. The misconception involved here would have to do with selections of objects. The user probably thought that an object was selected but the object had been accidentally unselected and that is how the misconception occurred. 3. The action issued and its internal pattern were intended but there was a misconception This case is characterized by an argument transform in the second statement. This would mean that the action intended was the action issued and the internal pattern intended was the selected pattern which the user falsely concluded to have been Windows-acceptable. In these cases, the user might have confused the syntax and semantics of two commands. The statements of the basic principle are the following:
internal-pattern(action)= selected-pattern Windows-acceptable(transformed-selected-pattern)= yes

semantics of the command, which is potentially a deep misconception. For example, a user issues a delete command having selected the hard disk C:\ in the left side of the Explorer in a previous action. The user was not sure whether s/he had to select at least one item before executing the command delete and executed the action, which was not Windows acceptable. From the above examples of the use of statement transforms in generating hypotheses about possible misconceptions, it is clear that transforms taking place in the second statement imply deeper misconceptions of the user than those generated through transforms in the first statement. VI. ADAPTING HPRS CERTAINTY PARAMETERS IN IFM USING AN EMPIRICAL STUDY A possible problem with the generation of alternative actions is the production of many alternatives. A solution to this problem is ordering the alternative actions in a way that the ones, which are most likely to have been intended by the user, come first. The certainty parameters of HPR provide a good tool for ordering the alternatives. Certainty parameters are used in IFM in order to calculate a degree of certainty for every alternative action. However, the certainty parameters of HPR were not immediately applicable in IFM. Their meaning needed to be specified in the domain of IFM. In addition, the exact way of calculation of one important certainty parameter, the degree of certainty (), was not specified fully in HPR. Finally, we considered necessary to combine HPRs inference mechanism with the inferential power of user stereotypes [33], [34] to achieve initialization of user models. However, the use of stereotypes presupposes the division of users into classes. Then these classes needed to be combined with HPRs primitives. A. Empirical Study To give a solution to the above problems we conducted our own empirical study [41] which resulted in the collection of useful empirical data by real users. This data was analyzed by human experts. Then, taking into account the results of the analysis, we adapted HPRs certainty parameters into IFM by specifying them fully and by combining them with user stereotypes as will be explained in detail in subsequent sections. The empirical study involved the following categories of humans: a) Novice and expert users acting as users of a standard explorer. b) Human experts acting as potential advisors of users. In particular, the empirical study involved 30 users of different levels of expertise in the use of a standard file manipulation program. All of them were asked to interact with a standard Explorer, as they would normally do in their dayto-day activities. While users interacted with the system, their actions were video captured. Approximately 37-50 commands constituted each users

In this case the system creates alternative commands or alternative objects to be proposed to the user. In terms of the misconception involved, this case reveals a misconception concerning the syntax/semantics of the command. In this case, the misconception is deeper than in the previous ones. For example, the user issues the command copy without having selected any objects because s/he thinks that the copy command is a without-argument-command. In this example, the system would suppose that s/he had probably intended the paste command, which is a without-argument-command and quite similar to the one issued. 4. The action issued and its internal pattern were intended without certainty that they were Windows-acceptable or not. This case has to do with a referent transform in the second statement i.e. the value yes or no as to whether the issued pattern was Windows-acceptable or not. The answer yes is not similar to no therefore the explanation that one can give is that the user was doubtful and decided to have a go and let Windows complain if there was an error. The statements of the basic principle are the following:
internal-pattern(action)= selected-pattern Windows-acceptable(selected-pattern)=transformed-referent

This case reveals a problem with the syntax and/or

10 protocol. Those protocols were considered to have been a good sample of real-life users interaction with a standard file manipulation program. The protocols collected by this experiment were given to 10 human experts in order to be analyzed. All human experts selected to participate in the empirical study possessed a first and/or higher degree in Computer Science and had teaching experience related to the use of file manipulation programs. The human experts were first asked to study carefully the protocols and focus on the following tasks: Identification of the errors and the categories that these errors could be classified into, in terms of what the experts believed the cause for those errors was. For example, some errors were made because novice users did not know the usage of certain commands. Classification of users in categories with respect to the frequency they made each error, the type of commands they executed most and the way each user had chosen to execute those commands. For example, some users used the interfaces buttons while some others used the menus. Rating the certainty parameters of HPR in a priority order in terms of their importance as criteria used by the human experts. Such criteria were used by the human experts to make the selection of what they thought an appropriate advice was. This advice would be the suggestion of a correct alternative action instead of the erroneous one issued by the user. As a result, human experts identified the following main categories of error: command errors, structure errors, spelling errors, mouse errors, and identical name errors. By command errors, we mean cases where the user had selected the wrong command with respect to his/her hypothesized intentions or cases where a command had failed. For example, some users confused the usage of cut/copy and paste commands. By structure errors, we mean that the user had made mistakes due to his/her unawareness of the structure of a standard file manipulation program. For example, when the user confused the parent folder on the left part of the Explorer with the folder shown on the right part of the program. These errors were mainly made by novice users due to their lack of knowledge about the system and its operations. By spelling errors we mean all the errors that were made because a user tangled up objects with similar names. Mouse errors refer to mistakes that were made because the user had tangled up neighboring objects in the graphical representation. Finally, by identical name errors we mean that the user confused objects with exactly the same name that were situated in different places in the file store. In general, all errors belonging to the last three categories were considered as accidental slips and by that we mean that the user tangled up neighboring objects or commands in the graphical representation or objects with similar names, for example Doc and Docs. In addition to the identification of error categories, human experts classified users into categories with respect to their level of expertise and to their degree of carelessness, as well. As a result, human experts classified users into three main categories with respect to their level of expertise, namely, novice, intermediate and expert. From the sample of 30 users, 14 were categorized as novice, 9 as intermediate and only 7 users were thought to be experts. In order to identify every users level of expertise, human experts had taken into account the users total number of command and/or structure errors. In addition to this, human experts identified a user as careless or careful taking into consideration the number of spelling, mouse or identical names errors they had made. Only 33% of the users were considered to be careful. The rest of them had made a lot of accidental slips, and were categorized as careless. Naturally, the degree of carelessness or carefulness may be affected by the degree of motivation of the students concerning the tasks that they carry out. For example, strongly motivated students may be more careful. However, in the experiment that we conducted, all users had the same motive for interacting with the system since this took place as part of their assignments for an introductory course on Computer Science. Therefore, we assumed that the students of the experiment were equally motivated. Moreover, the analysis of data relating to the number of errors that users had made is presented below: Novice users made on average 30% command errors and/or 20% structure errors in the total errors; however, they were characterized as novices when they had executed at least 2 command/structure errors in a total of 20 commands. The percentage of command and structure errors was reduced for intermediate users, who made only 10% of each kind of error. Intermediate users made, 0 or 1 command errors and/or structure errors in total of 20 commands. None of the expert users made any command or structure errors. Users that were considered as careless made 25%-30% spelling errors, 30% mouse errors and only a few errors due to confusion of identical names. Their weak point was either spelling or mouse errors. Careful users did make some accidental slips, but not more than 10% spelling and mouse errors and almost no errors due to confusion of files with identical names. Finally, the experts rating of certainty parameters of HPR is going to be presented in the next section. B. Certainty Parameters in IFM Among the certainty parameters of HPR, 5 certainty parameters have been adapted and used in IFM. The five certainty parameters are: the degree of certainty (), the degree of typicality () of an error set in the set of all errors, the degree of similarity () of a set to another set, the frequency

11 () of an error set in the set of all errors and the dominance () of a subset in a set. The degree of similarity () is applied in SIM statement transforms. This parameter is used to calculate the similarity between two commands or two objects; and consequently the similarity of two actions. The similarity between two commands is static and is pre calculated. Its value is based on the relative position of the command in the hierarchy of users actions in Fig. 3. Two commands that are neighboring in the hierarchy of users actions have a high degree of similarity; for example the commands mktxt and mkdoc. In addition to their relative distance, the effects of the commands have also been taken into account. For example, cut and copy commands have a similar effect; they place one or more objects into the clipboard, so they have a great similarity. Moreover, it has been observed that novice users tend to entangle two commands when these are neighboring on the screen. So the similarity of two commands also depends on their relative distance on the screen. For example, two commands such as copy and cut have great similarity since their relative distance in the hierarchy of users actions and on the screen is limited to a minimum and their execution has similar effects. A degree of similarity is also calculated when SIM statement transforms are applied to objects of the file store. This similarity cannot be static since the items of the file store are constantly changing. As a result, the similarity between two objects is calculated dynamically. The value of similarity between two objects depends on their relative position in the file store, as this is displayed on the screen. Moreover, the similarity of their names is also taken into account. For example, files document1.doc and document2.doc have a high degree of similarity. Another certainty parameter used, is the degree of typicality (). The value of typicality is calculated dynamically. A degree of typicality is associated with every command. The calculation of the value of this certainty parameter is based on the frequency of use of the command by a particular user, as this frequency has been recorded on his/her individual user model. For example, some users never create new files using the explorers command mkfile but rather they create files through wordprocessors or other application packages. Therefore, it would not be wise for the system to hypothesize that this user had intended to issue mkfile instead of an erroneous command issued. The degree of frequency () of an error represents the frequency of occurrence of the particular error by a particular user. In this way we can easily spot the errors that a user is prone to. Hence past errors may be used to predict new ones. Indeed, it has been observed that users tend to repeat the same errors. In order to find the most frequent error of a user we use the dominance () of an error in the set of all errors of the particular user. The value of this parameter shows the percentage of a category of error in the set of all errors of a particular user. For example, if the dominance of the deletion errors is 0.8, we can conclude that the particular user is mainly prone to deletion errors. This, of course, does not mean that s/he does not make other kinds of mistake as well. All of the above parameters are combined in order to calculate a degree of certainty () for every alternative action generated by the system based on the HPR transforms of the basic principle. The degree of certainty represents the likelihood that a user may have intended to issue one of the alternative actions generated. The degree of certainty is an approximate number ranging between 0 and 1 and determines if an action is to be proposed to the user and in what priority. In HPR, the degree of certainty () is associated with every statement transform and is calculated using a function that combines other certainty parameters. However, as already mentioned, the exact way of computation of is not fully specified in the theory. Therefore, we have used the Simple Additive Weighting Method (SAW) [14] to find the exact way of computation of . The decision about which action is to be proposed to the user and in what priority relates to the reasoning of a human advisor who would watch a user work over his/her shoulder. Therefore, for the specification of the computation of we focused on the answers of human experts who acted as observers and potential advisors. These experts take into account some criteria while forming some kind of advice. These criteria are represented with the HPRs certainty parameters in our approach. The first step of the SAW method is the calculation of the weights of the criteria, which was made within the context of IFM by taking into account the results of the empirical study. The relative importance of these criteria was defined by taking into account the opinions of human experts. More specifically, human experts were asked to rank the four certainty parameters with respect to how important they are in their reasoning process. Each human expert was asked to assign one score of the set of scores (1, 2, 3, 4) to each one of the four certainty parameters and not the same one to two different parameters. The sum of scores of the elements of the set of scores was 10 (1+2+3+4=10). As soon as the scores of all human experts were collected, they were used to calculate the weights of the certainty parameters. The scores assigned to each certainty parameter by each human expert were summed up and then divided to the sum of scores of all certainty parameters (10*10 human experts = 100). In this way the sum of all weights could be equal to 1. As a result, the calculated weights for the certainty parameters were the following: The weight for the degree of similarity (): w = 37 = 0.37


The weight for the dominance (): w = 32 = 0.32 100 The weight for 19 = 0.19 w = 100 the degree of frequency ():

12 The weight for the degree of typicality (): w = 12 = 0.12 100 Indeed, it was revealed that the most important criterion that human experts had used when they evaluated candidate alternative actions to suggest to a user was the similarity of the alternative action to the one issued by the user. The majority of them thought that similarity was important because users usually tend to tangle up actions or objects that are very similar. This criterion was related to the degree of similarity () and most users assigned to it the score 4. The weight of the particular parameter that represents its relative importance as this has been estimated by the empirical study is 0.37. The second most important criterion that human experts had used was whether a particular users error was the most frequent error of all errors that this user had made. This criterion corresponded to the dominance of an error in the set of all errors of a particular user. The weight of this criterion according to above procedure was estimated to 0.32. Furthermore, the third most important criterion when evaluating an alternative action was the frequency the user makes such an error while interacting with the system. This criterion corresponded to the degree of frequency of the particular error. The weight of the degree of frequency was estimated to 0.19. Finally, human experts thought that it was useful for them to know if a user used the action that they intended to propose quite often or not. It would not be likely that the user had made a mistake in the execution of a command that s/he uses quite often and thus probably knew how to use correctly. However, still there is a possibility that the user may have made a carelessness mistake in such a command. Therefore, the typicality of a certain command for the particular user was taken into account but was not considered of primary importance. Therefore, the weights of that criterion was only 0.12. After the calculation of weights was made, the SAW method was used further for the calculation of the degree of certainty . According to the SAW method the degree of certainty can be calculated as a linear combination of the values of the four certainty parameters using the function: C. Combination of Certainty Parameters with User Stereotypes The above description of the use of HPRs certainty parameters in IFM assumes that the system would have some information about the users habits before it could calculate the degree of certainty . However, no sufficient information about the user may be obtained before the user has interacted with the system for quite a long period of time. A solution to this problem was the incorporation of user stereotypes to provide default assumptions about users until the user model acquired sufficient information about each individual user. Indeed as Rich [33], [34] points out, a stereotype represents information that enables the system to make a large number of plausible inferences on the basis of a substantially smaller number of observations; these inferences must, however, be treated as defaults, which can be overridden by specific observations. Stereotypes may serve as a tool to model the beliefs and preferences that the users of a system may have. GRUNDY [34], the first system that used stereotypical user modeling, used stereotypes to model the preferences of a user and all the features that might influence the selection of a book. However, GRUNDY categorized users based on their explicit answers to questions made by the system. In the case of IFM, users are categorized based on implicit inferences, unlike GRUNDY. Implicit inferences are made after having observed users actions for a while. IFM needs to know what the habits of a user are and whether s/he is prone to certain kinds of error; users may not be able to provide this kind of information about themselves. Kobsa et al. [18] describe a stereotype as consisting of two main components: A set of activation conditions (triggers) for applying the stereotype to a user and a body, which contains information that is typically true of users to whom the stereotype applies. In IFM we use stereotypes for classifying the users with respect to their level of expertise and the degree of carelessness while interacting with the system. This classification can be very helpful in identifying users errors and providing individualized help very early in a users interaction with the system. Thus, IFM maintains a library of models for every group of users, and every time a new user interacts with the system, IFM must identify the class the particular user belongs to. All default assumptions of stereotypes in IFM, concern the errors that users belonging to this category usually make. The assumptions about each error are expressed by using the certainty parameters of HPR theory, as they have been adapted in IFM. So, we used frequency to show how often users belonging to a certain group make a particular error. Another piece of information that can be derived by a stereotype is the dominance of a particular error within the set of all errors for users belonging to the particular stereotype. Finally, the typicality shows how typical a command is in the set of the commands that a user of a particular stereotype is expected to use.

( X j ) = wi x ij , where wi are the weights of certainty

i =1

parameters and x ij are the values of the certainty parameters for the X

alternative action.

In view of above the formula for the calculation of the degree of certainty is: (1) = 0.37 + 0.32 + 0.19 + 0.12 As a result, the formula for the calculation of the degree of certainty combined the above certainty parameters in a way that reflected the opinions expressed by the majority of human experts.

13 The empirical study that we conducted revealed that users could be classified into three major classes according to their level of expertise, namely, novice, intermediate and expert. Each of these classes represents an increasing mastery in the use of the particular file manipulation system. Such a classification was considered crucial because it would enable the system to have a first view of the usual errors and misconceptions of a user, belonging to a group. For example, novice users are usually prone to command errors whereas expert users do not make mistakes in the command use. One might not expect expert users to make mistakes but this does not correspond to reality. There are some experts that are very prone to mistakes because of their carelessness due to tiredness or haste. As a result, another classification that was considered rather important was dividing users into two groups, careless and careful. Some of the errors identified during the empirical study were attributed to the carelessness of users. The carelessness/carefulness classification can apply to other domains as well. For example, in a tutoring system in Mathematics, a learner may make mistakes in mathematical calculations due to carelessness rather than lack of understanding of the mathematical concepts. However, in other domains where users know that they have to be extremely careful because their actions may have severe consequences (e.g. risk of a human life) then this kind of classification may not be applicable. A stereotype usually has a set of trigger events. IFM infers information about the user by watching him/her during his/her interaction with the system; a user that makes a lot of errors is probably a novice whereas someone that only makes few errors that can be considered as accidental slips is probably an expert. However, the system cannot decide where to categorize a user before s/he has executed a satisfactory number of commands. The empirical study revealed that an early conclusion could be drawn only after the execution of twenty commands. Hence, for example the stereotype of novice users is activated when a user makes at least 5 structure and/or command errors in a total of 20 commands.
TABLE VI DEFAULT ASSUMPTIONS OF STEREOTYPES CONCERNING THE TYPICALITY OF COMMAND RELATING TO THE LEVEL OF EXPERTISE OF A USER. Novice Intermediate Expert Mkdir 0.9 0.7 0.7 mkfile 0.1 0.3 0.3 Mktxt 0.05 0.1 0.1 mkdoc 0.05 0.2 0.1 mkbmp 0 0 0.05 mkwav 0 0 0.05 Deldir 0.4 0.5 0.5 Delfile 0.6 0.5 0.5 Copy 0.1 0.2 0.4 Paste 0.2 0.4 0.9

careless if s/he has made at least 5 errors that can be considered as accidental slips in general. Accidental slips refer to spelling mistakes, mouse errors or confusions of objects having similar names. However, a problem encountered when using stereotypes is that a user does not necessarily remain to a certain stereotype forever. Users skills and behavior change while they interact with the system. This is why the system must regularly check whether the right stereotype is activated or not. After 50 commands at a time the system checks whether the activated stereotype is still appropriate. If the system has observed an improvement in skills or an attitude modification for a specific user, it revises its previous categorization of the user and deactivates the activated stereotype in order to activate the one that fits best the user in the present conditions. Some of the default assumptions for users of the novice,

intermediate and expert stereotype are presented in Tables VI and VII. Table VI shows examples of how typical certain commands may be for a particular stereotype category. Table VII shows examples of frequency and dominance of certain types of error for a certain stereotype category. Novice users are more prone to command errors, which are considered to be their weak point, rather than structure errors. Intermediate users are still committing command and structure errors, although these are not such usual errors for them. On the other hand, expert users do not make such errors.
TABLE VIII DEFAULT ASSUMPTIONS RELATING TO REPRESENTING THE CARELESSNESS OF A USER Frequency Dominance CARELESS SpellingErrors 0.3 0.4 MouseError 0.3 0.4 IdenticalNamesError 0.1 0.2 CAREFUL SpellingErrors 0.05 0.45 MouseError 0.05 0.45 IdenticalNamesError 0.025 0.1 StructureError 0 0

Triggers for categorizing the users according to their carelessness when executing certain tasks are constructed similarly to the triggers of the stereotypes that correspond to the users proficiency. Again a user is not categorized before having executed 20 commands. A user is categorized as

Stereotypes that classify users according to their degree of carelessness provide default assumptions about the errors made due to carelessness. This stereotype contains information about the kind of accidental slip, a user may make and the frequency s/he makes such errors. For example, a

14 user, who is considered by the system as careless, usually makes 30% mouse errors, 30% spelling errors and only a few errors are due to confusion between objects with identical names. In Table VIII, one can see default assumptions for the careless and careful stereotype. VII. EMPIRICAL EVALUATION OF IFM After the completion of IFM, we conducted an empirical evaluation to ensure the completeness of IFMs design and the usefulness of its operation. Empirical evaluation refers to the appraisal of a theory by observation in experiments [8]. IFM aims primarily at helping users in situations where they accidentally issue actions, which they do not really intend. Such actions include commands that are prompted with error messages by a standard explorer. However, most importantly, they also include actions, which may be syntactically correct with respect to a standard explorers formalities but they do not achieve what the user may have really meant. For example, a user may accidentally delete a file, which was useful. In deletion actions, a standard explorer produces warning messages which are not very meaningful (as can be seen in an example that follows). Moreover, deletion warning messages are always the same (Are you sure you want to delete X?) in a standard explorer irrespective of the particular user intentions, previous actions of a user etc. Therefore, users tend to ignore them since they do not provide much information to them. In other cases, a user may issue a syntactically correct command that s/he does not mean which may result in an undesired situation, where a standard explorer would not respond at all. For example a user may accidentally paste a file in an undesired destination and may cause a disorientation, confusion or even indirect deletion of useful data. As a result a standard explorer suffers from many usability problems. For a detailed analysis of the usability problems of standard file manipulation programs please refer to [42]. In general, IFMs reasoning was aimed at rendering the interaction more human-like in terms of intelligent and plausible responses of the system to users errors. Therefore, an important evaluation goal was to find out how successful IFM was at producing additional reasoning in comparison to a standard explorer. Moreover and most importantly, how successful IFM was at reproducing reasoning similar to human experts who observed the interaction. For the above purposes, we conducted two kinds of experiment. First, we conducted a very similar experiment with the one described in the empirical study. The user protocols, which were commented by the human experts, were used for a competitive testing of IFM. In particular, these protocols were given as input to IFM and IFMs responses were recorded. In this way, IFMs responses were compared to those of a standard explorer. Moreover, IFMs responses were compared to the comments that the 10 human experts had made when they analyzed the protocols. Second, 16 users and 10 advisors were also asked to participate in an experiment where the users interacted directly with IFM rather than with a standard explorer and the advisors commented on the user protocols that were recorded from these interactions. The users had diverse backgrounds and interests and constituted a representative sample of expert and novice users. All 16 users were asked to interact with IFM, as they would normally do with a standard file manipulation program. In particular, these users were given an initial file store state and they were asked to organize it in the way they wanted without being given any specific tasks to do. Thus, IFM was not aware of the users goals. In case IFM diagnosed a problematic situation, it informed the user that perhaps there was something wrong and suggested an alternative command. The experiment required making observations about the users as they interacted with the system. Therefore, computer logging was used in order to register all users action. The protocols collected were studied very carefully after the completion of the users interaction with IFM and then the users were also interviewed so that they could give their own views about what had happened during their interaction with the system.

Fig. 4. The users initial file store state

Below, an example of a part of a user protocol is given. The example originates from the first kind of experiment. In this example, we show what IFMs reactions were to the users actions and how these were compared to a standard explorer and to the reactions of human experts. First, the users action is given in bold and in a way that describes briefly the mouse and/or typing actions of a user. Then IFMs reasoning that corresponded to the action is given; in case a command was characterized as suspect or erroneous, IFM generated alternative commands and suggested to the user to replace the command issued with one of the alternatives. Then we demonstrate whether IFMs suggestions were compatible to the human experts suggestions for each command. The users initial file store state is illustrated in Fig. 4. 1. create_new_folder_in (A:\essay\) IFMS REASONING: Expected command because A:\essay\ was empty and IFM assumes that the user wants to assign some

15 content in that folder. 2. rename (A:\essay\New Folder\, A:\essay\programs\) IFMS REASONING: Expected command. IFM assumes that the user wants to give a more meaningful name on that folder. 3. cut(A:\project\document\school\exercise.doc) IFMS REASONING: Neutral command. 4. copy(A:\essay\programs\) IFMS REASONING: Suspect command. IFM would expect a paste action following the cut action at command 5. IFMs alternative action: paste(A:\essay\programs\). This action is suggested because A:\essay\programs\ has been selected by the user and is newly created (on top of a stack of recently created folders) that has not been assigned any content yet. Moreover, the command copy is considered similar to paste, therefore, a user may have mistaken one for the other. Finally the command paste is expected after the command cut in action 3.


IFMs reasoning about the possible error of the user and the need of a paste action was compatible with the opinion expressed by the majority of the human experts (80%).

The standard explorer does not reason about the real intentions of a user and does not generate any response about this command. 5. create_new_folder_in (A:\) IFMS REASONING: Neutral command. 6. rename(A:\New Folder\, A:\ SoftEng\) IFMS REASONING: Expected command. 7. cut(A:\project\documents\requirements.txt) IFMS REASONING: Neutral command. IFM assumes that the user wants to give a more meaningful name to that folder. 8. paste(A:\SoftEng\) IFMS REASONING: Neutral command. 9. deldir(A:\project\document\) IFMS REASONING: Suspect command. IFMs alternative action: deldir(A:\project\documents\). This action is suggested because A:\project\documents has become empty therefore its existence is pointless unless it acquired some new content. On the other hand, this is not the case for A:\project\document. Moreover, A:\project\documents and A:\project\document have very similar names and could be mistaken.

IFMs reasoning was compatible with the opinion expressed by 100% of the human experts.

The standard explorer produces the warning message Are you sure you want to remove the folder document and move all its contents to the Recycle Bin?. However, this message is the same irrespective of the folders contents. Even if the folder was empty the standard explorer would produce the same warning message. Users often do not pay attention to such messages as they are always the same irrespective of their goals and plans.

10. deldir(A:\project\documents\) REASONING: Expected command. This command verifies IFMs suggestion in action 9. The degree of success of IFM was measured by the degree of compatibility of its reasoning with that of the human experts. Our goal was to approach the reasoning of human experts. Therefore, IFM was considered successful when it generated advice which was compatible with the advice given by the majority of human experts. The results of the evaluation were quite encouraging. In cases when there was a total agreement of human experts opinions, IFM produced either a very similar or exactly the same advice to that of the human experts. This usually corresponded to cases where the error was obvious to human advisors such as the errors in commands 4 and 9 of the sample session. For the error in command 9, there was 100% of agreement among the experts. In general the degree of compatibility of IFMs advice with a unanimous opinion of human experts was 92%. This meant that IFM was very successful at spotting the most plausible and obvious errors of the users. However, there were cases where there was a diversity of human experts opinions. In those cases IFMs advice was either identical to the advice of the majority of human experts (e.g. in command 4) or in fewer cases it was compatible to the advice provided by a minority of experts. More specifically, IFM gave a compatible advice with the majority of humans in 82.7% of the total cases. The value of this degree of compatibility revealed that IFM could successfully reproduce human experts advice to a satisfactory extent. This means that IFM can function to a large extent as a human expert that watches the user over the shoulder and provides useful comments and plausible advice whenever this is considered essential. Concerning the comparison of IFMs responses to a standard explorers, the experiment revealed that IFM reacted reasonably to a lot more cases than a standard explorer and IFMs reasoning was more sophisticated. For example, at command 4 of the example, the standard explorer did not react at all and at command 9 it produced a standard message that was not very informative and individualized. A more detailed presentation and discussion of the evaluations experiments and results is given in [42]. The second kind of experiment was very similar to the first one. However, this time the users interacted with IFM directly and they gave a small interview about their impressions and comments on IFM after they had used it. The protocols collected by this evaluation experiment revealed that in 85% of the cases where IFM intervened, the user had actually followed IFMs advice and in only 5% of the cases where IFM intervened the user had totally ignored IFMs advice. Below, we illustrate an example of a user protocol of the second kind of experiment. In this example, we show what IFMs reactions were to the users actions and how these were compared to the reactions of human experts. The user of this example had interacted with the system for a short period of time and IFM had collected limited information about this user. Therefore, the only available

16 information about his/her errors, habits and tendencies could have been acquired by the corresponding stereotype. Taking into account the users performance in his/her first interactions with the system, s/he was categorized as expert but careless. The users initial file store state is illustrated in Fig. 5. The user issued the following actions: 1. create_new_folder_in(A:\) 2. rename(A:\New Folder\, A:\publications\) 3. copy(A:\description.txt) 4. paste(A:\publications1\) names. However, the value of similarity of A:\papers\ and A:\documents2\ is only =0.4, since they are neighboring in the graphical representation but have not similar names. The degree of certainty of the alternatives is calculated by replacing the values of the certainty parameters in formula (1). Hence, the degree of certainty of the first alternative was 0.31 whereas the degree of certainty for the second alternative was 0.75. Therefore, the second alternative was proposed. Indeed, the comparison with the human experts revealed that 60% of the human experts proposed the first alternative action. In this case the user of the example had followed IFMs advice. The users answers to the interview they gave after having interacted with the system revealed that 56.25% of the users found the interaction with the system good, 25% found it mediocre but better than a standard explorer and only 18.75% of them thought that it needed a lot of improvement. Concerning the advice that IFM produced, 62.5% found it really helpful. Only 12.5% of the users found the advice unnecessary and thought that they could achieve their goals without such help. An important issue in help systems is often the method of intervention, therefore, there was a question concerning that issue. Only, 12.5% of the users found the method of intervention annoying whereas 56,25% found it good. VIII. DISCUSSION AND CONCLUSIONS In this paper, we have shown how HPR may be used to add more human reasoning to a user interface. In particular, we have tested our ideas in a Graphical User Interface similar to Windows/NT Explorer which is addressed to a very large number of computer users of varying backgrounds. GUIs, as they stand, are considered more user-friendly than other user interfaces such as command-language interfaces. However, in the empirical study that we conducted, it was revealed that GUI users made a lot of mistakes while interacting with the system, therefore, there was scope for the provision of intelligent assistance. The kind of intelligent assistance, that we were aiming at, was spontaneous assistance similar to the help that a human expert could provide if s/he watched a user work over the users shoulder. Therefore, we considered it important to incorporate into the system some kind of human reasoning such as HPR. HPR provides a domain-independent, formal framework for generating hypotheses about the users beliefs and intentions from the point of view of a human advisor. The formal framework of HPR has been used to simulate the plausible human reasoning of both a user and a human advisor. However, HPR was not immediately applicable into a GUI. First, we needed to specify the HPR questions that would drive the line of inference. Then we had to create hierarchies out of procedural knowledge such as users actions and commands. We had to adapt HPRs certainty parameters into the domain of a GUI. In some cases we had to complete the specifications because in HPR there were only guidelines for a possible application; for example, in the certainty parameter .

Fig. 5. The users initial file store state

IFM had no problem with the first three actions and executed them normally. However, the system judged that there may have been a problem with the fourth action. This was due to the fact that the folder A:\publications1\ already contained a file called description.txt that would be overwritten. Therefore, IFM generates the following alternative actions each of which is compatible to the users hypothesised intentions: Alternative action 1: paste(A:\papers\) Alternative action 2: paste(A:\publications\) Both alternative commands have been constructed by replacing the object selected by the user. Having completed the generation of alternative actions, the system calculates the degree of certainty for each alternative action. As mentioned above, the system only uses information given by the stereotypes. Consequently, the values of the degree of dominance and degree of frequency are acquired by the corresponding stereotypes default assumptions, which are presented in Table 8. The degree of frequency and the dominance for the first case, where the system supposes that the user had made an error between two neighboring folders are =0.3 and =0.4, respectively. However, the values of these parameters are higher in the second case, where the user has tangled up two neighboring folders with very similar names (i.e. publications and publications1) because there are two causes of possible mistake, neighboring position and similarity in names. Therefore, =0.6 and =0.8, respectively. The value of the degree of typicality (=0.9) for the paste command is acquired by the default assumptions of the stereotype for expert users. However, the fourth certainty parameter, the degree of similarity is calculated dynamically and depends on the similarity of the names of the objects and their relative distance in the graphical representation of the file store. The value of the degree of similarity of A:\publications\ and A:\publications1\ is =0.95, because they are neighboring in the graphical representation and they have very similar

17 Finally, we had to address the problem of the initialization of user models. In IFM, the line of HPR inference is driven by a multiple statement transform that is called the basic principle. The basic principle refers to the selection and assessment of an appropriate action in the users mind. Through the basic principle, HPR transforms are successfully used to generate a set of hypotheses representing possible users errors, which are quite plausible. A potential problem of the approach is the generation of many alternative actions to be suggested to the user. This problem is addressed by the use of certainty parameters. In particular, certainty parameters are used to rank the alternative actions in a priority order from the point of view of a human advisor. The fact that in HPR the way of calculation of the degree of certainty had not been fully specified was addressed by a decision making method. However, an important part of the application of the decision making model we selected, the SAW model, is that the weights of the criteria should be calculated based on what human decision makers usually do. We solved this problem by conducting an empirical study in the context of a GUI. Hence, the answers to the above questions were based on the results of the analysis of the empirical data. The analysis of empirical data and the application of a decision making model as a way of completing the specification of HPR in a particular domain was not suggested as such by HPRs inventors. However, we found that the two theories were very compatible and complementary to each other. Their compatibility lies on the fact that they both aim at simulating human reasoning to some extent. In addition they can complement each other for the following reasons. HPR gives a unifying framework of how human plausible inferences are made and specifies a set of criteria (certainty parameters) that can be taken into account but it does not give a formal way for calculating the degree of certainty. On the other hand, SAW deals exactly with this issue: how to combine the criteria that affect a decision making problem. Indeed, the combination of such decisionmaking models with HPR and their application in an intelligent user interface has proved quite powerful and effective. The problem of the initialization of the user model when there is not yet sufficient information about a particular user, is addressed by the use of user stereotypes. As Kay [15] points out, stereotypes constitute a powerful mechanism for building user models; therefore, they have been widely used in advisory software for user modeling. However, the approach of combining user stereotypes with the implicit inferential power of HPR is novel. Moreover, it is a useful, generic approach for combining stereotypes with an implicit inference mechanism. This approach is applicable to other domains as well. The success of HPR at rendering the interaction more human-like in terms of the systems ability to reason about human errors, was assessed by conducting an empirical evaluation. An important aim of the evaluation was to compare the reasoning of IFM with that of human experts to find out how successful IFM was at reproducing such plausible human reasoning. The results of the evaluation revealed that indeed IFM was quite successful at producing reasoning similar to the majority of human experts that took part in the evaluation. In addition, it was considered very important to build a GUI that would provide more adaptive and intelligent, plausible reasoning than a standard explorer. IFM was quite successful at obtaining these goals. In particular, IFM was especially successful at recognizing errors that looked quite obvious to a human advisor but were not recognized by a standard explorer at all. Such reasoning became feasible due to the certainty parameters that gave a very high degree of certainty to cases where human advisors had almost no doubt about a users mistake. Recognizing obvious human errors is a great improvement in a user interface in terms of its user-friendliness; this cannot be achieved if human reasoning is not incorporated into a system. However, even human advisors, in their minds, may only model an approximation of users beliefs. Therefore, it was beyond the scope of IFM (and consequently of the evaluation) to produce reasoning that would exceed the capabilities of human experts who observed the interaction. Finally, what we consider as the most important contribution of this work, is the applicability of the methods that we employed in other domains as well. As a matter of fact, IFM is the second application for providing intelligent help where HPR has been used successfully after a previous one concerning UNIX users [39]. Additionally, the application of HPR as a domain-independent reasoning mechanism has been investigated in an authoring tool for intelligent tutoring systems [38] and a virtual reality game that operates as an educational application [43]. The incorporation and adaptation of this theory in all these systems aimed at providing Intelligent Help Systems (IHSs) and Intelligent Tutoring Systems (ITSs) with the ability to follow imperfect but plausible users and students reasoning, respectively. In the case of the UNIX help system, the domain of a commandlanguage interface is quite different from the domain of a GUI but the reasoning of HPR was successful there too. In the case of the tutoring systems, HPR was used to follow the students reasoning when they formed their answers to questions in tests. In this way, the ITS would not assess the students performance based on the correctness, incorrectness of their answers only but on the reasoning process as well in teachinglearning dialogues. Indeed, HPR proved to be rather effective as it was successful at simulating the reasoning of human learners in a variety of domains such as anatomy, geography, etc. The main limitation that was identified by the application of HPR in IFM and in other domains is that the domain representation should be determined very carefully so that the certainty parameters may be defined in the context of the application. In order to achieve this an empirical study should always precede the application of the theory in a certain

18 domain. Furthermore, the empirical study is also needed for defining the relative importance of certainty parameters and therefore, enable the combination of HPR with a decision making model such as SAW. A final limitation of the application of the theory concerns the design of users stereotypes in accordance with the certainty parameters of HPR. However, this problem can also be addressed if an empirical study is conducted before the application of the theory. A detailed presentation of the methodologies that may be used for an empirical study is presented in [42]. HPR is a domain-independent theory and the parts of it that are not completely specified can be completed effectively by the use of empirical studies and other theories, such as SAW. Moreover, stereotypes constitute a widely used method for user modeling which has been shown to profit from the implicit inferential power of HPR. These conclusions can lead to a generalized framework which can be the base of a user modeling shell. Indeed, it is within our future research plans to create a user modeling shell that would incorporate the methods used in the present work. REFERENCES
[1] S. Bull, J. Greer, G. McCall, L. Kettel and J. Bowes, User Modelling in I-Help: What, Why, When and How, User Modeling: Proceedings of the Sixth International Conference, M. Bauer, P.J. Gmytrasiewicz and J. Vassileva, Eds., Springer Wien New York, 2001, pp. 117-126. M.H. Burstein and A.M. Collins, Modeling a theory of Human Plausible Reasoning, Artificial Intelligence III: Methodology, Systems, Applications. T. OShea and V. Sgurev, Eds. Elsevier Science Publishers B.V. (North Holland), 1988, pp. 21-28. M.H. Burstein, A. Collins and M. Baker, Plausible Generalisation: Extending a model of Human Plausible Reasoning, Journal of the Learning Sciences, vol. 3 and 4, pp. 319-359, 1991. R. J. Calistri-Yeh, Utilizing User Models to Handle Ambiguity and Misconceptions in Robust Plan Recognition, User Modeling and UserAdapted Interaction, vol. 1, No. 4, pp. 289-322, 1991. J.R. Carbonell, and A. Collins, Natural semantics in artificial intelligence, Proceedings of the Third International Joint Conference on Artificial Intelligence, Stanford, California, 1973, pp. 344-351. S.A. Cerri and V.A. Loia, Concurrent, Distributed Architecture for Diagnostic Reasoning, User Modeling and User Adapted Interaction, vol. 7, no. 2, pp. 69-105, 1997. D. N. Chin, KNOME: Modeling What the User Knows in UC, User Models in Dialog Systems, A. Kobsa and W. Wahlster, Eds. 1989, pp. 74-107. D. N. Chin, Empirical Evaluation of User Models and User-Adapted Systems, User Modeling and User Adapted Interaction, vol. 11, no. 12, pp. 181-194, 2001. A. Collins and R. Michalski, The Logic of Plausible Reasoning: A core Theory, Cognitive Science, vol. 13, pp. 1-49, 1989. D. Duncan, P. Brna and L. Morss, A Bayesian Approach to Diagnosing Problems with Prolog Control Flow, Proceedings of the 4th International Conference on User Modeling, 1994, pp. 79-86. R. Eller and S. Carberry, A Meta-rule Approach to Flexible Plan Recognition in Dialogue, User Modeling and User-Adapted Interaction, vol. 2, No 1-2, pp. 27-53, 1992. R. Haakma, Towards explaining the behaviour of novice users, International Journal of Human-Computer Studies, vol. 50, pp. 557-570, 1999. E. Horvitz, J. Breese, D. Heckerman, D. Hovel and K. Rommelse, The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users, Proceedings of the fourteenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann: San Francisco, 1998, 256-265. C. L. Hwang and K. Yoon, Multiple Attribute Decision Making: Methods and Applications. Lecture Notes in Economics and Mathematical Systems. Berlin/Heidelberg/New York: Springer, vol. 186, 1981. J. Kay Stereotypes, Student Models and Scrutability, Lecture Notes in Computer Science, Intelligent Tutoring Systems, G. Gautier, C. Frasson and K. VanLehn Eds. Springer, Berlin, vol. 1839, 2000, pp. 19-30. G. A. Klein, Naturalistic decision making: Implications for design, Crew System Ergonomics Information Analysis Center, 1993. G. A. Klein, J. Orasanu, R. Calderwood, and C. E. Zsambok (Eds.) Decision making in action: models and methods, Norwood, NJ:Ablex publishing corporation, 1993. A. Kobsa, J. Koenemann and W. Pohl, Personalized hypermedia presentation techniques for improving on-line customer relationships, The Knowledge Engineering Review, vol. 16, pp. 111-115, 2001. L. Kolonder, Case-Based Reasoning, Morgan Kaufmann Publisher Inc., San Mateo CA, 1993. D. B. Leake, Case-Based Reasoning: Experiences, Lessons and Future Directions, AAAI Press, Menlo Park, California, 1996. J. Martin and K. VanLehn, Student assessment using Bayesian nets, International Journal of Human Computer Studies, vol. 42, pp. 575-591, 1995. Martin-Lof Constructive mathematics and computer programming, Methodology and Philosophy of Science VI, Amsterdam: North Holland Publishing Company, 1982. M. Matthews, W. Pharr, G. Biswas and Neelakandan, USCSH: An Active Intelligent Assistance System, Artificial Intelligence Review, Intelligent Help Systems For Unix, St. J. Hegner, P. Mc.Kevitt, P. Norvig and R. Wilensky Eds. vol. 14, pp. 121-141, 2000. J. Mayfield, Controlling Inference in Plan Recognition, User Modeling and User-Adapted Interaction, vol. 2, no 1-2, pp. 55-82, 1992. K.L. McGraw, Performance Support Systems: Integrating AI, Hypermedia and CBT to Enhance User Performance, Journal of Artificial Intelligence in Education, vol. 5, no. 1, pp. 3-26, 1994. P. McKevitt, The OSCON Operating System Consultant, Artificial Intelligence Review, Intelligent Help Systems For Unix, St. J. Hegner, P. Mc.Kevitt, P. Norvig and R. Wilensky Eds. vol. 14, no 1-2, pp. 89-119, 2000. R.S. Michalski and P.H. Winston, Variable precision logic, Artificial Intelligence, vol. 29, pp. 121-146, 1986. Microsoft Corporation, Microsoft Windows 98 Resource Kit, Microsoft Press, 1998. E. Norling. Learning to Notice: Adaptive Models of Human Operators In D. Precup and P. Stone (eds.), Agents-2001 Workshop on Learning Agents Montreal, Canada, May 2001. E. Norling and C. Heinze, Naturalistic Decision Making and Agent Oriented Cognitive Modelling, Proceedings of Fifth Australasian Cognitive Science Conference, Melbourne, 2000. A. Quilici, AQUA: A system that detects and responds to user misconceptions, User Modeling and Dialog Systems. A. Kobsa and A. Wahlster Eds. Springer-Verlag. New York, 1988. A. Quilici, Using Justification Patterns to Advise Novice UNIX Users, in: St.J. Hegner, P. Mc Kevitt, P. Norvig and R. Wilensky (Eds.) Artificial Intelligence Review, Intelligent Help Systems for UNIX: Natural Language Dialogue, Vol. 14, No. 4/5, 403-420, 2000. E. Rich, User Modelling via Stereotypes, Cognitive Science, vol. 3, No 4, pp. 329-354, 1979. E. Rich, Users are individuals: individualizing user models, International Journal of Human-Computer Studies, vol. 51, pp. 323-338, 1999. M.E. Shiri, E. Ameur and C. Frasson, Student Modelling by CaseBased Reasoning, Fourth International Conference on Intelligent Tutoring Systems. Lecture Notes in Computer Science, Goettl, B.P., Halff, H.M., Redfield, C.L., Shute, V.J. Eds. Springer-Verlag, Berlin, Vol. 1452, 1998, 394-404. S.W. Tyler, J. L. Schlossberg, R.A. Gargan Jr., L.K. Cook and J.W. Sullivan, An Intelligent Interface Architecture For Adaptive Interaction, Intelligent User Interface, J. W. Sullivan and S. W. Tyler Eds. ACM Press, New York. Addison-Wesley Publishing Company, pp. 85-109, 1991. M. Virvou, Automatic reasoning and help about human errors in using an operating system, Interacting with Computers, vol. 11, No. 5, pp. 545-573, 1999. M. Virvou, A Cognitive Theory in an Authoring Tool for Intelligent Tutoring Systems 2002 IEEE International Conference on Systems

[15] [16] [17] [18] [19] [20] [21] [22] [23]

[24] [25] [26]


[27] [28] [29] [30] [31] [32]

[3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]

[33] [34] [35]


[37] [38]


Man and Cybernetics 2002 (SMC 02) Hammamet, Tunisia, Vol. 2, pp. 410-415. M. Virvou and B. Du Boulay, Human Plausible Reasoning for Intelligent Help, User Modeling and User-Adapted Interaction, vol. 9, pp. 321-375, 1999. M. Virvou and K. Kabassi, An Empirical Study Concerning Graphical User Interfaces that Manipulate Files, Proceedings of ED-MEDIA 2000, World Conference on Educational Multimedia, Hypermedia and Telecommunications, J. Bourdeau and R. Heller Eds. pp. 1117-1122, AACE, Charlottesville VA. M. Virvou and K. Kabassi, Evaluation of the advice generator of an intelligent learning environment, Proceedings of the 2001 IEEE International Conference on Advanced Learning Technologies, IEEE Computer Society, pp. 339-342, 2001. M. Virvou and K. Kabassi, Experimental Studies within the Software Engineering Process for Intelligent Assistance in a GUI, Journal of Universal Computer Science, vol. 8, no. 1, pp. 51-85, 2003. M. Virvou, C. Manos, G. Katsionis and K. Tourtoglou, VR-ENGAGE: A Virtual Reality Educational Game that Incorporates Intelligence 2002 IEEE International Conference on Advanced Learning Technologies, pp. 425-430, 2002. R. Wilensky, D.N. Chin, M. Luria, J. Martin, J. Mayfield, and D. Wu The Berkeley UNIX Consultant Project, Artificial Intelligence Review, Intelligent Help Systems For Unix, St. J. Hegner, P. Mc Kevitt, P. Norvig and R. Wilensky Eds. vol. 14, No 1-2, pp. 43-88, 2000. R. Winkels, Explorations in Intelligent Tutoring and Help, IOS Press, 1992. L.A. Zadeh, Fuzzy sets, Information and Control, vol. 8, pp. 338-353, 1965.

[39] [40]


[42] [43]


[45] [46]

You might also like