Validation and Verification of Intelligent Systems

Gonzalez, A.J., Murillo, M., and Knauf, R., "Validating Human Behavior Models," Ilmenau Scientific Colloquium, 2000.

[
ABSTRACT
| FULL DOCUMENT ] (Full document will be loaded onto a new page.)

Human behavior models (hbm's) have been used by the military training community to assist in tactical training tasks. These models represent typical human behavior in tactical situations such as seen in battle. They can serve to automate the presence of enemy aswell as friendly units, and thus save the manpower effort required to populate and control these entities. More recently, they have been used as part of larger aggregate units in constructive simulations to provide greater accuracy and realism to these simulations. Validation of these models has not yet been a research issue because of the community's current preoccupation with building them. However, this is currently gaining importance, as developers transition from a research to a deployment mode. The dynamic nature of these models and the need to use experts to judge their validity introduce great difficulty invalidating hbm's. This paper discusses some conceptual approaches to this problem. These are based on comparing the behavior of the model to that of an expert, while the latter behaves normally in a simulated environment, under the same conditions as perceived by the model.
Gonzalez, A.J. and Barr, V., "Validation and Verification of Intelligent Systems - What Are They and How are They Different?," Journal of Experimental and Theoretical Artificial Intelligence, 2000.

[
ABSTRACT
| FULL DOCUMENT ] (Full document will be loaded onto a new page.)

Researchers and practitioners in the field of expert systems all generally agree that to be useful, any fielded intelligent system must be adequately verified and validated. But what does this mean in concrete terms? What exactly is verification? What exactly is validation? How are they different? Many authors have attempted to define these terms and, as a result, several interpretations have surfaced. It is our opinion that there is great confusion as to what these terms mean, how they are different, and how they are implemented. This paper, therefore, has two aims – to clarify the meaning of the terms validation and verification as they apply to intelligent systems, and to describe how several researchers are implementing these. The second part of the paper, therefore, details some techniques that can be used to perform the verification and validation of systems. Also discussed is the role of testing as part of the above-mentioned processes.
Gonzalez, A.J., "Validation of Human Behavior Models," Florida Artificial Intelligence Research Society Conference, 1999.

[
ABSTRACT
| FULL DOCUMENT ] (Full document will be loaded onto a new page.)

Validation of human behavioral models, such as those used to represent hostile and/or friendly forces in training simulations is an issue that is gaining importance, as the military depends on such training methods more and more. However, this introduces new difficulties because of the dynamic nature of these models and the need to use experts to judge their validity. As a result, this paper discusses some conceptual approaches to carry out this task. These are based on comparing the behavior of the model to that of an expert, while the latter behaves normally in a simulated environment, under the same conditions as perceived by the model.
Gonzalez, A.J. and Murillo, M., "Validation of Human Behavioral Models," Simulation Interoperability Workshop, 1999.

[
ABSTRACT
| FULL DOCUMENT ] (Full document will be loaded onto a new page.)

Validation of human behavioral models, such as those used to represent hostile and/or friendly forces in training simulations is an issue that is gaining importance, as the military depends on such training methods more and more. However, this introduces new difficulties because of the dynamic nature of these models and the need to use experts to judge their validity. As a result, this paper discusses some conceptual approaches to carry out this task. These are based on comparing the behavior of the model to that of an expert, while the latter behaves normally in a simulated environment, under the same conditions as perceived by the model.
Michels, J., Abel, T., Knauf, A., and Gonzalez, A.J., "Investigating the Validity of a Test Case Selection Methodology for Expert System Validation," Florida Artificial Intelligence Research Society Conference, 1998.

[
ABSTRACT
| FULL DOCUMENT ] (Full document will be loaded onto a new page.)

Providing assurances of performance is an important aspect of successful development and commercialization of expert systems. However, this can only be done if the quality of the system can be assured through a rigorous and effective validation process. However, a generally accepted validation technique that can, if implemented properly, lead to a determination of validity (a validity statement) has been an elusive goal. This has led to a generally haphazard way of validating expert systems. Validation has traditionally been mostly done through the use of test cases. A set of test cases, whose solution is previously known and benchmarked, is presented to the expert system. A comparison of the system’s solutions to that of the test cases is then used to somehow generate a validity statement. It is an intuitive way of testing the performance of any system, but it does require some consideration as to how extensively to test the system in order to develop a reliable validity statement. One completely reliable statement of a system’s validity could result from exhaustive testing of the system. However, that is commonly considered to be impractical for all but the most trivial of systems. A better means to select “good” test cases must be developed. The authors have developed a framework for such a system (Abel, Knauf and Gonzalez 1996). This paper describes an investigation undertaken to evaluate the effectiveness of this framework by validating a small but robust expert system to classify birds using this framework.
Flannery, L.M., and Gonzalez, A.J., "Detecting Anomalies in Constraint-based Systems," Journal of Engineering Applications of Artificial Intelligence, Vol. 10, No. 3, pp. 257-268, 1997.

[
ABSTRACT
| FULL DOCUMENT ] (Full document will be loaded onto a new page.)

All software systems need to be verified, regardless of the type of techniques employed. Most existing verification tools are designed to work with conventional software or rule-based expert systems. Verification of constraint-based reasoning systems, however, has not been a popular subject of research in computer science and engineering. It could be safely said that Flannery’s work on the subject [Flannery, 1993] is the first investigation done towards the objective of producing a verification tool to verify the constraints in a constraint-based system. However, her pioneering work being the first, it is rather conceptual in nature and ignores some of the more practical aspects of the problem.

The objective of this paper is to describe an automatic constraint verification tool and its testing, performed on a real world application in scheduling. The highlight of this verification tool is its ability to do open-ended inferencing to detect hidden anomalies, something Flannery’s prototype was not capable of doing. In summary, this tool will ensure the consistency, completeness, and correctness of a constraint base for a constraint-based reasoning system.
Gonzalez, A.J. and Ramasamy, L., "Detecting Anomalies in Constraint-based Systems," Ilmenau Scientific Colloquium, 1997.

[
ABSTRACT
| FULL DOCUMENT ] (Full document will be loaded onto a new page.)

All software systems need to be verified, regardless of the type of techniques employed. Most existing verification tools are designed to work with conventional software or rule-based expert systems. Verification of constraint-based reasoning systems, however, has not been a popular subject of research in computer science and engineering. It could be safely said that Flannery’s work on the subject [Flannery, 1993] is the first investigation done towards the objective of producing a verification tool to verify the constraints in a constraint-based system. However, her pioneering work being the first, it is rather conceptual in nature and ignores some of the more practical aspects of the problem.

The objective of this paper is to describe an automatic constraint verification tool and its testing, performed on a real world application in scheduling. The highlight of this verification tool is its ability to do open-ended inferencing to detect hidden anomalies, something Flannery’s prototype was not capable of doing. In summary, this tool will ensure the consistency, completeness, and correctness of a constraint base for a constraint-based reasoning system.
Gonzalez, A.J., Xu, L., and Gupta, U.M., "Validation Techniques for Case-based Reasoning Systems," Ilmenau Scientific Colloquium, 1997.

[
ABSTRACT
| FULL DOCUMENT ] (Full document will be loaded onto a new page.)

Case-Based Reasoning (CaBR) systems, by their nature, have a built-in set of test cases in their case library. Effective use of this unusual feature can facilitate the validation process by minimizing the involvement of domain experts in the process. This can reduce the cost of the validation process, and eliminate the subjective component introduced by experts. This article proposes a validation technique which makes use of the case library to validate the CaBR system. Called the Case Library Subset Test Technique (CLST), it evaluates the correctness of the retrieval and adaptation functions of the CaBR engine with respect to the domain as represented by the case library. It is composed of two phases, 1) the Retrieval Test, and 2) the Adaptation Test. A complete description of the technique, as well as an application of the technique to validate an existing CaBR system are discussed in this paper.
Gonzalez, A.J., Gupta, U.G., and Chianese, R.B., "Performance Evaluation of a Large Diagnostic Expert System Using a Hueristic Test Case Generator," Journal of Engineering Applications of Artificial Intelligence, Vol. 9, No. 3, pp. 275-284, 1996.

[
ABSTRACT
| FULL DOCUMENT ] (Full document will be loaded onto a new page.)

Validating the performance of a knowledge-based system is a critical step in its commercialization process. Without exception, buyers of systems intended for serious purposes require a certain level of guarantees about system performance. This is particularly true for diagnostic systems. Yet, many problems exist in the validation process, especially as it applies to large knowledge-based systems. One of the biggest challenges facing the developer when validating the system's performance is knowing how much testing is sufficient to show that the system is valid. Exhaustive testing of the system is almost always impractical due to the many possible test cases that can be generated, many of which are not useful, It would thus be highly desirable to have a means of defining a representative set of test cases that, if executed correctly by the system, would provide a high confidence in the system's validity. This paper describes the experiences of the development team in validating the performance of a large commercial diagnostic knowledge-based system. The description covers the procedure employed to carry out this task, as well as the heuristic technique usedfor generating the representative set of test cases.

University of Central Florida

Validation and Verification of Intelligent Systems