swarm-modeling
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Swarm-Modelling] comparing models


From: Michael McDevitt
Subject: Re: [Swarm-Modelling] comparing models
Date: Tue, 2 Sep 2003 13:46:05 -0700



An extract from a paper I wrote for the Navy earlier this year that generally 
pertains to any modeling effort:

      “Model  validation  is  “substantiation  that  a computerized model 
within its domain of applicability possesses a satisfactory range of accuracy 
consistent with the intended application of the model” (Schlesinger et al. 
1979). Model verification is
often defined as “ensuring that the computer program of the computerized model 
and its implementation are correct”. Model accreditation determines if a model 
satisfies specified criteria according to a specified process.


      A  related  topic  is  model  credibility.  Model  credibility  is  
concerned  with developing the confidence needed by (potential) users in a 
model and in the information derived from the model that they are willing to 
use the model and the derived
information.” (Sargent, 1999)


      “Conceptual  model  validity  is  determining  that  (1)  the  theories 
and assumptions underlying the conceptual model are correct, and (2) the model 
representation of the problem entity and the model’s structure, logic, and 
mathematical and causal
relationships are “reasonable” for the intended purpose of the model.”

Ø     “The primary validation techniques used for these evaluations are face 
validation and traces. Face validation has experts on the problem entity 
evaluate the conceptual model to determine if it is correct and reasonable for 
its purpose. This usually
requires examining the flowchart or graphical model, or the set of model 
equations.  The use of traces is the tracking of entities through each 
sub-model and the overall model to determine if the logic is correct and if the 
necessary accuracy is
maintained.”  (Sargent, 1999)


      Whatever  tests  and  techniques are applied, the litmus test for 
validation is that the conceptual model is valid with respect to the problem 
and purpose for the model. “Conceptual model validity is determining that (1) 
the theories and assumptions
underlying  the  conceptual  model  are  correct,  and  (2) the model 
representation of the problem entity and the model’s structure, logic, and 
mathematical and causal relationships are “reasonable” for the intended purpose 
of the model.” (Sargent, 1999)
Remember,  “all  models  are  wrong,  some are useful” (Sterman, 2000).  A 
model should be a simplification of reality that provides insight.  Creating a 
full fidelity (scale) model of reality is usually prohibitively expensive and 
even impossible in most
case.  Regardless it may not provide any additional utility for the added 
expense.


      The  theories  and assumptions underlying the model should be tested 
using mathematical analysis and statistical methods on problem entity data when 
available. Examples of theories and assumptions are linearity, independence, 
stationary, and Poisson
arrivals.  Examples of applicable statistical methods are fitting distributions 
to data, estimating parameter values from the data, and plotting the data to 
determine if they are stationary. In addition, all theories used should be 
reviewed to ensure they
were applied correctly.”


      “Computerized model verification is defined as ensuring that the computer 
programming and implementation of the conceptual model is correct.”

      “The major factor effecting verification is whether a simulation language 
or a higher level programming language such as C, or C++ is used. The use of a 
special-purpose simulation language generally will result in having fewer 
errors than if a
general-purpose simulation language is used, and using a general-purpose 
simulation language will generally result in having fewer errors than if a 
general purpose higher level language is used. (The use of a simulation 
language also usually reduces the
programming time required and the flexibility.)” (Sargent, 1999)

      Determining  that  the  computer  simulation,  software,  programming and 
algorithms faithfully represent the intended conceptual model accurately enough 
for the purpose of the effort is the next step.  Any number of tests can be 
used to compare the
simulation’s  behavior  with  real system behavior.  Are the mathematical 
functions and relationships logically correct and result in the postulated 
behavior.  Is the model a black box model with hidden algorithms that produce 
results that appear to mimic
reality but which have no logical or causal relationship to reality?


      Before a model can be accredited for use, for the purpose for which it 
was designed, it must satisfy the sponsor that it is credible and that it is 
operationally valid under most circumstances as required by its application.


      “Operational  validity is concerned with determining that the model’s 
output behavior has the accuracy required for the model’s intended purpose over 
the domain of its intended applicability.” (Sargent, 1999)  This is done using 
different approaches
depending  upon  whether  the  system  being  modeled  is  “observable”.   
Observable implies that there are appropriate data available for analysis.  The 
following table summarizes the basic approaches to determining a model’s 
operational validity.  Both
approaches are usually required to foster confidence that the model is useful 
for its intended purpose.

      “’Comparison’  means  comparing/testing  the  model  and  system  
input-out  behaviors,  and  “explore  model  behavior”  means  to  examine  the 
 output  behavior  of  the model using appropriate validation techniques and 
usually includes parameter
variability-sensitivity  analysis.  Various  sets  of  experimental  conditions 
from the domain of the model’s intended applicability should be used for both 
comparison and exploring model behavior. To obtain a high degree of confidence 
in a model and its
results, comparisons of the model’s and system’s input/output behaviors for 
several different sets of experimental conditions are usually required.” 
(Sargent, 1999)

      “Data  are  needed  for  three  purposes: (1) for building the conceptual 
model, (2) for validating the model, and (3) for performing experiments with 
the validated model.  In model validation we are concerned only with the first 
two types of data.”
Computer model verification requires the third type. (Sargent, 1999)

      Various validation techniques (and tests) can be used in model validation 
and verification. The techniques can be used either subjectively or 
objectively. By “objectively,” we mean using some type of statistical test or 
mathematical procedure, e.g.,
hypothesis tests and confidence intervals. A combination of techniques is 
generally used.

      John Sterman “Checklist for the Model Consumer” includes twenty 20 
questions that aid in determining a model’s utility as part of the validation 
process:
     
|--------------------------------------------------------------------------|
     | 1.    What is the problem at hand?                                       
|
     
|--------------------------------------------------------------------------|
     | 2.    What is the problem addressed by the model?                        
|
     
|--------------------------------------------------------------------------|
     |3.     What is the boundary of the model?                                 
|
     
|--------------------------------------------------------------------------|
     |4.     What factors are endogenous? Exogenous? Excluded?                  
|
     
|--------------------------------------------------------------------------|
     |5.     Are soft variables included?                                       
|
     
|--------------------------------------------------------------------------|
     |6.     Are feedback effects properly taken into account?                  
|
     
|--------------------------------------------------------------------------|
     |7.     Does the model capture possible side effects, both harmful and     
|
     |beneficial?                                                               
|
     
|--------------------------------------------------------------------------|
     |8.     What is the time horizon relevant to the problem?                  
|
     
|--------------------------------------------------------------------------|
     |9.     Does the model include as endogenous components those factors that 
|
     |may change significantly over the time horizon?                           
|
     
|--------------------------------------------------------------------------|
     |10.    Are people assumed to act rationally and to optimize their         
|
     |performance?                                                              
|
     
|--------------------------------------------------------------------------|
     |11.    Does the model take non-economic behavior (organizational          
|
     |realities, non-economic motives, political factors, cognitive 
limitations)|
     |into account?                                                             
|
     
|--------------------------------------------------------------------------|
     |12.    Does the model assume people have perfect information about the    
|
     |future and about the way the system works, or does it take into account   
|
     |the limitations, delays, and errors in acquiring information that plague  
|
     |decision makers in the real world?                                        
|
     
|--------------------------------------------------------------------------|
     |13.    Are appropriate time delays, constraints, and possible bottlenecks 
|
     |taken into account?                                                       
|
     
|--------------------------------------------------------------------------|
     |14.    Is the model robust in the face of extreme variations in input     
|
     |assumptions?                                                              
|
     
|--------------------------------------------------------------------------|
     |15.    Are the policy recommendations derived from the model sensitive to 
|
     |plausible variations in its assumptions?                                  
|
     
|--------------------------------------------------------------------------|
     |16.    Are the results of the model reproducible? Or are they adjusted    
|
     |(add factored) by the model builder?                                      
|
     
|--------------------------------------------------------------------------|
     |17.    Is the model currently operated by the team that built it?         
|
     
|--------------------------------------------------------------------------|
     |18.    How long does it take for the model team to evaluate a new         
|
     |situation, modify the model, and incorporate new data?                    
|
     
|--------------------------------------------------------------------------|
     |19.    Is the model documented? Is the documentation publicly available?  
|
     
|--------------------------------------------------------------------------|
     | 20.   Can third parties use the model and run their own analyses with 
it?|
     
|--------------------------------------------------------------------------|





                                                                                
               Figure 13: Twenty Questions to Assess Usefulness (Sterman, 1991)


      The following techniques from Sargent (1999) are commonly used for 
validating and verifying the sub-models and overall model.


      “Animation: The model’s operational behavior is displayed graphically as 
the model moves through time. For example, the movements of parts through a 
factory during a simulation are shown graphically.


      Comparison  to  Other  Models:  Various results (e.g., outputs) of the 
simulation model being validated are compared to results of other (valid) 
models. For example, (1) simple cases of a simulation model may be compared to 
known results of analytic
models, and (2) the simulation model may be compared to other simulation models 
that have been validated.


      Degenerate  Tests:  The  degeneracy  of the model’s behavior is tested by 
appropriate selection of values of the input and internal parameters. For 
example, does the average number in the queue of a single server continue to 
increase with respect to
time when the arrival rate is larger than the service rate?


      Event Validity: The “events” of occurrences of the simulation model are 
compared to those of the real system to determine if they are similar. An 
example of events is deaths in a fire department simulation.


      Extreme Condition Tests: The model structure and output should be 
plausible for any extreme and unlikely combination of levels of factors in the 
system; e.g., if in process inventories are zero, production output should be 
zero.


      Face  Validity:  “Face  validity”  is  asking  people  knowledgeable  
about  the  system whether the model and/or its behavior are reasonable. This 
technique can be used in determining if the logic in the conceptual model is 
correct and if a model’s
input-output relationships are reasonable.


      Fixed Values: Fixed values (e.g., constants) are used for various model 
input and internal variables and parameters. This should allow the checking of 
model results against easily calculated values.


      Historical  Data Validation: If historical data exist (or if data are 
collected on a system for building or testing the model), part of the data is 
used to build the model and the remaining data are used to determine (test) 
whether the model behaves
as the system does. (This testing is conducted by driving the simulation model 
with either samples from distributions or traces.


      Historical  Methods: The three historical methods of validation are 
rationalism, empiricism, and positive economics. Rationalism assumes that 
everyone knows whether the underlying assumptions of a model are true. Logic 
deductions are used from these
assumptions  to  develop the correct (valid) model. Empiricism requires every 
assumption and outcome to be empirically validated. Positive economics requires 
only that the model be able to predict the future and is not concerned with a 
model’s assumptions
or structure (causal relationships or mechanism).


      Internal  Validity:  Several  replications  (runs)  of  a  stochastic  
model  are  made to determine the amount of (internal) stochastic variability 
in the model. A high amount of variability (lack of consistency) may cause the 
model’s results to be
questionable and, if typical of the problem entity, may question the 
appropriateness of the policy or system being investigated.


      Multistage  Validation: Naylor and Finger (1967) proposed combining the 
three historical methods of rationalism empiricism, and positive economics into 
a multistage process of validation. This validation method consists of (1) 
developing the model’s
assumptions on theory, observations, general knowledge, and function, (2) 
validating the model’s assumptions where possible by empirically testing them, 
and (3) comparing (testing) the input-output relationships of the model to the 
real system.


      Operational  Graphics:  Values of various performance measures, e.g., 
number in queue and percentage of servers busy, are shown graphically as the 
model moves through time; i.e., the dynamic behaviors of performance indicators 
are visually displayed
as the simulation model moves through time.


      Parameter  Variability–Sensitivity  Analysis: This technique consists of 
changing the values of the input and internal parameters of a model to 
determine the effect upon the model’s behavior and its output. The same 
relationships should occur in the
model as in the real system. Those parameters that are sensitive, i.e., cause 
significant changes in the model’s behavior or output, should be made 
sufficiently accurate prior to using the model. (This may require iterations in 
model development.)


      Predictive  Validation: The model is used to predict (forecast) the 
system behavior, and then comparisons are made between the system’s behavior 
and the model’s forecast to determine if they are the same. The system data may 
come from an operational
system or from experiments performed on the system, e.g., field tests.


      Traces: The behavior of different types of specific entities in the model 
are traced (followed) through the model to determine if the model’s logic is 
correct and if the necessary accuracy is obtained.


      Turing Tests: People who are knowledgeable about the operations of a 
system are asked if they can discriminate between system and model outputs.”  
(Sargent, 1999)





      References:


Sargent, R., (1999) Validation and Verification of Simulation Models 
[Electronic Version], Proceedings of the 1999 Winter Simulation Conference, 
Retrieved September 10, 2002, from 
http://www.informs-cs.org/wsc99papers/prog99.html

Schlesinger, et al. (1979). Terminology for Model Credibility, Simulation, 32, 
3, pp. 103–104.

Sterman, J., (1991). A Skeptics Guide to Computer Models  [Electronic Version] 
Retrieved April 5, 2002, from 
http://sysdyn.mit.edu/sdep/Roadmaps/RM9/D-4101-1.pdf


Sterman, J., (2000) Business Dynamics, Systems Thinking and Modeling for a 
Complex World, Boston, Massachusetts: Irwin McGraw-Hill








All the Best,

Mike McDevitt
Senior Analyst & Modeler
CACI Dynamic Systems Inc.
858-695-8220 x1457


                                                                                
                        
                      address@hidden                                            
                     
                      .com                     To:       address@hidden         
                   
                      Sent by:                 cc:                              
                        
                      address@hidden        Subject:  Re: [Swarm-Modelling] 
comparing models         
                      warm.org                                                  
                        
                                                                                
                        
                                                                                
                        
                      09/02/2003 01:20                                          
                        
                      PM                                                        
                        
                      Please respond to                                         
                        
                      modelling                                                 
                        
                                                                                
                        
                                                                                
                        




Andy Cleary writes:
 > >So, as to your questions about which techniques are best, just pick a
 > >few, do the work, write down the results.  Pick a few more, do the
 > >work, write down the results.  Etc.  If a sizable sampling of
 > >techniques (e.g. 3 statistical, 2 from feature extraction, 1
 > >state-space reconstruction, 2 in signal analysis) all give you a
 > >certain result (e.g. model 1 and model 2 lead to the same
 > >conclusions), then it may be worth pointing that out to some audience.
 >
 > I don't disagree with you, but if you tried selling this as "validation" to
 > people used to *physics*, you would not get very far.
 >
 > Or to make it more concrete, *I* have not gotten very far in the same
 > circumstances...

Can you give a list of the validation techniques you have used and,
perhaps, a breakdown of which ones were mildly successful and which
ones were definitely not successful?

--
glen e. p. ropella              =><=                           Hail Eris!
H: 503.630.4505                              http://www.ropella.net/~gepr
M: 971.219.3846                               http://www.tempusdictum.com

_______________________________________________
Modelling mailing list
address@hidden
http://www.swarm.org/mailman/listinfo/modelling





reply via email to

[Prev in Thread] Current Thread [Next in Thread]