David Cook on the VV&A process

||Conversations

Emil Vassev portraitDr. David A. Cook is Associate Professor of Computer Science atStephen F. Austin State University, where heteachesSoftware Engineering, Modeling and Simulation, and Enterprise Security. Prior to this, he was Senior Research Scientist and Principal Member of the Technical Staff atAEgis Technologies,作为支持机载激光器的验证,验证和认证代理。库克博士在软件开发和管理方面拥有40多年的经验。他曾是副教授兼系研究主任亚博体育官网USAF Academyand former deputy department head ofSoftware Professional Development ProgramatAFIT。He has been a consultant for the Software Technology Support Center, Hill AFB, UT for 19 years.

Dr. Cook has a Ph.D. in Computer Science from德克萨斯农工大学,是团队主席助长, Past President for theSociety for Computer Simulation, International和主席ACM SIGAda

Luke Muehlhauser: In various articles and talks (e.g.Cook 2006), you’ve discussed the software verification, validation, and accreditation (VV&A) process. Though the general process is used widely, the VV&A term is often used when discussing projects governed byDoD 5000.61。您能否向DOD 5000.61使用谁,以及如何在实践中使用它?


大卫厨师:DOD 5000.81适用于所有涉及建模和模拟的国防部活动。出于所有实际目的,它适用于DOD使用的所有模型和模拟。这意味着它也适用于用于国防部目的的平民承包商创建的所有模型和模拟。

指令的目的,除了specifying who is the “accreditation authority” (more on this later) is to require Verification and Validation for all models and simulation, and then also to require that each model and simulation by accredited for its intended use. This is the critical part, as verification and validation has almost universally been a part of software development within the DOD. Verification asks the question “Are we building the system in a quality manner?”, or “are we building the system right?”. Verification, in a model (and the resulting execution of the model providing a simulation) goes a bit further – and asks the question “Does the model build and the results of the simulation actually represent the conceptual design and specifications of the system we built?” The difference is that in a model and simulation, you have to show that your design and specifications of the system you envision are correctly translated into code, and that the data provided to the code also matches specification.

Validation asks the question “are we building a system that meets the users’ actual needs?”, or “are we building the right system?” Again, the verification of a model and resulting simulation is a bit more complex than non-M&S ”verification”. In modeling and simulation, verification has to show that the model and the simulation both accurately represent the “real world” from the perspective of the intended use.

当您建立模型并为现实世界中可能不存在的概念系统提供模拟结果时,这两个活动非常困难。亚博体育苹果app官方下载例如,很难为载人的火星任务提供V&V,因为在现实世界中,还没有载人的火星兰德勒!因此,对于名义系统,V&V可能需要估计和猜测。亚博体育苹果app官方下载但是,猜测和估计可能是您能做的最好的!

5000.61 further requires that there be an ultimate authority, the “accreditation authority”, that is willing to say “based on the Verification and Validation performed on this model, I certify that it provides answers that are acceptable for its intended use”. Again, if you are building a notional system, this requires experts to say “These are guesses, but they are the best guesses available, and the system is as close a model to the real world as possible. We accredit this system to provide simulation results that are acceptable.” If, for example, an accredited simulation shows that a new proposed airplane would be able to carry 100,000 pounds of payload – but the result airplane, once built, can only carry 5,000 pounds – the accreditation authority would certainly bear some of the blame for the problem.

In practice, there are process for providing VV&A. Military Standard 3022 provides a standard template for recording VV&A activities, and many DOD agencies have their own VV&A repository where common models and simulation VV&A artifacts (and associated documentation) are kept.

这re are literally hundreds of ways to verify and validate a model (and it’s associated simulation execution). The V&V “agents” (who have been tasked with performing V&V) provide a recommendation to the Accreditation Authority, listing what are acceptable uses, and (the critical part) the limits of the model and simulation. For example, a model and simulation might provide an accurate representation of the propagation of a laser beam (in the upper atmosphere) during daylight hours, but not be a valid simulation at night, due to temperature-related atmospheric propagation. The same model and simulation might be a valid predictor of a laser bouncing off of a “flat surface”, but not bouncing off of uneven terrain.


Luke: Roughly how many accreditation authorities are there for such projects? Do accreditation authorities tend to specialize in accrediting V&V in certain domains — e.g. some for computer software, some for airplanes, etc.? Are there accreditation authorities that the DoD doesn’t recognize as “legitimate”?


David: Accreditation authorities are simply the component heads who sign a letter saying “Model and Simulation X is approved for the following purposes”. The letter then states what the intended uses are, lists any special conditions, and lists the limitations of the model and the simulation. The accreditation authority is more of a position rather than a person. It can be a person (usually the head of the organization), or a committee.

每个DOD代理商都负责它开发或使用的模型和模拟 - 它们必须VV&A他们自己的模型和仿真,或使用模型和仿真(来自受信任的来源),该模型和仿真已执行了自己的VV&A。但是请注意,每个国防部代理商可能都有自己的数据 - 还必须获得认可。每个项目都有自己的M&S,并且可能有执行VV&A的领域专家。每个项目都可能被授权执行自己的VV&A的权力。

这re are no non-legimate accreditation authorities per se; accreditation authorities are not authorized based on knowledge, simply on position. However it is assumed that each M&S area has domain experts who have the specialized knowledge in the application area to perform reliable VV&A. These domain experts span many areas – application domain experts (who might, for example, be an expert on a laser beam), coding domain experts (who can verify that the code is a good representation of the requirements), data domain experts (who verify that the targeting data represents a valid target), and perhaps many others. Typically, each project has a VV&A team or “agent” who perform the V&V, and recommend accreditation (usually in a formal letter) that restates the intended uses and limitations, The recommendation includes all associated artifacts, such as test results, reviews, reports of individual Verification and Validation activities, other models and simulation used to compare against, real-life data (to show validity), and possibly many other items.

如果一个特定的使用模型和si国防部机构mulation in its exercises, it is responsible for VV&A of its own M&S. If, on the other hand, an allied agency is using a model that includes artifacts from an another agency – the outside agency is responsible for working to make sure that the model, simulation and data repressing them is valid. In essence, each DOD agency is responsible to other other DOD component to ensure that their forces and capabilities are appropriately represented to all outside agencies utilizing models and simulation that involve them.


Luke: Are there historical cases in which a model or simulation completed the VV&A process, but then failed to perform as it should have given the accreditation, and then the accreditation authority was held to account for their failure? If so, could you give an example? (Preferably but not necessarily in software.)


David: Because I worked as a consultant on many modeling and simulation projects, I am ethically prevented from discussing actual failures that I know about – mainly because most of the projects I worked on were classified, and I signed non-disclosure agreements.

However, by shifting into hypothetical scenarios, there are several stories that I can use that best illustrate this. One is a story taught in many simulation classes – and I only have secondhand knowledge of it. The other two are ancedotal – but good lessons!

In the first instance, a model was used which predicted “safe runway distance” for an airplane. Feed into the model the weight, altitude, temperature, and humidity, and run the simulation to predict how much runway was needed.

Unfortunately, the day the model was used, it took several hours for the airplane to actually takeoff. It had a bit more fuel that estimated – adding weight. By takeoff time, the temperature had risen, giving “stickier” tires and runway, and decreasing air density (giving less lift) Also – the humidity had changed, also affecting lift characteristics.

该模型没有大错误的免税额(it tried to give a relatively precise answer) – and with all the factors changing, the airplane went from “enough runway” to “marginal” after the simulation had been run. Combined with a relatively inexperienced pilot (who did not advance the throttle fast enough) – and the airplane overshot the end of the runway. Not much damage (other than a bruised ego) – but the simulation – while accurate, was not used properly.

这other two stories are certainly imaginary – but are passed around like legend in our field. In the first story, in the early days of the Airborne Laser, a very simple model was used to predict laser propagation. Code was reused to model the laser – basically, code for a missile, with the speed of the missile increased to the speed of light. The targeting acquisition, target recognition, etc. were all similar, and once fired, the simulation would show if the target was hit. Until the first time they ran it, halfway to the target, the laser beam “ran out of fuel” and fell into the sea.

第二个(当然虚构的)故事涉及到密苏里州deling a battle scenario for the Australian Air Force – using helicopters. One of the problems with landing a helicopter was making sure it had a clear landing field – and kangaroos were a problem. So – the developers of the battle simulation, who used object-oriented development, took some code which was basically used to model a ground soldier and modified its behavior to “run at the sound of helicopters”. They then changed its appearance on the simulation to show a small image of a kangaroo. When to model was executed, the simulation showed the kangaroos running away from the helicopter. Until it landed, and then the kangaroos reversed direction, and attacked the helicopter with rifles!

Ok – the last two examples are cute and funny, but show the problems with invalid assumptions and imperfect data.

我以我在M&S课程中一直使用的报价留下了这个问题 - “所有模型都不准确。无论如何,有些型号很有用。”完全在代码内部建模现实世界非常困难。我不在乎您如何在模拟中建模热水浴缸 - 它并不是真正的热水浴缸。

这re are always things you do not consider, data that is not perfect, or constraints that you miss. A model is a best-guess approximation of what will happen in the “ dal world” – but it is NOT the “real world”. All models have limitations. In spite of that. the resulting model and the simulation still give useful data. The accreditation authority is simply acknowledging that the model and simulation are useful “for their intended use”, and that “limitations exist”. No reputable modeling and simulation expert (nor any accreditation authority) trusts a single model and its resulting simulation to produce data that is used in life-or-death decisions. Multiple sources of validity are required, multiple independently-developed models and simulations are used, and domain experts are consulted to see if the results “feel right”. And tolerances must always be given. An aircraft might encounter a puddle of water when trying to takeoff. It might hit a bird. Both of these decrease speed, requiring longer takeoff distance. It’s hard to model unforeseen circumstances. If you include a “fudge factor” – how much “fudge factor” is correct? Before an accreditation authority accepts a model or simulation as reliable, many, many steps must be taken to make sure that it produces credible results, and equally as many steps must be taken to make sure that the limitations of the model and simulation are listed and observed before accepting the result of the simulation as valid.


Luke: How did the VV&A process develop at DoD? When did it develop? Presumably it developed in one or more domains first, and then spread to become a more universal expectation?


David: Interesting question. Before we can discuss VV&A, we have to take a slight detour through the history of M&S. And I need to tie several threads of though together.

VV&A is, of course, tied to the use of models and simulations. To be honest, the VV&A of models goes back to the civil war (and probably earlier) – when mathematical models were used to predict firing data (given desired range, here was amount of powder and elevation required. Obviously – the models needed a lot of V&V. However, all it too to V&V the model was to load a cannon and fire it. Not a complex process. The accreditation part was implicit – the Secretary of War used to “authorize” the data to be printed. To really need VV&A, however, complex simulations were needed – and it took computing power to achieve complex M&S.

Over the years, modeling became more and more important, as models and simulation were used for problems that could not easily be solved by traditional mathematical methods. To quote fromWikipedia article,

Computer simulation developed hand-in-hand with the rapid growth of the computer, following its first large-scale deployment during the Manhattan Project in World War II to model the process of nuclear detonation. It was a simulation of 12 hard spheres using a Monte Carlo algorithm. Computer simulation is often used as an adjunct to, or substitute for, modeling systems for which simple closed form analytic solutions are not possible. There are many types of computer simulations; their common feature is the attempt to generate a sample of representative scenarios for a model in which a complete enumeration of all possible states of the model would be prohibitive or impossible.

VV&A直到模型和模拟进行计算机化,直到1940年代后期才能使用计算机,才真正成为一个严重的问题。从1940年代后期开始,数字和模拟计算机都可以使用。但是,很少有(如果有的话)工程师接受过如何使用这种新开发的计算能力的培训。关于建模和仿真如何成为国防部的强大力量的故事有很多,但是我个人了解的故事是约翰·麦克劳德(John McLeod)的故事。洛杉矶。约翰是一名创新者,在1950年代初期制作模拟计算机和模拟后,约翰·麦克劳德(John McLeod)是约翰·麦克劳德(John McLeod),他在1952年的某个时候接送了一台新的模拟计算机。问题,其中一些决定作为非正式用户群体聚会来交换想法和经验。简而言之,约翰帮助找到了成为计算机模拟协会(SCS)的东西。多年来,这个组织的成员是建模,模拟和VV&A领域的领导者和创新者。[请注意,我有幸从2011年至2012年担任SCS总裁,所以我有些偏见]。 The SCS has, to this day, the McLeod award to commemorate the advances John McLeod made in the M&S arena. It is only awarded to those that have made significant contributions to the profession.

SCS发表时事通讯。M&S会议回答e organized. Leaders in the field were able to meet, publish, and share their expertise. All of which help integrate M&S into more and more domains. As a result of leaders in the field being able to share M&S information, and also as a result huge increase in capabilities and availability of computers to run M&S, the need for VV&A also increased. Over the years, modeling and simulation became more and more important in many domains within the DOD. It helped develop fighters (in fact, aircraft of all types). It helped train our astronauts to land on the moon. It modeled the space shuttle. Complex models and simulations helped us model ballistic missile defense, fight war games with minimal expense (and no lives lost!), and design complex weapon systems. In fact, it’s hard to imagine any technologically sophisticated domain that does not use M&S to save money, save time, and ensure safety. But – these increasingly complex models needed verification and validation, and frequently accreditation,

So – the proliferation in the use of M&S lead to an increased need for VV&A. M&S became so complex that VV&A could not be accomplished without “domain experts” – usually referred to as “Subject Matter Experts” (SMEs) to help. Increased complexity of the M&S lead to increased complexity of the VV&A. Various elements within the DOD were performing VV&A on their own, with little official coordination. To leverage the experience of various DOD components and multiple domains, the DOD saw the need for a single point of coordination. As a result, in the 1990s, the DOD formed the Defense Modeling and Simulation Office (DMSO). The DMSO served as a single point of coordination for all M&S (and VV&A) efforts within the DOD. One of the best DMSO contributions was the VV&A Recommended Practices Guide (VV&A RPG) – first published in 1996. The guide has been updated several times over the years, reflecting the uncreased importance of VV& in the DOD. In 2008 DMSO was renamed the Modeling and Simulation Coordination Office. The MSCO web site (and the latest version of the VV&A Recommended Practices Guide) can be found atmsco.mil

对于那些对M&S和VV&A感兴趣的人,我不能足够推荐MSCO资源。它不需要任何费用(甚至没有电子邮件注册),并且包含有关M&S和VV&A的大量信息。这RPG Key Concepts documentalone contains 34 pages of critical “background” information that you should read before going any further in VV&A.


Luke: InCook (2006)you write that one of the reasons V&V is so difficult comes from “having to ‘backtrack’ and fill in blanks long after development.” What does this mean? Can you give an example?


David: Let’s imagine you are designing a new fighter aircraft. It is still on the drawing board, and only plans exist.

您不是先花钱构建实际原型,而是开发喷气机的数学模型来帮助验证性能特征。实际上,您可能会建立一个非常小的身体模型 - 可能是风洞实验的1/10尺寸。

您还构建基于计算机的模型并执行它们以估计飞行特性。风能隧道体验(即使仅在1/10尺寸的模型上)也将提供可能使您修改或更改基于计算机的模型的数据。此反馈循环由“构建模型 - 运行模拟 - 检查数据 - 调整模型”和重复。

Eventually, you build a working prototype of the jet. Almost certainly, the actual flight characteristics will not exactly match the computer-based model. The prototype is “real world” – so you have to readjust the computer-based model. The “real-world” prototype is just a prototype – and probably not used for high-speed fighting and turns – but the basic data gathered from the flying of the prototype leads to changes in the computer-based model, which will now be used to predict more about high-speed maneuvering.

Back when I worked on the Airborne Laser – we had models that predicted the laser performance before the laser was actually built or fired! The models were based on mathematical principles, on data from other lasers, and from simpler, earlier models that were being improved on. Once a working Airborne Laser was built and fired – we had “real world” data. It was no surprise to find out that the actual characteristics of the laser beam were slightly different that those predicted by the models. For one thing, the models were simplistic – it was impossible to take everything into account. The result was that we took the real-world data, and modified the computer models to permit them to better predict future performance.

最重要的是该模型是neverfinished. Every time you have additional data from the “real world” that is not an exact match to what the model predicts, the model should be examined, and the model adjusted as necessary.

这re are two terms I like to use for models when it comes to VV&A – “anchoring” and “benchmarking”. If I can get another independently-developed model to predict the same events as my model, I have a source of validation. I refer this as benchmarking. Subject matter experts, other simulations, similar events that lend credence to your model – all improve the validity, and provide benchmarking. Anchoring, on the other hand, is when I tie my model directly to real-world data.

As long as the model is being used to predict behavior – it needs to continually be tied or anchored to real-world performance, if possible. If no real-world data is available, then similar models, expert opinions, etc. can be used to also increase the validity.

Just a final note. Models can become so engrained in thoughts that they become “real. For example, I remember when the recent Star Trek movie (the 2009 version) came out. A friend of mine said, after viewing the movie, that he had trouble the the bridge of the USS Enterprise. It did not “look real”. I asked what “real” was – and my friend replied “You know, like the REAL USS Enterprise, the NCC 1701 (referring to the original series). Think about it – all are notional and imaginary (sorry, fellow Trekers) – yet he viewed one as “real” and the other as inaccurate. Models – when no real-world artifact exists – have the potential to become “real” in your mind. It’s worth remembering that a model is NOT real, but only an artifact built to resemble or predict what might (or might not) eventually become real one day.


Luke: Do you have a sense of how commonformal verificationis for software used in DoD applications? Is formal verification of one kind or anotherrequiredfor certain kinds of software projects? (Presumably, DoD also uses much software that is not amenable to formal methods.)


David: I have not worked on any project that uses formal V&V methods.

I used to teach the basics of formal methods (using ‘Z’- pronounced Zed) – but it is very time consuming, and not really fit for a lot of project.

Formal notation shows the correctness of the algorithm from a mathematical standpoint. For modeling and simulation, however, they do not necessarily help you with accreditation – because the formal methods check the correctness of the code, and not necessarily the correlation of the cod eight real-world data.

我听说某些非常关键的应用程序(例如反应堆代码和火星兰德的代码)使用正式方法来确保代码正确。但是,正式的方法需要大量的培训和教育才能正确使用,并且在实际使用方面也会消耗大量时间。正式方法很少(从不?)加快流程 - 它们被严格用于验证代码。

From my experience, I have not work on on any project that made any significant use of formal methods – and in fact, I do not have any colleagues that have used formal methods, either.


Luke: Thanks, David!