On “expected results” and the perils of overmodelling

June 13, 2014 · Posted in Command

Recently there was an interesting discussion on the Matrix forum about various details of air combat modelling in Command. One of the residents enquired about the modelling of lopsided victories such as Bekaa valley:

I do not have the nice data set prints as above, but I notice much the same sort of issue. E.G., play the Battle of the First Salvo scenario – and just watch the Syria/Israel air combat. Historically, in that time frame, the Israeli side routinely wipes up the Syrian Air Force – but in the simulation the Israelis lose several Eagles every time I run the scenario. Just struck me the last time I ran it that the on screen results should at least be near the ball park of real results.

This was a fair point, and prompted some reflection on the modelling priorities of Command as a simulation:

…and right there lies one of the biggest pitfalls for generic (ie. not battle/theater-focused) wargame/simulation engines: Over-specializing in order to closely match a specific historical result, at the expense of everything else.

Let’s take another similar example to make the peril more obvious. Say we’re developing a tactical land combat engine (think Steel Panthers), and our first litmus test is 73 Easting. We do some trial runs, and although US/UK forces trounce the Iraqi units as in RL, allied losses are usually higher than the minimal ones historically attained. This does not please our target audience (mil/gov or consumers, depending on the game) so we go back and endlessly tweak both data and algorithms until the game consistently recreates the absolute wipe-out of the real battle. Great job! Or we think so.

Now we take that engine and data and go to our next test, a WW3-CentFront scenario. We run it and NATO forces effortlessly shrug off massive Soviet attacks. Oops.

What happened? We overspecialized for 73 Easting’s kill scores.

A simulation does not need to consistently recreate (in terms of bean-counting) a specific historical outcome in order to be realistic. The outcome must certainly be one of the possible results (otherwise you have a real problem) but the true litmus test is whether the game results are often close to the historical outcome. If Iraqi forces are rolling all over allied armor in 73 Easting then definitely something is wrong. If allied forces are dominating but taking losses here and there, it is probably realistic enough.

Let’s say you’re simulating the first night of Desert Storm. If the game flows pretty much like the historical route, the bulk of the Iraqi IADS should be neutralized; that right there is your authenticity criterion. If a few F-117s are lost does this mean there is a problem in the sim? It is possible (and you have to check the logs to determine that), but a more likely explanation is that the RL Nighthawk crews were simply very lucky. (Many F-117 pilots have openly stated exactly that, and the campaign planners in fact expected several losses). Can you tweak the models & data so that no F-117 ever gets lost? Sure. But then you’ll probably never be able to recreate the Kosovo shootdown (and the second one that limped back to Aviano and was written off after crash-landing). Overspecializing again.

(As an aside, this is one of the reasons many wargame/sim designers prefer their releases to focus on a single battle or theater at a time. If all you have to worry about is 73 Easting or Medinah Ridge or Desert Storm in general, you’re free to massage your models and data to consistently recreate historical outcomes. Fulda Gap? Simply re-tweak the engine for that on the next release.)

So, back to Bekaa. Like 73 Easting this is an unusual mismatch that tempts you to tweak your engine to closely match it, but we’ve seen the dangers of doing that. Let’s break down the factors that enabled the IAF to dominate these engagements:

* Strong sensors jamming, particularly standoff radar jamming. Command supports this already.

* Very strong comms jamming, which effectively forced Syrian aircraft to rely almost exclusively on their own (usually inadequate) sensors instead of sharing a common tactical picture with the Syrian IADS. You can _sort of_ simulate this in Command right now by not having the big Syrian EW/GCI radars present (the end-user effect of not receiving cues from your IADS is the same one as the IADS not being there in the first place) but you still have the limitation of Syrian fighters freely communicating among themselves. We plan to model comms jamming better in the future.

* Sub-optimal placement of Syrian IADS elements (see here for elaboration: http://www.ausairpower.net/APA-SAM-Effectiveness.html). You can do this in Command already.

* Vast differences in pilot proficiency. Command supports this, but still allows a rookie pilot to pull the same tactical and evasive maneuvers as an ace (albeit with reduced evasion benefit). This is crucial, because in Bekaa many Syrian aircraft were attacked and destroyed while literally flying straight and level. Now, could we modify the code so that “advanced” evasion techniques are available only to highly proficient crews? We could, and fairly easily so, but this would then create a very strong incentive for a player handling the Syrian side to micromanage (e.g. “my pilots are not beaming by themselves so I’ll do it manually for them”). This runs contrary to one of our chief design tenets of Command: the player should not have to micromanage to win. So it’s a bind.

* Substantial differences in pilot visibility. This doesn’t matter much if a fighter has good onboard sensors and solid communication with the IADS, but when (a)your sensors are crap and/or jammed, (b)you are cut off from anyone else on your side because your comms are solid static and (c)your limited side- or rear- visibility prevents you from seeing the IAF fighter coming up on you under his own perfectly working IADS guidance, you end up exactly in the situation of being attacked and killed while flying straight and level. (As Coiler correctly pointed out, most IAF kills were the result of surprise slashing hit-and-run atacks, not artful dogfights). Command currently assumes a JSF-like 360-deg visibility for all aircraft and it is definitely something we want to improve (our DB master cries for mercy).

* Superior hardware and weapons (particularly AIM-9L vs AA-2). Nothing much to say about this, Command of course already models this.

* Superior grand-tactical/operational management of air assets. This is pretty much what Command expects of you, as the player, to achieve. Manouvering your aircraft to get in optimal engagement positions while preventing the enemy from doing the same, selecting the most suitable weapons for the task, reacting on-the-fly to changes in the tactical situation etc. The AI does a reasonable job at this (most of the time) but human intuition is still hard to beat, and AI vs AI clashes cannot reproduce the imbalance of skill that characterized Bekaa.

So, as you can see there is a whole range of factors that have to be taken into account when considering mismatches like Bekaa, and very few of them have to do with the weapon endgame interactions that the OP was inquiring about.

Another user chipped in:

One thing that hasn’t been noted is that you need to be careful when comparing to real world results because the real world results aren’t necessarily the “average.” To use the 73 Easting example, we don’t know whether the one-sided results were unavoidable due to the circumstances, or if the real world results were actually an outlier. We can run the sim 100 times to see what the average results are for the sim. We don’t have that option with the real world.

I’ve seen this problem noted in a game that included the Battle of Midway. In the game, the Japanese usually did better than they did historically. After much studying, our conclusion was that if you fought the real battle a number of times, the historical results were pretty much the best result that was possible. The game was likely giving more realistic results than the actual battle did.

Definitely an interesting discussion.

UPDATE: After this discussion, both the more realistic pilot visibility and comms-disruption were added as features in Command, enabling for the first time a truly realistic recreation of the Bekaa airbattle & tactics in the “Shifting Sands” DLC.

Tags: no tags