From Archive to OOB, Part II
Part I of this series lived in the world of uncertainty: We discussed Bronze Age warfare in Kadesh, talked about Caesar and Belisarius, "the Last Roman", as well as the era of Musket and Pike and the Napoleonic and American Civil Wars. Eras where sources are thin, rhetorical, contradictory, or written for audiences that cared more about meaning than measurement. In that space, OOB work is disciplined reconstruction—building something defensible from fragments, judging credibility, and saying plainly what you’re inferring.
World War II flips the problem. Here, the archives can feel bottomless: daily strength states, equipment counts, TO&Es, war diaries, maps, signals logs, after-action summaries. But abundance creates its own traps. The questions change from “do we have enough information?” to “what exactly is this document counting?” and “why does the other side’s record disagree?” Modern OOB research is less about filling empty spaces and more about reconciling definitions, correcting inherited narratives, and refusing to let myths do the counting for you.
With that in mind, Part II deals with a classic example—Kursk—then widens out to the other half of the modern OOB problem: translating mountains of data into something the engine can actually use at the level the player experiences.
World War I – The Austro-Hungarian and Russian OOB problem
But before we come to Kursk, let us make a small detour and talk about the "War to End All Wars", especially its Eastern Front.
World War I sits in an awkward but useful middle ground for OOB work. The archives are far richer than in earlier periods, but they don’t yet have the industrial, standardized reporting culture you get in 1943. The defining feature—attrition—constantly erodes the connection between “paper” organization and battlefield capability. A division can be present on the map and still fight like a reinforced brigade; a regiment can exist as a name and a flag while its companies are composites, its specialists stripped out, and its effective strength living somewhere between “present” and “available.” If WWII tempts you to treat numbers as truth, WWI tempts you to treat organizational labels as stable. They often aren’t.

Austro-Hungarian officers circa 1914 (Unknown / Public Domain)
On the Eastern Front, those problems show up in their most demanding form, and the Austro-Hungarian and Russian armies make a useful case study. This is not because they “kept worse records,” but because they are a stress-test for what makes modern-era OOB work difficult: huge forces, multilingual bureaucracies, uneven survival of sources by theater, and constant improvisation under pressure. Both armies produced extensive documentation, but it is often fragmented by region and administrative layer, and repeatedly disrupted by rapid reorganization—exactly the conditions where a tidy OOB can be technically correct and still misrepresent battlefield capability. They also sit under a thicker layer of inherited simplification than the Western Front “standard model,” where public memory and many familiar narratives encourage a cleaner, more uniform picture than the war actually delivered.
For the Austro-Hungarian Army, that complexity is amplified by structure. You are not only tracking divisions and corps—you are tracking which administrative system a unit belongs to, how replacements flow, and how temporary groupings form as losses and emergencies reshape the front. In practice, the order of battle you want for a scenario is frequently a snapshot of a temporary fighting group rather than a clean reflection of a prewar table. The same formation can look “complete” in hierarchy while being hollowed out in its infantry companies, unusually MG-heavy, or artillery-weighted because attachments and pooled assets have shifted where the real power sits.

Washington Times Headline from July 28, 1914 (Public Domain)
For the Russian Imperial Army, the hierarchy can be clear on paper, while the reporting categories and the pace of operational change make reconciliation difficult across sources. Mobilization waves, refits, and reorganizations produce units that are “there” in the chain of command but not necessarily in the condition implied by their designation. Strength returns may distinguish “on hand,” “present,” and “combat-ready” in ways that don’t map neatly onto each other—or onto a game engine. Because the Eastern Front often combines rapid movement with sudden shortages and frequent reconstitution, a unit list can be correct and still generate the wrong battle if you don’t also model condition.
The practical lesson is familiar by now: don’t let tidy organizational names do the counting for you. Cross-check hierarchy against what the sources imply about capability—rifle strength, MG density, artillery attachments, cavalry presence, and the cumulative fatigue of marches and losses. If the documents disagree, the answer is usually not “split the difference,” but “identify what each document is actually describing.” WWI rewards the same detective discipline as WWII—just applied to a world where erosion, improvisation, and administrative complexity are the norm rather than the exception.
Modern Battles and Myths – The World War II Example
If any era invites the comforting idea that OOB work is just transcription, it’s World War II. By the 1940s, armies generated paperwork at an industrial scale: strength states, equipment returns, daily situation reports, unit diaries, organizational tables, and an endless trail of attachments and amendments. The catch is that the paper is not a single, neutral truth. It’s a series of snapshots taken for different purposes, using different definitions, under different pressures. The numbers exist—but the detective work is figuring out what those numbers actually mean.
Kursk (1943) makes the point better than almost anything else because it sits at the crossroads of documentation and myth. For decades, it was packaged as a single, decisive cataclysm: the “largest tank battle in history,” the moment German armor shattered against Soviet defenses, the point where the outcome became inevitable. The legend is familiar, repeatable, and satisfying. It is also exactly the kind of inherited story that OOB research has to interrogate rather than assume.
The Myth of Kursk
The popular version leaned on enormous figures: “thousands of tanks” across the campaign, and Prokhorovka on 12 July as an almost cinematic collision of massed armor—charges at point-blank range, heaps of burning wrecks, German losses supposedly in the hundreds. In that telling, the numbers do more than describe. They deliver the moral: the Germans are broken, the Soviets win by weight and will, and Kursk becomes a mechanized Stalingrad.

Burning Tiger I at Kursk, July 1943 (Deutsches Historisches Museum, Berlin Inv.-Nr.: F 60/2011 / Public Domain)
The story endured because it was useful. It made sense of appalling Soviet losses. It protected reputations. It fit the expectations of wartime reporting and postwar prestige. And for decades, it was hard to challenge in detail, because the easiest numbers to audit—daily operational states and repair returns—were not available equally to everyone.
What the documents actually support
Once unit-level records from both sides became accessible and comparable, the picture narrowed and sharpened. Studies built from strength states, operational logs, and repair-and-recovery data put Prokhorovka well below the classic legend. Using a tight definition of the engagement, the German II SS Panzer Corps had roughly 294 tanks operational on 12 July, while the Soviet 5th Guards Tank Army fielded about 616 tanks and self-propelled guns. Broaden the frame to include neighboring formations, and you get higher totals, but still something on the order of ~429 German versus ~870 Soviet armored vehicles—around 1,300, not “1,500 tanks” per side.
Loss accounting is where the detective work bites hardest. Soviet archival figures indicate the 5th Guards Tank Army lost 359 tanks/SPGs on 12 July (207 destroyed or beyond repair). German records for the SS corps (plus a neighboring corps) show roughly 193 tanks/SPGs lost that day, but only about 20 as irrecoverable total losses. Those numbers don’t soften the battle—they make it more intelligible: a brutal Soviet assault that inflicted damage, suffered heavily, and did not produce the annihilating German catastrophe the myth requires.
Why the numbers diverge
In modern war, “loss” is a definitional minefield. A tank can be knocked out, abandoned, recovered, repaired, and returned to service. Some reports count everything disabled on the day; others count only permanent write-offs. “Strength” can mean “on hand” (including vehicles in workshops) or “operational” (ready to fight at a specific reporting time). Even the word “tank” may or may not include assault guns and self-propelled artillery. If two sources are not counting the same thing, they will disagree even when both are internally consistent.

Soviet and German units around Prokhorovka on the night of 11 July (EyeTruth on WikiCommons / CC BY-SA 3.0)
Then comes friction. Under combat conditions, kill claims inflate almost by nature: multiple crews engage the same target; smoke and dust make identification unreliable; “immobilized” becomes “destroyed” as reports climb the chain; and the same wreck can be counted twice by different units moving through the area. Add the battlefield problem—who actually held the ground afterward and could verify wrecks, recover disabled vehicles, and conduct a sober audit—and the gaps widen again. A formation without reliable access to the battlefield cannot easily confirm what it claims, or what it has truly lost.
And Prokhorovka adds one more layer: career survival. Rotmistrov’s 5th Guards Tank Army burned down its strength for a limited tactical gain. In Stalin’s system, that kind of outcome could end a career or one’s life overnight. Reporting a “crushing blow” against German armor wasn’t just spin—it was self-preservation. And the easiest numbers to inflate were enemy strength and enemy losses: hard to audit immediately, politically useful, and perfectly suited to grow as claims climbed the chain. Once the story becomes “we smashed German armor,” it becomes safer, more defensible, and more repeatable. Printed, cited, and recycled, it hardens into “common knowledge” long before anyone has the material to dismantle it.

Pavel Alekseyevich Rotmistrov (Журнал «Красноармеец» №7, 1943 год / Public Domain)
This is the WWII OOB problem in miniature: abundance does not eliminate uncertainty. It changes where uncertainty lives. The detective work is aligning definitions, timing, and scope—making sure you are comparing operational with operational, write-offs with write-offs, tanks with tanks—and then building an OOB that reflects the reconciled picture rather than the inherited headline.
You can see the same logic play out on the map in Panzer Battles: Battles of Kursk – Southern Flank, where “how many tanks” only matters once you’ve answered the harder questions about readiness, recovery, and what’s actually operational on the day.
The practical lesson for OOB design is simple: in the modern era, the danger is not missing numbers. The danger is believing the wrong kind of number, or treating two incompatible accounting systems as if they were the same.
So what does the designer do?
The first move is to treat every “nice round story-number” as a suspect until it is tied to unit-level documentation. A corps didn’t “have 500 tanks” because a popular narrative needs it; it had what the daily state says it had, minus what was immobile, plus what arrived, minus what broke down on the march. That sounds pedantic. It is also the difference between a scenario that feels like history and one that feels like a legend with counters.
The second move is cross-checking. German records may be precise in one category and incomplete in another; Soviet records may be detailed but use different categories and thresholds. When one side’s documents say “operational” and the other says “on hand,” you do not average them. You translate them into the same language—or you keep them separate and build the scenario around what the engine can represent.
Finally, modern research doesn’t just correct a footnote—it changes how the battle plays. A Kursk scenario built on myth tends to collapse into inevitability. A Kursk scenario built on documentation becomes a tense problem of timing, readiness, repairs, and local concentration: formidable German units that are not infinite, Soviet armor that is massive but not magically invulnerable, and decisions that matter because the forces are real, not symbolic.
From Divisions and Regiments to Fire Teams – Squad Battles and the problem of granularity
Operational games reward you for getting the hierarchy right: divisions, regiments, battalions, batteries, attachments, timing. Tactical games punish you for getting the contents wrong.
At squad scale, the credibility of an OOB lives in details that many sources treat as background noise: how many automatic weapons were actually in a platoon that week, whether a company was short on junior leaders, how much of its paper strength was tied up with runners and ammunition carriers, whether a battalion’s “support” weapons were where the fight actually was.
This is where the detective work stops being “which unit was present” and becomes “what did that unit contain?” A report can list a rifle company as 140 men and still tell you almost nothing about its battlefield capability. The scenario needs a plausible mix: rifle squads, squad LMGs, leaders, light and medium MGs, mortars, anti-tank weapons, pioneers, and the inevitable understrength reality that makes units behave the way they did in the narrative.

TOE, Germany MG Schützenkompanie (NARA Records / Public Domain )
Granularity raises the cost of error. At higher scales, a ten percent mismatch is annoying. At squad scale, “a small mistake” can become a qualitative distortion. One extra MG team, one extra 60mm mortar section, or upgrading a squad’s weapon type can change the feel of the fight more than shifting a battalion’s headcount by a hundred men. The same is true for flamethrowers, demolition gear, and improvised anti-tank weapons: rare on paper, decisive in play.
You see that “small kit, big consequences” problem constantly in Squad Battles titles like Winter War and Red Victory, where a handful of SMGs, a mortar section, engineers with demo charges, or improvised anti-tank weapons can outweigh what the company’s headcount suggests.
Validation changes, too. Tactical OOB work triangulates sideways: TO&Es tell you what a unit should look like; narratives and after-action reports tell you how it behaved; captured documents, issue tables, and unit diaries sometimes explain why. That last step—the “why”—is often what makes the scenario click. A weapon might be on the books but be short of ammunition, lack a trained crew, or be parked well behind the action. The OOB isn’t a museum catalogue. It's a capability at a given moment.
From History to Game: Assigning Unit Qualities, Fatigue, and More
Numbers alone do not make an OOB playable or truthful. An OOB is also a condition: training, cohesion, leadership, fatigue, confidence, and the friction that turns identical headcounts into wildly different outcomes. This is where the designer turns evidence into behavior.
Unit quality and morale are not “flavor.” They are the model’s way of answering questions the sources constantly imply: which formations held under pressure, which ones cracked, which ones acted with initiative, which ones needed to be shoved forward. Letters calling a unit “green,” reports describing a brigade as “shaken,” repeated accounts of a formation standing fast under fire—these become quality grades and morale values.
Leadership is part of that translation. A chain of command is not just a structure; it is performance. A commander who consistently reacts late, misreads the situation, or loses control of subordinates produces a different battle than one who keeps formations moving and coordinated. The OOB encodes who reports to whom; scenario parameters encode how well that command structure actually works.
Fatigue and cohesion do the same work for time. Battles happen after marches, delays, detours, sleepless nights, and rushed deployments. Accounts that describe a formation arriving exhausted or disorganized are not decorative—they are instructions. They tell you what the player should feel at turn one: the pressure to rest, the risk of pushing too hard, the tendency of a tired unit to break at the wrong moment.
Equipment matters most when it changes decisions. In WWII, “tank battalion” is not a generic label; the mix of models matters. In infantry fights, the difference between an understrength rifle company and one with intact automatic weapons changes the fight even if the headcount looks similar. The detective work is matching the documentary language (“on strength,” “operational,” “serviceable,” “present for duty”) to the capabilities the engine actually represents.
Playtesting and the final audit
Once the OOB is assembled, playtesting becomes the last stage of investigation. If the scenario produces results that are consistently at odds with the historical shape of events, it’s rarely because “players are weird.” It’s usually because the model is missing a constraint: a unit arriving too early, a formation too fresh, a weapon mix too generous, a morale value too optimistic, a command relationship too clean.
This doesn’t mean forcing history to happen every time. It means making the historical outcome understandable in the model. A good scenario gives both sides real choices. It also ensures that when players fail, they fail in ways that resemble the real problems the historical commanders faced.
Conclusion
Modern OOB design looks easy only from a distance. The archives are deep, but that depth hides traps: incompatible definitions, propaganda-shaped narratives, memoir distortions, and the seductive comfort of numbers that feel authoritative because they are specific. The work is not transcription. It’s investigation.
That investigation has two faces. At the operational level, it’s reconciling record systems and building a force picture that doesn’t double-count, omit, or inherit legend. At the tactical level, it’s reconstructing capability—what formations actually contained, what they could actually do, and why seemingly minor details change the fight completely.
The end result is more than a roster. It is an argument: a reasoned reconstruction of who fought, in what condition, with what means, at what moment. When that argument is built well, the scenario stops feeling like a script and starts feeling like a problem set by history. And that is the point of the detective work: not to win an academic debate, but to put the player inside the real constraints that shaped the battle.
Bibliography
Below you find some book recommendations for books focusing on orders of battle and TOEs. Clicking the cover brings you to Amazon.
Nafziger, George F. The German Order of Battle, Vol. 1: Panzers and Artillery in World War II. London: Greenhill Books, 1999.
Nafziger, George F. The German Order of Battle, Vol. 2: Infantry in World War II. London: Greenhill Books, 2000.
Nafziger, George F. The German Order of Battle: Waffen-SS and Other Units in World War II. Cambridge, MA: Da Capo Press, 2001.
Sharp, Charles C. Soviet Order of Battle World War II, Volume I: “The Deadly Beginning”. West Chester, OH: The Nafziger Collection, 1995.
Glantz, David M., and Jonathan M. House. The Battle of Kursk. Lawrence: University Press of Kansas, 1999.







Articles like this, or these delightful instalments on Napoleonic battles or Musket and Pike series, the anatomy of a siege, from cataphract to the tank, etc., shows the depth and passion you put into your work. Its fruits are what we dreamed of.
Thank you very much.
This was extremely helpful and interesting to read. You articulated what I’ve been doing, somewhat haphazardly, for years – and it clarified the process for me. Thank you for sharing.
thank you for sharing all this info. One can only grasp the huge amount of effort and research you put into your games. I am very proud of being your client. Keep up with the good work!
Another great post to follow the pre-WWII considerations.
Really captures the process to develop a viable and challenging scenario without replaying history as it has been told by the victors. Keep this content coming, it is excellent.
@Rod: But I did in the post about the Eternal Logic of Shock Warfare (https://wargameds.com/blogs/news/from-cataphract-to-tank). I try not always recommend the same books. And I agree: Töppel’s book is excellent.
Leave a comment