SA Monte Carlo

A Monte Carlo Simulation of Combat in Battle Aces

Version 0.9.9 - WORK IN PROGRESS

Beta 3 - Patch 4 - 2025-04-25

3,000,000 Simulations

Style

Acknowledgements

Thanks to u/Dolmant for his working mining and publishing the unit data.

Thanks to u/Keiras for advice on getting started (also, I have broadly aped his style).

Thanks to Matsumoto et. al. for their work developing large period pseudo-random number generation for Monte Carlo simulation.

Deck Builder

The Simple Aces Monte Carlo compares the combat performance of every bot. This analysis is summarized in a set of weights that describe a bot's contribution to combat based on its own stats and on friendly and enemy bots. Use this tool to construct a deck for yourself and an opponent and explore the interactions. Scroll down past the interactive section for detailed instructions. If you would like to know more, use the top bar to navigate to later sections that discuss the goal and scope of the project, how it works, and walk through the results in granular detail.

Measure Value as:

Normalize by Value:

In the left section, select a deck slot in “Your Deck” or “Their Deck” to fill the spot with the desired bot or use the tech icon to clear the selection. When filled, a slot will present the bot and primary combat value in white. When empty, a slot is dimmed and the average value of the eligible bots is shown instead.

The value displayed is set using buttons above the decks. This impacts how heavily bots are affected by each other.

“Score”: The combat value as a resource cost equivalent, the sum of survival rate (based on the bot's cost) and damage dealt (based on the cost of destroyed bots).
“Win Rate”: The value as expected win rate. NOTE: To be Implemented.

"Resource": The combat score normalized by resource cost as a %. Expensive bots tend to have smaller scores and effects. A reasonable estimate of a bot's return on investment.
"Bandwidth": The combat score divided by bandwidth. Supply efficient bots will have higher values and high bandwidth bots will have reduced correlations with lower bandwidth bots. Approximates the bot's contribution in a maxed out force.
"Bot Count": The combat value as the expected score (in resources) for a single bot. Each bot is weighted equally, so powerful bots will generate powerful correlation effects. Descriptive for small army sizes.

The skill slider is currently disabled as behaviour does not vary with skill in the present version.

Slots are coloured to indicate synergies and anti-synergies with friend and foe. Each slot refers to the upper deck in the upper mini-grid and lower deck in the lower mini-grid. A slot indicates itself with a dark circle. An upward arrow indicates strong synergy; the bot gains a benefit from the referenced slot. Conversely, a downward arrow indicates anti-synergy; the bot loses effectiveness in the pairing. Small coloured circles are used to indicate small synergy effects. A white circle is used when the synergy is negligible in either direction. In all cases, the colour corresponds to the deck that benefits.

Relationships can be explored in detail by mousing over the deck and highlighting a slot. Here, coloured arrows are replaced with the referenced bot or slot icon and colour-coded text indicating the magnitude of the value correlation. Each bot referenced shows two numbers: on top is the benefit (or penalty) to the highlighted bot, while below in parentheses is the correlation in the reverse direction. The highlightee also displays additional detail, listing its base value and the net correlation value in parentheses.

Many bot pairings result in one-sided synergies. This is natural in enemy pairings. As an example, it is unsurprising to see that the Airship loses a lot of value paired against the Hunter and likewise expected that the Hunter gains a considerable amount against the Airship. However, these kinds of pairings are also very common between two bots in the same deck. For instance, the Crab loses a small amount of value when paired with the Hunter, but the Hunter gains more than is lost (especially when scores are normalized for cost, due to the Crab's lower price). In this case, it is likely that the effect of the Crab in the Crab/Hunter pairing is to significantly increase the lifespan and damage output of Hunters leading to their drastic gain. Though the Hunters also return the favour by slaying attackers threatening any Crabs, a portion of the score that would have been earned by those Crabs is reduced by this supporting fire and given to the Hunter instead. It is important to keep this cannibalizing effect in mind when looking at deck pairings and examine the interaction in both directions.

Looking at certain bots in detail also reveals some hints at the limitations of the model. The Guardian Shield, for instance, heavily skews the value of Core bots (and Core averages) in all but “Per Bot” viewing due to its 0 cost (which is approximated here as 1 Matter or 1 Bandwidth as needed to avoid divide-by-zero errors). This makes sense as the utility of the Guardian Shield comes not merely from its combat abilities but from its free deployment and detailed macro interactions are not a goal of the simulation.

To a player with experience in the game, certain bots are also noticeably undervalued. These tend to be melee and short-range bots (like the Crab), wide radius bots (like the King Crab), and high speed bots (like the Stinger). The replay section, further down, elaborates on the underlying simulation decisions that cause these issues.

Introduction

Welcome, please enjoy the following self-indulgent blog/guide/paper on the nature of combat in the (at time of writing) upcoming RTS Battle Aces from Uncapped Games.

In Battle Aces (BA from here on to spare my phalanges), players take part in matched competitions consisting of short, single games where play is comprised of a harshly pared down subset of elements common to the RTS genre. It is descended most directly from Blizzard's 2010 title, StarCraft II, and, like that game, is fast and furious. Both emphasize tight unit control, intelligent tactical positioning, and multitasking, and they feed back to the player similar vibes: huge futuristic drone armies pushing larger-than-screen-height battle lines while navigating delicate compositions using through silky smooth and responsive pathing.

BA is unlike nearly every other RTS in most other aspects, having been designed with the intent to remove unnecessary elements. In fact, it features no building placement, no worker management, no trade-offs between the collection of disparate resources, and very little "macro"-style gameplay at all. Likewise, it includes no high ground advantage, no covering behind terrain, no capture points or optional objectives, and no map variety. Finally, it features no single-player mode, limited team play, and few practice/non-competitive modes.

In place of genre staples, BA offers each player the opportunity to construct a custom faction from a growing supply of unique units, or bots. Combinations are limited by a few simple rules: Decks contain 8 slots for selecting bots, bots come in 5 tech categories, and 1-2 slots per deck are allotted to each category. The deck building provides the main source of variety and strategy in the game and, along with skill expression, decides the overwhelming portion of how games unfold and how players interact with the main game mode, competitive play up and down a ranked ladder.

This emphasis on player faction diversity, and the asymmetry it creates, pulls BA out from the realm of RTS and into a unique zone shared by only a handful of other titles, like Pocketwatch Games' Tooth and Tail, and the criminally underappreciated Absolver from SloClap that looked to blend TCG-style deck construction with real-time action. All RTS games since my youth have had some amount of faction diversity, with the decision to play one or another made outside the gameplay proper and leaching into the strategy by dictating the available timings and counters. BA, by so frontloading the faction creation and simplifying in-game decision-making, has placed almost all this activity outside the moment-to-moment gameplay. Player decks have narrow spaces to adjust to each other and the breakneck pace prefers hurling players into the haze of combat rather than having them sitting and stewing over long-term decisions. This feeds the main gameplay loop well, once a deck is built, a player can play again and again with match-making coming quickly and games lasting as little as two minutes and never more than ten.

It also means that the game is more fundamentally about how a player's choice in deck relates to the behaviour of the ladder population as a whole. It is a game of averages and puts a premium on players understanding the high-level trends in deck choices in order to engage with the ladder successfully. While this is a mindset already espoused at the top of the ladder in many RTS games, BA, through its design, more directly asks players to disavow the need/expectation to seek to win every individual encounter. It is not an uncontroversial choice. There are players who express frustration with the lack of direct control, or concern that the gameplay complexities of asymmetrical information and long-term planning that can normally balance players have been reduced to pre-game decisions. While these concerns are, in my mind, overanxious, I think they arise from a legitimate discomfort that grows out of finding oneself in a new and unfamiliar space even if that space is merely a new lens on the world they already occupy. RTS players are not always TCG players, card and tabletop gamers are not always RTS players, and everyone (myself included) who engages with this game is going to need a bit of time and assistance in making the switch.

So that is, finally, why we are here. I will repeat this phrase throughout this document and certainly over and over during the lifetime of the title, "Battle Aces is a game of averages," and so, to describe it, to play it, to master it, we will have to learn to speak the language of averages. Farther down, I will be presenting a statistical analysis that combines the rules of deck-building in BA with the rules of its combat to model as much of the game as I can at a population level. As time goes on, this analysis will improve and deepen and, with enough head-scratching and a little luck, I think I can create a holistic description of the game's interactions that players can use as a tool to learn, or bring with them onto the ladder to upset the meta.

I will add a few final notes on the project's status, goals, and intentions before I move on. Simple Aces Monte Carlo is written in C and will be open-source (once it is presentable) so that the source, too, can be used as a learning tool or for any branching works interested parties can imagine. I am happy to collaborate and accept feedback on the design and content of the simulation, this page, and my analysis, or answer questions about my decisions. You can reach me here and here. I am treating this both as a contribution to a growing community and a learning experience for myself, so I reserve the right to make mistakes and take my time as I painstakingly figure out optimal and accurate solutions to the many interesting problems.

I hope you find something here to help you on your Ace's journey, and I look forward to seeing you on the ladder.

Good luck, have fun.

Overview

On Monte Carlo Models

A Monte Carlo model is a type of stochastic representation that describes a large and perhaps unknowably complex real-world system by breaking it into smaller, well-behaved components or inputs (some or all of which can be realistically modelled as random), combining them together, and iterating the collective over an enormous number of randomized permutations. This generally has the effect of both expressing the length and breadth of variation in the system, as most combinations of very large and very small values occur in the system over the course of a very large number of iterations, and of describing the system's behaviour in the average case.

The classic example of simulation in engineering is a production line. A production line has multiple inputs, different materials, parts, even other partially assembled components, that each come with lead times and often suffer random delays or hiccups. It also includes several stages or steps where a worker performs a task, like machining, assembling, painting, or testing. Each stage takes time and the time taken can be modelled as random variation about a mean. Taken even farther, in the emotionless language of our corpo-verlords, the human resources can be modelled as random with each worker's labour efficiency varying about the population means of skill and speed.

There is more at play, however, than the single time taken for a particular stage. The process of a production line is sequential with each stage having to wait until one (or more, if multiple sub-elements have to come together for one activity) previous stages are completed creating, sometimes extreme, delays. Likewise, a stage of production is usually limited in how many inputs it can receive and might create a bottleneck as the speed of previous stages creates too many inputs and a queue forms. In both cases, the aggregate behaviour of the system very quickly becomes chaotic and leads to headaches when attempting to predict output.

This is where the power of Monte Carlo modelling shines. With many behaviours colliding and decisions cascading through or bouncing off one another, simple analyses leave a lot of nuance on the table. If one assumes the mean time for every stage, one woefully underestimates the compounding effects of errors. If one assumes the worst case scenario, where everything goes wrong at once, one is equally unsatisfied, finding in-hand a prediction of the apocalypse. And all in between, the chaotic nature of the system defies attempts to soothe uncertainty. Monte Carlo models cleanly combine both the average and the extreme.

On Event-Based Simulation

Monte Carlo modelling creates and collates diverse data, but this framework puts no limits and gives no guidance on how the system runs or how the data are generated. The actual details of the simulation are based on other factors.

In the digital world, these factors are usually practical concerns regarding time, memory usage, and the purpose of the sim itself. For example, an aircraft flight simulator needs to perform in real-time to train real-world actions, while a simulation of stress in a building's design can take a week or so to run so long as it is very accurate. As well, simulations can vary based on whether they center discrete objects or chronological events. In the BA, simulating the units themselves is part of the point; it enhances the fun to see discrete objects interact. In the above production line example, only specific moments in time when a process starts or stops are really needed and details like worker bathroom breaks can be abstracted into the randomized time computations.

One can break simulation broadly into real-time and non-real-time, and further into agent-based and event-based. In many cases, it feels like an agent-based simulation with bots as the focus is the truest representation of the system. However, event-based simulation can often represent the same interactions and is usually orders of magnitude faster. Inevitably, some interactions of agents can be highly complex and even rely on inter-agent relationships. These relationships are, in event-based sims, forced to be approximated as an acceptable casualty of performance.

For those reasons, I have undertaken to reformulate agent-behaviours in BA into simplified timed events. While agents are still present in the simulation, they are not the owners of any actions but are merely repository states, updated and acted upon by the events that occur. Discussing how and where this assumption creates differences between the model and reality is an important part of this analysis and will come up frequently in the more detailed discussions.

Goals

Battle Aces (and most competitive games) is a massive nest of interconnected selections, decisions, actions, costs, trade-offs, and delays that somehow coalesce into a climactic battle between two players and, finally, an outcome. The advantage that BA has in this arena is that its rules are relatively simple and its playspace relatively small. One can imagine a possible breakdown of the game into its constituent components and their recombination into a Monte Carlo model that might not be practical in a larger game.

Such a simulation will not be perfect, of course; predictive models never are. Its ability to describe professional play will be limited, as top-tier players become so by eliminating much uncertainty and inconsistency from their play. Similarly, its ability to describe the outcome of a specific battle through population level numbers will be weaker than some more direct analyses. Rather, what a Monte Carlo simulation of BA offers is the prediction of trends, a crystallization of population-level behaviours: the choices of the playerbase holistically, the meta of the ladder as a whole. As well as a prediction of ranges, helping to show the upper and lower bounds for each player choice. Finally, statistical modelling opens the door to unravelling complex correlations that describe how many variables can interact and which depends on the others the most.

To get nitty-gritty, I hope to be able to determine a few key values in relation to the count of a bot used in a battle:

Overall contribution to victory in a general case, measuring the improvement in battle results as more of that bot is added,
Counts of highest return, those numbers where adding or removing a bot impacts score the most,
Synergies and anti-synergies, the bots with and against which a bot sees enhanced or degraded performance,
Approximate skill cap, the degree to which player and bot performance are correlated.

As well, I want to analyse individual unit values: health, attack range, damage, speed, ability charge count, etc. Then I want to derive similar numbers to describe these in terms of score contribution to pull out which elements of a bot's specific make up contribute most to their performance.

Details

While I don't want to put anymore detail on top than I really need, I think it does make sense to preview the content of the actual simulation, in advance of detailed discussion here. Generally, a Monte Carlo consists of:

A (pseudo) random number generator (PRNG) -- selected for large periods of generation without repetition and for thread-ability,
An algorithm for setting up initial model conditions -- including all relevant variables and leveraging the PRNG to ensure realistic spread,
An algorithm for propagating the models from initial to end conditions -- leveraging the PRNG to create variation in intermediate states,
Statistical techniques for interpreting the outcomes.

The random number generator and statistical analysis I will leave for later and focus on the BA-specific details. The two middle algorithms are the bulk of the model and run together for a large number of iterations to generate a mountain of outputs.

The model imagines games between two players, each moves through the same initial steps:

Deck selection -- up to 8 bot types are selected for each player according to BA deck building rules,
Mutation -- the stats of the selected bot types are randomly moved up or down by a small percentage,
Resource Allotment -- A random army size is generated using the resources Matter and Energy, both players receiving the same allotment,
Tech Choice -- A random number of upgrades are chosen for each player, unlocking some bots, with the tech cost subtracted from the allotment,
Army Composition -- The remaining allotment determines how many of each bot a player fields, totals are adjusted to ensure cost parity as much as possible,
World Creation -- The units are placed in the simulated game world, all simulations use the same world size and separate players a fixed distance,
Event Execution -- Everything that occurs in BA is an event, each is placed in a chronological queue and then executed, generating more events if needed.
Implemented Events Include:
- Creating a bot,
- Targeting a bot,
- Moving toward a location -- includes intercepting a target unit and resolving collisions,
- Attacking a bot -- including wind up and recoil times as well as projectile speed,
- Taking damage -- resolves scoring for a unit that is destroyed,
- Destroying a bot-- resets other units targeting the victim,
- Starting/Stopping HP Regen,
- Blink -- instantly moves to target position,
- Recall -- including wind up time, instantly returns to home position,
- Starting/Stopping Overclock -- Increases move and attack rates for the duration,
- Ability cooldown and recharging,
- Setup/Unsetup -- Enables attack/movement respectively, and vice versa,
- Detonate -- Destroys the bot on attack, does not grant score to the defender,
- Destruct -- Destroys the bot if it has no more valid targets,
- Guardian Shield -- Doesn't do anything because workers are not simulated.

As you have no doubt noticed. There are a few things conspicuously absent from the simulation, namely:

Terrain,
Core and Expansion buildings,
Workers and resource collections,
Addition of new units to an on-going battle,
Researching new tech during an on-going battle.

Each of these missing pieces has an impact on the utility of the sim. The main point to keep in mind is that the values the model generates are without reference to macro and only describe a bot's value within a pure combat context. Values should be expected to vary when examined in relation to attack timings and in economically-focused play.

On Linear Regression

The analysis of the model results is based on a multivariate regression to a linear model. A linear model is one where an input adds proportionally to an output. Such a model may have any number of inputs, each with its own coefficient of proportionality or weight. The inputs to this system are army composition (bot counts), aggressive or defensive position and player skill for each of two players. NOTE: aggression and skill are not currently implemented. A multivariate scheme describes a model with numerous output values. In our case, each output corresponds to the score earned by a particular bot type. At present, this means the system is generating 53 linear relationships as outputs, each with 110 input weights.

The process of determining this relationship is that of finding the line of best fit to a set of data. It should be familiar to anyone who has completed a university physics lab, and will be detailed later. The nuance to keep in mind now is that this number does not exist in isolation, but also has a corollary value: the error, which describes how closely the actual data actually matches the line of best fit.

I expect that in understanding these many results, the analysis will provide significant insight into the workings of gameplay. However, the linear representation is certain to miss some nuance, especially for very large numbers where small changes in weight result in large differences in predicted score. It is likely to be most descriptive in the base case, for valuing first (or first few) bots added.

Because I want to preserve both the direct value and describe the error, I will present the results in a few ways. Below you will step through the process of creating the results. First, examples of the event-based simulation are shown as pseudo-replays to indicate the type of behaviour analysed. Beyond that you will find raw results of simulations for each bot, with comparisons showing the variability in score relative to the regression line and unit cost. Next, the main line regression results, sorted by bot, are presented in a few different measures (score, win rate, etc.), relating how that bot improves or diminishes player outcomes. Finally, the interplay of bots is described with a more detailed breakdown of several additional regressions performed on subsets of the data to pull out specific synergies and anti-synergies within and between the bots.

As well, I will create a few sections applying the data in different features that might be of ancillary interest. One is a deck builder, a much requested element, that uses the weighted impacts of bots on each other to create a prediction of deck power. The other generates a fake ladder and estimates how a variety of decks and player skills would distribute themselves. NOTE: The ladder simulation is currently disabled.

Replays

Select from a reduced set of 125 simulation iterations in the dialogue to the bottom left to watch a recreation. Track forward and backward in time by using the lower scrollbar. The current combined and slot scores for both decks are shown below the replay field along with a count of living bots.

Replays are a good first step in understanding the inner workings of the simulation. The replay is presented in a scroll-able field showing the approximate position and actions of each bot. Actions are presented using coloured lines (solid for movement and movement abilities like Blink, dashed for attacking and combat abilities like Overclock).

Initial set-up, including the selection of army compositions and placement, can be seen at the 0th second. All iterations take place on a featureless square region several hundred units wide and feature armies of roughly the same size. An army's starting position is a crude fan spanning roughly ninety degrees and oriented toward the opponent. Bots are placed on the field in slot order.

As time moves forward, elements such as the bot radius, speed, and attack range are easily visible. Other values like damage and attack rate are not immediately obvious, and can sometimes be hidden because the time resolution is only 1 second (so, for example, a Butterfly that attacks more than once per second will appear to only attack once) and because display priority is given to abilities when they are used at the same time as other actions. While not all activity is visible, it is still executed by the simulation.

The lower section of the replay describes the current player scores with a large two-coloured bar. Scores are further broken down into dark and light bands. A dark band represents the score generated by the resource value of the living bots of that side. This decreases as those bots are destroyed. The light band appears as combat begins and indicates the value of the opposition bots destroyed. Note that while the simulations aim to create equally valued armies to pit against one another, the cost of unlocked tech branches is subtracted from the allotted resources before units are generated meaning that armies of different tech levels usually have different starting scores. This situation also arises, to a smaller degree, when bot selections prevent an exact match of resources spent, as in the presence of the Kraken.

Underneath, each bot group present in the simulation is shown in its deck slot alongside the number of instances of the bot currently alive. Each bot type also has its own miniature score bar. Value gained through destruction of an opposing bot is proportional to the damage dealt, so a unit group whose members did half the damage to a bot, but did not destroy it, still gets half the points if it is destroyed later. These points are assigned to the group even if the damaging bots have already been destroyed themselves.

A white line is drawn at the centre of each unit group score bar representing the starting value of the score. This line may move left as the scores scale up, and it always identifies the position of the starting score on the current bar. This means a bot with a white bar far to the left has earned many times more points than the bot was initially worth.

Looking at scores for each bot at the end of the simulation gives a visual representation of the total contributions made to combat. Bot scores are the primary output of the simulation and carry on into the remainder of the analysis.

While many of the mechanics of Battle Aces have been recreated faithfully, a large number are looser, and a few are replaced altogether with heuristics and approximations. This results in several more and less visible discrepancies between the sim and reality. The intent is to address some or all of these in future versions. For now it is important to discuss them to provide the appropriate context for interpreting the simulation output.

The common behaviour of all bots is to move directly toward their nearest valid target and begin to attack. This simple behaviour is meant to be in line with attack-move bot behaviour. A few special elements are in play to simplify or smooth out motion.

At the start of a sim, units move in parallel toward each other rather than closing distance as directly as possible. This mostly emulates the clustered movement of bots while cutting down on time pressure from needless collision checks. While not strictly accurate (real bots formations tend to narrow as they move) this behaviour drastically improves performance for a minor degradation. Over a handful of seconds, this tendency is smoothly relaxed and bots begin to hone in as expected.

A major difference from the game is the handling of collisions. While moving, bots do not collide at all and instead only check collisions at the end of a move event. This has the useful effect of generating bulk motion that places fast bots ahead of slow ones without losing time, but leads to cases of overlap that can be visible (see: any sim where Wasps must rush past heavier front liners). This tends to result in more spreading between bots based on their speed than expected and a faster overall start to combat with faster bots fighting earlier. It can also create moments where splash damage is magnified because more units are in the targeted region than would normally be allowed. When collisions are considered, a slowdown is applied to bots. The heuristic for this slowdown is tuned to provide a good balance between reliable bot collisions and simulation time, but it does tend to create unrealistic delays in movement of bots who collide frequently (for example while moving to surround a target in melee) and probably lowers their scores to some degree.

In combat, bots spread out and surround their enemies, this does not involve any specific behaviour, occurring automatically as bots collide with each other. A particular unreality associated with the implementation is that any given bot only navigates either clockwise or counterclockwise when resolving collisions. This creates a visible wave effect around any large fixed line of combat with two clusters of bots rolling around each other. Again, melee and larger units tend to be the most affected, likely with lowered scores.

Target selection, like movement, is simple and follows the basic rule of picking the nearest target after filtering for bots qualifying for extra damage. Bots are permitted to freely retarget and will swap to a better target as soon as one appears. This matches known behaviour for the Destroyer, for example, but is different for the Guardian Shield which always targets the nearest bot. For many bots, the exact behaviour for target selection in-game is not known, so this also represents a limitation to be addressed in the future.

Movement errors are difficult to fully quantify, but can be understood to consistently affect bots that move fast, have short range, and/or large radii, as those units overlap or collide most often. This, in turn, is likely to decrease their score, either because they are destroyed more quickly or deal damage less often. On the other side of the coin, ranged combatants, and especially splash damage dealers, benefit more than they should from movement interactions.

Finally, there are many elements of RTS play, such as kiting, stutter step, and focus fire that are absent from this version of the simulation. A future release is intended to include scaling skill measures for each combattant that will transition between all-size-fits-all attack-move behaviour in the present simulation and sophisticated micro based on each bot's momentary situation.

Scores

This section shows the score values resulting from the simulation. The initial set is reduced to a maximum of 3125 data points to reduce clutter.

Select a bot to view its scores. Select “Raw Points” or “Means and Errors” to show the data as a scatter or error bar plot.

Select Bot:

Crab

Hunter

Guardian Shield

Recall

Recall Hunter

Scorpion

Beetle

Blink

Blink Hunter

Gunbot

Missilebot

Wasp

Hornet

Knight

Crossbow

Ballista

King Crab

Crusader

Bomber

Shocker

Recall Shocker

Mortar

Swift Shocker

Heavy Hunter

Destroyer

Raider

Turret

Heavy Ballista

Gargantua

Sniper

Advanced Blink

Assaultbot

Advancedbot

Behemoth

Advanced Mortar

Blaster

Butterfly

Dragonfly

Falcon

Airship

Advanced Recall

Mammoth

Stinger

Flak Turret

Bulwark

Katbus

Locust

Kraken

Predator

Valkyrie

Artillery

Advanced Destroyer

Shade

Show data as:

Data are charted on the x-axis by selected bot count and on the y-axis by sum of bot score. Each individual bot scores points equal to its resource cost if it is alive at the end of the simulation plus points proportional to damage dealt toward to the destruction of an opposing bot. That is, a bot that deals 50% of the damage to a destroyed bot scores 50% of that bot's resource value. A bot will score points for damage dealt whether or not it is alive at combat's end. Scores are combined so that each bot is represented by the sum of all points earned by all bots of that selected type. In the scatter plot view, each simulations point sum is represented by a separate point, while in the error bar view, all simulations containing the same number of bots are averaged together and a vertical bar extends above and below to the 1st and 3rd quartile of scores (that is, spanning the middle 50% of results).

Points are drawn in one colour if the score value exceeds the resource cost at that bot count and in another if it falls short. Two lines are drawn over the data. The white line shows the linear increase in resource cost with bot count, delineating the two colours. The coloured line shows the regression result for the selected bot. This is the average actual increase in score as bot count is increased.

Just as the varied simulations presented in the replays section are summarized by their final scores, these result charts condense the simulation set into a single image, focused on creating a single output. I described the nature of the model as a linear regression in the overview, and these plots give a clear indication of what that really means. The main outputs of the model are the slopes of the coloured lines corresponding to each score/bot set. This line is created to be as close as possible to all the points on the plot and capture their simplest trend. In the remaining sections, frequent reference will be made to weights, slopes, and values derived from the model. Each refers to the slope of a line on charts like these.

Linear regression is good for creating a first approximation understanding of a problem and also for generating basic solutions, but it can also leave details on the table. It is intuitive to the Battle Aces player, for example, that bot values are not inherently linear. Many bots, like King Crabs, are considerably less effective when their numbers exceed a certain threshold beyond which they cannot move or fight effectively. These drop-offs can be seen in the results plot, but are not retained by the regression weights.

As well, even though the general character of a bots performance can be seen by looking at its line in the chart, the linear regression weight by itself also has little to say about the variance in the score values measured. Almost every bot has many games above and below their cost line showing that, in the right situation, any bot can over or underperform, sometimes significantly. This can be seen clearly in these charts through the tall error bars attached to large army simulations.

A challenging form of non-linear behaviour, visible in many charts, especially those of core bots when the “Raw Points” toggle is active, is a bimodal density with many very high score games and many very low score games with little in-between, likely due to the powerful impact of counter mechanics. This problem is addressed in part through the multivariate nature of the analysis: there are additional weight values, not as neatly presentable in a scatter plot, that characterize correlation between one bot's score and the presence of a different bot and these help explain some of the variance by quantifying synergies and counters. These are explored in a later section.

Regression Analysis

The results of a linear regression of 3,000,000 permutations of combat in Battle Aces are described in a bar chart. Each bot is given a bar equal in height to its weight and bots are ranked from left to right. The bots presented can be filtered by selecting from several dropdowns for tech tier, trait, and manufacturer.

Detailed data are provided in a table below the chart. Clicking a column in the table will redraw the bar chart using the selected measure. Mousing over or clicking on bots in the chart or table will highlight them in both.

Filters:

Several measures are presented to allow bots to be compared for relevant performance metrics. These include:

“AvgScore” is the average score for this bot group across all simulation iterations of which it was a part,
“Weight” is the regression output, the score earned by each bot,
“Weight / Res (%)” is weight divided by cost,
“Weight / BW” is weight divided by bandwidth,
“Win %” is the percentage of victories where the bot was present,
“Pick %” is the percentage of games in which the bot appeared.

This interface gives the first tangible method of comparing many bots at once. By selecting the appropriate measure and filter, bots can be ranked for value, for efficiency, or for win rate in one place. The foregoing sections have detailed several limitations to be kept in mind when using this data. Certain types of interactions (ranged attacking and splash damage) are privileged in the current version of the system and others (speed, girth, and melee attacking) are penalized, however it is very satisfying to see that, a player's expectations hold up well, core Anti-Air, for instance, is highly specialized and so has low weight and win rate.

It is recommended to use the various measures together to get a good idea of a bot's real value. A high win rate and a high weight do not always go together if a bot is very inefficient.

Individual Stat Weights

Explore the stat-specific regression results for a chosen bot. Each stat is shown in a pie chart describing its percentage contribution to the bot's total performance and a wireframe chart comparing the percentage to other bots in the game.

Select a bot to view from the first dropdown. Select a comparison, either a general bot or an average for a particular tier from the second.

Now is the time to admit that I told a small fib in the earlier sections on the results and regression analysis. I said that each bot generates a single weight describing the score per entity. This is untrue. The simulation, in fact, generates one weight for every stat of the bot and adds these weights together (in proportion to stat values) to generate a collective weight. The math is essentially the same, except instead of measuring 20 Crabs, the system is measuring 20*2200 Crab HP, 20*200 Crab damage, 20*SMALL, etc.

The analysis in this pane presents these values exploded out to show the individual contribution of each stat to the total score and displays them in two charts. In the centre, the bot and its total score are displayed. On the left is a pie chart showing the proportion of the score based on each stat. On the right is a wireframe chart showing the contribution compared to an average of comparable bots.

A weight can be positive or negative. Positive meaning that increasing that stats helps the bot's score, while negative means that increasing it hurts the score. Examining common helpful and harmful stats across all bots reinforces conclusions from the replay section. Radius, for example, is one of the most common negative stats. This is expected, as high radius units have less flexibility of movement, but it plays a consistently large role in reducing many scores.

The pie chart represents stat scores as a percentage of total score. These are ranked clockwise from largest to smallest, with scores contributing less than 5% omitted. Main colour slices represent a positive score contribution, while alternate coloured slices represent a score penalty. Mouse over the slices of the pie chart for a detailed breakdown in the central table. This table shows the score contribution on the left and the contribution of that stat in the comparison bot on the right.

Comparisons selected from the second dropdown are based on tech levels and average out the stat scores for all corresponding bots to create a comparison group.

The wireframe chart shows a bot's stat scores relative to the comparison. As in the pie chart, only the highest contribution scores (by %) are shown. The main colour line represents the selected bot and the alternate colour, the comparison. This means that for stats where the bot line is outside the comparison that the selected bot is more dependent on that stats. Where it is inside, the bot is less sensitive. The gray rings represent percentage point thresholds, with the outer band being 100% contribution, and the inner band -100% contribution. The thick grey ring in the middle represents 0% contribution.

Below the charts is a complete table of all stats for the selected bot. This table includes weight values excluded by the pie chart for being too small, as well as weights corresponding to any stats or traits the bot doesn't normally have (these are dimmed).

Each stat can be read in a few ways in the different columns. The “Value” column describes a bot's in-game stat value. The “Weight” column indicates the increase (or decrease) in score if that stat is increased by 1. The “Score” and “Percent” shows the total score for the actual stat value and that product as a percentage of total score. In addition, the table includes all zero stats in the selected bot. Zero stats always contribute 0% to the unit score but can still be used as hypotheticals describing an increase or decrease. Lastly, note that values for boolean properties such as the SMALL or FLYING traits are treated in the table as though they have a value of 0 or 1, so adding BIG to a unit would increase its score by the list amount, while removing the ability to target FLYING would decrease by the listed amount.

While these outputs have less direct application to gameplay being less succinct or meaningful in constructing a deck, they do provide a fascinating lens into the game balance. In theory, an under or over performing unit can be brought in line by adjusting any one or several of its stats, but this analysis allows us to describe which stats are likely to bring larger and smaller changes. Evaluating the accuracy of this section going forward wll mean comparing the performance of balance patches with the breakdown computed here.

Correlations

View each bot in terms of its best and worst pairings, collected into several bins.

Select a bot with the dropdown. Correlating bots are presented as friendly in the top half and hostile in the bottom. They are highly anti-synergistic in the left bin, neutral in the central bin, and highly synergistic in the right-most bin.

Select alternate groupings for comparison using the second dropdown.

Select Bot:

Crab

Hunter

Guardian Shield

Recall

Recall Hunter

Scorpion

Beetle

Blink

Blink Hunter

Gunbot

Missilebot

Wasp

Hornet

Knight

Crossbow

Ballista

King Crab

Crusader

Bomber

Shocker

Recall Shocker

Mortar

Swift Shocker

Heavy Hunter

Destroyer

Raider

Turret

Heavy Ballista

Gargantua

Sniper

Advanced Blink

Assaultbot

Advancedbot

Behemoth

Advanced Mortar

Blaster

Butterfly

Dragonfly

Falcon

Airship

Advanced Recall

Mammoth

Stinger

Flak Turret

Bulwark

Katbus

Locust

Kraken

Predator

Valkyrie

Artillery

Advanced Destroyer

Shade

Select Grouping:

This is the final analysis of the system output weights. The previous sections were concerned with the results of regression coming from measuring a bot's contribution to its own score while these represent the collected effects of bots on each other.

And that's it. If you came here all the way from the Deck Builder, this is where the correlation values for deck power are drawn from. In the same way as the deck builder, the aggregate bot weights and these correlations power the mock ladder section that follows.

Under the Hood

This section will go into a bit more detail about the workings of the model and the assumptions that underpin it. TODO

Future Work

Version 1.0: A non-exhaustive list of features to come...

Add aggressive and defensive positioning logic,
Add defensive combat logic,
DONE Add new beta 2 bots,
DONE Update bot stats for beta 2,
CLI option for generating specific deck match-ups,
CLI option for generating specific deck match-ups,
Add unit movement (kiting) between attack cooldowns,
DONE Program OVERCLOCK_START and OVERCLOCK_STOP events,
DONE Program BLINK_START and BLINK_STOP events,
DONE Program RECALL_START and RECALL_STOP events,
DONE Program SETUP_START and SETUP_STOP events,
DONE Program UNSETUP_START and UNSETUP_STOP events,
DONE Program DETONATE events,
DONE Logic for ability use: Detonate,
Logic for ability use: aggressive Overclock (use on cooldown),
Logic for ability use: defensive Overclock (same as aggro),
Logic for ability use: aggressive Blink (Blink to get in range of target),
Logic for ability use: defensive Blink (Blink back on low HP),
Logic for ability use: aggressive Recall (none),
Logic for ability use: defensive Recall (Recall on low HP),
DONE Logic for ability use: aggressive Setup (leapfrog),
Logic for ability use: defensive Setup (static defense),
Fix units moving when dead,
Fix untargetable units at map boundary,
Implement and compare Glicko/Glicko2 with Elo performance for ladder,
DONE Multi-thread report writing,
Multi-thread ladder match-making,
SIMD for matrix library (maybe),
Make code presentable,

Version 2.0:

Human-ish APM,
Human-ish kiting and stutter stepping,
Human-ish focus fire,
Human-ish ability use,
Fuzzy logic to scale Human-ness with Skill setting,
Iterate ladder in "days" of a gaussian number of games, with players changing decks each day,
Add to a subset of players a tendancy to copy Top Ace decks from the previous day,
Add a chance for higher skill players to pick high performing bots,
Generalize Monte Carlo code and fully separate from BA data, allowing reuse on other titles,

Version 3.0 (ie: optional stuff I may never get to):

Javascript version of sim, allowing users to plug in bots and get a summary of outcomes,
Create page for comparing simulations between patches,

Contact Info

- reddit - GitHub -