Ai Optimization: Part 1

We’re starting to dial in what performance should be like for the upcoming playtest since the playtest will be here before we realize it.

Criteria decided on for Ai Logic was ~2ms on the Game Thread per frame for 20 active Ai. Pretty much arbitrary and just some numbers that were presented so I went with it.

Active Ai means Ai that are doing things. That sounds reasonable, but doing things means available for combat and when Ai is available for combat that means that they need to be assessing all targets in the world that are in range.

Profiling 20 Ai in a level that were assessing all targets in the world was around ~3ms per frame when spread out over 10 frames.

I was optimistic at this point.

In this case all that had to happen was to increase the number of frames for Ai Logic from 10 to 20 and the time per frame dropped to less than 2ms. 20 frames was still responsive enough.

And that solved it. Thanks for reading.

So, are we to expect that the maximum number of Ai in any level in the game is… 20?

You, probably.

Taking a look at the level that the Level Team is working on right now we see that there are currently over 100 Ai placed in the level. The performance of 100 Ai in the level is well above the previous criteria, but then 100 Ai in the level is not the previous criteria anyway.

Previous to optimizations Ai Logic Time in the Real Level was 3x what we wanted!

So, lets see how much we can do with some optimizations while not giving up any capabilities for the Ai.

Optimization: Target Mapping

Ai get their targets by grabbing the list of players in the world, getting all of the destructibles within a range around them and by being given a target directly. This is no where near efficient because the Ai has to sort through ALL targets, determine if they are hostile and should be kept or if they are not hostile/attackable and should be rejected.

The idea of Target Mapping is to maintain a map of Factions to Targets for all targets in the world. This way the Ai only needs to pull from the known Factions in order to get hostile or attackable targets.

Implementation of Target Mapping is straight forward:

  • Add Character targets to the Target Mapping when they are added to the world.
  • Remove Character targets from the Target Mapping when they die or are removed from the world.
  • Add Enemy targets to the Target Mapping when they are added to the world.
  • Remove Enemy targets from the Target Mapping when they die or are removed from the world.
  • Add Destructible targets to the Target Mapping when they are added to the world and are attackable by Ai.
  • Remove Destructible targets from the Target Mapping when they are destroyed or are removed from the world.
  • Anytime that there is a Faction change, remove the current mapping and create a new mapping for the Target’s new Faction.

Once Target Mapping was implemented and I accounted for all of the non-hostile target interaction that Ai need to be able to support, the performance was better.

Target Mapping saved around ~35%

Optimization: Ai Range

Originally all Ai had the same range meaning that the Ai would pick up and lose targets at the same range. These ranged were set across Ai using the worst case settings: The Heavy Archer!

The Heavy Archer has a range of around 100 meters and therefore all Ai Ranges were set to that range, which is a lot for any melee Ai.

Simply cutting the Ai Ranges in half from 100 meters to 50 meters gains another significant reduction because the Ai doesn’t need to consider as many targets. In the future, each instance of the Ai’s ranges will be carefully selected by the Level Team so that they act appropriately.

Reduced Ai Range dropped Ai Logic Time significantly.

Very close to the 2ms mark for 100 Ai! Making good progress.

Optimization: Reachable Queries

Reachable Queries are how I am determining what targets an Ai should consider for Melee Attacks. If an Ai knows that they cannot reach a target they shouldn’t consider that target.

Example of this is if you have a melee-only Ai and there is no way for that Ai to get to a target, then the Ai shouldn’t consider that target and should instead consider another target or go back to what it was doing.

For Ai that have Ranged Attacks that can hit the target in the above scenario, the Ai will use their ranged attacks instead as expected.

Reachable Queries were set to test up to 5 points around each target in order to determine the reachability of that target.

The optimization is turning that value down to 3 points around each target and then to orient those points around each target in a better pattern.

Now we’re cooking with gas.

Better reachable query helped get down below 2ms for 100 Ai.

Looking Back

So with these optimizations, where does the original scenario of 20 Ai at 2ms per frame sit?

20 Ai clock in at around 1.3ms across 10 frames post optimization.

Note the Frames column.

Conclusion

Going through this process was really educational. I learned a lot about the Ai system that I built and where its bottlenecks are. In the future I plan on doing further optimizations:

  • Better control of what Ai are active/spawned in the level so that only the Ai that need to be in the level are there. This is actually already in progress because the needed functionality for on demand spawning of Ai or on demand waking up of Ai have been added to the new project.
  • Decouple non-combat and combat portions of Ai logic so that only Ai that are in combat consider their combat abilities.
  • Use Async Traces.

Goals

My goal for our Ai is to get >100 Ai to below 2ms across 10 frames. Still a way to go.

Disclaimer

All of these numbers are generated on AMD Ryzen 5900X and AMD Ryzen 5950X at stock frequencies. In the near future I will be testing these numbers using the min spec machine, which, from now on, I will call The Beast (EDIT: First tests using The Beast (min spec machine) show times that are roughly double the above times (example: Ai logic that takes 1.8ms on 5900X takes ~3.6ms on The Beast)).