It's time for another blog post! In this one, we'll be delving deeper into all the brick optimizations we've made behind the scenes, and how they'll improve your experience playing the game in the future. Turns out there are quite a few!
Unreal Engine uses PhysX to provide collision and process scene queries, such as detecting whether characters have a floor under them to walk on. Since we want players to be able to walk on bricks and interact with them, that means we have to register every brick with PhysX in some way.
In Alpha 4, we were dividing the brick grid into chunks, then simply adding a collider to the scene for every brick and let PhysX figure out the rest. This works reasonably well for small builds, but it quickly falls apart when loading larger builds. If you've visited Sylvanor's Brickadia City server, you might have noticed framedrops whenever a brick was placed, and that the game becomes completely unplayable while downloading bricks:
So what happened here? It turns out PhysX was maintaining a global acceleration structure containing all static colliders in the world, and whenever we inserted a brick this structure was marked for a full rebuild. PhysX is able to spread this rebuild over several frames, but it is still enough to take the game to its knees. We needed to do better.
For Alpha 5, we've upgraded the PhysX SDK used by Unreal Engine to version 4.1, so we can take advantage of several new features that help us solve the issue.
The first of these is actor-centric scene queries. It allows us to properly split the brick grid into a number of chunks which then use their own self-contained acceleration structure, eliminating the global rebuild on placing new bricks. We now only have to rebuild a much smaller chunk, which can be done without causing any frame drops.
The second is improved broad phase data structures. This is related to actual physics simulation, such as ragdolls bouncing around on bricks. PhysX has previously had the ability to mark a group of shapes as an aggregate that should be considered one object, preventing expensive global updates when modifying individual shapes. However, this used to be limited to a small number of shapes, meaning we couldn't use it for bricks.
In PhysX 4.1, this limitation has been removed, so we can now create an aggregate combining the colliders of each chunk of bricks. PhysX 4.1 also comes loaded with many unrelated, general performance improvements. Let's see if this made a difference already?
It's a huge success! In Alpha 5, these optimizations have eliminated all frame drops from placing bricks, and the game runs just as smoothly while downloading 100000 new bricks per second from a server. But that's not all!
Loading and Saving Optimizations
If you've loaded a large build before, you've probably noticed that it freezes the game for a long time. Uploading save files also takes quite a while. For example, here it took about 12 seconds to upload this build to a local server and then another 16 seconds to load it:
It turns out we can do much better than that! For Alpha 5, we've fixed the save file upload not sending the data as fast as it should, and significantly optimized the time it takes to create the bricks. The slowest part of creating bricks, registering the collision with PhysX, has also been time sliced over multiple frames.
Saving bricks has been optimized too. This one was quite silly, as it turns out we were unnecessarily resizing a buffer for every single brick - so that's about two million useless, large allocations. Removing this one-line fail made it twice as fast just like that. If you've used an autosave script on your server, you'll now have much shorter hitches!
Replication Bandwidth Optimizations
Sending millions of bricks to every client uses quite a lot of bandwidth. In Alpha 4, a local dedicated server would be able to send you somewhere around 40000 bricks per second, so it takes quite a while to actually download the full Brickadia City build.
For Alpha 5, we've significantly reduced the amount of data that has to be sent by using delta compression. For example, if 100 bricks are sent in a row that all use the same brick type, we only send this full property once and a skip bit for the remaining ones. As a result, a local server can now send you up to 100000 bricks per second with the same bandwidth.
You'll likely see lower rates from servers over the internet - but the improvement is going to be there too. If you're hosting from home with a low upload speed, other players will see the builds on your server much faster than before.
Bulk Edit Optimizations
The selection tool allows you to quickly copy, delete, paste, or cut huge numbers of bricks. However, in Alpha 4 this has been quite slow in reality, especially for the delete part. Take this example of removing a large section from the Brickadia City build:
What's going wrong here? Turns out this was also related to the PhysX colliders, but in a different way. For every chunk of the build, we create one actor to contain many shapes for the bricks. When the shapes are deleted again, PhysX performs some extremely inefficient operations that scale quadratically with the number of shapes in the chunk.
In Alpha 5, we've decided to simply cut the chunk size in half since we weren't able to directly fix the performance issue in PhysX. The chunk is a cube, so halving the size will make its volume 8 times smaller, and contain on average 8 times less bricks. Since the inefficient operation scales quadratically, that means we should now be a whopping 64 times faster on an individual chunk! But wait, we also have 8 times as many chunks to process now, so only an 8x improvement can actually be expected.
Of course, deleting so many bricks also involves other operations besides this really slow one, so the final improvement will be less, but it's still clearly noticeably faster.
Memory Usage Optimizations
We've also started attacking the rather high memory usage of the game on large builds. While this mostly affects the client, it is also apparent on dedicated servers. For example, in Alpha 4, a dedicated server (top) and a connected client (bottom) with Brickadia City loaded would look like this in the task manager:
In Alpha 5, we've achieved much lower memory usage with the same build loaded:
The most significant improvement only affects the client, which is the removal of the coverage link lists stored for every brick. Whenever a brick is inserted into the grid, we search for other adjacent bricks, and check if any faces are covered between these bricks. Previously, we would store the result of this search, to speed up future changes to the brick such as making it transparent which requires uncovering the adjacent bricks.
What we didn't realize at the time though, is how large that data is: over 8 million links stored in 2 million individual lists, adding up to almost 1GB with memory management overheads. So we've just removed this entire cache in Alpha 5. Modifying an existing brick has become ever so slightly slower as we need to search for adjacent ones again to uncover, but that's totally worth it.
Further improvements affecting both the client and server can be attributed to removing various unnecessary buffers that were left over from resizable brick generation, a more efficient storage system for the properties of each of the millions of bricks, and improvements related to the previously mentioned upgrade to PhysX 4.1.
You didn't think opening the game in one second was fast enough when running the game off a NVMe SSD? Yeah, me neither. So obviously that had to change. By removing some unnecessary dependencies that were pulling in unused assets into the menu, combined with general startup improvements in Unreal Engine 4.24 and 4.25, the game now takes only half a second to open. Progress!!
The final and most important optimizations we've made for Alpha 5 are to the brick rendering system. To cut it short: you can expect double FPS when having huge builds like Brickadia City loaded. Smaller builds will be affected less, but should still run a little faster.
This post is getting quite long though, so let's cut it here for now. Stay tuned for the next technical post where we'll go into more detail about the rendering optimizations!