Well, not really at all - but it's something every project needs to stop and work on every once in a while to make sure things will run smoothly. As we race towards the deadline for submission to the IGF, we had to come to terms with the fact that performance was not where we needed it to be with 16 players on the screen. In a game which focuses so much on physics and flying and tumbling around, not to mention driving at high speeds - you really need some steady performance in order to have the most fun.
So for the last week and a half we have been overhauling certain parts of the graphics in order to speed things up. With 16 players on screen at once, we were getting around 20 fps with our decent rigs, so we had a lot to work on.
The thing we had to work on was the "batch count" in the game, which is basically a measure of how many tasks the computer and video card need to do in order to render everything on the screen. Ideally, grouping as many things as you can into one of these batches is the best way to make things faster. So we set upon figuring out where every single one of our batches was going. Right off the bat, we found that we could treat the level entirely as one piece, and reduce the number of batches it took to render significantly. This was our first easy fix, and as anyone in development will tell you, there is nothing quite so sweet as discovering an easy fix. Once we addressed that, things were running really fast with nobody playing. We knew the performance issues were in large part tied to the number of players in the game, so Brian hooked up a key to pop a fake client into the game. Now it was easy to drop 15 other players into the game and test performance.
Not too surprisingly, performance was still underwhelming although a bit better becuase of our changes to the environments. We decided to use NVIDIA's PerfHUD to step through all the rendering steps, which was pretty easy to do since Ogre already supports the tool.
After some testing we figured that every new player added to the game averaged around 20 new batches to display, which multiplied by 16 added up to a lot. This was due to the fact that each players stuff was made up of a lot of different customizable items: there are separate karts, characters, hats, accessories, and wheels. Those are only 5 different items, (8 if you count every wheel) but each item could have any number of materials on it (in order to make it look awesome, of course) and also the kart and the wheels were casting a shadow,each of which made up it's own batch.
So, optimally we needed each player to use the absolute minimum number of batches, which would be 5, since each player loads 5 meshes. To achieve this, we created a simple shader that would enable us to do the effects that we used such as color masks, environment mapping, and rim lighting - but all in one material. Each character kart hat, etc could be rendered all in once batch now since it was all inside one material. Since we are shader noobs, it took us a few days to get the shader working, as well as a few days to move all of the items we had previously made to this new material format and make sure they would work. Brian also created a shader especially for the wheels that made use of Ogre's model instancing to render all 4 wheel models in one batch instead of four. After all these changes, we succeeded in reducing each player to 5 batches instead of 20! Huzzah!
We also decided to move away from the stencil shadows we were using to create shadows under the karts and items, they looked great - but the way they are created was more and more of a bottleneck the more players that were on screen. We dug into Ogre's texture shadow system, and set up some render to texture shadows that while not totally inexpensive, at least performs better with many shadows being cast at once.
All that work, and hopefully nobody will ever know about it when they go to play the final game! Here is a screenshot with myself and 15 test players onscreen, with Ogre's performance display.
We still have a lot of improvements to make, the soonest of which will probably be hooking up our new shader to utilize hardware skinning in order to move some of the animation cost to the GPU. But we are off to a good start! Already the game feels a lot more responsive with lots of players, and that's fun for everybody!
p.s. here is the 3d mark score for my computer which this screenshot was taken on.