`

Molehill / ND2D – speeding up the engine

 
阅读更多

from: http://www.nulldesign.de/2011/04/07/molehill-nd2d-speeding-up-the-engine/

 

Molehill / ND2D – speeding up the engine

 

One of biggest challenges in modern computer graphics, still is the high cost of rendering thousands of different objects, no matter how simple they are. While developing ND2D, I’m experimenting and trying out different techniques to get a good performance.

 

To optimize the rendering you have to know it’s weaknesses. As a simple rule you can say: Every state change on the graphics context (Context3D) and especially the drawTriangles() call is using a lot of processing power. You’ll notice pretty fast, that if you try to render 2000 sprites (a sprite are just two textured triangles, so 4000 tri’s in total) and you’re doing a draw call for every single sprite, the overhead will be so high, that the output looks more like a slideshow than a smooth animation. The possible solution is simple: Just do as little state changes and draw calls as possible. The implementation is a bit more work…

 

So how do you save draw calls? The answer is geometry batching. Instead of drawing one sprite per draw call you just draw multiple sprites in a single call. To get it to work, you have to dig a bit deeper into pixel shader programming and the graphics hardware:

Single sprite per draw call:


A sprite consists of two triangles, a triangle of three vertices and each vertex has the following attributes: x,y,z, u,v, which will be the format for our vertex buffer. The shader input parameters (constants) will be the mvp matrix, a color (to tint a sprite and to enable transparency) and of course the texture image (image4). This way you’re able to draw one sprite per call, pretty easy and straight forward… but slow.

Improvement, batching calls:


You can only batch calls, if the sprites you want to draw have all the same texture (Setting a texture is also pretty expensive). The main idea is, that you pass multiple mvp matrices and multiple colors to the shader instead of just one. Within the shader, depending which sprite is drawn, a different mvp matrix is used. But how many values you can pass to the shader? Todays modern graphic hardware has at least 128 constant registers available in the GPU, so to be compatible with all the different graphics cards out there it’s limited to 128 in the Molehill API. In the following picture you can see the different inputs that are available for the vertex shader. We won’t bother with temp registers and input vectors now, because it’s just unlikely that we are running out of registers while drawing sprites. So just keep in mind, that the vertex shader has limited storage space. In our case we’re limited to 128 constants.

 



 (Image taken from the DX8 SDK documentation)

 

A single register can hold a float4. So, let’s do some simple math. The matrix uses 4 registers (4 x float4) and the color just one: 128 / 5 = 25. We should be able to batch 25 draw calls in a single call. But how does the shader know which matrix to use? To provide this information in the shader, we simply add a batch identifier to the vertex buffer: x,y,z, u,v, batchID. The vertex shader could look like this then:

 

...
parameter float4x4 clipSpaceMatrix[25];
 
void evaluateVertex()
{
    vertexClipPosition = vertexPosition * clipSpaceMatrix[batchID];
}

 

Yay! We just batched our draw calls and the engine will run a lot faster for sprites with the same texture.

But there is more… Right now, we can only batch sprites that share the same texture. Wouldn’t it be great if we could batch just everything? There is an idea called texture atlas. Basically it’s pretty simple as well: Instead of using different textures, you just “bake” every texture used in your game into a single big texture like this: Pocket God Texture Atlas. All you have to do then, is to adjust the UV coordinates of your sprites to match the original texture in the big one. Generating a texture atlas at runtime and adjusting the UV coords is in fact a bit more work…

Have fun exploring the GPU ;)

 

  • 大小: 15.5 KB
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics