Element 61

Monday, February 20, 2006

Geomipmapped Terrain, Part 1: The Basics

Geomipmapping is a fairly well known technique at this point. It's a relatively easy algorithm to implement, and provides extremely good results on medium-sized terrain. It is, however, somewhat more difficult to implement optimally, and the original paper leaves out a few important things, like how to deal with cracks in the landscape or how to compute errors for actual terrain data. There also aren't very many reference implementations, so seeing an example of it actually running is somewhat difficult. I'm one of the few people who has bothered to share significant code or methodology (on GameDev.Net) and to have actually explained all of the necessary steps to create a complete and usable terrain system. Since these explanations are scattered across a number of forum posts, and because I recently finished re-engineering my system to a set up that I believe to be near-optimal, it made sense for me to describe everything in detail.

First of all, it's necessary that you read Willem H. de Boer's original article, titled Fast Terrain Rendering Using Geometrical MipMapping. In the paper, de Boer discusses two things of note. One is the method of detail reduction, by removing every other vertex at each level. The other is the screen space error metric, which allows us to determine when to change our level of detail (abbreviated LoD from here on out). The detail reduction method is simple and indeed obvious; there is nothing groundbreaking about it. It's the screen-space error metric that makes geomipmapping different from earlier algorithms, which were generally ad-hoc distance based mechanisms. Until you've put that screen-space metric in, what you have isn't geomipmapping.

Most of the actual constraints on our terrain are described fairly well by de Boer himself. It'll be a uniform, square heightmap whose dimensions are 2n + 1. The plus one is critical here; most introductory terrain stuff, for example the NeHe tutorial, use a simple 2n heightmap. As it turns out, this breaks level of detail and complicates the quadtree as well.

I want to emphasize iterative development here as well, so what will happen is that we will start off with a simple terrain implementation and add more capabilities to it, keeping it in a usable state all the while. You don't want to write a thousand lines to come up with a black screen. It's damn near impossible to debug that much code at once, especially when several dozen bugs may be scattered through various parts of the system and interacting in unpredictable ways. The goal is that whenever we make a breaking change, the damage is easily localizable to a small subset of the code.

As far as the technical stuff. I'll be doing code samples in C#/MDX, but really they should be easy to follow no matter what your persuasions (or perversions, as the case may be). You OpenGL people should make sure you're comfortable with VBOs; this is a no vertex array (or god forbid immediate mode) zone. The goal is to render our terrain as fast as possible using 16 bit indices and a minimized number of batches. (These two goals conflict, and we'll handle that later.) We want to minimize the amount of CPU time spent, while maximizing GPU efficiency. These goals are aggressively pursued, often at the expense of extra memory. I'll discuss memory use at length later. For a 1025 terrain, the memory consumption of the terrain is about 50 MB. It tops out at around 25 million vertex and pixel shaded tris/sec when LoD is disabled on my Athlon64 3200+/X800 GTO, for which the theoretical maximum triangle throughput is 38 million tris/sec (and we all know how accurate those numbers are...). In reality you don't show nearly as high triangle throughput due to the simple fact that cards these days can shrug off 20K or 50K triangles per frame without caring at all; you get bottlenecked in other areas at that point and benchmarks are meaningless. So that's all for now. Next post I'll dive in and get things wired up. Keep in that since this is a blog, I won't be giving complete working code; there are too many dependencies on other parts of the engine anyway. I'll show enough that you can see what's going on. The engine source will be available in full sooner or later anyway.

Dot Plan: The Revival

So I guess this blog flared up briefly in the fall and died almost immediately. I'd like to bring it back, and to do things right this time. I'm shooting for bi-weekly updates, which seems reasonable. (Ignore the fact that I am competing in the Collegiate National Taekwondo Championships this week, and then flying down to Redmond for an interview with MS next week.)

As for content. I'm not terribly interested in the Quake 3 thing at this point, so hopefully somebody else will pick up. Although really, this is probably a good time to learn how to cope when a big existing code base is dropped on you. It's an important job skill. The resource management articles are going to be restarted from the beginning, due to significant re-engineering. There are now two versions, one in C++ and one in C#; I'll be discussing both soon. I'll be interleaving that discussion in between posts about the main thing I want to discuss right now, which is my completely re-engineered terrain system. Most people around GDNet know that I'm basically resident expert when it comes to geomipmapped terrains, but I haven't made a lot of noise about the redesigned system (now in C#/MDX), which is incredibly efficient. Later on I'll probably be discussing the rest of the engine architecture, which is looking like it will turn out to be fantastic. It's quite immature at the moment, though, so I want to let it settle down first. Most likely the source to the engine will be available before the blog posts discussing the design decisions in it. I'll toss up the first terrain article shortly. Hopefully this can be something that will work its way on to people's "things to read" list.

Tuesday, September 20, 2005

Looking at the Quake 3 Source -- Part 3

Alright then, it's finally time to look at Q3's rendering. Everything that's rendered in Quake 3 can be categorized as follows:

Those of you who are familiar with Q3 BSPs will notice I skipped curved surfaces entirely here. I don't even want to go near those things right now.
  • BSP -- Specifically, brush models. These convex hulls form the majority of the static level geometry.
  • Models -- Pretty much anything that's loaded from an MD3 file. These can either come as part of the map or as entities (players, weapon pickups, etc).
  • Other -- This is kind of a cop-out, I know. This would include stuff that's generated at runtime for effects (flares, lightning) and just miscellaneous stuff that isn't either from the BSP map or MD3 files.
Originally I wasn't going to discuss the BSP system at all. But since there was such a huge delay in getting this part of the blog up, I figured it better be good. If you're not familiar with the basic concept of BSP (binary spatial partition), then go read up -- there are a ton of resources on the net (The BSP FAQ is the canonical reference on the subject in amateur circles). Now, it's important to realize that Quake 3 does NOT the BSP tree for visibility determination, which is what BSP was originally meant to do. Instead, the map compiler slices the map up into the BSP tree. Then, for every single leaf, it determines whether or not every other leaf could ever be visible from that leaf. This is called the Potentially Visible Set (PVS). So all of the necessary visibility data is completely precompiled and stored into the map (it's compressed into an RLE bit set). Then, when the engine needs to determine what to render, it has a very simple job. First, it uses the BSP tree to figure out what leaf the view point is in. Then it goes through that bitset, finds every other leaf that could ever possibly be visible from that one, and frustum culls the potentially visible ones to come up with a final set of visible leaves. The resultant set is nearly completely optimal on indoor environments, making newer techniques like occlusion culling unnecessary.

Lighting on the Q3 map is also precalculated, and then baked into light maps. These lightmaps are stored back into the BSP file itself. They're fairly low detail, at 16x16 per poly in the map. Although the Quake 2 lighting engine did radiosity, Quake 3 did not. (The story, I gather, is that Carmack didn't like the bleeding and indirect lighting that radiosity created.) Rendering the lighting is a simple task of drawing another pass which is modulated against the lightmaps. Again, these can be collapsed into a single multitexture pass. Everything that isn't baked into the map, which mainly means all the movable junk, misc models, players, etc, is dynamically lit at runtime. Quake 3 maps have light entities that are used during the precomputation process, but those light entities are also stored back into the final BSP, so that they can be used at runtime.

Lastly, BSPs are used for all of the collisions in Quake. The actual BSP data is constructed entirely from convex hulls. During compilation, overlapping hulls are merged together using constructive solid geometry (CSG). Then, the BSP splits apart everything so that no face spans any of the splits. These properties can all be used for collision detection. I won't go into more detail about that; it is a post about rendering architecture, after all.

Let's move on to Models and Other. Rendering of these things in Q3 is controlled by shader files. These aren't shaders like the GPU programs that we're used to, but rather simply descriptions of how things ought to be rendered. Nowadays they'd probably be referred to as material scripts. Let's look at an example:
textures/base_button/shootme2
{
 qer_editorimage textures/base_button/metal3_3_shootme.tga
 q3map_lightimage textures/base_button/shootme_glow.tga
 q3map_surfacelight 1000
 {
  map $lightmap
  rgbGen identity
 }
 {
  map textures/base_support/metal3_3.tga
  blendFunc GL_DST_COLOR GL_ZERO
  rgbGen identity
 }
 {
  map textures/base_button/shootme_glow.tga
  rgbGen wave sin 0.5 1.0 0 .3
  blendFunc GL_ONE GL_ONE
 }
}
So this is a shader for some kind of glowing button thing. We can see it has 3 layers. The first one is the lightmap (this would actually be applied to a brush in the map). The second is a base texture that isn't blended against anything. The third is an additive blend that comes from a base texture. It's also modified by a sine wave, which will have the end result of giving it a pulsating glow. I'm not going to go into great depth about everything a shader can have, because we'll be here all day if I do. However, I did happen to find a great manual. Shaders are dealt with in code/renderer/tr_shader.c. All of the parser code is here, and don't expect any clever stuff in there -- the parser code is about as brute force as it gets. There are also a ton of misc. operations to work with and manage shaders, but most of them are not particularly interesting. You'll notice that each and every texture layer is specified seperately. Q3 assumes that any effect can be done with a single texture per pass and alpha blending. (And this was quite true until the advent of pixel shaders.) The engine does detect certain blending modes as collapsible into a single multitextured pass, and will attempt to do so. It's hard coded to only ever use two textures per multitexture pass though, so hardware with more texture units doesn't benefit.

I think it's time to take a brief look at the model rendering in Q3. Q3 models are keyframe animated, just like Q2. The difference, though, is that the models can be sliced into multiple parts and then controlled seperately. Each player has 3 models associated with it: head, torso, and legs. These are attached by "tags" embedded in the model. Each tag is a single triangle which is used to correct orient the attached models in the correct pose. By rotating the tag of the torso to the head, you can make the player turn his head without having to do anything else. So the legs, torso, and head all have their own animations, and they're combined at runtime. It isn't dynamic blending of multiple skeletal animations (arguably the holy grail today), but it's pretty damn clever nonetheless.

That more or less covers the external behavior of the rendering subsystem. Most (all?) of this information has been public for years, but it's necessary to make sure we're on the same page before digging into the code, otherwise we'll lose people along the way, and that's never good. It sounds very simple, and externally it is. But there's a lot of code backing this stuff, particularly in processing shaders. There's stuff for processing mirrors and portals, which carry their own complications. Additionally, there's a complete command queueing backend to sort and render things in optimal order. The things in the Other category that I mentioned are usually dynamically generated geometry, and there's plenty of code for that. The list goes on and on. It's not a simple engine internally.

I hope that helps everybody. Sorry about the huge delay in writing this part. As I said earlier, I needed to get copies of both Q3 and Visual Studio. Also, there were some major technical problems that have left my main system a corpse until a few RMAs come back. Not to mention school started again, and that carries a whole load of stuff with it. Anyway, if you have any questions, please post in the comments and I promise I will answer them. The BSP section seemed a little brief, since I only covered rendering and not compilation; the latter is where the real complexity lies (and I don't quite understand it besides). The shaders are for the most part self explanatory.

Wednesday, September 14, 2005

Update...again

Ok, so apparently a few people are still watching and waiting, and I gotta say I appreciate that, it's heartening. In any case, I needed two things before continuing -- Quake 3 and Visual Studio (the reason for not having the latter is a bit of a long story). I have both and so I promise that I will continue now. I can actually do Q3 builds now, which helps considerably. Expect stuff soon. And to make up for the long delay, I'll do a complete examination of the BSP/environment stuff (sans curved surfaces), which I wasn't originally going to do. When I do post that article, I will bump that one thread on GDNet, so watch there, not here.

As for the resource management articles, I don't know that I'm going to bother. The thing is, holes are starting to appear and I'm considering some redesigns. Depending on how extensive the refactoring is, I may kill the old articles and start over.