Element 61

Saturday, August 27, 2005

Update

I just wanted to let anyone know, in case you're waiting for part 3, that I'm still alive. A few "real life" things going on at the moment, you know. In any case, it'll probably be monday before I get part 3 of the Q3 stuff up. But I promise it'll be a good entry.

Wednesday, August 24, 2005

Looking at the Quake 3 Source -- Part 2

I had previously said that I was planning to cover the rendering architecture in part 2. I even wrote up a draft. Problem is, it isn't any good. I tried to describe both the architecture and the code at once, and ended up doing a miserable job of both. So instead, what I'm going to do (presumably for part 3) is to cover the high level rendering system, which has been publically known for several years if you've worked with modding or even just artwork. After that, I'll delve into how that architecture is actually implemented underneath. In the meantime, I want to address some of the questions people have been asking, and also generally give an overview of how the game flow is structured. The actual entry point, WinMain, is in code/win32/win_main.c. The main loop is really very simple. It updates the current time, accesses input, and then runs a frame. The actual frame logic starts from the Com_Frame function, which is at line 2635 of code/qcommon/common.c. There's a lot of misc stuff going on in this function, but if we just cut our all the extra and just keep the important things, it would look like this (pseudocode):

void Com_Frame( void )
{
    if( fps > max_fps )
        Sleep();

    SV_Frame();

    if( !dedicated_server )
        CL_Frame();

    //performance analysis and stuff here
}
The client code is smeared between the "cgame" folder, which stands for client-game, and the "client" folder. The server code is entirely in the "server" folder. Everything that's used for making mods resides in the "game" folder.
Simple, yes? First we execute the server frame (there is always a server). That function is defined in code/server/sv_main.c, line 751, and it's also rather simple. Similarly, CL_Frame is called if we're not a dedicated server (i.e. a client side exists). That function is in code/cgame/cl_main.c, line 1997. Again, really well commented over here, and it's very straightforward. Now, I'm going to stop there to avoid recursing infinitely into the engine, but here's a hint -- use Find In Files. If you're finding too many calls to a function and not the actual function definition, try searching for "void Foo" instead of just "Foo". If you're finding too many uses of a struct and not the actual definition, try "struct Foo_s" or "} Foo_t". The Quake code is really consistent this way, so a few clever alterations of the Find In Files string will quickly hunt down what you need. But best of all, VS 2k3 will usually be able to very quickly and easily find the definition. Just right click and hit "Go to definition" in the menu. (VC6 users will need to enable Browse Info and do a complete rebuild.)

A few people mentioned the Q3 VM, and the odd presence of a C compiler, lcc, in the source code. This is for QuakeScript, and from what I can figure so far it's a replacement for normal game DLLs. In other words, I have almost no idea what it's for. Give me some days (actually, a week or so) to finish dissecting the parts of the engine I'm most interested in, and then we'll move on to other things. I understand that the VM is a bit looming, given that there are references to it all over the place, but I want to have a thorough understanding of what it's doing before I start trying to explain it.

Anyway, that's quite all for tonight. Sorry about the short entry, as I did spend most of today taking apart the rendering code and finding out that it's a bit more complex than I originally thought. Tomorrow you can expect a solid, non-code oriented explanation of how the Quake 3 rendering subsystem works. After that, we'll dive in and take a look at the implementation. (And maybe we'll sneak cvars in there somewhere. Who knows?)

Tuesday, August 23, 2005

Looking at the Quake 3 Source -- Part 1

In case you're not aware (for example, you may have been living in a cave where the only internet is dial up access to a 3 month old backup of the google cache), the Quake 3 source is available for download, under the GPL. My initial impressions are good, given that when I looked at Q2 back in the day, I very nearly gave up hope of ever accomplishing anything. Things are for the most part well organized, categorized, and sensible. The id penchant for clumping a dozen header files into just one continues, but overall it's easy to find the code you're looking for. Hell, it took me 3 days to find the BSP code in Quake 2 because they had called it model_t or some such meaningless thing. I can see why Q3 was so popular for licensing, despite being in C. Is it perfect? Of course not. Naturally there are hacks here and there, and a few very weird design things...and the C versions of what would in C++ be inheritance and aggregation are hilarious. But it's not the horrible tangled mess we were all scared it could have been. Overall though, I think this code is going to go a lot farther than Q1 or Q2 source ever did. Compared to everything else out of id, this source is really quite nice. Most functions have documentation if it's not obvious what they do. All of the members of the major engine structs are well commented, for the most part. Now, there is still evidence of Carmack's "magic" at work. Long stretches of uncommented MMX assembly code. Really strange math code implementations at times, and occasionally stuff going on in the renderer that you have to stop and think about. Despite all that, the overall architecture is good. And I'll be poking around their architecture.

Before we go into the actual code, let's examine how Quake 3 is basically structured. No use looking at all the various structs if we don't know what they're for. So basically, every Quake and Quake based game since the beginning of time has revolved around the fundamental concept of an "entity". Everything in the game world is an entity. All the things you see -- the map, players, doors, items and weapons -- as well as plenty of things you don't see, are entities. Really, everything else that happens in the world is generated from the entities. All of the logic is controlled by interactions between the various entities. And when you're creating your own mods, the entities are what you'll really be working with the most. The core entity struct is the gentity_t, defined in code/game/g_local.h. I won't bother to replicate it here in its entirety, but let's take a look at what's in this thing. A quick glance through this struct will show that the lack of inheritance in C leaves its marks -- there are lots of things here that have no business being in a struct that represents every single object. That aside, I'll outline some of the most interesting members:

  • classname -- Roughly speaking, the type of the object. It's basically a string that tells us what this is -- a player, a part of the map, etc.
  • model, model2 -- The model (and alternate model) that this entity is rendered with. An invisible/non rendered entity wouldn't have an model associated with it.
  • flags -- This variable contains a list of flags bit-ORed together to tweak behaviors. God mode, for example, is set here.
  • physicsObject -- True if this is a movable object, false otherwise. The physics in Q3 is pretty damn simple, but there's great potential here (ODE or Novodex, somebody?).
  • parent -- There's a crude scene hierarchy here; entities can be attached to other entities.
  • speed, movedir -- This is our velocity. That's all we really need to update the object's position from frame to frame, since there's no real physics.
  • health -- Well, straightforward, really. Not just for players; it can be used to destroy any entity you want when its health reaches 0.
  • takedamage -- True if this is a thing that takes damage, dies, etc, false otherwise.
  • item -- When you pick up a quad damage or other item, it goes here. Q3 had a weak item system...well, almost no item system at all. This is another location that might be useful to mess with.
Now I've skipped one very important part of the entity struct, the block of function pointers in the middle. These function pointers basically provide logic and AI for all of the entities, and provide a fairly simple but powerful and flexible way of creating simple or complex behaviors. Let's look at the different pointers:
  • reached -- Happens when a "mover" entity reaches the end of its path.
  • blocked -- Happens when an entity's path is blocked by something.
  • touch -- Happens when an entity comes in contact with another entity. If you were writing a contact grenade or a rocket, you could use this to create an explosion.
  • use -- Happens when the player presses the 'use' key on the entity. You could use this to create some sort of switch or button.
  • pain -- Happens when the entity takes damage but isn't killed.
  • die -- Happens when the entity gets killed.
  • think -- Keep reading.
The think function is by far the most interesting of all of the function pointers. It allows us a great deal of flexibility in controlling what's going on. For example, suppose we were writing a machine gun that could fire 5 rounds per second. We would write a MGFireThink function that created a bullet, then set the nextthink member to current time + 200. Then, 200ms from now, the engine would notice that it's time to think and call our MGFireThink again, allowing us to fire another bullet. Or suppose we were making a grenade that explodes after 5 seconds. We could set its think function to GrenadeIdleThink, and its nextthink to current time + 5000. Then, the next time the think happened we could set it to GrenadeExplodeThink. When that function got called, we would create an explosion.

Anyway, I think that's quite enough for one blog post. Entities are the heart of Quake 3, and there's a lot to be gained by altering the base entity management. A lot of the code in code/game is concerned with working with entities; check out g_weapon.c for a lot of good examples of code that works with entities (specifically, weapon entities). Oh, and one last thing. If you're reading this, please, please leave a comment of some sort, preferably with suggestions of what to cover or what you did/didn't like. I'm a little unsure of where I'm going with this, and where people want me to go.

Wednesday, August 10, 2005

Resource Management System -- Part 2

In Part 1 of this series, we looked at the basic structure of our resource management system. Now, we're going to implement the core class of our system, the Handle. You might want to go back and review the responsibilities behaviors of the handle class, which I discussed in part 1. Remember, the real power of this design is contained within the handle class. It's the object that is moved around between subsystems inside the engine, and it's the only thing clients should see. Let's jump right into the code for it:

template<typename Type> class ManagerBase;
template<typename Type> class Manager;

//defines a resource handle to work with
template<typename Type>
class Handle
{
private:
    friend ManagerBase<Type>;
    friend Manager<Type>;
    
    //internal id is just an integer (0 is reserved for invalid ids)
    typedef std::size_t resource_id;

    resource_id        m_Id;
    Manager<Type>*    m_Parent;

    //construct a valid handle

    Handle( resource_id Id, Manager<Type>* Parent ) : m_Id( Id ), m_Parent( Parent )
    {
        assert( m_Id != 0 );
        assert( m_Parent != NULL );
        m_Parent->AddRef( *this );
    }


public:
    //construct an invalid handle
    Handle()
    {
        m_Id = 0;
        m_Parent = NULL;
    }

    ~Handle()
    {
        if( m_Parent != NULL )
            m_Parent->Release( *this );
    }

    //copy a handle (works on valid and invalid handles)

    Handle( const Handle& Other ) : m_Id( Other.m_Id ), m_Parent( Other.m_Parent )
    {
        if( m_Parent != NULL )
        {
            m_Parent->AddRef( *this );
        }
    }

    //assign a handle (works on valid and invalid handles)

    const Handle& operator = ( const Handle& rhs )
    {
        if( m_Id && m_Parent )
            m_Parent->Release( *this );

        m_Id = rhs.m_Id;
        m_Parent = rhs.m_Parent;

        if( m_Parent != NULL )
            m_Parent->AddRef( *this );

        return *this;
    }

    bool operator < ( const Handle<Type>& rhs ) const { return m_Id < rhs.m_Id; }
    bool operator > ( const Handle<Type>& rhs ) const { return m_Id > rhs.m_Id; }
    bool operator == ( const Handle<Type>& rhs ) const { return m_Id == rhs.m_Id; }
    bool operator != ( const Handle<Type>& rhs ) const { return m_Id != rhs.m_Id; }
    
    bool Valid() const { return m_Id != 0; }
};
Ok, so let's take a look. We have two members. One is the array index of the resource this handle is attached to. The other is a pointer to the manager that we're part of. This pointer is mainly there because handles need to call home from time to time, and we wouldn't want to do anything nasty with globals or, dare I say it, singletons. Handles are a friend of the manager of the same type, because the manager will need to know our resource id later on. We don't want to reveal our resource id publicly; there's no reason anyone else should need it. You'll also notice that I've said that any handle with a resource id of 0 is invalid. This gives us a really easy way of checking whether a handle is invalid. (Strictly speaking the parent should always be NULL as well, but the redundancy can't hurt.)

You might notice that the comparison operators don't take the parent into account. I've assumed that two handles being compared always have the same parent. If this isn't the case in your code, you should make sure to alter the checks with '&& m_Parent == rhs.m_Parent'.
Now, behavior. First of all, you should notice that the only constructor to create a valid handle is private. Only a manager can create a real handle. Everyone else can create an invalid handle, copy an existing handle, or assign from an existing handle. Handles can be compared; the comparison uses their internal ids, without exposing the ids to the client. I haven't included a full set of comparison operators for brevity's sake. The main reason for equality and inequality operators is so that we can figure out whether or not two handles refer to the same object. The other operators, operator < in particular, are designed to help using handles in STL containers that rely on functors such as less<_T>. And that sums up all of the functionality of the handle. Clients can't do much of anything with it, or modify it in any way.

A word about managers. In the code above, there's a forward declare for a templated Manager and a templated ManagerBase. Both classes will be implemented in the next part. ManagerBase will carry functionality that is common to all managers. Manager will be specialized for each resource type, and will carry logic that is specific to that resource type. For the next part in this series, I'll be implementing a manager to work with images. It'll allow us to control when we load and unload images, as well as making sure that we never load an image more than once.

Monday, August 08, 2005

Resource Management System -- Part 1

This is the first in a series of posts to build a reference counted "resource" management system for your game/graphics engine. First of all, what defines a resource? For our purposes, there are several conditions that determine what can be used as a resource:

  • It can be used by multiple clients, and needs to be reference counted.
  • Most of the clients of these objects do not need raw pointers most of the time.
  • There should be a master list of all of these objects somewhere.
  • Even when this object is not used, we may want to defer destruction of it for any reason.
The more up-to-date of you with regards to C++ might be wondering why we can't simply dole out boost::shared_ptr objects everywhere, and maybe keep a list of them somewhere. This isn't a terrible approach, but it doesn't really provide the most coherent system ever. The only good way to defer destruction is to keep an extra shared_ptr within the manager system. And it doesn't make the reference count inherent to the object. (I am aware of the existence of intrusive pointers in boost, but not familiar with their usage.) Lastly, boost is huge. Just plain massive. Overkill, really. All things considered, we'll do it ourselves for now.

Here's a quick rundown of the system we're going to build. Each type of resource will be managed completely seperately; we make no attempt to fuse all the resources together via base classes and using RTTI to cast them. It's just not worth the effort. We will have handles to resources, which are the main currency handed around when working with resources. Finally there will be the manager, who maintains the list of resources, deals with resource creation and destruction, and allows access to raw pointers if desired. To start with, we'll write up a base class that allows an object to be reference counted.
class RefCountedBase
{
private:
    unsigned int        m_RefCount;

public:
    RefCountedBase() : m_RefCount( 0 )
    { }

    unsigned int RefCount() const { return m_RefCount; }

    unsigned int AddRef()
    {
        return ++m_RefCount;
    }

    unsigned int Release()
    {
        assert( m_RefCount > 0 );
        return --m_RefCount;
    }
};
There's really nothing to explain here; any class that inherits from this will have a reference count associated with it that can be increased, decreased, and accessed. If you're familiar with DirectX or COM, you'll no doubt notice the similarity to IUnknown. Indeed, usage is identical, and this base class is totally independent; you can use it on pretty much anything that needs a reference count. Of course, the main problem here is that we need to explicitly manage the reference count, which is a pain. We need to address this, and the best way is an object that behaves similarly to a shared_ptr. This object will be our resource handle. Let's summarize the basic behavior of these handles:
  • Any user constructed handle (that is, a new handle not generated from inside a manager) is always invalid.
  • Invalid handles don't affect any reference counts, since they have no object to affect.
  • The handle should decrement the reference count of its resource when destroyed.
  • The copy constructor should increment the reference count of the resource.
  • Assignment should decrement the current reference count, assign, and increment the reference count of the new resource.
  • A handle does not include a direct pointer to its resource. This helps discourage access to the resource, unless you really need it.
  • Handles can be compared with == and !=.
  • Handles can be used for any reference counted type.
Since we are managing every type of resource independently, we do not require an inheritance hierarchy for handles to follow. Instead, our handle will be a templated class. Amongst other things, this means that you can never mix handles; a handle to an image can't be used as a handle to a vertex buffer. The next question is, what member variables does a handle have? Before answering that question, let's examine what a resource manager needs to do.

A resource manager has one really important responsibility -- it's the only one which has actual pointers to the resources in question. There is no other way to get access to these pointers. If you're not part of the subsystem which contains the manager, you can never access those pointers. As a result, all managers have three main responsibilities:
  • Maintain the list of resources.
  • Provide a way to flush unused resources out of memory.
  • Receive the requests from handles to alter reference counts.
There's a lot of ways we could keep a list of objects, but ideally we want something very efficient. At first, you might be tempted to use an STL container such as a std::set, and store iterators in handles. Unfortunately, if we do that, there are all sorts of nasty catches with iterators becoming invalidated. Instead, what we'll do is to store the resource list in a std::vector, and keep an array index in the handles. When a resource is flushed out of memory, we delete its pointer, set it to NULL, but leave that NULL as a blank slot in the resource list. Additionally, for efficiency's sake we add that blank slot to another queue. When we load a new resource, we can take a slot off this list, or if the list is empty, we can push_back onto the resource list. With this design, all of the operations relating to resources are constant time.

That really just about covers it for design. For part 2, we'll cover the implementation of handles and managers, and tie everything together

Sunday, August 07, 2005

Fixing endian issues -- the quick and easy way

Ok, so the real reason for this blog -- technical discussion. It's sunday morning, and I'm bored, so I thought we'd start with something mundane and ordinary. Suppose you want your game to work on both Windows and OSX (as it stands now, not the crazy Intel-OSX boxes that will be public in a few years). One of the biggest issues with doing this, apart from the usual problems of OS specific code, non standard code, etc, is the processor architecture endian. I wrote on this topic once before, looking at how these issues were resolved in the Q2 source...but as many people pointed out, the Q2 source is frickin insane. Plus, that was pure C -- we're going to be working in proper C++ here.

The actual origin of the term "endian" is a funny story. In Gulliver's Travels by Jonathan Swift, one of the places Gulliver visits has two groups of people who are constantly fighting. One group believes the a hard boiled egg should be eaten from the big, round end (the "big endians") and the other group believes that it should be eaten from the small, pointy end (the "little endians"). Endian no longer has anything to do with hard boiled eggs, but in many ways, the essence of the story (two groups fighting over a completely pointless subject) still remains.

Suppose we start with an unsigned 2 byte (16 bit) long number; we'll use 43707. If we look at the hexadecimal version of 43707, it's 0xAABB. Now, hexadecimal notation is convenient because it neatly splits up the number into it's component bytes. One byte is 'AA' and the other byte is 'BB'. But how would this number look in the computer's memory (not hard drive space, just regular memory)? Well, the most obvious way to keep this number in memory would be like this:

| AA | BB |
The first byte is the "high" byte of the number, and the second one is the "low" byte of the number. High and low refers to the value of the byte in the number; here, AA is high because represents the higher digits of the number. This order of keeping things is called MSB (most significant byte) or big endian. The most popular processors that use big endian are the PPC family, used by Macs. This family includes the G3, G4, and G5 that you see in most Macs nowadays.

So what is little endian, then? Well, a little endian version of 0xAABB looks like this in memory:
| BB | AA |
Notice that it's is backwards of the other one. This is called LSB (least significant byte) or little endian. There are a lot of processors that use little endian, but the most well known are the x86 family, which includes the entire Pentium and Athlon lines of chips, as well as most other Intel and AMD chips. The actual reason for using little endian is a question of CPU architecture and outside the scope of this article; suffice to say that little endian made compatibility with earlier 8 bit processors easier when 16-bit processors came out, and 32-bit processors kept the trend.

It gets a little more complicated than that. If you have a 4 byte (32 bit) long number, it's completely backwards, not switched every two bytes. Floating point numbers are also this way, and much to the chagrin of some people, you can't byte-shift them into the right order. This means that you can't arbitrarily change the endian of a file; you have to know what data is in it and what order it's in. When you write an int to a file, it stays in the processor's endian. The only good news is that if you're reading raw bytes that are only 8 bits at a time, you don't have to worry about endians.

Now, we need to deal with this problem. The first thing we need is a way to reverse the bytes of a given variable. We could write functions SwapInt, SwapShort, SwapFloat, etc, but that's hardly good C++. So I present to you the magic of templates and the standard library:
template<typename Type>
Type ByteSwap( const Type& Obj )
{
    Type NewVal;
    const char* Src = reinterpret_cast<const char*>( &Obj );
    std::reverse_copy( Src, Src + sizeof(Obj), reinterpret_cast<char*>( &NewVal ) );
    return NewVal;
}
The beauty is entirely in the simplicity. We pretend the value is a byte array, read it backwards with the help of a standard library function, and return the new value. Even better, it's well suited to optimization. Now, the next thing to do is automate calling this function, so that we never really need to call it explicitly. We'll fall back on old style macro magic for this one:
#ifdef CPU_BIG_ENDIAN
#    define LittleEndian(x)    Cpu::ByteSwap(x)
#    define BigEndian(x)        (x)
#else
#    define LittleEndian(x)    (x)
#    define BigEndian(x)        Cpu::ByteSwap(x)
#endif
Usage is really quite simple. Whenever we are reading some value from file, we know what endian the file was written in. We simply wrap the read in the macro for the same endian. So if we're reading a little endian file, it goes something like this:
MyFloat = LittleEndian( FileReader->Read<float>() );
MyShort = LittleEndian( FileReader->Read<short>() );
MyInt = LittleEndian( FileReader->Read<int>() );
And it's as simple as that.

Saturday, August 06, 2005

Is Microsoft trying to kill OpenGL?

First of all, huzzah for the grand opening of Element 61 blog. Yay. Now that that's out of the way, on to the good stuff.

It broke recently that Microsoft would not be supporting OpenGL in Windows Vista (codename 'Longhorn') or some such thing. A lot of information is still missing, but this is about as much as we have to go on in terms of facts:

This information came from the OpenGL BOF held at Siggraph 2005 in LA this last Wednesday evening. This was confirmed at the BOF by NVIDIA, ATI and us (3Dlabs).

As soon as an ICD is loaded the composited desktop is turned off on Windows Vista. If you want the composited desktop Aeroglass experience, you will need to make your application go through Microsoft's OpenGL implementation, which is layered on top of DirectX. As pointed out earlier, this layering can have performance implications. Their implementation supports OpenGL version 1.4 only, without extension support.

We believe it possible to provide an ICD with full composited desktop support while adhering to the stability and security requirements in Windows Vista. But we need Microsoft's help in doing so.

Therefore, as mentioned before, please let your contact in the ISV or IHV or OEM community know how you feel about this and spread the word.

For some more information, you can browse these Microsoft Winhec slides:

Windows Graphics Overview [WinHEC 2005; 171 KB]

Advances in Display and Composition Architecture for Windows [WinHEC 2005; 422 KB]

Regards,
Barthold
3Dlabs
So that's it. Everything else we've heard so far isn't credible. There's a "50%" number being thrown around a lot as the speed penalty we're going to see for OpenGL. As far as I and several others can tell, this number is bullshit, pure and simple. But let's take a closer look at what exactly is going on here.
As soon as an ICD is loaded the composited desktop is turned off on Windows Vista. If you want the composited desktop Aeroglass experience, you will need to make your application go through Microsoft's OpenGL implementation, which is layered on top of DirectX. As pointed out earlier, this layering can have performance implications. Their implementation supports OpenGL version 1.4 only, without extension support.
This paragraph has all the good stuff. An ICD, for those of you who don't know, is an Installable Client Driver. In other words, it's what you get when you go to nVidia or ATI's site and download their graphics drivers. So when one of these drivers is loaded in order to run an OpenGL application, as things stand, Aeroglass will be deactivated. Your desktop will revert to Windows XP mode.
This has no implications for full screen OpenGL applications on a single monitor!
So, when does it matter?
  • Multiple monitor setups
  • Windowed mode applications
I'm going to basically assume that the multiple monitor setups are really just a special case of windowed mode applications. As for windowed OpenGL applications, what is there? In short, everything that isn't a game. Game editors, modelers, CAD applications, scientific and engineering software, etc. are all in this class. So when one of these applications starts, Windows Vista has two choices about what to do:
  • Disable Aeroglass, load an ICD, and run the OpenGL app as normal.
  • Use the slow(er) MS implementation of OpenGL built on Direct3D and keep Aeroglass running as normal.
Personally, I have no objections to the first option. The second option would mean that Aeroglass would continue to occupy texture memory and other GPU resources, which would probably hurt the performance of my OpenGL application. The only benefit of the second option is that the 3D display acceleration is maintained, and I don't see this as a huge problem. Apparently some people do -- some of them are fairly intelligent and I trust their opinions. It's definitely a good idea to be pressuring Microsoft to come up with a more agreeable solution. But let's keep things in proportion, shall we? Try not to listen to the slashdot crowd -- OpenGL is not dead in Vista. But it is without a doubt in danger. We can only hope that the IHVs (nVidia in particular) can force Microsoft to remedy these problems. It's still a year or so to the official Vista release, and I have no doubt that MS can fix the problems.

By the way, everybody, and I mean everybody should read the two power points linked above in the quoted post.