In my mind, there are three pressing issues that need addressing: 1. My implementation of stencil shadows is too slow. 2. We need to get static lights working with stencil shadows/bumpmapping. 3. We need to get moving entities working with stencil shadows/bumpmapping. Number 1 is the key here, without getting shadows faster, 2 and 3 would just bring frame rates to a halt. With this in mind I started looking over the code and thinking about how to speed things up a bit. Firstly, I decided to 'start again'. My original coding style was 'don't touch id's code, just write around it'. This was fine for an introduction/example mod, which was the original idea. However, I don't think it will work for further development, so it's best to start from scratch now rather than later. Of course, we won't be fully 'starting from scratch' as we can pull the useful bits from the current version. Since GLSL fragment shaders aren't supported on my computer, I also decided to do initial development without bumpmapping (i.e. using a simple, ambient, attenuated, fixed-function pass to render lighting effects). My computer does support GLSL vertex shaders though, so using a shadow vertex extrusion shader is an obvious optimisation (although I'm not sure how much it will improve performance by, if at all). Bumpmapping should then be relatively straightforward to pull from the current version. So, here is my list of optimisation ideas so far: * The first optimisation involves generating what I call the Potentially Shadowing Set (PSS). The PSS for cluster X includes X's normal PVS and all 'leafs' marked by the PVSs corresponding to each leaf in X's PVS. PSSs should be generated offline using a utility like the Normal Map Generator program. A light source can only be illuminating visible objects if it is in the viewer's PSS. When drawing shadow faces, only those 'leafs' that are in both the viewer's PSS and the light source's PVS should be marked. Also, only those surfaces that are front-facing to the light source need be rendered (the current version of the mod has to draw both front- and back-facing surfaces). This should cut down the number of shadow faces rendered quite significantly. Whether this will result in a performance boost or whether the cost of the algorithm will cancel out the boost remains to be seen. Rendering of light effects should be carefully ordered so that PSSs and PVSs are decrompessed a minimum number of times. Another advantage with this is that we won't get lighting artefacts due to potential shadow casters being culled by the viewer's PVS cluster (the current version of the mod suffers from this problem). When we step through the BSP tree to mark visible nodes we won't need to perform a frustum cull as all lights are point sources, but we can do simple distance culling based on the light's area-of-effect. * Other optimisations follow directly from "Fast, Practical and Robust Shadows" by Mark Kilgard, Cass Everitt and others available from the nVidia Developer website (http://developer.nvidia.com/page/home.html). Another good reference is Eric Lengyel's Gamasutra article "The Mechanics of Robust Stencil Shadows" available at http://www.gamasutra.com/features/20021011/lengyel_01.htm. You need a Gamasutra account to access this, but Gamasutra accounts are free. The optimisations discussed in these articles are: use of a shadow vertex extrusion shader (mentioned above); use of a scissor test to cull pixels definitely not illuminated by the light source; use of fast z-pass if the shadow caster is culled by the 'viewport occlusion pyramid'; only rendering sides and dark cap of shadow volume if viewer is in shadow and looking away from light. Unfortunately, we can't take advantage of rendering just the 'silhouette edge' for the static world model because the portion we render will not be 'closed'. However, I do have an idea for an optimisation that I discuss later that is similar to this method, although implementing it will be extremely difficult. Also, use of the 'depth bounds' optimisation is not worth immediate consideration as it will not be available on most peoples' hardware (inlcuding mine). Two-sided stencil is an obvious and easy optimisation, but unfortunately it's not available on my system. I'm hoping that the 'silhouette edge' optimisation will be applicable to moving entities as I reckon most, if not all of them, will be closed. There's quite a few optimisations here. I think each one should be 'relatively' straightforward to implement, although I'm not saying that it'll be 'a walk in the park'. By themselves, I don't think they'll make too much impact, but all combined I reckon they could make a difference. * Using index and vertex buffer objects to store mesh data in VRAM for faster rendering. Quake 2 makes use of compiled vertex arrays, which I don't know too much about yet, but I suspect using buffer objects will be better (perhaps the two can be used simultaneously for even greater performance?). * It would be more efficient to consider connected, co-planar faces as a single amalgamated face for the purposes of shadow casting. This introduces the apparant problem of 'holes' in such faces, but it turns out that these would be elegantly dealt with by the normal algorithm without change because the 'winding order' of 'hole faces' would be the opposite to that of the amalgamated shadow casting face (i.e. they would be facing the other way relative to the viewer). The difference in sign because of this cancels out the stencil increment from rendering of the shadow volume in areas that should be in light only. Determining co-planar faces should probably be done offline by a utility similar to the Normal Map Generator program. * Once we get static lights working, we won't need lightmaps at all anymore. As such, I'm working with that assumption and simply not rendering them. * We can cache the results of certain per-node computations, such as a frustum-cull test, for use in subsequent passes. The current version already caches the results of the frustum-cull test, but there may be other tests that are worthwhile. * I don't think this one is feasable, but I'll mention it for completeness. Consider a room that is a cube with a door that opens into another area. Assume this room contains a shadow-casting light source. It is quite obvious that the walls of the cube do not shadow anything that is in the room. The walls obviously do shadow the area behind the door, although it is likely that only one of the walls shadows this area. Assume that the door is a simple rectangle. It would be far more efficient to render a 'light volume' rather than a number of 'shadow volumes' in this case and mark those fragments that are in light rather than in shadow. Going back to inside the room, only 'non-convex' faces can possibly cast shadows in the room (by 'convex' faces in the room I mean the 3d equivalent of OpenGL's definition of a 'convex' polygon). This may simply be another picture of the 'silhouette edge' idea, but it is clear that we could use such an idea even for non-closed models. I haven't analyzed this idea to its completion yet, but perhaps the combination of the room's surfaces and 'light casting' surfaces must form a closed model for this to work. However, I suspect that determining these 'light casting' surfaces will be an extremely difficult task and will probably require a process similar to that of generating a BSP tree from raw data. Quake 2 maps define 'areas' that are connected by doors that can open and close (Quake 2 uses this information to cull large sections if such doors are closed). These 'areas' might prove useful in determining these 'light casting' surfaces, but the idea should be applied elsewhere to get the best out of it (i.e. for an 'around-the-corner' type of situation). Even if such an optimisation were implemented, the algorithm for determing 'light casters' would need to be executed on end-users' computers, which would require it to have a rather limited execution time (i.e. less than an hour). However, there is a possibility that this might give us the biggest performance boost of all. Also note that this is really only suited to indoor type games (which of course Quake 2 is). I reckon that this technique would be very compatible with a portal rendering system. Those are my thoughts on optimsation so far. I have made a start, but I'm not too far along yet. Moving entities look like they're going to be a bit of pain, but doable. As for static lights I had no clue about until a while ago. Initially I assumed that the illumination from static lights was 'hard coded' into lightmaps. Determining static light properties from lightmaps was not something I wanted to think about. However, I happened to look at the end of a Quake 2 .bsp file in WordPad on the off-chance I might spot something when lo-and-behold I noticed that there is a list of 'entities' in plain text, including information about 'lights'. I'm hoping that this is information about static lights saved from the original map file, in which case static lights should be really easy. The whole list is loaded by Quake 2 in "CMod_LoadEntityString". Based on my proposal about shadow casters and the PSS, we could store a static list/array of static lights that occupy each leaf. After all this has been done (if it ever gets done) I want to start looking at other interesting graphical effects such as: transparent surfaces casting a 'stained-glass window' effect; soft shadows; animated water caustic textures; reflection on large water bodies (which cannot be handled by simple environment mapping); physically realistic water that moves in three dimensions and reacts to solid objects; physically realistic explosions rendered with shaders. Most of these are crazy pipedreams, but who knows, maybe some of them will work.