In my mind, there are three pressing issues that need addressing:
	1. My implementation of stencil shadows is too slow.
	2. We need to get static lights working with stencil
shadows/bumpmapping.
	3. We need to get moving entities working with stencil
shadows/bumpmapping.

Number 1 is the key here, without getting shadows faster, 2 and 3 would
just bring frame rates to a halt. With this in mind I started looking
over the code and thinking about how to speed things up a bit. Firstly,
I decided to 'start again'. My original coding style was 'don't touch
id's code, just write around it'. This was fine for an
introduction/example mod, which was the original idea. However, I don't
think it will work for further development, so it's best to start from
scratch now rather than later. Of course, we won't be fully 'starting
from scratch' as we can pull the useful bits from the current version.
Since GLSL fragment shaders aren't supported on my computer, I also
decided to do initial development without bumpmapping (i.e. using a
simple, ambient, attenuated, fixed-function pass to render lighting
effects). My computer does support GLSL vertex shaders though, so using
a shadow vertex extrusion shader is an obvious optimisation (although
I'm not sure how much it will improve performance by, if at all).
Bumpmapping should then be relatively straightforward to pull from the
current version. So, here is my list of optimisation ideas so far:


* The first optimisation involves generating what I call the Potentially
Shadowing Set (PSS). The PSS for cluster X includes X's normal PVS and
all 'leafs' marked by the PVSs corresponding to each leaf in X's PVS.
PSSs should be generated offline using a utility like the Normal Map
Generator program. A light source can only be illuminating visible
objects if it is in the viewer's PSS. When drawing shadow faces, only
those 'leafs' that are in both the viewer's PSS and the light source's
PVS should be marked. Also, only those surfaces that are front-facing to
the light source need be rendered (the current version of the mod has to
draw both front- and back-facing surfaces). This should cut down the
number of shadow faces rendered quite significantly. Whether this will
result in a performance boost or whether the cost of the algorithm will
cancel out the boost remains to be seen. Rendering of light effects
should be carefully ordered so that PSSs and PVSs are decrompessed a
minimum number of times. Another advantage with this is that we won't
get lighting artefacts due to potential shadow casters being culled by
the viewer's PVS cluster (the current version of the mod suffers from
this problem). When we step through the BSP tree to mark visible nodes
we won't need to perform a frustum cull as all lights are point sources,
but we can do simple distance culling based on the light's
area-of-effect.

* Other optimisations follow directly from "Fast, Practical and Robust
Shadows" by Mark Kilgard, Cass Everitt and others available from the
nVidia Developer website (http://developer.nvidia.com/page/home.html).
Another good reference is Eric Lengyel's Gamasutra article "The
Mechanics of Robust Stencil Shadows" available at
http://www.gamasutra.com/features/20021011/lengyel_01.htm. You need a
Gamasutra account to access this, but Gamasutra accounts are free. The
optimisations discussed in these articles are: use of a shadow vertex
extrusion shader (mentioned above); use of a scissor test to cull pixels
definitely not illuminated by the light source; use of fast z-pass if
the shadow caster is culled by the 'viewport occlusion pyramid'; only
rendering sides and dark cap of shadow volume if viewer is in shadow and
looking away from light. Unfortunately, we can't take advantage of
rendering just the 'silhouette edge' for the static world model because
the portion we render will not be 'closed'. However, I do have an idea
for an optimisation that I discuss later that is similar to this method,
although implementing it will be extremely difficult. Also, use of the
'depth bounds' optimisation is not worth immediate consideration as it
will not be available on most peoples' hardware (inlcuding mine).
Two-sided stencil is an obvious and easy optimisation, but unfortunately
it's not available on my system. I'm hoping that the 'silhouette edge'
optimisation will be applicable to moving entities as I reckon most, if
not all of them, will be closed. There's quite a few optimisations here.
I think each one should be 'relatively' straightforward to implement,
although I'm not saying that it'll be 'a walk in the park'. By
themselves, I don't think they'll make too much impact, but all combined
I reckon they could make a difference.

* Using index and vertex buffer objects to store mesh data in VRAM for
faster rendering. Quake 2 makes use of compiled vertex arrays, which I
don't know too much about yet, but I suspect using buffer objects will
be better (perhaps the two can be used simultaneously for even greater
performance?).

* It would be more efficient to consider connected, co-planar faces as a
single amalgamated face for the purposes of shadow casting. This
introduces the apparant problem of 'holes' in such faces, but it turns
out that these would be elegantly dealt with by the normal algorithm
without change because the 'winding order' of 'hole faces' would be the
opposite to that of the amalgamated shadow casting face (i.e. they would
be facing the other way relative to the viewer). The difference in sign
because of this cancels out the stencil increment from rendering of the
shadow volume in areas that should be in light only. Determining
co-planar faces should probably be done offline by a utility similar to
the Normal Map Generator program.

* Once we get static lights working, we won't need lightmaps at all
anymore. As such, I'm working with that assumption and simply not
rendering them.

* We can cache the results of certain per-node computations, such as a
frustum-cull test, for use in subsequent passes. The current version
already caches the results of the frustum-cull test, but there may be
other tests that are worthwhile.

* I don't think this one is feasable, but I'll mention it for
completeness. Consider a room that is a cube with a door that opens into
another area. Assume this room contains a shadow-casting light source.
It is quite obvious that the walls of the cube do not shadow anything
that is in the room. The walls obviously do shadow the area behind the
door, although it is likely that only one of the walls shadows this
area. Assume that the door is a simple rectangle. It would be far more
efficient to render a 'light volume' rather than a number of 'shadow
volumes' in this case and mark those fragments that are in light rather
than in shadow. Going back to inside the room, only 'non-convex' faces
can possibly cast shadows in the room (by 'convex' faces in the room I
mean the 3d equivalent of OpenGL's definition of a 'convex' polygon).
This may simply be another picture of the 'silhouette edge' idea, but it
is clear that we could use such an idea even for non-closed models. I
haven't analyzed this idea to its completion yet, but perhaps the
combination of the room's surfaces and 'light casting' surfaces must
form a closed model for this to work. However, I suspect that
determining these 'light casting' surfaces will be an extremely
difficult task and will probably require a process similar to that of
generating a BSP tree from raw data. Quake 2 maps define 'areas' that
are connected by doors that can open and close (Quake 2 uses this
information to cull large sections if such doors are closed). These
'areas' might prove useful in determining these 'light casting'
surfaces, but the idea should be applied elsewhere to get the best out
of it (i.e. for an 'around-the-corner' type of situation). Even if such
an optimisation were implemented, the algorithm for determing 'light
casters' would need to be executed on end-users' computers, which would
require it to have a rather limited execution time (i.e. less than an
hour). However, there is a possibility that this might give us the
biggest performance boost of all. Also note that this is really only
suited to indoor type games (which of course Quake 2 is). I reckon that
this technique would be very compatible with a portal rendering system.


Those are my thoughts on optimsation so far. I have made a start, but
I'm not too far along yet. Moving entities look like they're going to be
a bit of pain, but doable. As for static lights I had no clue about
until a while ago. Initially I assumed that the illumination from static
lights was 'hard coded' into lightmaps. Determining static light
properties from lightmaps was not something I wanted to think about.
However, I happened to look at the end of a Quake 2 .bsp file in WordPad
on the off-chance I might spot something when lo-and-behold I noticed
that there is a list of 'entities' in plain text, including information
about 'lights'. I'm hoping that this is information about static lights
saved from the original map file, in which case static lights should be
really easy. The whole list is loaded by Quake 2 in
"CMod_LoadEntityString". Based on my proposal about shadow casters and
the PSS, we could store a static list/array of static lights that occupy
each leaf.


After all this has been done (if it ever gets done) I want to start
looking at other interesting graphical effects such as: transparent
surfaces casting a 'stained-glass window' effect; soft shadows; animated
water caustic textures; reflection on large water bodies (which cannot
be handled by simple environment mapping); physically realistic water
that moves in three dimensions and reacts to solid objects; physically
realistic explosions rendered with shaders. Most of these are crazy
pipedreams, but who knows, maybe some of them will work.