Tuesday, April 15, 2008

Geometry Can Be Abstract, Too

When I first created the geometry loader for the materials I explained, I assumed there was a single geometry source that would suit all platforms and materials, and every material would perform any conversions it needed on the source geometry.

As an example, a material that would render multiple instances of a model using shader constants, would require creating multiple copies of the input geometry, and inserting instance indices as an additional vertex component in the vertex stream. But if you had hardware support for instancing, the geometry stream for the model would be just the same as the source one.

But there could be more complex operations performed on a geometry stream, eg. merging several meshes together, adding or removing components, tesselating higher-order primitives, etc. Therefore, what we would be interested in doing would be encapsulating in a geometry class the operations performed on any geometry data source, and let the class perform them behind the scenes, without the client code having to deal with the specifics of geometry processing.

In order to do this, I've created a base structure called a Mesh, that is a container for both the raw data and its metadescription: vertex components, submeshes, etc. It goes like:

class Mesh

{

protected:

uint nVertices;

uint nIndices;

uint nStreams;

uint nBatches;

Stream* pStreams;

Batch* pBatches;

float* pVertices;

uint* pIndices;

}

with the Stream and Batch structures describing the vertex components and mesh subdivisions respectively (these terms I've taken from Emil “Humus” Persson demo framework):

Such a structure is general enough to hold mostly all types of geometry that can be handled by today's hardware. But it doesn't give any process to initialize, load or otherwise create this geometry data. To do that, we create specialized classes.

In doing this, Tom Forsyth's article on material abstraction comes in handy, yet again. By generalizing his idea on texture sources, we can define as well Mesh-derived classes that perform operations such as:

  • Load geometry from a source file, eg. MeshN3D2, MeshNVX2 would load the geometry from the n3d2 or nvx2 file formats from nebula2.

  • Merge geometry together, eg. CompositeMesh would merge together several source meshes

  • Process the geometry somehow, eg. NormalMesh would compute the normal for a triangle stream, TransformMesh would apply a given transform (translation, rotation, scale) on the source geometry.

  • Prepare the geometry for some specific rendering, eg. ShadowMesh would insert the quads for shader-based shadow volume extrusion, InstanceMesh would insert instance indices in a stream made of multiple copies of some source geometry.

  • Expose an interface for direct geometry manipulation, eg. BuilderMesh would allow adding custom vertices from the application using a comprehensive interface, eg. AddCoord, AddTexCoord, AddTriangle, etc.

And the usage of such specific classes would look like:

Mesh* pSrcMesh = Mesh::Create( “torus.n3d2” );

Mesh* pNormalMesh = Mesh::Create( pSrcMesh );

TransformMesh* pScaleMesh = Mesh::Create( pSrcMesh );

pScaleMesh->Scale( 2.0f );

CompositeMesh* pCompositeMesh = Mesh::Create();

pCompositeMesh->Add( pSrcMesh1 );

pCompositeMesh->Add( pSrcMesh2 );

pCompositeMesh->Add( pSrcMesh3 );

Mesh* pInstanceMesh = Mesh::Create( pSrcMesh );

Mesh* pShadowMesh = Mesh::Create( pSrcMesh );

Now this kind of abstraction makes it possible to create all sort of derived classes, with only one virtual method: Load(), that would perform the required operations, fill the vertex and index arrays, and made them available either for the client application (eg. to load vertex and index buffers) or for another mesh to use it as a source mesh.

Now this abstraction has yet another useful application: reusing geometry across several models. If several models are merged together in order to reuse the same vertex buffer but using different materials for each of the submeshes, the trick would be creating a register of all meshes using a unique string, similar to the one described by Forsyth for textures, eg:

ShadowMesh(CompositeMesh(MeshN3D2(“upper_body.n3d2”),MeshN3D2(“lower_body.n3d2”))))

Now if we have different materials for the upper and lower body of a model using this composite mesh, we would do the following:

Material* material1;

void* pMeshData1 = material1->Load( Mesh );

Material* material2;

void* pMeshData2 = material2->Load( Mesh );

Internally, both materials would load the same mesh, but instead of duplicating data, they reuse the same geometry buffers, and switch to different vertex and index ranges when rendering them.

Wednesday, April 2, 2008

Yet Another Material Framework

Now that I'm working in some visually appealing demos for the benefit of the world, I've made myself the proposal of developing a Demo Framework. Yet another one.

Yep, no graphics programmer has been able yet to use someone else's Demo framework, despite the amount of those that are readily available. And yet, not only everyone creates their own, but they do it in mostly the same way: create a homogeneous DirectX and OpenGL wrapper, encapsulate your renderer, your models, your textures, and start implementing your fancy demos on top of them.

Quite probably, that's exactly the same thing I've been doing, but in doing so, I've been trying to prove my point that no matter which implementation you are using, you need to plan for the future, you need to support as many different features as possible without sticking to the minimum common denominator of all your target platforms, and you need to be able to take advantage of the most powerful features in every different platform or programming interface.

Curiously enough, that's exactly what Tom Forsyth proposed in [Forsyth2004]. In this great paper, Forsyth exposes the concept of abstract materials, meaning an abstract representation of what a shader does in the form of attributes and operations instead of providing explicit shaders. Not only such a representation would hide the complexity of the shader code from the user, but also would readily enforce code reuse across multiple shaders that shared the same operations. Redundant shader code is probably one of the nastiest issues in shader maintenance and it is the one that originally interested me the most when I addressed the paper.

But then, I found that the implementation is a little less clean than expected, because in order to correctly describe abstract materials, you need the kind of expressive power that only comes from programmability, be it scripting or a comprehensive, graphical representation of shader effects. The rest of the points in Forsyth's document fall into the “simple” category: texture encapsulation, material fallback, static vs dynamic materials, etc. But the way he presents a Material Description as an arbitrary enumeration of attributes and rendering flags is way too generic, and probably would lead to some unextensible, bloated code aggregator. The goal of achieving maximum expressiveness is limited to whatever form of code generation we could turn the abstract Material Description into.

For the sake of argument, let's review some known forms of shader encapsulation.

Shader library

This approach is the one that works the best because it is simple and grants the programmer control over the variety and complexity of shaders. Basically, it means that shaders are entirely coded by a programmer on demand, possibly using some external tool (NVIDIA and ATI have excellent shader editors, packed with samples for the benefit of the public). These shaders are then assigned to a Material, that would possibly encapsulate details of the process such as sorting, grouping or rendering passes, and expose requirements such as system parameters or input formats.

As an example, a PhongDiffuse material could require the Position, Texcoord, and Normal vertex components, and require the DiffuseColor material parameter, the LightDirection and LightColor environment parameters, and the WorldViewProjection system parameters (let's keep it simple). This information is enough to generate a suitable mesh and render it to the screen, assuming all required data are present.

The problem is obviously that artists don't have any power to experiment and define their own material unless they are willing to program shaders themselves. It is a known fact that artists and designer usually hate coding, but still love to customize their data and behaviors through some procedural representation. That's exactly the same as coding, but they don't perceive it as such (think ActionScript or Renderman) and that's something we must take into consideration.

Assembled fragments

Shader fragments were made official in DirectX 9, but at that time, several authors had made their own fragment-based approach to build shaders, myself included. The concept of shader fragments is that if you can isolate reusable operations into code blocks, and describe how these blocks relate to each other (dependencies, order, inputs and outputs, etc), you can theoretically build shaders just by selecting which fragments a Material requires. [Hargreaves] and [Osterlind] both expose different approaches to the concept of splitting shaders into fragments that can then be rebuilt into a meaningful whole. For example, the DiffusePhong shader above could be assembled from the named fragments: Projection (transform position to clip space), DiffuseLight (compute Diffuse color component from the dot product of the Normal and the incident Light direction), etc. If, for example, we wanted to add Bump to this Material, the Normal would be computed in an additional code fragment. By clever definition of dependencies between code fragments to ensure that all required fragments are present at the time of assembly, these shaders can be thus built.

This solution enforces reusability, but moves some flexibility and experimentation into the hands of artists, the programmer still retaining most of the control on how shaders actually work, and it is probably the most comprehensive of all. I must warn you though, it requires some heavy thinking about the best way to mix fragments when combinations start growing exponentially.

I've seen other approaches of the same solution that kept working with shader code but let named fragments be invoked from within the code of other shaders. Other use preprocessor directives to conditionally include or exclude shader code depending on a number of defined values, thus rendering a number of code combinations. While working in Tragnarion, I was personally responsible of one the worst shader assemblers ever: a builder script that programatically outputs lines of code depending on the attributes present in the material description. It seemed like a good and simple idea at the time but it quickly reached the critical mass.

Shader Graph

The whole idea of shader programmability seems to point to good old Renderman, and Shader Graphs basically rescue the original idea and put a very similar concept in the art pipeline, making it look more like an art tool. A Shader Graph is usually presented in the form of a graph of connected blocks, that represent either data or operations. The data implicitly defines the requirements of the Material (textures, numeric parameters, input streams), whereas the operations describe its results. These are usually fed to one of several rendering models that process every output of the graph into the corresponding visual result. For example, a graph that outputs a Diffuse and Normal values can be processed by a Phong renderer, but the Diffuse value can be directly sampled from a texture, or processed through some other operations, and the Normal can come from the input geometry, a Bumpmap, or any other source.

The advantage of this approach over the previous one is that only the interface and outputs of a graph are determined by the blocks it contains, but the user is free to arbitrarily define behaviors.

Examples of this approach to shader synthesis include the Material editor in the Unreal Engine, or the mental mill tool from mental images, available as part of the NVIDIA SDK.

Fragments, on its own, are black boxes: they define the input, the output, and surely the operations they encapsulate. On the other side, fragments are simpler to use, and are a higher level abstraction of shader behaviors when it comes to Material level of detail: A Bump fragment can be implemented in its own way for different implementations (or removed altogether) when it comes down to automatically simplifying a shader. The type of low-level operations that are usually involved (eg. texture sampling, modulation, etc) are basically a fancy form of coding (just like in Action Script or Renderman, remember?) and has the exact same problem when it comes to Material level of detail, data hiding, shader diversity, etc. But it is too powerful a tool to ignore it anyway.

Metamaterials

What is the right way to describe a Material? All of them, of course, as long as they get the job done. For many cases, a shader library will do the trick, and it is a clean way to control how complex the system turns anyway (it won't get easily out of hand). Fragments are a simple and comprehensive way to combine known pieces, and most materials in a game could benefit from just a fistful of these fragments. Shader graphs are the perfect solution for every other effect you may require, letting both artists and graphics programmers play with it for the sake of trial and experimentation.

Each of these approaches works better in certain scenarios. It would be stupid of us to dismise one or the other just because they seem too simple, or too complex, or whatever other reason. But all of them share one common trait: Metainformation. Whether the boxes encapsulate an entire material or small pieces of them, the actual description of the material is irrelevant to the design of the material system. This is why I've thought of creating the kind of Material description that Forsyth describes, but in its own abstract way, that would make it possible to fit a complete shader, a fragment composite, or a shader graph without having to rewrite the whole system.

The base of this system is that the Metainformation is in a homogeneous namespace, that gives it the appearance of a shader library. This simplifies the process of creating materials and using them to load and render geometry to the simplest form:

Material* material = Material::Create( “materials/phong/diffuse” );
Model* model = Model::Create( “models/cube” );
material->Load( model );
material->Render( model );

Now could you tell me what kind of shader abstraction is in Material? No you can't, because the “materials/phong” name is defined in a flat namespace where all kinds of Metamaterials are posible. Let's say the Metamaterial is an effect, in whatever format you prefer (HLSL, GLSL, Cg, etc.) and looks like:

materials/phong/diffuse
{
stream position
stream normal
profile dx9
{
effect dx9/phong.fx
technique tDiffuse
}
}

Now this makes explicit the implementation of the Phong material for every supported profile (roughly, a platform) each in its preferred format.

Now if we would like to clearly separate the code in its building pieces (fragments), we could create a different type of Metamaterial:

materials/phong/diffuse
{
diffusemap
phonglighting
diffuselighting
}

Now we're just saying that by assembling these three fragments (whatever they may be) we're going to achieve the desired effect. Of course the management of these fragments is not trivial, but one of them could look like:

fragment diffusemap
{
stream texcoord0
sampler diffmap0
vertexshader
{
out.uv0 = in.uv0;
}
pixelshader
{
diffuse = sample( diffsampler0, in.uv0 );
}
}

This is probably too simple a shader fragment, but its purpose is to show that fragments are just a higher level of indirection than directly providing a shader file. But in the end it is the same code anyway.

Now let's try to achieve the same thing through a clever shader graph:

materials/phong/diffuse
{
sampler diffuse
{
texture diffusemap
texcoord uv0
}
color output
{
diffuse = diffuse.rgba
}
}

Here, sampler and output are different types of graph nodes, each with a well defined set of inputs and outputs. Sampler is a block that takes a Texture and a Texture coordinate set as an input, and outputs a 4-component value. Output is a block with a number of inputs, such as diffuse, normal, specular, etc. that it combines in any way the material wants to. This is too simple an example to make it obvious, but when it comes to lighting, shadowing, environment mapping and other such advanced effects, there are many models that could be used to process the output block.

It is obvious that these approaches are all variations of the same theme. The difference is the interface and expressiveness it gives the user: there's no rule preventing to create a Phong block in a shader graph that takes all input and processes them into all outputs. It's up to the actual material designer to decide the level of abstraction after all.

And this is where I'm stuck now, trying to build a Metadescription that can be turned into an actual shader (or set of render states) still keeping the advantage of automatic quality downgrading, code reuse, and extensibility. I will report back when I get there.

References
  • [Forsyth] Tom Forsyth, “Shader Abstraction”, from ShaderX2. Shader Programming Tips and Tricks with DirectX 9
  • [Hargreaves] Shawn Hargreaves, "Generating Shaders From HLSL Fragments", in ShaderX3
  • [Osterlind] Magnus Österlind, "Shaderbreaker", in ShaderX

Friday, September 28, 2007

Pushing the limits

While working on developing a graphics engine for video games, development was driven by immediate needs, schedules, release plans and available resources, most notable time, money and people. Now that I have more time to carefully think about graphics on my own time, and seeing all the different directions that next-gen technology seems to be presenting for the future of real time graphics, I'm still trying to figure out, in a wild exercise of fortune-telling, how are games supposed to look in the next few years, and what should we developers focusing on in order to make these next-gen techniques possible.

Even trying to classify all the different areas of research and development in computer graphics seems now like a titanic task, and catching up with all current and future developments slightly less than an utopia. Simply put, there is too much information to process for single individuals, and I can imagine technical directors in major development studios spending a lot of time just staying up to date with cutting edge technology. That's probably why it looks almost impossible to be a guru of everything that has to do with graphics these days, and still cope with milestones, unless you're one of the lucky guys who gets paid to do research and development, and enlighten the graphics community after that, either in papers and conferences, or by helping create the next stunning-looking blockbuster that everyone will be learning from (ie. copying) for the next few years. To be honest, I hate those guys as much as I admire them.

In this blog, I want to try and summarize the buzz regarding modern techniques that seem to be breaking the ground as of these days, and how these techniques could evolve into the graphics of the next generation, and maybe the one after that.

Geometry

the amount of geometric detail that modern hardware is able to handle is growing over the years, thus granting the artists the ability to tessellate their finely-modeled props and characters as much as they want to, leading to characters with as much as 10.000, 20.000, 100.000 polygons each easily. At the same time, Bump-, Normal-, Parallax- and general Relief-maps allow adding the fine detail that plain geometry isn't able to achieve. There doesn't seem to be much room for improvement here, at least when it comes to static, predefined geometry - but of course there is everything to say on dynamic geometry, including adaptive tessellation, constructive geometry, deformations and fractures, etc. Several technologies have already been introduced to dynamically create, deform, break or otherwise modifying geometry both in CPU and GPU.

The only limitations so far are due to the throughput required to push dynamically generated geometry to the GPU, but using GPU-based techniques, that shouldn't be such a huge problem in a future. Now it's time to move from polygon-based geometry to volume-based matter, which will make bodies look more real, closer to their real-life counterparts, behaving as physically accurately as possible within computation limits. If there is to be real progress, next-gen models will run away from the metaphor of creating geometry as the surface enclosing the volume, instead modeling it as the actual piece of matter.

Or liquid. Let's just remember that simulating fluids in real time is the next big thing, and the day we're able to simulate the behavior of such substances in real time, there will be room for improving not only in graphics quality, but in gameplay too, allowing next and exciting environments to be explored, and used to challenge the player.

Materials

Just as geometry is gaining in detail and realism, so are textures and shaders. Many of the techniques that have been widely used in film production only differ from what's possible in real time by the texture detail, amount of texture layers and passes, and precision of lighting and shading algorithms. But new hardware is already allowing more and bigger textures, and more and more intricate shaders. There seems to be no limit to the amount of instructions that shader developers will be able to use in a future, and the techniques thus possible are simply impossible to foresee.

It follows logically that textures and shaders will evolve into accurately modeling the physical properties and appearance of matter instead of surfaces. Also, it is quite probable that textures will be more and more used to reflect not only the properties of materials, but their state too, creating more and more techniques that use textures as general-purpose buffers that will be used to hold the changes and perturbations of the matter to which they are applied. And volume textures (or their generalization) should become more and more standard, following the evolution of polygons into volumes.

Lights, Shadows, Volumes

Most of the buzz you'll find regarding shaders are different techniques to create convincing simulations of light environments, of all the fine interrelations of lights and physical media. shadowing, environment mapping, ambient occlusion, Radiosity, Ray tracing, are some of the names of the techniques that have been regularly used in both production and real time graphics. Most of them are convincing enough, but ultimately approximations based on several assumptions on perception. But now, it just seems that the complex way in which real light interacts with the real world (reflectivity, transparency, scattering, etc.) is the way to go.

The challenge of improving the quality of lighting and shadowing includes moving into volumetric rendering. This means that the air (or the particles in it) is now part of the equation, and that no lighting, no environment will look real enough if there is no feeling of density, of smoke, steam, dust. Every single particle in the air (or corresponding fluid or matter) will have to be part of the equation, otherwise the simplification will show. Techniques to simulate light shafts or volumetric scattering already exists, but they are still applied as an extension to the simple technique for lighting or shadowing, usually through faking volume and depth in a 2D space. When true volume rendering becomes widely available, maybe we'll be able to render volumes not as a function of the viewpoint, but as true 3D effect.

Oh, and of course, expect all of these techniques to be real time in nature. Pre-baking any form of lighting will be banned from development practices the minute geometry and materials fully enter the dynamic world.

Special effects

I would say that where videogames will excel in a future, will be in making it all look alive. When you combine all of the techniques aforementioned together, you have a perfect picture (or stereogram) of reality. But it's a picture anyway, and even though dynamicity will be a regular part of that simulation, it's in what happens during change, motion, life, that we recognize a world we can feel real, instead of just believable. The good thing with special effects is that thanks to Hollywood, we're already used to them, and seeing worlds explode or cities flooded is already part of the collective conscious. At the level of video games, even the simplest particle effects, impacts, explosions, atmospheric effects (rain, lightning, etc.) add several levels of depth to an otherwise empty (but nice-looking) world.

Special effects are improving in many different directions: there is a lot of them already, and several of them are combined into one big effect that behaves more realistic; they are less predictable, with the right amount of randomness expected; they accurately match the physical properties of the materials to which they are associated; and they leave their marks and traces in the world after fading out. The latter is an important quality of special effects that is not too usual nowadays but comes associated with the dynamic nature of next-gen graphics. That is to say, special effects are part of the visual representation of a synthetic world, not an addition on top of it.

In fact, I don't think there will be much difference between plain rendering and special effects, all being part of the same process. I mean, they are rendered through the same processes already, but effects will be part of the definition of digital matter, just as dust is part of a building when it collapses. Maybe effects exist only as a function of events, transitions and other types of processes. But they are part of the world on their own, and as such they should be part of the world definition. If we're able to teach a system to behave, just like nature does, it should be easier to create the rules to describe what's expected of the components of that world, and maybe some unexpected behavior will emerge naturally.

The lesson learned from recent movies where CGI interact with real images in a way that makes it difficult to tell real from digital (see Transformers for an example) is simple: real human vision has already set the quality standards of what's expected from digital images. Of course that doesn't apply to non-photorealistic rendering, but faking reality in a computer is the most extended, and expected, way of graphics representation today, and the expectations of viewers will go up even higher. Maybe we could be fooled once with simple tricks, but no kid born in the digital era will be. In time, our ability to identify defects in synthetic imaging will grow, and by that time, all techniques that today are just convincing, will be simply not enough for the casual eye, not to speak of the expert one. The interactive nature of video games, and the necessary sense of immersion, makes it even more difficult, forcing us to create worlds instead of scenes, vision instead of graphics, and life instead of reality.

Thursday, September 27, 2007

Where were you in 1986?

A long time ago, back when most computer games were sold in tapes, there was a game called The Trap Door. This might be an odd way to start a blog about next generation gaming, but please stick with me to see where this is going. The game was a typical puzzle-solving adventure where you had to perform several tasks for your boss, who had you trapped in a castle. Some elements of this game shock me even now: there were only 6 different screens, and there were 5 missions to solve using all the objects that were lying around in these few rooms, that included things as fun as boiling slimeys, or crushing eyeballs into juice (it was based on a children's show, you know). Every time Berk (the player) picked up some object, you could see him grabbing it from the ground, holding it in his hands, and dropping it when necessary. And in order to solve a mission, you needed to figure out the logic behind all objects just by paying attention at the screen, eg. in order to crush the eyes, you needed to use some small eyes as seeds, grow them into some eye plant, collect the big eyeballs, put them in some container, and crush them with the help of some nasty creature that came out of the trap door from the title.

I'm bringing this up because this game had some amazingly modern concepts in its game play. The idea of an interactive environment seems to be one of the cornerstones of next-generation gaming. But even more, what I like the most in The Trap Door is that there is an implicit language for players to learn and understand, that explains by itself the rules of the game, letting players try and use game objects as they like, only implicitly assuming that all of them will be of some use in some of the missions. But there is no HUD, no labels, no signals to show what every item does, or how it is used. Through a clever use of simple yet clear graphics, everything seems to announce what it is and what it is supposed to be used for. Comparing it to recent games with that kind of freedom taken to a much larger scale, eg. Oblivion, where apparently there are thousands of useless objects spread throughout the game - or Dead Rising, where all interactuable objects, even though it's apparent what can or can't be used, is labeled using a HUD- is proof of an amazing ability to interact with players in a smarter way, letting them learn the logic, instead of being constantly guided by lazy game designers.

I must admit I like being guided a little, at least at the beginning, but following a path that has been laid down for me to follow obediently is little less than being treated as a puppet, or a well-trained monkey. Even though actions scenes usually still take the type of ability that truly comes from eye-to-hand coordination, finesse and finely paced moves, when it comes to solving the story, too many games keep treating their players with a disdain for their ability to make their way through the game. Some people complain about those games which storytelling is basically an interactive movie where you get to move the character where he's expected to, and not much else. It's better to switch to a simple cutscene where the story is plain exposed. If the player isn't able to change the course of the story anyway, what's the point in asking him to go through a predefined set of movements or actions?

Next-gen gaming is all about choices. Letting players actively interact, and even change, the story, figuring out the non-player characters, not as a game abstraction, but in their true condition as allies, opponents, or plain witnesses. That's what next-generation gaming is supposed to be about, using next-gen technology (graphics, animation, physics) to increase the amount of believability of that story, those characters, those settings. One good thing about having all that computing power at hand is that it forces the game designer to think of more clever ways of challenging the player. The tricks that have been routinely used in the past should be banned and slightly fade into memory. More and more games are proposing truly next-gen game experiences, with or without the help of cutting-edge technology. After all, if The Trap Door did it back in 1986, it sounds simply stupid that a 21st century game designer isn't able to work that out.