// Add all geometry below this node. sglScene *root = new sglScene; : : // Reuse the same traversal state node from frame to frame for better // performance. sglCullf trav_state; while (!quit_flag) { // Do application specific work doAppStuff(); // Adjust internal data-structures (display lists, bounding volumes, // etc...) for the parts of the scene graph that may have changed // since last frame. root->preDraw(); // Collect the geometry intersecting the given frustum & view_matrix. root->cull(trav_state, fov_x, fov_y, win_x, win_y, frustum, view_matrix, override_statelets, disc_mask, state_mask); // Render the visible geometry. trav_state.draw(); }Where the following is an explanation of the parameters to the cull traversal:
The above code illustrates one typical way of rendering the scene graph. However, there are other ways of rendering the graph that may be useful in some circumstances.
// Add all geometry below this node. sglScene *root = new sglScene; // used to move the camera around sglTransformf view_trans = new sglTransformf; root->addChild(view_trans); // set up the view platform and camera sglPerspectiveCamera *view_camera = new sglPerspectiveCamera; view_camera->setFOV(M_PI*0.25, (double)win_x/(double)win_y, 1.0, 200.0); sglViewPlatform *view_platform = new sglViewPlatform(*view_camera); view_platform->setEnableZUp(true); view_trans->addChild(view_platform); : : // Reuse the same traversal state node from frame to frame for better // performance. sglCullf trav_state; while (!quit_flag) { // Do application specific work doAppStuff(); // Adjust internal data-structures (display lists, bounding volumes, // etc...) for the parts of the scene graph that may have changed // since last frame. root->preDraw(); // Collect the geometry intersecting the given frustum & view_matrix. root->cull(trav_state, win_x, win_y, *view_platform, override_statelets, disc_mask, state_mask); // NEW: setup OpenGL projection matrix (view_platform->getCamera()).applyProjection(); // Render the visible geometry. trav_state.draw(); }
SGL has the capability to collect rendering statistics (number and type of geometry rendered, state changes, textures, etc...). Enabling statistics gathering is as simple as switching to a debug traversal object. For example:
sglScene *root; sglCullf trav_state; sglStatsCullf debug_trav_state; draw() { root->preDraw(); if (need_debug_stats) { root->cull(debug_trav_state, ...); debug_trav_state.draw(); // Get the statistics collected since the last call to stats.reset(). sglStats &stats = debug_trav_state.getDebugStats(); } else { root->cull(trav_state, ...); trav_state.draw(); } }
That said, there are some cases where double precision traversals are useful. In any case, the traversal state used in rendering should match precision of the nodes in the scene graph to achieve the desired precision. For example:Double precision will be slower. How much is hardware dependent. Vertex data is single precision. You'll need to keep vertices close enough to their local origin to maintain your desired precision. All the graphics hardware that I've ever used is single precision. You'll need to keep your eyepoint close enough to the local origin of the geometry you are looking at to maintain your desired precision.
sglTransformf and sglCullf or sglTransformd and sglCulldNote, however, that you can still mix single and double precision nodes in a single scene graph and you can still use either traversal state above. An analysis of the resulting precision of modelview matrix at any given point during traversal is a function of the order of precision of the transform nodes encountered during traversal and the precision of the sglCull. This is beyond the scope of this document.
sglNode *node1; sglNode *node2; sglNode *node3; sglCullf trav_state; vectorNote, the above code still performs all normal culling and state sorting. To disable the cull check (if you already know what's in view), pass a zero into the cull() method.default_statelets; sglScene::createDefaultStatelets(default_statelets); deque override_statelets; unsigned int frame_count = 0; sglTimespec frame_time; draw() { // bind any textures that need it sglTexture::bindDirtyTextures(); // Call preDraw on any nodes that may have changed. node1->preDraw(); node2->preDraw(); node3->preDraw(); // if no animation nodes exist then the following is unnecessary. ++frame_count; sglTimespec::getSysTime(frame_time); // Perform some cleanup - may not be necessary all the time if nodes // are being changed or textures aren't being replaced. sgl::deleteUnusedDisplayLists(); sgl::deleteUnusedTextureObjs(); // Compute a LOD scale factor. float lod_scale = sglScene::computeLODScale(fov_x, fov_y, win_x, win_y); // Setup flags required for culling. unsigned int cull_flags = 0; for (unsigned int i=0; i < frustum.getNumPlanes(); i++) { cull_flags |= (1 << i); } // Setup the traversal state object for this frame. trav_state.initialize(lod_scale, disc_mask, state_mask, view_matrix, frustum, default_statelets, override_statelets, frame_count, frame_time); // Collect the geometry. node1->cull(trav_state, cull_flags); node2->cull(trav_state, cull_flags); node3->cull(trav_state, cull_flags); // Render the visible geometry. trav_state.draw(); }
// Collect the geometry. node1->cull(trav_state, 0); node2->cull(trav_state, 0); node3->cull(trav_state, 0);
The sgl library itself is does not provide multi-threading capability. Nor does it take advantage of multi-threading. Instead, sgl has been design to allow the user to take advantage of multi-threading, and therefore multi-processor systems, according to his/her situation.
The simplest way to use sgl is in a single threaded manner. The
default thread will do all the simulation and culling. For example,
while (!done) { simulate(scene); // Do whatever you need to do. scene->preDraw(); // Readjusts internal structures. scene->cull(trav, ...); // Cull nodes not in the view frustum. trav->draw(...); // Draws the nodes that passed the cull test. }
If you are using a single cpu system, and the time needed to simulate
is small, then the above approach is probably the best one to take.
On the other hand, if simulating your scene can take a lot of time,
then it would be better to split that work onto a seperate thread.
Even on a single cpu system, gains can be made by seperating the simulation
thread. This is due to the fact that on many systems the cpu feeding
the graphics pipe can spend much of its time waiting on the graphics processor(s)
to finish their work. For example,
sim thread: draw thread: simulate(scene); while (!done) while (!done) { { sync1(); <-----------------> sync1(); update(changelist, scene); scene->preDraw(); sync2(); <-----------------> sync2(); simulate(scene, changelist); scene->cull(trav, ...); trav->draw(...); } }
Note, for this to work, the simulate function cannot alter the scene
graph in any way. However, reading from the scene graph is allowed.
Thus, if the scenegraph data structures are a natural fit for the data
that you are simulating, the simulate function can read values from the
scengraph and write changes to a change list that will be processed during
the update.
Alternatively, if the scenegraph data structures do not fit you data domain (you're not just moving model articulations around), then it might be better to create you own data structures independent of the scene graph. Preventing the simulate function from reading or writing from the scene graph allows a further improvement in concurrency:
sim thread: draw thread: while (!done) while (!done) { { sync1(); <-----------------> sync1(); update(data, scene); sync2(); <-----------------> sync2(); simulate(data); scene->preDraw(); scene->cull(trav, ...); trav->draw(...); } }
What else can be done? If you are rendering on a system with
multiple graphics pipes (completely seperate systems that can be fed different
commands simultaneously), then multiple draw threads could be used as well.
Multiple draw threads may also make sense when you have multiple cpus and
no hardware graphics at all. If the rendering is going to be done
in software, then you might as well throw as many cpus as possible at the
problem. For example,
thread 1: thread 2: glXMakeCurrent(window 1); glXMakeCurrent(window 2); while (!done) while (!done) { { sync1(); <-----------------> sync1(); simulate(scene); scene->preDraw(); sync2(); <-----------------> sync2(); scene->cull(trav, ...); scene->cull(trav, ...); trav->draw(...); trav->draw(...); } }
[Note, this scenario has not yet been tested in sgl, but should
work.]
A really complex program may want to combine seperate sim threads with multiple draw threads for maximum concurrency.
Finally, there is one other way to achieve greater parallelism.
Some scene graph libraries allow the cull process to be seperated from
the draw process. This can be beneficial when the scene has many
geodes with small bounding volumes that will take a lot of time to cull.
Unfortunately, this would require that the scene graph itself be buffered.
It also adds extra latency. For example,
sim thread: cull thread: draw thread: while (!done) while (!done) while (!done) { { { sync1(); sync1(); sync1(); swapbuffers(); sync2(); sync2(); sync2(); simulate(frame3); cull(frame2); draw(frame1); } } }
Do to the high cost involved, this is not supported with sgl and
will probably not be supported in the foreseeable future. If you
find that your cull is taking too long, you may want to consider reorganizing
your scene graph. Try grouping spatially related geometry together,
so that larger portions of the graph can be culled sooner.