Depth + Texture Representation - International Institute of ...

Depth + Texture Representation: 

Study and Support 

in OpenSceneGraph 

(Final Year Project Report) 

Aditi Goswami 

200201058

Abstract 

Image Based Rendering holds a lot of promise for navigating through a real world 

scene without modeling it manually. For IBR, different representations have been 

proposed in literature. One of the techniques for Image Based Rendering uses depth 

maps and texture images from a number of viewpoints and is a rich and viable 

representation for IBR but is computationally intensive. Our approach using GPU 

gives the results in real time with quality output. 

We also provide the support for our system on a widely used, open source graphics 

API called OpenSceneGraph (osg). osg represents a complex 3D scene as a 

hierarchical; object oriented model called Scene graphs. The actual geometry is 

stored in leaf nodes of the scene graphs. We add our IBR technique (Depth 

+Texture Representation) to osg as an independent class, which will allow the user 

to develop hybrid scenes: the geometry coming partially from actual 3D models and 

from our IBR technique. 

Keywords: Image based modeling and rendering, Splatting, triangulation, 

Blending, Graphics Processing Unit, Vertex Shader, Pixel Shader, Scene Graphs

TABLE OF CONTENTS 

TITLE No. 

Page No. 

1.Introduction 4 

1.1 Image Based Rendering 4 

1.2 GPU 5 

1.3 Cg 7 

1.4 Scene Graphs 8 

1.5 Related Background 9 

2. Depth + TR system 11 

2.1 Representation 11 

2.2 Rendering 12 

2.2.1 Rendering one D+TR 12 

2.2.2 Rendering multiple D+TR 13 

2.3 Blending on CPU 13 

2.4 GPU Algorithm 15 

2.5 Two Pass Algorithm 15 

2.5.1 Pass 1 16 

2.5.2 Pass 2 16 

2.6 Blending on GPU 17 

2.7 Pipeline of Accelerated D+TR 18 

2.8 Some Results 20 

3. D+TR & OpenSceneGraph 21 

3.1 OpenSceneGraph 21 

3.1.1 “What is Scene Graph?” 22 

3.1.2 Nodes in OSG 25 

3.1.3 Structure of Scene Graph 26 

3.1.4 Windowing System in OSG 27 

3.1.5 Skeleton OSG Code 27 

3.1.6 Callbacks 28 

3.1.7 osgGA::GUIEventHandler 28 

3.2 D+TR in OSG 29 

3.2.1 Representation 29 

3.2.2 Rendering 30 

3.2.2.1 Rendering in OSG 31 

3.2.3 Discussion 31 

3.3 Conclusions and Results 33 

References 

Appendix

Chapter 1 

Introduction 

The potential of Image Based Modeling and Rendering (IBMR) is to produce new 

views of a real scene with the realism impossible to achieve by other means, which 

makes it very appealing. IBMR aims to capture an environment using a number of 

(carefully placed) cameras. Any view of the environment can be generated from 

these views subsequently. For Rendering the new views generated using the D+TR 

(Depth + Texture representation) technique various algorithms like splatting and 

triangulation are used and further implementing them on GPU gives real time 

acceleration to the system. 

1.1 Image Based Rendering 

Key features of any Image based Rendering algorithm is constructing and rendering 

models. Construction of models is achieved using techniques of vision as stereo, 3D 

co-ordinates calculation, calibration etc. and rendering is done using graphics 

algorithms as splatting, triangulation, interpolation, and algorithms implemented 

using GPU. Hence we can describe Image based Rendering as a unique example 

where vision and graphics both are implemented hand in hand. 

We can classify various IBR techniques into two broad categories. They are shown 

in the figure 1 below. The D+TR technique discussed in this report comes under the 

‘representation with geometry’. Representation without geometry means that some 

graphics algorithms as culling etc. are used to render the model whose location in 

the world is fixed and no 3D co-ordinates calculation is done. Whereas in a 

representation with a geometry involves generating a novel view based on the 3D 

co-ordinates in the scene. 

IMAGE BASED RENDERING 

\\\ WITH GEOMETRY 

WITHOUT GEOMETRY 

FIGURE 1 

Figure 1: Classification of Image Based Rendering 

Generally, most of the IBR techniques being studied and implemented in past suffer 

either form quality or speed. Getting both these factors together is still a big 

challenge. The quality of the novel views being rendered using D+TR technique is 

very high, but it lacks the speed to make it real time. Our attempt is to implement 

GPU based algorithms to accelerate the current version of D+TR. The concept of 

generating the novel view image is same as in original D+TR, but the algorithms

are twisted a bit to suit the limitations and features of GPU. All these algorithms 

and limitations are discussed in chapters further in the report. 

In figure below, the basic structure of D+TR is shown and the parts in the structure 

where GPU algorithms are implemented are also shown 

Depth Maps 

Calibration 

Parameters 

Estimation of valid 

views for this novel 

view camera 

Novel View 

Camera 

Rendering Valid views 

from novel view. 

Storing novel view 

depth and texture 

from each view 

Estimation 

of 3D 

Texture 

Novel View 

Blending 

GPU is used to implement and accelerate the four 

major blocks within dotted lines of original D+TR 

FIGURE 2: Basic Structure of D+TR technique of IBR 

As shown in figure above, the modules of rendering, 3D co-ordinates estimation 

and blending are key parts of D+TR technique of IBR and are costly also, hence 

GPU is used to accelerate them. GPU algorithm for all these modules is explained 

later. 

1.2 GPU (Graphics processing unit) 

The power and flexibility of GPU’s make them an attractive platform for general 

purpose computation. Modern Graphics processing units are deeply programmable 

and support high precision. GPU has various advantages and has certain limitations 

as. Figure below shows the modern GPU processor pipeline.

GPU’s are also termed as fast co-processor. The speed of GPU’s is increasing at 

cubed Moore’s law. They allow data processing in parallel. GPU is used in many 

applications like simulation of physical processes as fluid flow, n-body systems, 

molecular dynamics, real time visualization of complex phenomena, rendering 

complex scenes among others. 

The kernel of GPU is vertex/fragment program which runs on each element of the 

input stream, generating an output stream. Input stream is a stream of fragments or 

vertices or texture data. Output stream is frame buffer or pixel buffer or texture. 

Hence GPU is also termed as parallel stream processor. 

Graphics State 

Application 

on 

CPU 

Vertex 

Shader 

Rasterizer 

Pixel 

Shader 

Memory 


Vertices (3D) Vertices (2D) Render to 

FIGURE 3: Modern GPU Processor’s Pipeline 

Vertex Shader: A vertex Shader allows custom processing of per vertex information 

on the GPU. The position, color and other related data can be modified by the 

vertex shader. Each vertex is processed in a sequence of steps before geometry are 

rasterized. The same program is executed for every vertex. There is no 

interdependence between the vertices, hence shaders can be implemented in 

parallel. The shader cannot delete/add a vertex. Vertex shaders are used in giving 

unique graphical effects like deformation of mesh, procedural geometry, blending, 

texture generation, interpolation, displacement mapping, and motion blur among 

others. 

Pixel Shader: It deals with programming the pixel. It allows custom processing of 

fragment information on the GPU. The color of each fragment can be computed 

rather than simply being interpolated by the rasterizer. Fragment or Pixel shaders

are responsible for texturing, shading and blending. They can also be implemented 

in parallel in graphics hardware. The color of each component is individually 

computed rather than just been computed by the rasterizer. Pixel Shaders have 

limited or no knowledge of neighboring pixels. Output of the fragment shader can 

be color or depth. 

Limitations of GPU: Besides providing such a powerful approach towards 

computation, GPU’s also laid certain constraints on its use. Graphics Processor 

avoids read backs. It does not allow any arbitrary write to any location. Each 

operation is performed on the chunk of data. Code written on CPU cannot be simply 

ported on GPU, it requires some initialization and binding procedure. 

1.3 Cg 

Historically, graphics hardware has been programmed at a very low level. Fixedfunction 

pipelines were configured by setting states such as the texture-combining 

modes. More recently, programmers configured programmable pipelines by using 

programming interfaces at the assembly language level. In theory, these low-level 

programming interfaces provided great flexibility. In practice, they were painful to 

use and presented a serious barrier to the effective use of hardware. 

Using a high-level programming language, rather than the low-level languages of 

the past, provides several advantages. The compiler optimizes code automatically 

and performs low-level tasks, such as register allocation, that are tedious and prone 

to error. Shading code written in a high-level language is much easier to read and 

understand. It also allows new shaders to be easily created by modifying previously 

written shaders. Shaders written in a high-level language are portable to a wider 

range of hardware platforms than shaders written in assembly code. 

This chapter introduces Cg, a new high-level language tailored for programming 

GPUs. Cg offers all the advantages just described, allowing programmers to finally 

combine the inherent power of the GPU with a language that makes GPU 

programming easy. Cg is very similar to C that is why it is called C for graphics 

(Cg). It is high level shading language. Cg is easy to use with OpenGL. It has 

powerful swizzle operator and build in vector and matrix type, supports basic types, 

structure, array, type conversion as in C, support large number of mathematical, 

derivative and geometric functions as well. 

The Cg language allows you to write programs for both the vertex processor and the 

fragment processor. We refer to these programs as vertex programs and fragment 

programs, respectively. (Fragment programs are also known as pixel programs or 

pixel shaders, and we use these terms interchangeably in this document.) Cg code 

can be compiled into GPU assembly code, either on demand at run time or 

beforehand. Cg makes it easy to combine a Cg fragment program with a 

handwritten vertex program, or even with the non-programmable OpenGL or 

DirectX vertex pipeline. Likewise, a Cg vertex program can be combined with a

handwritten fragment program, or with the non-programmable OpenGL or DirectX 

fragment pipeline. 

FIGURE 4: Pipeline of Cg Code and Compiler 

1.4 Scene Graph 

A Scene Graph is a data structure which captures the logical and spatial 

representation of a graphical scene. A Scene Graph consists of various kinds of 

nodes, each representing a particular type of information like transformations or 

lighting. 

Open Inventor, OpenGL Performer, OpenSceneGraph, Java 3D etc are some of the 

common APIs which use a Scene Graph. The key reasons that many graphics 

developers uses scene graphs are Performance, Productivity, Portability and 

Scalability: 

1)Performance 

Scene graphs provide an excellent framework for maximizing graphics 

performance. A good scene graph employs two key techniques - culling of the 

objects that won't be seen on screen, and state sorting of properties such as textures 

and materials, so that all similar objects are drawn together. The hierarchical 

structure of the scene graph makes this culling process very efficient, for instance a 

whole city can be culled with just a few operations. 

2)Productivity 

Scene graphs take away much of the hard work required to develop high 

performance graphics applications. Furthermore, one of most powerful concepts in 

Object Oriented programming is that of object composition, enshrined in the 

Composite Design Pattern, which fits the scene graph tree structure perfectly and 

makes it a highly flexible and reusable design.

Scene graphs also often come with additional utility libraries which range from 

helping users set up and manage graphics windows to importing of 3d models and 

images. A dozen lines of code can be enough to load our data and create an 

interactive viewer 

3)Portability 

Scene graphs encapsulate much of the lower level tasks of rendering graphics and 

reading and writing data, reducing or even eradicating the platform specific coding 

that you require in your own application. If the underlying scene graph is portable 

then moving from platform to platform can be as simple as recompiling your source 

code. 

4)Scalability 

Along with being able to dynamic manage the complexity of scenes automatically 

to account for differences in graphics performance across a range of machines, 

scene graphs also make it much easier to manage complex hardware configurations, 

such as clusters of graphics machines, or multiprocessor/multipipe systems. A good 

scene graph will allow the developer to concentrate on developing their own 

application while the rendering framework of the scene graph handles the different 

underlying hardware configurations. 

1.5 Related Background 

Image Based Rendering (IBR) has the potential to produce new views of a real 

scene with the realism impossible to achieve by other means. It aims to capture an 

environment using a number of cameras that recover the geometric and photometric 

structure from the scenes. The scene can be rendered from any viewpoint thereafter 

using the internal representations used. The representations used fall into two broad 

categories: those without any geometric model and those with geometric model of 

some kind. Early IBR efforts produced new views of scenes given two or more 

images of it [5, 19]. Point-to-point correspondences contained all the structural 

information about the scene used by such methods. Many later techniques also used 

only the images for novel view generation[15, 11, 8, 20]. They require a large 

number of input views -- often running into thousands -- for modeling a scene 

satisfactorily. This makes them practically unusable other than for static scenes. 

The representation was also bulky and needs sophisticated compression schemes. 

The availability of even approximate geometry can reduce the requirements on the 

number of views drastically. The use of approximate geometry for view generation 

was a significant contribution of Lumigraph rendering [8] and in view-dependent 

texture mapping[6]. Unstructured Lumigraph[4] extend this idea to rendering using 

an unstructured collection of views and approximate models. 

The Depth Image (DI) representation is suitable for IBR as it can be computed 

from real world using cameras and can be used for new view generation. A Depth 

Image consists of a pair of aligned maps: the image or texture map I that gives the 

colour of all visible points and a depth map D that gives the distance to each visible

point. The image and depth are computed with respect to a real camera in practice 

though this does not have to be the case. The calibration matrix C of the camera is 

also included in the representation giving the triplet (D, I, C). This is a popular 

representation for image-based modeling as cameras are cheap and methods like 

shape-from-X are mature enough to capture dense depth information. It has been 

used in different contexts [14, 16, 23, 13, 24]. The Virtualized Reality system 

captured dynamic scenes and modeled them for subsequent rendering using a studio 

with a few dozens of cameras [16]. Many similar systems have been built in recent 

years for modeling, immersion, videoconferencing, etc [22, 3]. Recently, a layered 

representation with full geometry recovery for modeling and rendering dynamic 

scenes has been reported by Zitnick et al [23]. Special scanners such as those made 

by CyberWare have also been used to capture such representations of objects and 

cultural assets like in the Digital Michelangelo project [2, 1]. 

Depth Images have been used for IBR in the past. McMillan used it for warping 

[14] and Mark used an on-the-fly Depth Image for fast rendering of subsequent 

frames [13]. Virtualized Reality project computed them using multibaseline stereo 

and used them to render new views using warping and hole-filling [16]. Zitnick et al 

[24] use them in a similar way with an additional blending step to smooth 

discontinuities. Waschbusch et al [24] extended this representation to sparsely 

placed cameras and presented probabilistic rendering with view-independent pointbased 

representation of the depth information. 

The general framework of rendering Depth Images with blending was presented in 

[17]. [25] Provides a GPU based algorithm for real-time rendering of a 

representation consisting of multiple Depth Images. It present a study on the 

locality properties of Depth Image based rendering. Results on representative 

synthetic data sets are presented to demonstrate the utility of the representation and 

the effectiveness of the algorithms presented.

Chapter 2 

Depth +TR System: Representation and Rendering 

2.1 Representation 

Basic representation consists of an image, its depth map and the calibration 

parameters. Depth map is a two dimensional array of real values with location (i,j) 

storing the depth distance to the point that projects pixel (i,j) in the image. Figure 5 

below shows the depth maps and images for synthetic and rea; scene. Closer points 

are shown brighter. 

(a) Synthetic scene 

Figure 5 

(b) Real Scene 

Depth and Texture are stored as images in disk. The depth map contains real values 

whose range depends upon the resolution of the structure recovery method. Images 

with 16 bits per pixel can store information up to 65 meters. Further sections 

explain how the maps are constructed and why we are using Depth + Texture for 

IBR. 

Construction of D+TR: The D+TR can be created using a suitable 3D structure 

recovery method, including stereo, range sensors, shape-from-shading, etc. 

Multicamera stereo remains the most viable option as cameras are inexpensive and 

nonintrusive. Depth and texture needs to be captured only from a few points of view 

since geometry can be interpolated. Calibrated, instrumented setup consisting of a 

dozen or so cameras can capture static or dynamic events as they happen. Depth 

map can be computed for each camera using other cameras in its neighborhood and 

a suitable stereo program.

There are various reasons for selecting Depth and Texture for IBR. Rendering a 

scene using depth maps is a research area. Depth map gave visibility limited model 

of the scene and can be rendered easily using graphics algorithms. Texture mapping 

ensures photorealism. 

2.2 Rendering 

The rendering aspect of the D+TR system is detailed in [17]. We describe in this 

section the rendering approaches and related issue here. First, we describe rendering 

using one depth Map and then using multiple depth Maps. 

2.2.1 Rendering one D+TR 

For rendering one D+TR two approaches have been implemented and tested. 

Splatting: The point cloud can be splatted as point-features. Splatting techniques 

broaden the individual 3D points to fill the space between points. The color of the 

splatted points is obtained from the corresponding image pixel. Splatting has been 

used as the method for fast rendering, as point features are quick to render. The 

disadvantage of splatting is that holes can show up where data is missing if we 

zoom in much. Figure below shows one D+TR rendered using Splatting. 

Implied Triangulation: Every 2x2 section of the Depth map is converted to two 

triangles by drawing the diagonal. The depth discontinuities are handled by 

breaking all edges with large difference in the z-coordinate between its end points 

and removing the corresponding triangles from the model. Triangulation results in 

the interpolation of the interior points of the triangles, filling holes created due to 

the lack of resolution. The interpolation can produce low quality images if there is 

considerable gap in the resolutions of the captured and rendered views, such as 

when zooming in. This is a fundamental problem in image based rendering. Figure 

6 shows the D+TR rendered using the method of Triangulation. 

(a) D+TR rendered using splatting 

FIGURE 6 

(b) D+TR rendered using triangulation

2.2.2 Rendering Multiple D+TR 

The generated view using one D+TR will have holes or gaps corresponding to the 

part of the occluded scene being exposed in the new view position. These holes can 

be filled using another D+TR that sees those regions. Parts of the scene could be 

visible to multiple cameras. The views generated by multiple D+TRs have to be 

blended in such cases. Hole Filling and Blending of the views to generate the novel 

view is explained in next section. Figure 7 shows the result of rendering multiple 

D+TRs 

(a) D+TR rendered using splatting 

FIGURE 7 

(b) D+TR rendered using triangulation 

2.3 Blending on CPU 

The views used for novel view generation are blended to provide high 

quality output. Each valid input view is blended based on its depth and texture. 

Blend Function: Angular distance between the input views camera center and the 

novel view camera enter is used as the criteria for blending. This angular distance is 

used to calculate the weight given to each view for each pixel. Cosine weights are 

used in D+TR technique. Figure 8 below is used to explain the blending process. 

FIGURE 8: Blending is based on Angular distance

As shown in the figure, t 1 and t 2 are angular distance of the views c 1 and c 2 from D 

(novel view). So the corresponding pixel weight for both the views is given by 

following function. 

w c1 = cos n (t 1 ); (for view c 1 ) 

w c2 = cos n (t 1 ); (for view c 1 ) 

n > 2 gives suitable values for blending. 

Pixel weight can also be calculated using the exponential function. Exponential 

blending computes weights as w i = e cti where is the view index, t i is the angular 

distance of the view i, and w i is the weight for the view at that pixel. The constant c 

controls the fall off as the angular distance increases. 

Pixel Blending: The blend function should result in smooth changes in the 

generated view as the viewpoint changes. Thus, the views that are close to the new 

view should get emphasized and views that are away from it should be 

deemphasized. Besides this z-buffering is also used so that the proper pixel are used 

to render the novel view. An example is explained below using figure 9. 

FIGURE 9: Process of Pixel Blending when one pixel is seen by multiple views 

As shown in the figure above view 1,2,3,4 are used in generating the novel view. 

Consider the ray from novel view which in the scene corresponds to both the pixel 

A and B. To render this pixel in the novel view the D+TR takes care of all the 

complexities, it is explained below. 

View 1 and 2 are used for pixel A although view 3 and 4 are also valid views to be 

used for rendering. This is simply because the pixel A seen by view 3 and 4 lies 

behind the pixel B seen by view 1 and 2 along the same ray from novel view. The 

red line shows that this view is discarded and green line indicates that the view is 

considered for blending.

View 3 sees both the pixel A and B in the scene which are along the ray from novel 

view, but the pixel in view 3 corresponding to pixel B is used in rendering the pixel 

in novel view. 

2.4 GPU Algorithm 

We devised a 2-pass algorithm to render multiple DIs with per-pixel blending. The 

first pass determines for each pixel which views need to be blended. The second 

pass actually blends them. The property of each pixel blending a different set of DIs 

is maintained by the new algorithm. The pseudo code of the algorithm, as presented 

in [25] is given below: 

2.5 Two Pass Algorithm 

The D+TR technique of IBR has the blending module to provide high 

quality novel views. The read back of the frame buffer is the time consuming 

operation in the above algorithm. The modern GPUs have a lot of computation 

power and memory in them. If the read back is avoided and the blending is done in 

the GPU, the frame rate can possibly reach interactive rates. The much required 

speed with the same original quality is achieved with the 2 pass algorithm. Pass 1

deal with vertex shader and pass 2 deals with pixel shader implementing the 

blending process. 

2.5.1 Pass One 

This pass generates the novel view depth. Each pixel is rendered with a shift of 

‘Delta’ in the depth value. If ‘D’ is the depth for a pixel then it is rendered at ‘D- 

Delta’ In pass 2 all the pixels within the range of Delta are blended and other are 

rendered using the depth test. Pass 1 can be explained with the block diagram 

below. 

calibration parameters 

rendering 

3D co-ordinates are 

computed 

x,y,z 

Z -> Z - Delta 

Novel View Depth 

depth map 

FIGURE 10: Process of Pass 1 and Rendered surface with pass 1 

As shown in the block diagram calibration parameters and the Depth values are passed to 

the vertex shader. Also, the rendered surface is little behind the actual surface. Each valid 

input view for the current novel view position is rendered one by one to get the novel 

view depth. 

2.5.2 Pass Two 

Blending is done in pixel shader. The process of blending is explained in next 

section. Pixel shader takes camera center of the novel view and the current view along 

with the textures and 3D location of the current pixel to implement this pass. The output 

for each view from the pixel shader is sent with next view to the pixel shader to perform 

the blending in a normalized way, which also increases the quality of output. Pass 2 is 

also explained using block diagram in figure 11. Pixel shader code is given at the end of 

this report in the Appendix section along with vertex shader code.


output of the pixel shader 

is copied back to original 

texture 

Input View 


Pixel Shader 

for last input view 

output is sent to 

frame buffer 

Frame Buffer 

each valid input view 

texture is sent one by one 

FIGURE 11: Implementation of Pass 2 

Pass 2 is more time taking than pass 1 because the blending process is implemented 

in this pass and the weight calculation per pixel is done in pixel shader only. The 

number of parameters being passed to the pixel shader is also more which further 

increases the execution time. Pixel shader takes two textures, two camera centers 

(novel view and the input view camera) along with the 3D location of the pixel as 

input, whereas vertex shader just takes the 3D location and modelview and 

projection matrix as the input. 

2.6 Blending on GPU 

Concept of Blending is same on GPU as in CPU, but here the normalization is done 

with each incoming input view and in CPU before the blending process starts the 

normalized weights are there for each view. 

Blend Function: Blend function used is 

C f = (a s C s + a d C d )/ a f 

a f = a s + a d 

Where, C s is the color of source pixel (incoming view) and C d is the color of the 

destination pixel (rendered views before this). a s and a d are corresponding alpha 

values. With each incoming view the alpha value of the rendered view is 

normalized. 

C f and a f is the color and alpha value which is rendered and if the input view is not 

the last view, then both of them come as C d and a d along with the next valid input 

view. If the current input view is the last view, then the rendered value of C f and a f 

is the color and alpha value of this location of the pixel in the finally rendered novel 

view. Both the source and destination alpha values are added so that in division in 

the next pass with another input view, the values are normalized.

The alpha value of the source pixel is calculated by the pixel shader which becomes 

its weight. Using the block diagram below the process of calculation of alpha value 

is explained. 

Center of Novel View 

Center of Current View 

3D location of Pixel 

Computation 

of angle and 

weight in pixel 

shader 

a s 

FIGURE 12: Process of calculation of alpha value of source pixel in pixel shader 

Each pixel of the neighboring valid views used for novel view generation is used in 

the process of blending, hence weight for each one of them is calculated. This 

accumulated alpha value is normalized with each new view coming to pixel shader 

for blending by dividing by total accumulated alpha value. 

2.7 Pipeline of Accelerated D+TR 

The entire pipeline of the Accelerated D+TR is discussed and is explained in this 

section. From the figure 13, we can see that the input to the system is Depth map, 

calibration parameters and the input view texture. Depth map is input in the preprocessing 

step and pass 1. Calibration parameters are input in pre-processing step 

and second pass. Input view texture is the input in pass 2 only when the process of 

blending is implemented.

FIGURE 13: Pipeline of accelerated D+TR 

From the figure above it is clearly explained that pre-processing step is used to 

generate the 3D location of pixels, pass 1 for shifting the depth for the novel view 

for each valid input view, and pass 2 for blending the pixels based on the depth 

rendered and valid input view texture. 

The input to the system is calibration parameters, depth maps, and texture images 

from certain number of views. Based on the novel view camera, firstly the valid 

views surrounding the novel view camera are estimated. These valid views are then 

used for generating the novel view. To determine the validity of the view, the 

average of the 3D coordinates of the scene is taken (which is origin (0, 0, 0) in our 

case) and then based on angles between the two cameras and the average center 

point, validity is estimated. Views beyond 90 degrees are not considered (as they 

will see other part of the scene). When less number of views is used then closest 

views are preferred. 

Once the valid views are known, then using the calibration parameter and the Depth 

map of each valid view, its 3D location is generated and is passed as texture to the 

vertex shader (or is used by vertex shader somehow) for generation of novel view 

depth. Vertex shader simply shifts the z coordinate a little behind so that the range 

within which shift is there, pixels can be blended in second pass. In certain 

approaches vertex shader also performs the 3D coordinate estimation, but it not the

optimal solution. On generating the shifted coordinate system, pass 1 passes this 

information to pass2. 

Pass 2 performs on the fly blending. This is performed by using a pixel shader that 

runs on GPU. For each pixel, the shader has access to the novel view and DI 

parameters and the results of previous rendering using a Frame Buffer Object 

(FBO). Depending on which DIs had values near the minimum z for each pixel, a 

different combination of DIs can be blended at each pixel. The colour values and 

alpha values are kept correct always so there is no post-processing step that depends 

on the number of DIs blended. The algorithm also ensures there will be no 

exceeding of the maximum range of colour values that is a possibility if the 

summing is done in the loop followed by a division at the end. 

2.8 Results 

The FPS of the system was calculated by varying the resolution and number of 

input DIs. Below are the tables for the same. This experiment was conducted on an 

AMD64 Processor with 1GB RAM and nVidia 6600GT graphics card with 128MB 

RAM. 

Number of Input Views= 18 Number of Input Views= 9 

Number Of FPS 

views (Resolution 

=2) 

2 to 3 75 21 

3 to 5 46 13 

4 to 6 38 10.5 

7 to 8 21.3 6 

8 to 9 20 5 

10 to 12 14 3.2 

13 to 14 11 2.7 

FPS 

(Resolution 

=1) 

Number Of 

views 

FPS 

(Resolution 

=2) 

2 to 3 210 78 

3 to 5 130 46.5 

4 to 6 109 40 

7 to 8 40 14 

8 to 9 38 13.3 

9 32 13 

FPS 

(Resolution 

=1) 

As clear from the table above, our system is capable of producing novel view in real 

time.

Chapter 3 

D+TR & OpenSceneGraph 

3.1 OpenSceneGraph 

Open Scene Graph also commonly known as OSG is an Open Source, cross 

platform graphics toolkit for the development of high performance graphics 

applications such as flight simulators, games, virtual reality and scientific 

visualization. OSG is based around the concept of a Scene-Graph, it provides an 

object oriented framework on top of OpenGL freeing the developer from 

implementing and optimizing low level graphics calls, and provides many 

additional utilities for rapid development of graphics applications. 

It is a 3D graphics library for C++ programmers. A SceneGraph library allows us 

to represent objects in a scene with a graph data structure which allows us to group 

related objects that share some properties together so we can specify common 

properties for the whole group in one place. OSG can then be used to automatically 

manage things like the level of detail, culling, bounding shapes etc necessary to 

draw the scene faithfully but without unnecessary detail (which slows down the 

graphics hardware drawing the scene). 

The OpenSceneGraph project was started in 1998 by Don Burns as means of 

porting a hang gliding simulator written on top of the Performer scene. The source 

code was open sourced in 1999 and porting of the scene graph element to windows 

was carried on by Robert Osfield. The project was made scalable in 2003 and the 

Version 1.0 OpenSceneGraph 1.0 release (which is the culmination of 6 years work 

by the lead developers and the open-source community that has grown up around 

the project) happened in 2006. The OSG we know now, is a cross platform, 

scalable, real time, open source scene graph that has over 1000 active developers 

world wide and users such as NASA, European Space Agency, Boeing, Magic 

Earth, American Army, and many others. Enabling the rapid development of 

custom visualization programs, the OSG is also the power behind various projects 

like OsgVolume, Present3D etc. 

Unfortunately no there is currently no real Reference manuals or Programmers 

guides for Open Scene Graph. The recommendations on the OSG web site it to 

"Use the Source". Having the source assumes you readily understand how it all 

works or that you can deduce this from the code; this is not true for many who are 

new to OSG or to the simulation world and can be seen by many of the question on 

the mailing lists. While OSG has documents generated from headers and source, a 

lot of material found there is does not have context and can thus be difficult to 

assimilate, again a good programmers guide and reference manual can give the 

require context

3.1.1 “What is a scene graph?” 

As the name suggests, a scene graph is data structure used to organize a 

scene in a Computer Graphics application. The basic idea behind a scene graph is 

that a scene is usually decomposed in several different parts, and somehow these 

parts have to be tied together. So, a scene graph is a graph where every node 

represents one of the parts in which a scene can be divided. Being a little more 

strict, a scene graph is a directed acyclic graph, so it establishes a hierarchical 

relationship among the nodes. 

In this section, we describe a simple scene graph and introduce some basic OSG 

node types.Suppose we want to render a scene consisting of a road and a truck. 

A scene graph representing this scene is depicted in Figure 14 

Figure 14: A scene graph, consisting of a road and a truck. 

If we render this scene just like this, the truck will not appear on the place you want. 

We’ll have to translate it to its right position. Fortunately, scene graph nodes don’t 

always represent geometry. In this case, we can add a node representing a 

translation, yielding the scene graph shown on Figure 15. 

Figure 15: A scene graph, consisting of a road and a translated truck. 

Let’s add two boxes to the scene, one on the truck, the other one on the road. Both 

boxes will have translation nodes above them, so that they can be placed at their 

proper locations. Furthermore, the box on the truck will also be translated by the 

truck translation, so that if we move the truck, the box will move, too since both

oxes look exactly the same, we don’t have to create a node for each one of them. 

One node “referenced” twice does the trick, as Figure 16 illustrates. During 

rendering, the “Box” node will be visited (and rendered) twice, but some memory is 

spared because the model is loaded just once. This is one of the reason a scene 

graph is a “Graph” and not a “tree”. 

Figure 16 : A scene graph, consisting of a road, a truck and a pair of boxes. 

Up to this point, the discussion was around “generic” scene graphs. From now on, 

we will use exclusively OSG scene graphs, that is, instead of using a generic 

“Translation” node, we’ll be using an instance of a real class defined in the OSG 

hierarchy. 

A node in OSG is represented by the osg::Node class. Renderable things in OSG are 

represented by instances of the osg::Drawable class. But osg::Drawables are not 

nodes, so we cannot attach them directly to a scene graph. It is necessary to use a 

“geometry node”, osg::Geode, instead. Not every node in an OSG scene graph can 

have other nodes attached to them as children. In fact, we can only add children to 

nodes that are instances of osg::Group or one of its subclasses. 

Using osg::Geodes and an osg::Group, it is possible to recreate the scene graph 

from Figure 14 using real classes from OSG. The result is shown in Figure17.

Figure 17: An OSG scene graph, consisting of a road and a truck. Instances of OSG 

classes derived from osg::Node are drawn in rounded boxes with the class name 

inside it. osg::Drawables are represented as rectangles. 

That’s not the only way to translate the scene graph from Figure 14 to a real OSG 

scene graph. More than one osg::Drawable can be attached to a single osg::Geode, 

so that the scene graph depicted in Figure 18 is also an OSG version of Figure 14. 

Figure 18: An alternative OSG scene graph representing the same scene as the one 

in Figure 17. 

The scene graphs of Figures 17 and 18 has the same problem as the one in the 

Figure 14: the truck will probably be at the wrong position. And the solution is the 

same as before: translating the truck. In OSG, probably the simplest way to translate 

a node is by adding an osg::PositionAttitudeTransform node above it. An 

osg::PositionAttitudeTransform has associated to it not only a translation, but also 

an attitude and a scale. Although not exactly the same thing, this can be thought as 

the OSG equivalent to the OpenGL calls glTranslate(), glRotate() and glScale(). 

Figure 19 is the OSGfied version of Figure 15. 

Figure 19: An OSG scene graph, consisting of a road and a translated truck. For 

compactness reasons, osg::PositionAttitudeTransform is written as osg::PAT.

3.1.2 Nodes in OSG 

As stated in previous section, OSG comprises of various kinds of nodes for 

representing specific information of a complex 3d scene. Notable among these are : 

1) osg::Node – This is the base class of all internal nodes classes. This class is used 

very rarely in a scene graph but it has important members like Bounding Sphere, 

parentlist, NodeCallbacks (for update, event Traversal and cullCallback , more on 

callbacks later). 

2)osg::Group - An example of the Composite Design Pattern, this class is derived 

from osg::Node. It provides functionality for adding children nodes and maintains a 

list of children nodes. 

3)osg::Transform – This is a group node for which all children are transformed by a 

4x4 matrix. It is often used for positioning objects within a scene, producing 

trackball functionality or for animation. 

Transform itself does not provide set/get functions, only the interface for defining 

what the 4x4 transformation is. Subclasses, such as MatrixTransform and 

PositionAttitudeTransform support the use of an osg::Matrix or a 

osg::Vec3/osg::Quat respectively. 

osg::MatrixTransform - uses a 4x4 matrix for the transform 

osg::PositionAttitudeTransform - uses a Vec3 position, Quat rotation for the 

attitude, and Vec3 for a pivot point. 

4)osg::Geode - A Geode is a "geometry node", that is, a leaf node on the scene 

graph that can have "renderable things" attached to it. In OSG, renderable things 

are represented by objects from the Drawable class, so a Geode is a Node whose 

purpose is grouping Drawables. It maintains a list of “Drawables”. 

5) osg::Drawable - A pure virtual class (with 6 concrete derived classes) which 

provides all the import draw*() methods. In OSG, everything that can be rendered is 

implemented as a class derived from Drawable. The Drawable class contains no 

drawing primitives, since these are provided by subclasses such as osg::Geometry. 

Also, note that a Drawable is not a Node, and therefore it cannot be directly added 

to a scene graph. Instead, Drawables are attached to Geodes, which are scene graph 

nodes. 

This class contains a stateset and a list of parents along with cull and draw 

callbacks.The OpenGL state that must be used when rendering a Drawable is 

represented by a StateSet. These StateSets can be shared between drawables which 

proves to be a good way to improve performance, since this allows OSG to reduce 

the number of expensive changes in the OpenGL state. Like StateSets, Drawables 

can also be shared between different Geodes, so that the same geometry (loaded to 

memory just once) can be used in different parts of the scene graph.

FIGURE 20: Inheritance diagram for the osg::Drawable class. 

The major classes derived from this base class are: 

osg::Geometry : This class adds real geometry to the scene graph and can have 

vertex (and vertex data) associated with it directly, or can have any number of 

'primitiveSet' instances associated with it. Vertex and vertex attribute data (color, 

normals, texture coordinates) is stored in arrays. Since more than one vertex may 

share the same color, normal or texture coordinate, and array of indices can be used 

to map vertex arrays to color, normal or texture coordinate arrays. 

osg::ShapeDrawable : It adds the ability to render the shape primitives, so that they 

can be rendered with reduced effort. Various shape primitives are: Box, Cone, 

Cylinder, Sphere, Triangle Mesh etc. ShapeDrawable currently doesn't render 

InfinitePlanes. 

6)osg::StateSet - Stores a set of modes and attributes which respresent a set of 

OpenGL state. Notice that a StateSet contains just a subset of the whole OpenGL 

state. In OSG, each Drawable and each Node has a reference to a StateSet. These 

StateSets can be shared between different Drawables and Nodes (that is, several 

Drawables and Node s can reference the same StateSet). Indeed, this practice is 

recommended whenever possible, as this minimizes expensive state changes in the 

graphics pipeline. This state include textureModeList, textureAttributeList, 

attributeList, modeList etc along with updateCallback and eventCallback. 

All the nodes described above are part of the core module of OSG called osg. There 

are various other modules in osg like osgDB (plugin support library for managing 

the dynamic plugins - both loaders and NodeKits ), osgGA( GUI adapter library - 

to assist development of viewers), osgGLUT (GLUT viewer base class ) , 

osgPlugins (28 plugins for reading and writing images and 3d databases) etc. Some 

of these will be discussed later in this report. 

3.1.3 Structure of Scene graph 

Having described major node types in OSG, let us discuss a typical scene hierarchy. 

The graph will have osg::Group at the top (representing the whole scene), 

osg::Groups, LOD's, Transform, Switches in the middle(dividing the scene in to

various logical units) and osg::Geode(Geometry Nodes containing osg::Drawables 

and osg::StateSets) as the leaf nodes. 

3.1.4 Windowing System in OSG 

Just like OpenGL, the core of OSG is independent of windowing system. The 

integration between OSG and some windowing system is delegated to other, noncore 

parts of OSG (users are also allowed to integrate OSG with any exotic 

windowing system they happen to use). Viewer implements the integration between 

OSG and Producer, AKA Open Producer thus offering an out-of-the-box, scalable 

and multi-platform abstraction of the windowing system. 

3.1.5 Skeleton OSG Code 

FIGURE 21: Inheritance diagram for osgProducer::Viewer class 

This section describes various steps to setup a simple OSG program using the 

Nodes and the windowing system discussed in above sections. Given below are the 

steps to follow: 

1) Setup the viewer (osgProducer::Viewer instance) 

2) Create the scene graph for the scene(using various nodes like 

Geode,Geometry etc). 

3) Attach the viewer and graph using the setSceneData() method. 

4) Start the Simulation Loop which generates the scene : 

While(!viewer.done){ 

/* wait for all cull and draw threads to complete*/ 

viewer.sync(); 

/*update the scene by traversing it with the the update visitor which will call all 

node update callbacks and animations. */ 

viewer.update();

* fire off the cull and draw traversals of the scene.*/ 

} 

viewer.cull(); 

3.1.6 Callbacks 

Users can interact with a scene graph using callbacks. Callbacks can be thought of 

as user-defined functions that are automatically executed depending on the type of 

traversal (update, cull, draw) being performed. Callbacks can be associated with 

individual nodes or they can be associated with selected types (or subtypes) of 

nodes. During each traversal of a scene graph if a node is encountered that has a 

user-defined callback associated with it, that callback is executed. 

FIGURE 22: Callback Mechanism 

Code that takes advantage of callbacks can also be more efficient when a 

multithreaded processing mode is used. The code associated with update callbacks 

happens once per frame before the cull traversal. One way would be to insert the 

code in the main simulation loop between the viewer.update() and viewer.frame() 

calls. However, callbacks provide an interface that is easier to update and maintain. 

3.1.7 osgGA::GUIEventHandler 

The GUIEventHandler class provideds developers with an interface to the 

windowing sytem's GUI events. The event handler recieves updates in the form of 

GUIEventAdapter instances. The event handler can also send requests for the GUI 

system to perform some operation using GUIActionAdapter instances. 

Information about GUIEventAdapters instances include the type of event (PUSH, 

RELEASE, DOUBLECLICK, DRAG, MOVE, KEYDOWN, KEYUP, FRAME, 

RESIZE, SCROLLUP, SCROLLDOWN, SCROLLLEFT). Depending on the type 

of GUIEventAdapter, the instance may have additional information associated with 

it.

The GUIEventHandler uses GUIActionAdapters to request actions of the GUI 

system. It interacts with the GUI primarily with the 'handle' method. The handle 

method has two arguments: an instance of GUIEventAdapter for receiving updates 

from the GUI, and a GUIActionAdapter for requesting actions of the GUI. The 

handle method can examine the type and values associated with the 

GUIEventAdapter, perform required operations, and make a request of the GUI 

system using the GUIActionAdapter. The handle method returns boolean variable 

set to true if the event has been 'handled', false otherwise. 

3.2 D+TR in OSG 

In this section, we describe the specifications of our D+TR system ported in OSG. 

3.2.1 Representation 

Basic Representation in OSG consists of a special kind of node called 

DepthTR, which is derived from Geode class. This DepthTR node can store the 

geometry of the scene and has a special class called dtrDrawable (inherited from 

osg::Drawable) attached to it (more on the dtrDrawable later in this section). 

There is another class called InputView which contains Depth Images (DIs) and 

calibration parameters which were the inputs to our D+TR system. Depth map is a 

two dimensional array of real values with location (i,j) storing the depth distance to 

the point that projects pixel (i,j) in the image. Closer points are shown brighter. 

Depth and Texture are stored as images in disk. The depth map contains real values 

whose range depends upon the resolution of the structure recovery method. Images 

with 16 bits per pixel can store information up to 65 meters. DepthTR class contains 

a pointer to the array of InputViews. 

Class DepthTR has several important functions like load() (loads all input texture 

and depth maps), projectView() (projects input view to the novel view orientation), 

setNovelView (to set the validity flag for each input view), getNovelView() (returns 

novel view generated from the input textures and depth maps) etc. Class 

dtrDrawable has a function called drawImplementation() which basically is used 

for rendering the DepthTR node in accordance to the Depth+TR algorithm 

presented in previous chapters. Class InputView has a function for loading a 

depthMap, calculating the 3D from the image using get3D() and a function for 

projecting the view in novelView direction. 

DepthTR 

dtrDrawable 

InputView 

InputView * inputViews 

valid 

numberOfInputViews 

DepthTR* _ss calibration, cameraCenter, 

class dtrDrawable 

imageFile, depthFile 

load() 

drawImplementation(o load() 

projectView(…) 

sg::State& state) get3D() 

cosAngleBlending(…) 

projectView(…) 

FIGURE 23: Class Diagrams of D+TR in OSG

3.2.2 Rendering 

We have implemented Implied Triangulation approach for rendering in 

OSG. We have already described this approach in the previous chapter. In this 

section, we further describe how the Rendering occurs in OSG. 

3.2.2.1 Rendering in OSG 

OSG provides an excellent framework for maximizing graphics performance. Any 

scene graph employs three key phases while generating a 3D scene. These are App, 

Cull, Draw phases. In App phase, the graphics application sets up the scene graph 

and the parameters necessary for rendering. In Cull phase, culling of the objects that 

won’t appear on the screen is done. The hierarchical structure of scene graph 

enables efficient culling. And, finally in the Draw phase, the scene is actually drawn 

on the screen.For further optimization, these three phases are carried out by 

different threads simultaneously as shown in Figure 24 below. 

FIGURE 24: Parallelism in OSG 

During the Cull phase, the whole scene graph is traversed by NodeVisitor and the 

visibility of each node is determined. The scene graph is constructed in such a way 

that the geometry nodes (Geode) lie at the bottom of the graph (leaf nodes). Each 

Geode further contains a list of Drawables (Geometry, ShapeDrawable, Text etc) 

which can be drawn. During the ‘draw’ phase, the scene graph is again traversed by 

NodeVisitor, which calls a virtual function called 

‘drawImplementation(osg::State&)’ while rendering the Drawables. 

drawImplementation(State&) is a pure virtual method for the actual implementation 

of OpenGL drawing calls, such as vertex arrays and primitives, that must be 

implemented in concrete subclasses of the Drawable base class, examples include 

osg::Geometry and osg::ShapeDrawable. drawImplementation(State&) is called 

from the draw(State&) method, with the draw method handling management of 

OpenGL display lists, and drawImplementation(State&) handling the actual 

drawing itself.

The code above is extracted from the source of OSG. As mentioned earlier,our 

DepthTR class is derived from the Geode class. We add dtrDrawable to it for 

drawing our IBR scene. The ‘drawImplementation(osg::State&)’ function of 

dtrDrawable class is overridden to render Depth Maps using the D+TR algorithm. 

3.2.3 Discussion 

This section describes the rendering algorithm implemented in 

‘drawImplementation’ method of dtrDrawable class from both OSG and D+TR 

point of view. 

During each ‘drawImplementation’ call to dtrDrawable, all inputViews are 

projected to the novelView using projectView() function. This is followed by 

reading the projected images using glReadPixels() and blending the views which 

are set to be valid (by setNovelView() function). Then, the angular blending is 

carried out and the final novelImage is drawn to the framebuffer using 

glWritePixels(). 

Any keyboard event invokes the GUIEventHandler function which can be used to 

set novelView Parameters by calling NextView() function (this function set the 

novelView direction and is an auxillary function used in our system). This is 

followed by a call to ‘drawImplementation’ function which renders the novelImage 

on to the window screen.

To summarize the whole process, given n depth images and a particular novel view 

point, first we find the depth images on the same side of the novel view point. 

Depth images are considered to be on same side, when the angle subtended at the 

center of scene with depth image's camera center and novel view camera center is 

less than particular angle (threshold). Novel views are generated using these depth 

images. Now for each pixel Ô in the novel view, we compare the z. -values across 

all these novel views and keep the nearest z. -values within threshold. Weights are 

computed using blending for each pixel Ô as described earlier. The complete 

rendering algorithm is given in Algorithm above. Flow chart for this algorithm is 

given in Figure 25 below. 

FIGURE 25: Flow chart for complete rendering

3.3 Conclusions and Results 

In this chapter we have described the basic OSG concepts and how the D+TR 

system is implemented in OSG. We have also described the class structure and the 

rendering process of our system in great detail. 

Here are some of the snapshots of our system: 

The results will include the FPS figures for the table data in out system and are 

summarized in the table below: 

Application Time in seconds per frame 

D+TR on OSG 

References: 

[1] http://graphics.stanford.edu/projects/mich/. 

[2] http://www.cyberware.com. 

[3] H. Baker, D. Tanguay, I. Sobel, M. E. G. Dan Gelb, W. B. Culbertson, and T. 

Malzbender. The Coliseum Immersive Teleconferencing System. In International Workshop 

on Immersive Telepresence (ITP2002), 2002. 

[4] C. Buehler, M. Bosse, L. McMillan, S. J. Gortler, and M. F. Cohen. Unstructured 

Lumigraph Rendering. In SIGGRAPH, 2001. 

[5] S. Chen and L. Williams. View Interpolation for Image Synthesis. In SIGGRAPH, 1993. 

[6] P. E. Debevec, C. J. Taylor, and J. Malik. Modeling and Rendering Architecture from 

Photographs: A Hybrid Geometry and Image-Based Approach. In SIGGRAPH, 1996. 

[7] B. Girod, C.-L. Chang, P. Ramanathan, and X. Zhu. Light Field Compresion Using 

Disparity-Compensated Lifting. In ICASSP, 2003. 

[8] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F.Cohen. The Lumigraph. In 

SIGGRAPH, 1996. 

[9] I. Ihm, S. Park, and R. K. Lee. Rendering of spherical light fields. In Pacific Graphics, 

1997. 

[10] R. Krishnamurthy, B.-B. Chai, H. Tao, and S. Sethuraman. Compression and 

Transmission of Depth Maps for Image-Based Rendering. In International Conference on 

Image Processing, 2001. 

[11] M. Levoy and P. Hanrahan. Light Field Rendering. In SIGGRAPH, 1996.

[12] M. Magnor, P. Eisert, and B. Girod. Multi-view image coding with depth maps and 3-d 

geometry for prediction. Proc. SPIE Visual Communication and Image Processing (VCIP- 

2001), San Jose, USA, pages 263{271, Jan. 2001. 

[13] W. R. Mark. Post-Rendering 3D Image Warping: Visibility, Reconstruction, and 

Performance for DepthImage Warping. PhD thesis, University of North Carolina, 1999. 

[14] L. McMillan. An Image-Based Approach to Three Dimensional Computer Graphics. 

PhD thesis, University of North Carolina, 1997. 

[15] L. McMillan and G. Bishop. Plenoptic Modelling: An Image-Based Rendering 

Algorithm. In SIGGRAPH, 1995. 

[16] P. J. Narayanan, P. W. Rander, and T. Kanade. Constructing Virtual Worlds Using 

Dense Stereo. In Proc of the International Conference on Computer Vision,Jan 1998. 

[17] P. J. Narayanan, Sashi Kumar P, and Sireesh Reddy K. Depth+Texture Representation 

for Image Based Rendering. In ICVGIP, 2004. 

[18] Sashi Kumar Penta and P. J. Narayanan. Compression of Multiple Depth-Maps for 

IBR. In Paci_c Graphics, 2005. 

[19] S. M. Seitz and C. R. Dyer. View Morphing. In SIGGRAPH, 1996. 

[20] H.-Y. Shum and L.-W. He. Rendering with concentric mosaics. In SIGGRAPH, 1999. 

[21] X. Tong and R. M. Gray. Coding of multi-view images for immersive viewing. In 

ICASSP, 2000. 

[22] H. Towles, W.-C. Chen, R. Yang, S.-U. Kam, and H. Fuchs. 3D Tele-Collaboration 

Over Internet2. In International Workshop on Immersive Telepresence (ITP2002), 2002. 

[23] C. L. Zitnick, S. B. Kang, M. Uyttendaele, S. Winder, and R. Szeliski. High-quality 

video view interpolation using a layered representation. In SIGGRAPH, 2004. 

[24] Michael Waschbusch, S. Wurmlin, D. Cotting, F. Sadlo, and M. Gross. Scalable 3D 

Video of Dynamic Scenes. In The Visual Computer, 2005. 

[25] Pooja Verlani, Aditi Goswami, P. J. Narayanan, Shekhar Dwivedi and Sashi Kumar 

Penta. Depth Images: Representations and Real-time Rendering. In Third International 

Symposium on 3D Data Processing, Visualization and Transmission (3DPVT), 2006 

Appendix 

I. Features and Advantages of OpenSceneGraph 

The stated goal of Open Scene Graph is to make the benefits of scene graph 

technology freely available to all, both commercial and non commercial users. OSG 

is written entirely in Standard C++ and OpenGL, it makes full use of the STL and 

Design Patterns, and leverages the open source development model to provide a 

development library that is legacy free and focused on the needs of end users. 

The stated key strengths of Open Scene Graph are its performance, scalability, 

portability and the productivity gains associated with using a fully featured scene 

graph, in more detail: 

Performance

Supports view frustum culling, occlusion culling, small feature culling, Level Of 

Detail (LOD) nodes, state sorting, vertex arrays and display lists as part of the core 

scene graph. These together make the Open Scene Graph one of the highest 

performance scene graph available. 

The Open Scene Graph also supports easy customization of the drawing process, 

such as implementation of Continuous Level of Detail (CLOD) meshes on top of 

the scene graph (see Virtual Terrain Projection and Demeter). 

Productivity 

The core scene graph encapsulates the majority of OpenGL functionality including 

the latest extensions, provides rendering optimizations such as culling and sorting, 

and a whole set of add on libraries which make it possible to develop high 

performance graphics applications very rapidly. The application developer is freed 

to concentrate on content and how that content is controlled rather than low level 

coding. 

FormatSupport: OSG now states that it includes 45 separate plugin's for loading 

various 3D database and image formats. 3D database loaders include OpenFlight 

(.flt),TerraPage (.txp) including multi-threaded paging support , LightWave (.lwo), 

Alias Wavefront (.obj) , Carbon Graphics GEO (.geo) ,3D Studio MAX (.3ds) 

,Peformer (.pfb) ,Quake Character Models (.md2) , Direct X (.x) ,Inventor Ascii 2.0 

(.iv) ,VRML 1.0 (.wrl) ,Designer ,Workshop (.dw) ,AC3D (.ac) ,.osg Native 

OSG ASCII format , .osg Native OSG banary Format 

Image loaders include .rgb ,.gif, ,.jpg ,.png ,.tiff ,.pic ,.bmp ,.dds ,.tga , quicktime 

(under OSX),Fonts (via the freetype plugin) 

Node Kits 

OSG also has a set of Node Kits which are separate libraries that can be compiled in 

with your applications or loaded in at runtime, which add support for particle 

systems (osgParticle) ,high quality anti-aliased text (osgText) ,special effects 

framework (osgFX) ,OpenGL shader language support (osgGL2) ,large scale 

geospatial terrain database generation (osgTerrain) ,navigational light points 

(osgSim) ,osgNV ( support for NVidia's vertex, fragment, combiner, Cg shaders ) 

,Demeter (CLOD terrain + integration with OSG) ,osgCal (which integrates Cal3D 

and the OSG) ,osgVortex (which integrates the CM-Labs Vortex physics enginer 

with OSG). 

Portability 

The core scene graph has been designed to have minimal dependency on any 

specific platform, requiring little more than Standard C++ and OpenGL. This has

allowed the scene graph to be rapidly ported to a wide range of platforms - 

originally developed on IRIX, then ported to: Irix , Linux ,Windows ,FreeBSD. 

Window Systems 

The core OSG library is completely windowing system independent, which makes 

it easy for users to add their own window-specific libraries and applications on top. 

In the distribution thereis already the osgProducer library which integrates with 

OpenProducer, and in the Community/Applications section of this website one can 

find examples of applications and libraries written on top of GLUT, Qt, MFC, 

WxWindows and SDL. Users have also integrated it with Motif, and X. 

Scalability 

OSG will not only run on portables all the way up to Onyx Infinite Reality 

Monsters, but also supports the multiple graphics subsystems found on machines 

like a multi-pipe Onyx 

II 

Vertex Shader and Pixel Shader Code 

Vertex Shader: 

struct appdata 

{ 

float4 position : POSITION; 

float3 texpos: TEXCOORD0; 

float3 pointpos: COLOR; 

}; 

struct vs2ps 

{ 

float4 currpos2 : POSITION; 

float4 currpos : TEXCOORD1; 


float3 pointpos: COLOR;

}; 

vs2ps main(appdata IN, uniform float4x4 modelMatProj ) 

{ 

vs2ps OUT; 

OUT.currpos2 = OUT.currpos = mul(modelMatProj, IN.position); 

OUT.texpos=IN.texpos; 

OUT.pointpos=IN.pointpos; 

} 

return OUT; 

Pixel Shader: 

struct vpixel_out { 

float4 color : COLOR; 

}; 

struct vs2ps 

{ 

float4 currpos : POSITION; 


float3 pointpos: COLOR; 

}; 

vpixel_out main( 

vs2ps IN, 

uniform float3 c, 

uniform float3 n, 

uniform sampler2D texture, 

uniform samplerRECT buffer) 

{ 

vpixel_out OUT; 

float4 color; 

float4 color_old; 

float3 v1; 

float3 v2; 

IN.currpos /= 512.0; 

v1 = normalize( IN.pointpos - c ); 

v2 = normalize( IN.pointpos - n ); 

color.rgba = tex2D( texture, IN.texpos.xyz ).rgba; 

color_old.rgba = texRECT( buffer, IN.currpos.xy ).rgba; 

color.a = dot( v1, v2) ; 

color.a = pow( color.a,8); 

OUT.color.rgb = ( color.rgb * color.a + color_old.rgb * 

color_old.a) / ( color.a + color_old.a); 

OUT.color.a = color.a + color_old.a; // alpha can get out of 

range

} 

III 

return OUT; 

Class Definitions 

#include 

#include 

#include 

#include 

#include 

#include 

#include "matrix/matrix.h" 

#include 

#define FILE_NAME_SIZE 200 

#define MAX_DEPTH -100000000.00 

#define THRESHOLD 3.50 

#define ANGLE_THRESH M_PI/3 

#define TRIANGULATION 1 

#define SPLATTING 2 

extern float radius,theta,phi,ex,ey,ez; 

extern int mode, k, blendType, novelWidth,novelHeight; 

extern int type, resolution; 

extern float pointSize; 

extern int vn; 

extern bool finished; 

extern float m[3][4], calib[3][3]; 

class InputView 

{ 

public: 

char imageFile[FILE_NAME_SIZE]; 

char depthFile[FILE_NAME_SIZE]; 

int width, height; 

CMatrix modelMatrix; 

CMatrix calibration; 

CMatrix cameraCenter; 

float * depth; 

float * X; 

float * Y; 

float * Z; 

float * weights; 

view. 

// tells whether this camera is valid for this novel 

bool valid; 

// projected Values 

GLubyte * projectedImage; 

float * projectedDepth;

tells whether particular pixel is visible in 

projectedImage. 

bool * holes; 

InputView(); 

// loads depth and texture 

void load(char * imFile, char * depFile); 

void get3D(); 

// projects this view into novel view orientation 

void projectView(float depth_threshold, int type, int 

resolution, float pointSize); 

}; 

class DepthTR: public osg::Geode 

{ 

public: 

int numberOfInputViews; 

InputView * inputViews; 

// tells the, whether particular pixel in novel view is 

valid or not. 

bool * valid; 

// temporary vars used to read GL buffers 

GLfloat * projectedZbuffer; 

// novel view params 

int width; 

int height; 

// this threshold is used to find whether three near by 

vertices can form triangle or not 

float depth_threshold; 

// this threshold is used to eliminate straight away if 

they are not near to novel view camera 

float angle_threshold; 

GLubyte * novelImage; 

float * novelViewDepth; 

float * novelViewX; 

float * novelViewY; 

float * novelViewZ; 

CMatrix R, t; 

CMatrix cameraCenter; 

CMatrix calibration; 

CMatrix modelMatrix; 

bool save; 

DepthTR():osg::Geode() 

{ 

init(); 

depth_threshold = THRESHOLD; 

angle_threshold = ANGLE_THRESH; 

} 

void dtrDrawable_drawImplementation() ;

loads all the input textures and depth maps 

void load(char *); 

void computeWeights(int reqNumber, int blendType, float 

* angles, float * weights, bool * flags); 

// blending functions 

void angleBlending(float * angles, float * weights, 

bool * flags, int * positions); 

void exponentialAngleBlending(float * angles, float * 

weights, bool * flags, int * positions); 

void cosAngleBlending(float * angles, float * weights, 

bool * flags, int * positions); 

void newAngleBlending(int k, float * angles, float * 

weights, bool * flags, int * positions); 

void inverseAngleBlending(int k, float * angles, float 

* weights, bool * flags, int * positions); 

// resets the validity flag. 

void setNovelView(float model[][4], float calib[][3]); 

// returns novel view generated from the input textures 

and depth maps 

GLubyte* getNovelView(int blendType, int reqNumber); 

// project input view number vn onto the novel view 

orientation 

void projectView(int vn, int type, int resolution, 

float pointSize); 

void getProjectedDT(int vn); 

void saveProjectedImage(int vn); 

private: 

void init(); // Shared constructor code, generates 

the drawables 

class dtrDrawable; 

friend class dtrDrawable; 

bool dtrDrawable_computeBound(osg::BoundingBox&) const; 

}; 

class DepthTR::dtrDrawable: public osg::Drawable 

{ 

public: 

DepthTR* _ss; 

dtrDrawable(DepthTR* ss): 

osg::Drawable(), _ss(ss) { init(); } 

dtrDrawable():_ss(0) 

{ 

init(); 

osg::notify(osg::WARN)

"Warning: unexpected call to 

osgSim::SphereSegment::Spoke() copy constructor"setAttributeAndModes(new 

osg::LineWidth(2.0),osg::StateAttribute::OFF); 

} 

}; 

virtual osg::BoundingBox computeBound() const; 

void calculateParams (); 

#endif

Depth + Texture Representation - International Institute of ...

Create successful ePaper yourself

Delete template?

Save as template?