SUMMARY PhD. THESIS

THE FACULTY OF ELECTRONICS, TELECOMMUNICATIONS 

AND INFORMATION TECHNOLOGY 

Șerban-Nicolae MEZA, Eng. 

SUMMARY 

PhD. THESIS 

CONTRIBUTIONS TO THE DEVELOPMENT 

OF 2D & 3D VISION SYSTEMS 

Thesis evaluation comission: 

Coordinator, 

Prof.dr.ing. Aurel VLAICU 

PREŞEDINTE: Professor Virgil DOBROTĂ, Ph.D., Eng, Head of Communications Department 

Technical University of Cluj-Napoca 

MEMBRI: - Professor Aurel VLAICU, Ph.D., Eng - conducător ştiinţific, 

Technical University of Cluj-Napoca; 

- Professor Mihai ROMANCA, Ph.D., Eng - reviewer, 

Transilvania University of Brașov; 

- Professor Radu VASIU, Ph.D., Eng - reviewer, 

Polytehnic University of Timişoara; 

- Associated Professor Bogdan ORZA, Ph.D., Eng - reviewer, 

Technical University of Cluj-Napoca.

Key Words 

3D vision, depth from polarisation, 3D LookUp Tables, distributed video coding 

Thesis’ Table of Content 

1. Introduction ......................................................................................................... Error! Bookmark not defined. 

2. Study on the Theoretical Fundamentals, Existing Solutions and Current Trends in the Evolution of Vision 

System Architectures, Technologies and Equipments............................................. Error! Bookmark not defined. 

2.1. Visual Perception. Spatial Perception. Stereovision. ............................... Error! Bookmark not defined. 

2.2. 2D and 3D Video Representations and Extensions (multiview, freeviewpoint)Error! Bookmark not 

defined. 

2.4. Elements of Projective Geometry for 2D & 3D Vision Systems ............ Error! Bookmark not defined. 

2.5. The Theory of Video Sampling ............................................................... Error! Bookmark not defined. 

2.6. Emerging Theories – Compressive Sampling ......................................... Error! Bookmark not defined. 

2.7. Standards for 2D & 3D Video Data Representation ................................ Error! Bookmark not defined. 

2.8. Technologies and Architectures for 2D & 3D Video Capturing – Existing Solutions and Trends . Error! 

Bookmark not defined. 

2.9. Technologies and Architectures for 2D & 3D Video Rendering – Existing Solutions and Trends Error! 


Contributions ....................................................................................................... Error! Bookmark not defined. 

3. Study on 2D&3D Video Signal Coding for Transmission – Emerging Paradigms: Distributed Video Coding 

................................................................................................................................. Error! Bookmark not defined. 

3.1. The Distributed Video Coding Paradigm. Approaches and Perspectives Error! Bookmark not defined. 

3.2. General Presentation of the Theory of Distributed Video Coding .......... Error! Bookmark not defined. 

3.3. Implemented Architectures for Distributed Video Coding...................... Error! Bookmark not defined. 

3.4. Analisys of Distributed Video Coding Architectures .............................. Error! Bookmark not defined. 


4. Acquiring Depth Information in Vision Systems ............................................... Error! Bookmark not defined. 

4.1. Photo and Video Acquisition for 3D and Stereovision ........................... Error! Bookmark not defined. 

4.2. The Acquistion of Depth from Defocus .................................................. Error! Bookmark not defined. 

4.3. Principles and Approaches for the Acquition of Depth from Light Polarisation Error! Bookmark not 

defined. 

4.4. Algorithm and Experimental Model for the Acquisition of Depth from Light Polarisation ........... Error! 



5. Real-time 2D High-Definition Video Signal Processing Using FPGA Technology and 3D-LookUp Tables 

................................................................................................................................. Error! Bookmark not defined. 

5.1. Realtime in-Camera Video Processing .................................................... Error! Bookmark not defined. 

5.2. 3D LookUp Table Processing for Video Signals .................................... Error! Bookmark not defined. 

5.3. Software Simulator Implementation for 3d LookUp Table Video Signal Processing ................ Error! 


5.4. FPGA Technology Hardware Integration of 3D LookUp Table Video Signal Processing ............. Error! 


5.5. Obtained Results in Realtime Video Signal Processing Using 3D LookUp TablesError! Bookmark 

not defined. 

2


6. Conclusion ........................................................................................................... Error! Bookmark not defined. 

7. Bibliography ........................................................................................................ Error! Bookmark not defined. 

Appendix ................................................................................................................. Error! Bookmark not defined. 





General Subject of the Thesis 

In the last decade, visual representations (images or video sequences) have been present in almost all 

aspects of everyday life, making multimedia information ubiquitous. Services and products like digital 

photography, digital television, DVDs and BlueRays, the World Wide Web, videoconferencing, virtual 

collaboration and others, are all based to a great extent on the use of such visual content. 

Currently, one can witness, under the momentum of the theoretical and technological developments 

characteristic to the „digital era”, a wide process of „re-invention” of the video systems and services 

formerly based on the use of analogue technology. Likewise, the capabilities available in video 

rendering devices (screens that allow high contrast ratios, low power consumption and various shapes, 

sizes and spatial resolutions), the high throughput processing and transmitting networks (in the order of 

Mbytes per second) and the opportunities offered by the visual information acquisition systems that 

allow high quality images and videos (high resolution, colour depth and time sampling of more than 

160 frames per second) and the „boom” in applications and use case scenarios have all contributed in 

the last decade to an unprecedented growth and development of systems that make use of static (images 

and photos) and dynamic (videos and animations) visual information. In order to cope with these 

overwhelming requests of new product „invention” and technological breakthrough that will surpass the 

performances of current devices, services and technologies, one can observe a certain segmentation and 

specialisation adapted to different types of applications and user groups, as well as an import and 

mixture of techniques, methods, and technologies stemming from various fields (statistics, chemistry, 

physics, biology) that were summoned to help. However, one of the most important/revolutionary 

aspects of the past few years is the interest in adding the extra depth dimension to the „classic” 

brightness information that is captured from the scene. 

This thesis aims at bringing personal and original contributions to the challenges that are being currently 

pursued worldwide within the general field of 2D & 3D vision systems. The research, according to the 

structure of the presented work, focused on the following: 

a. Acquiring indepth knowledge in the field of 2D & 3D vision systems engineering: from photo to 

the latest innovations in imaging acquisition, processing, transport and rendering, including 3D. 

b. Studying, analysing and comparing emerging methods and paradigms for the coding and 

compression of the video signal – namely distributed video coding, all in the framework of the 

increasing number of mobile video-aware devices and wireless transmissions. 

c. Implementing and proposing new algorithms for acquiring/generating/extracting depth 

information from defocus and light polarisation, together with brightness information in order to 

3

create the premises for the development of photo-video devices that are able to work without 

any constraint on illumination or distance to the objects in the scene in any generic 3D scenario. 

d. Developing and implementing new methods, techniques, and real time processing algorithms for 

high definition video signal using the advantages offered by the FPGA technology (Field 

Programmable Gate Array) and the LUT (Look Up Table) approach. 

Each of these objectives was thoroughly studied and scientific contributions were brought as a result of 

the research activity undertaken by the author. Due to the complexity of the field in general and the 

specific tasks under each formulated objective, the thesis was structured in 6 different chapters. 

Thesis Structure and Content 

Chapter 1 includes the description of vision information systems in general, 2D and 3D, and sets the 

general framework under which the rest of the thesis subject and work was carried. 

Chapter 2 describes the characteristic elements of the theory and practice of the systems aimed at 

exchanging visual information between people, and later focuses on the technology and engineering 

part of the main devices used: the video camera and the rendering screen / display. The study is based 

on what the author considers as being some of the most important highlights in video systems evolution 

as well as examples of the latest innovations in the field: theoretical (compressive sampling) and 

technological (OLED displays). The aim of this endeavour is to present and set the exact context in 

which the rest of the thesis’ contributions apply, together with their exact relevance and particularly 

solved problem, as well as their global impact in the development of 2D & 3D vision systems. 

Chapter 3 presents the emerging distributed video coding paradigm that combines knowledge from 

theory and practice in the field of channel coding with fundamental theoretical results presented by 

Slepian, Wolf, Wyner and Ziv regarding lossy source coding. 

The research included the following: 

a. Understanding the main fundamental theory from the field of distributed source coding and the 

way it applies to video coding in particular, together with presenting the main results from the 

field (the theorems of Slepian & Wolf, and Wyner & Ziv). 

b. Studying the „Stanford” distributed video coding architecture and the various improvements and 

used channel codes reported in the existing literature. 

c. Studying the „PRISM” distributed video coding architecture and the various improvements and 

used channel codes reported in the current literature. 

d. Studying the extensions of the „Stanford”, „PRISM” and other distributed video coding 

architectures. 

e. Comparing the main approaches and implementations published in the field of distributed video 

coding. 

Chapter 4 presents 2 methods that, according to the author, offer, independently of the structure of the 

illumination or the use of other than visual sensors, the premises necessary for implementing an 

integrated 3D capturing device. The proposed implementation and algorithms for acquiring / generating 

/ extracting depth from defocus and, respectively, light polarisation, offer the following advantages: 

a. Independence from illumination. 

4

. The possibility of acquiring the depth information from the distance by using optical „close-up” 

means (lenses); this is a major advantage compared to the methods that use structured light or the 

propagation time of some laser beam. 

c. The proposed algorithm for „depth from light polarisation” is sufficiently flexible to allow the 

implementation on video sequences in real time. This is due to the fact that the implementation is 

based on a finite number of simple mathematical operations (multiplications and additions). 

d. Both solutions are based on regular 2D photo / video sensors and require minimum 

additions/transformations for enabling 3D acquisition. 

The research activity, based on the study of different already available technologies, techniques and 

algorithms for capturing the depth information for 3D - but also on the actual implementation of an 

algorithm for „depth from defocus” and, respectively, on developing a new method that uses 

information from light polarisation - proved that the proposed approach could be easily extended and 

integrated into future photo/video 3D aware cameras. 

Chapter 5 presented an architecture for real-time, pixel-based and frame based processing of high 

definition video (1920 x 1080 pixels per frame, 50 frames per second). The implementation, using 

FPGA technology and Look Up Tables has the following advantages over the ASIC (”application 

specific integrated circuits”) based version: 

a. Independence of video processing speed from the complexity of the algorithm that analytically 

describes it. So, no matter how complex the expressions used to compute the final pixel value 

are, the actual processing is performed in the same amount of time. 

b. The possibility of hardware reuse for different processing. The use of the FPGA technology 

allows the easy reconfiguration of the processing, including for future 3D video systems. 

c. Easy extension and integration of other user defined processing / functions. The proposed 

architecture, based on Look Up Tables, enables the addition of further processing functions 

according to the user needs. 

d. The possibility for integration of third party processing functions that are defined using 

proprietary equations and the availability to further ensure copyright protection. 

The research effort, resulting in a functional study of a look-up table based video processing 

architecture performed using a software simulator, implemented by the author and in a working proofof-concept 

hardware model, proved that the approach, as well as the actual integration in a video camera 

is feasible. 

Chapter 6 closes the thesis with a summary and conclusions of the entire endeavour. 

Main Contributions and Conclusions 

In the context of the research performed by the author on the existing literature as well as the result of 

the implemented simulations and functional models, this thesis brought the following main 

contributions to the field of 2D&3D vision systems and state of the art: 

a. An overview of the main technological aspects that support the current development of 3D 

systems has been realised, namely the advances in CMOS and CCD sensor technology and in 

PLASMA, LCD, OLED display technology 

5

. Examples of the main emerging concepts and theories in field of 2D&3D vision systems have 

been presented: stereo vision and epipolar geometry 

c. A presentation of the emerging paradigms in the field has been done: that of compressive 

sampling for video data acquisition and, respectively, that of distributed video coding 

d. A study of the most important distributed video coding architectures (the „Stanford” and the 

„PRISM” architecture) together with their improvements and extensions has been performed. 

e. A comparison between the existing implementations of codec designs based on the distributed 

video coding paradigm has been realised. 

f. An algorithm based on the concept of „depth from defocus” has been implemented. 

g. An new algorithm for extracting the depth information form light polarisation has been proposed 

and implemented 

h. An evaluation of the proposed algorithm for depth extraction from light polarisation has been 

performed based on various sets of measurements and the depth map was constructed for the test 

scenes 

i. An algorithm for video processing using lookup tables has been developed together with the 

algorithm for computing and writing data in these tables 

j. The algorithm for real-time video processing using look-up tables has been implemented and 

simulated in software, at frame level. 

k. An experimental, proof-of-concept model, using FPGA technology and look-up tables has been 

implemented; the prototype, developed with contributions from the author, is protected by 

official copyright laws and is in the intellectual property of Netherland Grass Valley company. 

From the point of view of the economical value of the research outcome, the proposed algorithm for 

depth from light polarisation offers the premises for creating new photo / video cameras able to capture 

brightness but also depth information from the scene. Also, the research results presented in chapter 5, 

of real time video processing using FPGAs and LUTs, was integrated in the latest LDK 8xxx series of 

professional video cameras launched in 2012 by the Netherland Grass Valley company. 

The main published papers with results from the thesis that were presented to the scientific 

community so far are the following: 

S.N. Meza, K.J. Damstra, J.V. Rooy, S. Persa, “Embedded real-time look-up table processing for 

high definition video signals” Proceedings of 2010 IEEE International Conference on Automation 

Quality and Testing Robotics (AQTR 2010), ISBN 978-1-4244-6724-2, pp 315 – 319 

S.N. Meza, A. Vlaicu, B. Orza, “Bridging the gap between video data acquisition, compression 

and transmission under emerging technologies and scenarios”, Proceedings of 2010 IEEE 

International Conference on Automation Quality and Testing Robotics (AQTR 2010), ISBN 978- 

1-4244-6724-2, pp 309 - 314 

Also, the author made the following public presentation within the subject and thematics of the thesis: 

6

S.N. MEZA ”Distributed Video Coding” – presentation at the „16th Summer School on Image 

Processing SSIP” 2008 Technical University of Vienna, Austria. 

Future development of the research done by the author aim at optimising the implementation of the 

proposed algorithm of depth information extraction from light polarisation and its extension to video. 

Likewise, an experimental model for a device capable of 3D (brightness and depth) data acquisition will 

be realised based on the result of chapter 4. By sensor level integration of the polariser filter, direct 

capturing of the depth information from a single perspective will be possible, without any intervention 

on scene illumination or projecting any scanning pattern. 

7

SUMMARY PhD. THESIS

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?