Deliverable 5.2 - the School of Engineering and Design - Brunel ...

Deliverable 5.2 - the School of Engineering and Design - Brunel ... Deliverable 5.2 - the School of Engineering and Design - Brunel ...

dea.brunel.ac.uk
from dea.brunel.ac.uk More from this publisher
10.07.2015 Views

ICT Project 3D VIVANT– Deliverable 5.2Contract no.:248420Search & Retrieval Mechanisms &Tools1 INTRODUCTION1.1 EXECUTIVE SUMMARYThe past decade has witnessed an exponential growth in digital multimedia production andcommunication. If nowadays a huge number of images and thousands hours of video are created eachday by professionals or home users, the establishment of the holoscopic imaging technology,advanced by 3D VIVANT, will lead literally to an explosion in the digital content production. Withsuch an increasing rate of content production, the development of effective and efficient contentbasedretrieval tools becomes of utmost importance.3D VIVANT aims to provide tools that will allow the retrieval of similar objects from holoscopiccontent databases. Given the inherent problems of current text-based search engines relying onsubjective manual or automatic annotation, 3D VIVANT opted for a search-by-example approach.The search and retrieval framework can be used from both the professionals and home users.However, in the hyperlinker scenario only the professional users will have access to the integral videoediting and thus also to the search and retrieval framework. Due to this fact, the emphasis in theGraphical User Interfaces was given with the professional user in mind. For a home user, thedifference in the GUI design would be to have less configuration options and emphasize the ease ofuse and not focus on the advanced control of the framework.The usage of the search and retrieval framework will enable efficient editing of hyperlinked integralvideo and content reuse in different scenarios. Moreover, the tools and methodologies can be used bythe home / amateur user to search inside large multimedia databases.To bridge the gap between conventional 2D and 3D technology with holoscopic imaging, 3DVIVANT develops content-based retrieval mechanisms that support multimodal queries, i.e. the useris able to pose holoscopic content queries and the framework will answer these queries with similarmultimedia content from various modalities, such as: 2D images, full 3D models or range scans withor without texture information.In order to reduce the integration constraints and enable a modular usage of the software systemsproduced in 3D VIVANT, allow easy adaptation of the framework to the Content Editing Terminaland enable the usage of different PCs in editing and pre-processing of the content, the software wasdeveloped as an autonomous system. In this approach the S&R framework takes the original integralvideo sequence and metadata and extracts an XML file to be used in the hyperlinker framework. Theselected approach enables integral video pre-processing in advance and distribution of thecomputational workload on different PCs. The hyperlinker environment is presented in deliverableD5.4[2], however, a short presentation of the usage of S&R framework with hyperlinker is discussedlater in this document.Finally, it is worth noting that the software is based on open source libraries that can be compiled onvarious operating systems and it can be easily customized.4/03/2013 6

ICT Project 3D VIVANT– Deliverable 5.2Contract no.:248420Search & Retrieval Mechanisms &Tools1.2 DESIGN OBJECTIVES AND DOCUMENT STRUCTUREThe document provides a detailed description of the search and retrieval framework that wasdeveloped in the context of Task 5.1 under WP5. It is emphasized that this document presents theframework, i.e. the environment, mechanisms, tools and User Interfaces, while the actual search andretrieval algorithms are described in deliverable D5.1[1]. The search and retrieval framework is asoftware platform that provides the tools and user interfaces to: a) prepare the integral videosequences to be searchable b) to search inside a multimedia database for similar content c) to extractXML files to be used in the hyperlinker environment.This deliverable initially presents the overall architecture of the search and retrieval framework andthen separates the presentation of the framework in two fundamental sections: a) the databasepreparation and b) the actual search inside the database. The graphical User Interfaces (GUI) for eachfunctionality of the system are presented in-between the discussion of the components.The core of the system is the low-level feature extraction process, during whichmultimediadescriptors are extracted for each multimedia object in the database. Then a manifold learningalgorithm combines descriptors from different modalities to unify the search space and providemultimodal search capabilities.The descriptor extraction procedure is followed by a feature matching step, whose aim is to establisha similarity measure between any two objects. This step is highly dependent on the low-level featureextraction module and together they form the basis of the search and retrieval framework.Finally, the search engine’s graphical user interface is extensively presented, enhanced with examplesthat demonstrate each step of the search and retrieval procedures.The rest of the document is structured as follows: Section 2describes the framework’s architecture,basic modules and the menu items. Section 3 discusses the structure of the database and the actionsfor preparing the database for search queries. Section 4 presents the procedures for searching insidethe database using two different approaches, while in section 5 the implementation details andprogramming tools are presented. Finally, in section 6 the conclusions of the document are drawn.4/03/2013 7

ICT Project 3D VIVANT– <strong>Deliverable</strong> <strong>5.2</strong>Contract no.:248420Search & Retrieval Mechanisms &Tools1 INTRODUCTION1.1 EXECUTIVE SUMMARYThe past decade has witnessed an exponential growth in digital multimedia production <strong>and</strong>communication. If nowadays a huge number <strong>of</strong> images <strong>and</strong> thous<strong>and</strong>s hours <strong>of</strong> video are created eachday by pr<strong>of</strong>essionals or home users, <strong>the</strong> establishment <strong>of</strong> <strong>the</strong> holoscopic imaging technology,advanced by 3D VIVANT, will lead literally to an explosion in <strong>the</strong> digital content production. Withsuch an increasing rate <strong>of</strong> content production, <strong>the</strong> development <strong>of</strong> effective <strong>and</strong> efficient contentbasedretrieval tools becomes <strong>of</strong> utmost importance.3D VIVANT aims to provide tools that will allow <strong>the</strong> retrieval <strong>of</strong> similar objects from holoscopiccontent databases. Given <strong>the</strong> inherent problems <strong>of</strong> current text-based search engines relying onsubjective manual or automatic annotation, 3D VIVANT opted for a search-by-example approach.The search <strong>and</strong> retrieval framework can be used from both <strong>the</strong> pr<strong>of</strong>essionals <strong>and</strong> home users.However, in <strong>the</strong> hyperlinker scenario only <strong>the</strong> pr<strong>of</strong>essional users will have access to <strong>the</strong> integral videoediting <strong>and</strong> thus also to <strong>the</strong> search <strong>and</strong> retrieval framework. Due to this fact, <strong>the</strong> emphasis in <strong>the</strong>Graphical User Interfaces was given with <strong>the</strong> pr<strong>of</strong>essional user in mind. For a home user, <strong>the</strong>difference in <strong>the</strong> GUI design would be to have less configuration options <strong>and</strong> emphasize <strong>the</strong> ease <strong>of</strong>use <strong>and</strong> not focus on <strong>the</strong> advanced control <strong>of</strong> <strong>the</strong> framework.The usage <strong>of</strong> <strong>the</strong> search <strong>and</strong> retrieval framework will enable efficient editing <strong>of</strong> hyperlinked integralvideo <strong>and</strong> content reuse in different scenarios. Moreover, <strong>the</strong> tools <strong>and</strong> methodologies can be used by<strong>the</strong> home / amateur user to search inside large multimedia databases.To bridge <strong>the</strong> gap between conventional 2D <strong>and</strong> 3D technology with holoscopic imaging, 3DVIVANT develops content-based retrieval mechanisms that support multimodal queries, i.e. <strong>the</strong> useris able to pose holoscopic content queries <strong>and</strong> <strong>the</strong> framework will answer <strong>the</strong>se queries with similarmultimedia content from various modalities, such as: 2D images, full 3D models or range scans withor without texture information.In order to reduce <strong>the</strong> integration constraints <strong>and</strong> enable a modular usage <strong>of</strong> <strong>the</strong> s<strong>of</strong>tware systemsproduced in 3D VIVANT, allow easy adaptation <strong>of</strong> <strong>the</strong> framework to <strong>the</strong> Content Editing Terminal<strong>and</strong> enable <strong>the</strong> usage <strong>of</strong> different PCs in editing <strong>and</strong> pre-processing <strong>of</strong> <strong>the</strong> content, <strong>the</strong> s<strong>of</strong>tware wasdeveloped as an autonomous system. In this approach <strong>the</strong> S&R framework takes <strong>the</strong> original integralvideo sequence <strong>and</strong> metadata <strong>and</strong> extracts an XML file to be used in <strong>the</strong> hyperlinker framework. Theselected approach enables integral video pre-processing in advance <strong>and</strong> distribution <strong>of</strong> <strong>the</strong>computational workload on different PCs. The hyperlinker environment is presented in deliverableD5.4[2], however, a short presentation <strong>of</strong> <strong>the</strong> usage <strong>of</strong> S&R framework with hyperlinker is discussedlater in this document.Finally, it is worth noting that <strong>the</strong> s<strong>of</strong>tware is based on open source libraries that can be compiled onvarious operating systems <strong>and</strong> it can be easily customized.4/03/2013 6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!