10.07.2015 Views

The ESA Earth Observation Payload Data Ground Segment ...

The ESA Earth Observation Payload Data Ground Segment ...

The ESA Earth Observation Payload Data Ground Segment ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>The</strong> <strong>ESA</strong> <strong>Earth</strong> <strong>Observation</strong><strong>Payload</strong> <strong>Data</strong> <strong>Ground</strong> <strong>Segment</strong>InfrastructureGian Maria Pinna<strong>ESA</strong>-ESRINGianMaria.Pinna@esa.int


Presentation Content MultiMission <strong>Payload</strong> <strong>Data</strong> <strong>Ground</strong> <strong>Segment</strong>harmonization MM PDGS Logical Model and Decomposition MultiMission Facility Infrastructure MultiMission related activities Distributed Processing Capacity and Grid Hosting of User’s Applications and Grid ProcessingOn-Demand EO MM PDGS Future Challenges


Rationale for PDGS harmonization (1) Historically most parts of the <strong>ESA</strong> EO <strong>Payload</strong> <strong>Data</strong><strong>Ground</strong> <strong>Segment</strong>s have been (re)developed for eachmission to be operated Re-use concept not stable part of the developmentstrategy This is valid even more for the acquisition,processing and archiving domain (APAD) User Services part more in line with multimission reuseapproach


Rationale for PDGS harmonization (2) In 2003 the EO Directorate approved a strategy forthe evolution of the several missions ground segments(handled and/or to be developed) into an open multimissionarchitecture, which included as main themes:• Adoption of a common architecture for all missions• Harmonization and standardization of interfaces• Evolution of current missions payload data ground segmentsinto the common architecture• Standardization of products and formats across missions• Re-utilization of already available and proven elements• Harmonization and rationalization of archives Implementation of the <strong>ESA</strong> proposed EO Long Term<strong>Data</strong> Preservation (LTDP) strategy In line with the specific requirements for On-lineArchive <strong>Data</strong> Access


<strong>Ground</strong> <strong>Segment</strong> DecompositionSpecific-to-mission elements(Processors, Acquisition, Q/C, etc.)Monitoring &ControlMissionAMissionBMissionCArchivesMissionD<strong>Data</strong> ManagementUser Services I/FNetworksProductsPackagingExamples of multimissioncommon elements


PDGS Logical Model (OAIS-based)PDGSAdministrationPDGS<strong>Data</strong>ManagementQueries& ordersPDGSConsumerAccessInteractiveUser ServicesPDGSIngestionMetadata& browsesOrders &metadata<strong>Data</strong>ProducersProductsPDGSStorageArchivedproductsPDGSOrderProcessingOutputproducts<strong>Data</strong>Consumers


MultiMission Facility Infrastructure(MMFI) Following these principles, we developed a commonmultimission framework for the implementation of theFacility <strong>Ground</strong> <strong>Segment</strong> called MultiMission FacilityInfrastructure (MMFI) <strong>The</strong> MMFI responds to the requirements and needshighlighted in the previous slides <strong>The</strong> MMFI has been developed in the course of the lastyears, and makes re-use as much as possible offunctional elements already existing and in use in the<strong>ESA</strong> and/or national PDGSs and assembled together ina coherent architecture


Building a Facility <strong>Ground</strong> <strong>Segment</strong>Central ServicesCoreMMFI<strong>Data</strong> LibraryULSLocal InventoryAMSSatStoreRequest HandlingPSMCARPOHPSMPRFGSGFEPSMProcessingSystem.PSMProductDistributionE-PFDPFDCentral ServerOnlineArchivePFDNetworkServerDRSNRTCirculationCache InSiteCacheCirculationCache OutIngestionProcessingDisseminationMonitoring & AlarmMonitoring Logging& AlarmMonitoring and ControlOperating Tool<strong>Data</strong>Producers<strong>Data</strong>ConsumersMission-SpecificElementsMMFIconfigurationsfor missions


MMFI Main Functionality‣ <strong>Data</strong> Ingestion and Archiving‣ Systematic <strong>Data</strong> Driven Processing‣ NRT Processing‣ Subscriptions‣ Standing Requests‣ Distribution on media (CD/DVD/BRD/Tapes)‣ Online Archive Dissemination‣ Reprocessing‣ <strong>Data</strong> Circulation‣ Order Driven Production


MMFI ArchitectureCentral InfrastructureMMMCMMOHSEOLIEOLIDAILDAIL……EOLIDAIL…MMFIULS<strong>Data</strong> LibraryRequest HandlingLocal InventoryAMSCAR IF PR IFPOHE-OANRT FlowGFEPSMPFMOnlineArchiveVC4 TLM from FOSCirculationCache InIngestionIPF IPF IPFIPF IPF IPFProcessingSiteCachePSMProductDistributionDisseminationCirculationCache OutPFDDDS, ...Monitoring & AlarmMonitoring Logging & AlarmMonitoring and ControlOperating ToolMACH


MMFI ArchitectureCentral InfrastructureMMMCMMOHSEOLIEOLIDAILDAIL……EOLIDAIL…MMFI‣ Technologies<strong>Data</strong> Library Request HandlingULSLocal InventoryAMS‣ Gbit Ethernet for production network‣ NFS, FTP‣ CORBA, SOAPCAR IF PR IFPOHE-OAVC4 TLM from FOSNRT Flow‣ LimitedPSMGFE SAN usage PFM in archiving system (no global FS)OnlineArchiveProduct‣ Wide use of XML (Metadata files, Schemas, XQuery) PFDDistributionCirculationCache InIngestionIPF IPF IPFIPF IPF IPF‣ DRB (<strong>Data</strong> Request Broker)ProcessingSiteCachePSMDisseminationCirculationCache OutDDS, ...Monitoring & AlarmMonitoring Logging & AlarmMonitoring and ControlOperating ToolMACH


MMFI ArchitectureCentral InfrastructureMMMCMMOHSEOLIEOLIDAILDAIL……EOLIDAIL…MMFI‣ ‣ HW/SW Technologies<strong>Data</strong> Library Request HandlingULSLocal InventoryPOHAMS‣ ‣ Sun/StorageTek Gbit Ethernet for SL8500/9310 production Tape network libraries‣ ‣ Sun/StorageTek NFS, FTP T9940B tapes (200GB /tape) plannedCAR IF PR IF‣ upgrade CORBA, to SOAP T1000B (1TB/tape)E-OANRT Flow‣ ‣ SAM-FS LimitedPSMGFE HSM SAN (recent usage PFM in upgrade) archiving system 40PB distributed (no global FS) licenseOnlineArchiveVC4 TLM from FOS‣ Linux, Solaris OSProduct‣ Wide use of XML (Metadata files, Schemas, XQuery) PFDDistributionCirculationCache InIngestionIPF IPF IPFIPF IPF IPF‣ ‣ Various DRB (<strong>Data</strong> other Request COTS Broker)ProcessingSiteCachePSMDisseminationCirculationCache OutDDS, ...Monitoring & AlarmMonitoring Logging & AlarmMonitoring and ControlOperating ToolMACH


MMFI Elements All MMFI Elements are highly configurable andexpandable Plug-in concept available in most of elements for easyexpansion Most workflows and functionalities are available nativelyor introduced via a set of already developed plug-ins High use of standardized interfaces and moderntechnologies, e.g.:• Standard processors interface• <strong>Data</strong> description language for products analysis, metadataextraction, input data subsetting from archive, productsreformatting in output Each element logically allocated to function MMFI ismodular and not all elements are needed in eachmissions-specific’s PDGS


MMFI installations <strong>The</strong> MMFI is today installed in 11 <strong>Data</strong> Centers in Europe(Acquisition, Processing and Archiving), part of the <strong>ESA</strong>EO network of facilities Two centers are located in <strong>ESA</strong> establishments (ESRINand Kiruna <strong>ESA</strong> Station), while all others are hosted bynational-owned centers All centers are equipped with computing nodes providingthe processing capacity for various EO missions/sensors All centers are interconnected via the GEANT Europeanlink, with contracted throughputs between 32 Mbps and200 Mbps (HiSEEN) VPN across all centers implemented via a dedicatedsolution (ODAD), which also provides a standard solutionfor the internal and DMZ(s) MMFI LANs <strong>The</strong> interconnected MMFI installations host the <strong>ESA</strong> EOarchive, which amounts to more than 3 PB of data


MM PDGS benefits Lower cost & risks for PDGS implementation, includingallocation for late coming requirements Lower costs & risks in Transfer To Operations phase Lower costs & risks for HW & SW maintenance andrecurrent operations Possibility to show in advance to project and missionsmanagers how the operations of “their” mission will looklike Benefits to mission-specific projects from MM elementsevolutions Lower costs & risks when implementing new requirementsduring operations phase Easy and almost transparent relocation of operations inthe case this is required for a certain mission


MM related activities Various other activities are presently on-going that followthe standardisation/harmonisation of the EO PDGS: HARM (Historical Archive Rationalization and Management) is aproject that aims at converting the entire EO archive into acommon format, removing overlaps between consecutive datadumps from satellite SAFE (Standard Archive Format for Europe), started as part ofthe HARM project, is now proceeding along an independent line, toprovide <strong>ESA</strong> with a standard EO format that can be used by ahigher variety of missions/sensors, for both L0 and higherprocessing levels products (http://earth.esa.int/safe) HMA (Heterogeneous Missions Access) provides a set ofstandardised interface specifications for catalogue, ordering,reporting, in particular promoted outside <strong>ESA</strong> to enhance theinteroperability among different EO data holders MACS (MMFI Automatic Configuration System) aims at simplifyingand automating the configuration of the MMFI for specificmissions/workflows. Also improves configuration control.


EO payload data processing needs EO satellites dump payload data at typical rates between few Mbpsto more than 300Mbps (more than 500Mbps in the near future andmuch more in future with Ka-band) Big volume of archive data for reprocessing (L0), often in the orderof hundreds TB per reprocessing campaign Requirement is to speed more and more reprocessing time <strong>Data</strong> segmentation is normally the single satellite dump, i the orderof ~1-10 GB, much more in the future Additional need to auxiliary data for processing whose size is also issome cases not negligible (GB) Algorithms varies from very fast (few minutes per single segment)to “very slow” (many hours, up to 12) Processing algorithms are often not well parallelized/ parallelizable difficult to get advantage of high parallel processinginfrastructure like multi-core CPUs and GPUs Need anyway to implement a robust and performing infrastructureto parallelize multiple segments processing I/O main bottleneck


MMFI data processing infrastructurePFM (Processing Facility Management)Essentially a specialized job schedulerMain function is the abstraction of the processingsystems (IPF – Instrument Processing Facility) to thehigher level elements of the MMFI (ordering, archive,etc.)Implements our Standard IPF InterfaceSelection of best auxiliary data for processing(several selection policies available, expandable byimplementing new modules)Powerful subscription mechanism to LI, based on OQL(Object Query Language), to be notified of dataappearance in the archive (systematic processing,reprocessing, circulation, etc.)Recurrent auxiliary data caching on processing nodesto improve performanceOptimization of CPU resources, i.e. multi-(core)-CPUs


Distributed Processing Each MMFI center has been equipped with enoughprocessing nodes for all the center’s processing needs(NRT, systematic processing, mission reprocessing,etc.) In the particular case ofthe reprocessing campaignseach center uses itsprocessing capacity for alimited period of time On the other hand allMMFI centers are todayinterconnected via theODAD VPNMMFIProcessingResourcesHiSEENODADWANMMFIProcessingResourcesMMFIProcessingResources


Distributed Processing Capacity expansion<strong>The</strong> goal is then to implement a distributedprocessing solution that: reuses the same resources for different processingrequests use the remote processing resources toaccommodate temporary high loads, e.g. forreprocessing scenarios centralize and simplify M&C of jobs and resources easy upgrade of hardware when current resourcesare continuously under high load2008: Distributed Processing Capacity Study (DPCS)2009: Distributed Processing Capacity Operationalization(DPCO)


DPCS <strong>The</strong> Distributed Processing Capacity Study focused on: identifying available parallel processing solutions analyzing potential re-use and adaptation of existing products evaluating the interfacing possibilities to the MMFI analyzing network requirements selection of the most suitable solution design the architecture as an MMFI extension ... ... that naturally fits into the overall MMFI architecture, and ... avoiding losing too much of the current MMFI capabilities tooptimize processing workflows implement a demonstrator prototype evaluate the performance with small and large datasets


DPCS <strong>The</strong> solution naturally coming out from the study is to interconnectthe various MMFI centers via a GRID solutionMMFI AMMFI BGridPFMPFMCacheCacheEnterprise /DepartmentalGridPFMCache= hardware resources (CPUs)MMFI C


DPCS Solution Search Workload Management System with scheduler File transfer mechanism (data management) Proximity considerations, cost (in time) of datatransfer during scheduling Wait for local job to finish, instead of using a far node Distributed job submission Job priorities Support for Enterprise Grid A standardized API for job submission desirable Possibility to immediate re-claim of a node for localhigh priority processing (e.g. NRT) Support for heterogeneous platforms Support for up to 1000 processing node


DPCS Solution Condor Focus on computational intensive applications Supports network of clusters (departmental grid) Nodes are only registered to the local cluster master Requests are routed via master “Flocking” mechanism Jobs can be moved to other scheduler instances Automatic when there is local saturation Supports dual use nodes No job scheduling on nodes that are busy otherwise (byPFM itself) Node preemption with priority to PFM self managed nodes


DPC Structured Processing BusMMFI AMMFI BMMFI C……PFMCachePFMCachePFMCacheStructured Processing Bus


DPC MMFI ArchitectureCentral InfrastructureMMMCMMOHSEOLIEOLIDAILDAIL……EOLIDAIL…MMFIULS<strong>Data</strong> LibraryRequest HandlingLocal InventoryAMSCAR IF PR IFPOHE-OANRT FlowGFEPSMPFMOnlineArchivePSMVC4 TLM from FOSCirculationCache InIngestionIPF IPF IPFStructuredProcessing BusProcessingSiteCacheProductDistributionDisseminationCirculationCache OutPFDDDS, ...Monitoring & AlarmMonitoring Logging & AlarmMonitoring and ControlOperating ToolMACH


DPC Conclusion Grid technology Rationalize Processing Resources (Structured Processing Bus) Enhances Flexibility Use of remote resources with minimal effort Partly automated integration of new processing resources New Functionality Preemption for high priority requests Still to solve (DPCO) avoid transferring input data over the WAN if input availablealso in the remote MMFI archive avoid transferring back results to distribute to users, whilecapacity for distribution is also available in the remotecentre


Hosting of User’s applicationAnother area where <strong>ESA</strong> EO made use of GRIDtechnologies is the hosting of user’s applicationsEnd-users can require the processing of specificdatasets held in the <strong>ESA</strong> EO archive with theirown algorithmAfter integration, users autonomously theirprocessing jobs, monitor progress and retrievegenerated productsG-POD (Grid Processing On-Demand)http://eopi.esa.int/G-POD


G-POD ObjectivesProvide a “user-segment” environmentPut data & user’s processors togetherAllow “on-demand” processing of the dataAssume common functionsOpen: able to host “any” processorSizeable and SecureOffer scientists a “production” labFocus on algorithmsReuse housekeeping functions (data access, catalogue access,common software tools, etc)Bridge gap from “prototype” to “production” processorOffer scientists a “collaboration” environmentShare tools and functionsReuse output of other processorsIPR is kept, core processor is developed/maintained by the scientist


Temporal/spatial selection of EOproductsG-POD Web Portal Job definition, submission and livestatus monitoring Customisable result visualizationinterfaces Access to output products anddocumentation


G-POD History and Status 2002-2004: Development of Grid On-Demand as a technology (GRID)demonstrator to support e-science for <strong>Earth</strong> <strong>Observation</strong> Sample data on-line, few demonstrator applications Used mainly for internal research and to generate “nice images” 2005-2006: “Transfer to Operations” and “Industrialization” of thesystem MERIS full archive + gradually all Envisat low-bit-rate products Supporting few internal research applications test-bed of first external collaboration (with JRC) for the generation ofthe global MERIS MGVI aggregated level-3 product System maintenance and evolutions by industry 2006: G-POD included as a ground-segment facility to supportroutine services Routine generation of high-level products (e.g. level-3) Test-bed G-POD as a support infrastructure to <strong>ESA</strong> scientists: Call forCAT-1 proposals 2007-2009: G-POD for the long-term G-POD CAT-1 opportunity routinely sustained [http://eopi.esa.int/G-POD] G-POD to expand into <strong>ESA</strong> archiving centres G-POD for GMES (Global Monitoring of Environment and Security)


Examples of G-POD in Production MERIS Level-3 Products NRT generation Joint <strong>ESA</strong> collaboration with ACRI(France), JRC/Ispra (European Commission)and Brockmann Consult (BEAM) 11 products published on-line daily/monthlyhttp://earth.esa.int/meris/level3RIS data Daily ASAR GM mapping of Antarctica Internal Development in operations since 2005 Daily Generation of 400-m resolution mosaics andpublish to <strong>ESA</strong> Web Map Server Aeromeris Fast extraction over user-area of pixels andstatistics from the complete MERIS level-2product archive Output to Excel, Google <strong>Earth</strong>, XML River and Lake Processor <strong>ESA</strong>/Montfort University (UK) collaboration Accurate River and Lake heights measurements inNRT from satellite altimetry (RA2) products published onlinehttp://earth.esa.int/riverandlake/ Meris True-Color Mosaics 9km resolution global Monthly mosaics of MERISdata


Improved <strong>Data</strong> AccessMM PDGS Future Challenges new services/capabilities (request broker, QoS, paralleldownload, peer-to-peer download, SSO, etc.) further merge of long-term archive with on-line archive new services for users (User Services Next Generation) GMES more demanding requirement Higher volume of data Improve performances Standardization of services and products Faster mission datasets reprocessing Distributed Processing Capacity Virtualization to host heterogeneous processors on the samenode study proposed in 2009 Processing acceleration (GP-GPU) experimented in 2008 Improved data exchange high performance SAN


ThanksQuestions?

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!