The CMS Trigger Supervisor: - HEPHY

Tesis Doctoral 

Departament d’Enginyeria Electrònica 

Universitat Autònoma de Barcelona 

The CMS Trigger Supervisor: 

Control and Hardware Monitoring System of the CMS 

Level-1 Trigger at CERN 

Ildefons Magrans de Abril 

Directora: 

Dra. Claudia-Elisabeth Wulz 

Tutora: 

Dra. Montserrat Nafría Maqueda 

March 2008

Dr. Claudia-Elisabeth Wulz, CMS-Trigger Group leader of the Institute for High Energy Physics in Vienna, and 

Deputy CMS Trigger Project Manager 

CERTIFIES 

That the dissertation The CMS Trigger Supervisor: Control and Hardware Monitoring System of the CMS Level- 

1 Trigger at CERN, presented by Ildefons Magrans de Abril to fulfil the degree of Doctor en Enginyeria 

Electrònica, has been performed under her supervision. 

Bellaterra, March de 2008. 

Dra. Claudia-Elisabeth Wulz

Abstract 

The experiments CMS (Compact Muon Solenoid) and ATLAS (A Toroidal LHC ApparatuS) at the Large 

Hadron Collider (LHC) are the greatest exponents of the rising complexity in High Energy Physics (HEP) data 

handling instrumentation. Tens of millions of readout channels, tens of thousands of hardware boards and the 

same order of connections are figures of merit. However, the hardware volume is not the only complexity 

dimension, the unprecedented large number of research institutes and scientists that form the international 

collaborations, and the long design, development, commissioning and operational phases are additional factors 

that must be taken into account. 

The Level-1 (L1) trigger decision loop is an excellent example of these difficulties. This system is based on a 

pipelined logic destined to analyze without deadtime the data from each LHC bunch crossing occurring every 

25_ns, using special coarsely segmented trigger data from the detectors. The L1 trigger is responsible for 

reducing the rate of accepted crossings to below 100 kHz. While the L1 trigger is taking its decision the full 

high-precision data of all detector channels are stored in the detector front-end buffers, which are only read out if 

the event is accepted. The Level-1 Accept (L1A) decision is communicated to the sub-detectors through the 

Timing, Trigger and Control (TTC) system. The L1 decision loop hardware system was built by more than ten 

research institutes with a development and construction period of nearly ten years, featuring more than fifty 

VME crates, and thousands of boards and connections. 

In this context, it is mandatory to provide software tools that ease integration and the short, medium and long 

term operation of the experiment. This research work proposes solutions, based on web services technologies, to 

simplify the implementation and operation of software control systems to manage hardware devices for HEP 

experiments. The main contribution of this work is the design and development of a hardware management 

system intended to enable the operation and integration of the L1 decision loop of the CMS experiment (CMS 

Trigger Supervisor, TS). 

The TS conceptual design proposes a hierarchical distributed system which fits the web services based model of 

the CMS Online SoftWare Infrastructure (OSWI) well. The functional scope of this system covers the 

configuration, testing and monitoring of the L1 decision loop hardware, and its interaction with the overall CMS 

experiment control system and the rest of the experiment. Together with the technical design aspects, the project 

organization strategy is discussed. 

The main topic follows an initial investigation about the usage of the eXtended Markup Language (XML) as 

uniform data representation format for a software environment to implement hardware management systems for 

HEP experiments. This model extends the usage of XML beyond the boundaries of the control and monitoring 

related data and proposes its usage also for the code. This effort, carried out in the context of the CMS Trigger 

and Data Acquisition project, improved the overall team knowledge on XML technologies, created a pool of 

ideas and helped to anticipate the main TS requirements and architectural concepts. 

i

Visual summary 

The following diagram presents a visual summary of the PhD thesis. It consists of text boxes summarizing the 

main ideas and labeled arrows connecting them. The author’s contribution to peer reviewed journals (p), 

international conferences (c) and supervised master theses (t) are also indicated next to each text box. 

Motivation 

Chapter 1 

Unprecedented complexity related to the implementation of hardware control system for the 

last generation of high energy physics experiments. Very large hardware systems, human 

collaborations, and design, development and operational periods. 

Chapter 2 

Chapters 1, 3 

Generic solution 

[39]p 

Development model: [50]p 

•XML for data and code. 

•Interpreted code. 

First lessons 

Concrete case and main thesis goal 

Control and monitoring system for the 

Level-1 (L1) trigger decision loop. 

Chapter 2 

Chapter 3 

•Web services and XDAQ middleware as 

suitable technologies. 

•Experience of developing a hardware 

management system for the CMS experiment. 

Experience 

Requirements 

[56]p 

Conceptual design of control system for the CMS L1 decision loop (Trigger Supervisor, TS): [60]c 

•Requirements. 

•Project organization. 

•Layered design: Framework, System, Services. 

Chapter 4 

Framework design 

•Baseline technology survey. 

•Additional developments. 

[89]c 

[90]t 

•Performance measurements. 

System design 

Chapter 5 

•Design guidelines. 

•Distributed software system architecture. 

[95]c 

Services design 

Chapter 6 

Chapters 7, 8 

Thesis achievements 

•Configuration, interconnection test 

and GUI services. 

•New software environment model: confirms XML and XDAQ. 

•TS design and project organization as a successful experience for future experiments. 

•A building block of the CMS experiment. 

•A contribution to the CMS operation. 

•Proposal for a uniform CMS experiment control system. 

[97]t 

ii

Contents 

ABSTRACT............................................................................................................................................................I 

VISUAL SUMMARY............................................................................................................................................... II 

CONTENTS.........................................................................................................................................................III 

ACRONYMS..................................................................................................................................................... VII 

CHAPTER 1 INTRODUCTION..................................................................................................................... 1 

1.1 CERN AND THE LARGE HADRON COLLIDER ........................................................................................... 1 

1.2 THE COMPACT MUON SOLENOID DETECTOR ........................................................................................... 3 

1.3 THE TRIGGER AND DAQ SYSTEM ............................................................................................................ 5 

1.3.1 Overview ......................................................................................................................................... 5 

1.3.2 The Level-1 trigger decision loop ................................................................................................... 5 

1.3.2.1 Calorimeter Trigger..................................................................................................................................... 6 

1.3.2.2 Muon Trigger .............................................................................................................................................. 7 

1.3.2.3 Global Trigger ............................................................................................................................................. 7 

1.3.2.4 Timing Trigger and Control System............................................................................................................ 7 

1.4 THE CMS EXPERIMENT CONTROL SYSTEM ............................................................................................. 8 

1.4.1 Run Control and monitoring System ...............................................................................................8 

1.4.2 Detector Control System ................................................................................................................. 9 

1.4.3 Cross-platform DAQ framework..................................................................................................... 9 

1.4.4 Sub-system Online Software Infrastructure................................................................................... 10 

1.4.5 Architecture................................................................................................................................... 10 

1.5 RESEARCH PROGRAM............................................................................................................................. 11 

1.5.1 Motivation ..................................................................................................................................... 11 

1.5.2 Goals ............................................................................................................................................. 12 

CHAPTER 2 UNIFORM MANAGEMENT OF DATA ACQUISITION DEVICES WITH XML ........ 13 

2.1 INTRODUCTION ...................................................................................................................................... 13 

2.2 KEY REQUIREMENTS .............................................................................................................................. 13 

2.3 A UNIFORM APPROACH FOR HARDWARE CONFIGURATION CONTROL AND TESTING ................................ 14 

2.3.1 XML as a uniform syntax .............................................................................................................. 14 

2.3.2 XML based control language ........................................................................................................ 15 

2.4 INTERPRETER DESIGN............................................................................................................................. 17 

2.4.1 Polymorphic structure................................................................................................................... 17 

2.5 USE IN A DISTRIBUTED ENVIRONMENT ................................................................................................... 18 

2.6 HARDWARE MANAGEMENT SYSTEM PROTOTYPE ................................................................................... 18 

2.7 PERFORMANCE COMPARISON................................................................................................................. 20 

2.8 PROTOTYPE STATUS............................................................................................................................... 20 

CHAPTER 3 TRIGGER SUPERVISOR CONCEPT................................................................................. 21 

3.1 INTRODUCTION ...................................................................................................................................... 21 

3.2 REQUIREMENTS ..................................................................................................................................... 22 

iii

3.2.1 Functional requirements ............................................................................................................... 22 

3.2.2 Non-functional requirements......................................................................................................... 23 

3.3 DESIGN .................................................................................................................................................. 25 

3.3.1 Initial discussion on technology.................................................................................................... 25 

3.3.2 Cell................................................................................................................................................ 26 

3.3.3 Trigger Supervisor services .......................................................................................................... 27 

3.3.3.1 Configuration............................................................................................................................................. 27 

3.3.3.2 Reconfiguration......................................................................................................................................... 29 

3.3.3.3 Testing....................................................................................................................................................... 29 

3.3.3.4 Monitoring................................................................................................................................................. 31 

3.3.3.5 Start-up...................................................................................................................................................... 31 

3.3.4 Graphical User Interface .............................................................................................................. 32 

3.3.5 Configuration and conditions database ........................................................................................ 32 

3.4 PROJECT COMMUNICATION CHANNELS .................................................................................................. 32 

3.5 PROJECT DEVELOPMENT ........................................................................................................................ 33 

3.6 TASKS AND RESPONSIBILITIES................................................................................................................ 34 

3.7 CONCEPTUAL DESIGN IN PERSPECTIVE................................................................................................... 35 

CHAPTER 4 TRIGGER SUPERVISOR FRAMEWORK......................................................................... 37 

4.1 CHOICE OF AN ADEQUATE FRAMEWORK ................................................................................................ 37 

4.2 REQUIREMENTS ..................................................................................................................................... 38 

4.2.1 Requirements covered by XDAQ................................................................................................... 38 

4.2.2 Requirements non-covered by XDAQ............................................................................................ 38 

4.3 CELL FUNCTIONAL STRUCTURE.............................................................................................................. 39 

4.3.1 Cell Operation............................................................................................................................... 39 

4.3.2 Cell command................................................................................................................................ 41 

4.3.3 Factories and plug-ins .................................................................................................................. 41 

4.3.4 Pools.............................................................................................................................................. 41 

4.3.5 Controller interface....................................................................................................................... 41 

4.3.6 Response control module .............................................................................................................. 42 

4.3.7 Access control module................................................................................................................... 42 

4.3.8 Shared resource manager ............................................................................................................. 42 

4.3.9 Error manager .............................................................................................................................. 42 

4.3.10 Xhannel ......................................................................................................................................... 42 

4.3.11 Monitoring facilities...................................................................................................................... 43 

4.4 IMPLEMENTATION.................................................................................................................................. 43 

4.4.1 Layered architecture ..................................................................................................................... 43 

4.4.2 External packages ......................................................................................................................... 43 

4.4.2.1 Log4cplus.................................................................................................................................................. 43 

4.4.2.2 Xerces........................................................................................................................................................ 44 

4.4.2.3 Graphviz.................................................................................................................................................... 44 

4.4.2.4 ChartDirector............................................................................................................................................. 44 

4.4.2.5 Dojo........................................................................................................................................................... 44 

4.4.2.6 Cgicc ......................................................................................................................................................... 45 

4.4.2.7 Logging collector....................................................................................................................................... 45 

4.4.3 XDAQ development....................................................................................................................... 45 

4.4.4 Trigger Supervisor framework ...................................................................................................... 46 

4.4.4.1 The cell...................................................................................................................................................... 47 

4.4.4.2 Cell command............................................................................................................................................ 48 

4.4.4.3 Cell operation ............................................................................................................................................ 49 

4.4.4.4 Factories, pools and plug-ins ..................................................................................................................... 50 

4.4.4.5 Controller interface.................................................................................................................................... 51 

4.4.4.6 Response control module........................................................................................................................... 51 

4.4.4.7 Access control module............................................................................................................................... 53 

4.4.4.8 Error management module ........................................................................................................................ 53 

4.4.4.9 Xhannel ..................................................................................................................................................... 53 

4.4.4.9.1 CellXhannelCell .................................................................................................................................. 54 

4.4.4.9.2 CellXhannelTb .................................................................................................................................... 55 

iv

4.4.4.10 CellToolbox .......................................................................................................................................... 56 

4.4.4.11 Graphical User Interface ....................................................................................................................... 56 

4.4.4.12 Monitoring infrastructure ...................................................................................................................... 57 

4.4.4.12.1 Model ................................................................................................................................................ 58 

4.4.4.12.2 Declaration and definition of monitoring items ................................................................................. 58 

4.4.4.13 Logging infrastructure........................................................................................................................... 61 

4.4.4.14 Start-up infrastructure ........................................................................................................................... 62 

4.5 CELL DEVELOPMENT MODEL.................................................................................................................. 62 

4.6 PERFORMANCE AND SCALABILITY MEASUREMENTS............................................................................... 63 

4.6.1 Test setup....................................................................................................................................... 63 

4.6.2 Command execution ...................................................................................................................... 63 

4.6.3 Operation instance initialization................................................................................................... 65 

4.6.4 Operation state transition ............................................................................................................. 66 

CHAPTER 5 TRIGGER SUPERVISOR SYSTEM.................................................................................... 69 

5.1 INTRODUCTION ...................................................................................................................................... 69 

5.2 DESIGN GUIDELINES............................................................................................................................... 69 

5.2.1 Homogeneous underlying infrastructure....................................................................................... 69 

5.2.2 Hierarchical control system architecture...................................................................................... 69 

5.2.3 Centralized monitoring, logging and start-up systems architecture ............................................. 70 

5.2.4 Persistency infrastructure ............................................................................................................. 70 

5.2.4.1 Centralized access ..................................................................................................................................... 70 

5.2.4.2 Common monitoring and logging databases.............................................................................................. 70 

5.2.4.3 Centralized maintenance............................................................................................................................ 70 

5.2.5 Always on system........................................................................................................................... 70 

5.3 SUB-SYSTEM INTEGRATION.................................................................................................................... 71 

5.3.1 Building blocks.............................................................................................................................. 71 

5.3.1.1 The TS node .............................................................................................................................................. 71 

5.3.1.2 Common services ...................................................................................................................................... 72 

5.3.1.2.1 Logging collector................................................................................................................................ 72 

5.3.1.2.2 Tstore................................................................................................................................................... 72 

5.3.1.2.3 Monitor collector ................................................................................................................................. 72 

5.3.1.2.4 Mstore.................................................................................................................................................. 73 

5.3.2 Integration..................................................................................................................................... 73 

5.3.2.1 Integration parameters ............................................................................................................................... 73 

5.3.2.1.1 OSWI parameters ................................................................................................................................ 73 

5.3.2.1.2 Hardware setup parameters.................................................................................................................. 74 

5.3.2.2 Integration cases ........................................................................................................................................ 74 

5.3.2.2.1 Cathode Strip Chamber Track Finder.................................................................................................. 74 

5.3.2.2.2 Global Trigger and Global Muon Trigger............................................................................................ 74 

5.3.2.2.3 Drift Tube Track Finder....................................................................................................................... 75 

5.3.2.2.4 Resistive Plate Chamber...................................................................................................................... 76 

5.3.2.2.5 Global Calorimeter Trigger ................................................................................................................. 76 

5.3.2.2.6 Hadronic Calorimeter .......................................................................................................................... 77 

5.3.2.2.7 Trigger, Timing and Control System................................................................................................... 78 

5.3.2.2.8 Luminosity Monitoring System........................................................................................................... 79 

5.3.2.2.9 Central cell .......................................................................................................................................... 80 

5.3.2.3 Integration summary.................................................................................................................................. 80 

5.4 SYSTEM INTEGRATION ........................................................................................................................... 81 

5.4.1 Control system............................................................................................................................... 81 

5.4.2 Monitoring system......................................................................................................................... 82 

5.4.3 Logging system.............................................................................................................................. 83 

5.4.4 Start-up system.............................................................................................................................. 83 

5.5 SERVICES DEVELOPMENT PROCESS ........................................................................................................ 83 

CHAPTER 6 TRIGGER SUPERVISOR SERVICES ................................................................................ 87 

6.1 INTRODUCTION ...................................................................................................................................... 87 

6.2 CONFIGURATION.................................................................................................................................... 87 

6.2.1 Description.................................................................................................................................... 87 

v

6.2.2 Implementation.............................................................................................................................. 88 

6.2.2.1 Central cell ................................................................................................................................................ 89 

6.2.2.2 Trigger sub-systems................................................................................................................................... 92 

6.2.2.3 Global Trigger ........................................................................................................................................... 93 

6.2.2.3.1 Command interface.............................................................................................................................. 94 

6.2.2.3.2 Configuration operation and database ............................................................................................... 101 

6.2.2.4 Sub-detector cells .................................................................................................................................... 103 

6.2.2.5 Luminosity monitoring system................................................................................................................ 103 

6.2.3 Integration with the Run Control and Monitoring System .......................................................... 103 

6.3 INTERCONNECTION TEST...................................................................................................................... 105 

6.3.1 Description.................................................................................................................................. 105 

6.3.2 Implementation............................................................................................................................ 105 

6.3.2.1 Central cell .............................................................................................................................................. 105 

6.3.2.2 Sub-system cells ...................................................................................................................................... 107 

6.4 MONITORING ....................................................................................................................................... 108 

6.4.1 Description.................................................................................................................................. 108 

6.5 GRAPHICAL USER INTERFACES............................................................................................................. 109 

6.5.1 Global Trigger control panel ...................................................................................................... 109 

CHAPTER 7 HOMOGENEOUS SUPERVISOR AND CONTROL SOFTWARE INFRASTRUCTURE 

FOR THE CMS EXPERIMENT AT SLHC................................................................................................... 111 

7.1 INTRODUCTION .................................................................................................................................... 111 

7.2 TECHNOLOGY BASELINE ...................................................................................................................... 111 

7.3 ROAD MAP ........................................................................................................................................... 112 

7.4 SCHEDULE AND RESOURCE ESTIMATES ................................................................................................ 113 

CHAPTER 8 SUMMARY AND CONCLUSIONS.................................................................................... 115 

8.1 CONTRIBUTIONS TO THE CMS GENETIC BASE...................................................................................... 116 

8.1.1 XSEQ........................................................................................................................................... 116 

8.1.2 Trigger Supervisor ...................................................................................................................... 117 

8.1.3 Trigger Supervisor framework ....................................................................................................117 

8.1.4 Trigger Supervisor system........................................................................................................... 118 

8.1.5 Trigger Supervisor services ........................................................................................................ 118 

8.1.6 Trigger Supervisor Continuation ................................................................................................ 119 

8.2 CONTRIBUTION TO THE CMS BODY ..................................................................................................... 119 

8.3 FINAL REMARKS................................................................................................................................... 119 

APPENDIX A TRIGGER SUPERVISOR SOAP API............................................................................ 121 

A.1 INTRODUCTION .................................................................................................................................... 121 

A.2 REQUIREMENTS ................................................................................................................................... 121 

A.3 SOAP API ........................................................................................................................................... 121 

A.3.1 Protocol....................................................................................................................................... 121 

A.3.2 Request message.......................................................................................................................... 123 

A.3.3 Reply message ............................................................................................................................. 124 

A.3.4 Cell command remote API .......................................................................................................... 125 

A.3.5 Cell Operation remote API ......................................................................................................... 125 

A.3.5.1 OpInit ...................................................................................................................................................... 125 

A.3.5.2 OpSendCommand.................................................................................................................................... 126 

A.3.5.3 OpReset ................................................................................................................................................... 127 

A.3.5.4 OpGetState .............................................................................................................................................. 128 

A.3.5.5 OpKill...................................................................................................................................................... 129 

ACKNOWLEDGEMENTS.............................................................................................................................. 131 

REFERENCES.................................................................................................................................................. 133 

vi

Acronyms 

ACM 

AJAX 

ALICE 

API 

ATLAS 

aTTS 

BX 

BU 

CCC 

CCI 

CERN 

CGI 

CKC 

CMS 

CSC 

CSCTF 

CVS 

DAQ 

DCC 

DCS 

DB 

DBWG 

DIM 

DOM 

DT 

DTSC 

DTTF 

ECAL 

ECS 

ERM 

EVM 

FDL 

Access Control Module 

Asynchronous JavaScript and XML 

A Large Ion Collider Experiment 

Application Program Interface 

A Toroidal LHC Apparatus 

Asynchronous Trigger Throttle System 

Bunch crossing 

Builder Unit 

Central Crate Cell 

Control Cell Interface 

Conseil Europeen pour la Recherche Nucleaire 

Common Gateway Interface 

ClocK crate cell 

Compact Muon Solenoid 

Cathode Strip Chamber 

Cathode Strip Chamber Track Finder 

Concurrent Versions System 

Data Acquisition 

DTTF Central Cell 

Detector Control System 

DataBase 

CMS DataBase Working Group 

Distributed Information Management System 

Document Object Model 

Drift Tube 

Drift Tube Sector Collector 

Drift Tube Track Finder 

Electromagnetic CALorimeter 

Experiment Control System 

Error Manager 

EVent Manager 

Final Decision Logic 

vii

FED 

FLFM 

FM 

FPGA 

FRL 

FSM 

FTE 

FU 

GCT 

GMT 

GT 

GTFE 

GTL 

GUI 

HAL 

HCAL 

HF 

HLT 

HTML 

HTTP 

HEP 

HW 

I2O 

JSP 

LEP 

LHC 

LHCb 

LMS 

LMSS 

LUT 

L1 

L1A 

ORCA 

OSWI 

PCI 

PSB 

PSI 

PVSS 

RC 

Front-end Device 

First Level Function Manager 

Function manager 

Field Programmable Gate Array 

Front-end Readout Link board 

Finite State Machine 

Full Time Equivalent 

Filter Unit 

Global Calorimeter Trigger 

Global Muon Trigger 

Global Trigger 

Global Trigger Front-end 

Global Trigger Logic 

Graphical User Interface 

Hardware Access Library 

Hadronic CALorimeter 

Forward Hadronic calorimeter 

High Level Trigger 

HyperText Markup Language 

HyperText Transfer Protocol 

High Energy Physics 

HardWare 

Intelligent Input/Output 

Java Server Pages 

Large Electron and Positron collider 

Large Hadron Collider 

Large Hadron Collider beauty experiment 

Luminosity Monitoring System 

Luminosity Monitoring Software System 

Look Up Table 

Level-1 

Level-1 Accept signal 

Object Oriented Reconstruction for CMS Analysis 

Online SoftWare Infrastructure 

Peripheral Component Interconnect bus standard 

Pipeline Synchronizing Buffer 

PVSS SOAP Interface 

ProzessVisualisierungs- und SteuerungSSystem 

Run Control 

viii

RCM 

RCMS 

RCT 

RF2TTC 

RPC 

RU 

SRM 

SW 

SCADA 

SDRAM 

SEC 

SLHC 

SLOC 

SOAP 

SRM 

SSCS 

sTTS 

TCS 

TFC 

TIM 

TOTEM 

TPG 

TriDAS 

TS 

TSCS 

TSMS 

TSLS 

TSM 

TSSS 

TTC 

TTCci 

TTCrx 

TTS 

UA1 

UDP 

UML 

URL 

VME 

WSDL 

Response Control Module 

Run Control and Monitoring System 

Regional Calorimeter Trigger 

TTC machine interface 

Resistive Plate Chamber and Remote Process Call 

Readout Unit 

Shared Resources Manager 

SoftWare 

Supervisory Controls And Data Acquisition 

Synchronous Dynamic Random Access Memory 

Service Entry Cell 

Super LHC 

Source Lines Of Code 

Simple Object Access Protocol 

Shared Resource Module 

Sub-detectors Supervisory and Control Systems 

Synchronous Trigger Throttle System 

Trigger Control System 

Track Finder Cell 

TIMing module 

TOTal cross cection, Elastic scattering and diffraction dissociation at the LHC 

Trigger Primitive Generator (HF, HCAL, ECAL, RPC, CSC and DT) 

Trigger and Data Acquisition System 

Trigger Supervisor 

Trigger Supervisor Control System 

Trigger Supervisor Monitoring System 

Trigger Supervisor Logging System 

Task Scheduler Module 

Trigger Supervisor Start-up System 

Timing, Trigger and Control System 

CMS version of the TTC VME interface module 

A Timing, Trigger and Control Receiver ASIC for LHC Detectors 

Trigger Throttle System 

Underground Area 1 experiment 

User Datagram Protocol 

Unified Modeling Language 

Uniform Resource Locator 

Versa Module Europa bus standard 

Web Service Description Language 

ix

W3C 

XDAQ 

XML 

XPath 

XSD 

XSEQ 

World Wide Web Consortium 

Cross-platform DAQ framework 

EXtensible Markup Language 

XML Path language 

XML Schema Document 

Cross-platform SEQuencer 

x

Chapter 1 

Introduction 

1.1 CERN and the Large Hadron Collider 

At CERN, the European laboratory for particle physics, the fundamental structure of matter is studied using 

particle accelerators. The acronym CERN comes from the earlier French title: “Conseil Européen pour la 

Recherche Nucléaire”. CERN is located on the Franco-Swiss border west of Geneva. CERN was founded in 

1954, and is currently being funded by 20 European countries. CERN employs just under 3000 people, only a 

fraction of those are actually particle physicists. This reflects the role of CERN: it does not so much perform 

particle physics itself, but rather offers its research facilities to the particle physicists in Europe and increasingly 

in the whole world. About half of the world’s particle physicists, some 6500 researchers from over 500 

universities and institutes in some 80 countries, use CERN’s facilities. 

The latest of these facilities that has been designed and is being built at CERN is the Large Hadron Collider or 

LHC [1]. It is contained in a 26.7 km circumference tunnel located underground at a depth ranging from 50 to 

150 meters (Figure 1-1). The tunnel was formerly used for the Large Electron Positron (LEP) collider. The LHC 

project consists of a superconducting magnet system with two beam channels designed to bring two proton 

beams into collision, at a centre of mass energy of 14 TeV. It will also be able to provide collisions of heavy 

nuclei (Pb-Pb) produced at a centre of mass energy of 2.76 TeV per nucleon. 

When the two counter-rotating proton bunches cross, protons within bunches can collide producing new particles 

in inelastic interactions. Such inelastic interactions are also referred to as “events”. The probability for such 

inelastic collisions to take place is determined by the cross section for proton-proton interactions and by the 

density and frequency of the proton bunches. The related quantity, which is a characteristic of the collider, is 

called the luminosity. The design luminosity of the LHC is 10 34 cm −2 s −1 . The proton-proton inelastic cross 

section σ inel depends on the proton’s energy. At the LHC center-of-mass energy of 14 TeV, σ inel is expected to be 

70 mb (70·10 −27 cm 2 ). Therefore, the number of inelastic interactions per second (event rate), is the product of the 

cross section (σ inel ) and the luminosity (L): N inel = σ·L = 7·10 8 s -1 . As the bunch crossing rate is 40 MHz and 

bearing in mind that during normal operation at the LHC not all bunches are filled (only 2808 out of 3564), the 

average number of events per bunch crossing can be calculated as 7·10 8·25·10 −9·3564/2808 ≈ 22. 

The main LHC functional parameters that are most important from the experimental point of view are reported in 

Table 1-1. At the energy scale and raw data rate aimed at LHC, the design of the detectors faces a number of new 

implementation challenges. LHC detectors must have the capability of isolating and reconstructing the 

interesting events as only few events can be recorded out of the 40 million each second. Another technical 

challenge is the extremely hostile radiation environment.

Introduction 2 

Figure 1-1: Schematic illustration of the LHC ring with the four experimental points. 

Design Luminosity (L) 

Bunch crossing (BX) rate 

10 34 cm −2 s −1 

40 MHz 

Number of bunches per orbit 3564 

Number of filled bunches per orbit 2808 

Average number of events per bunch crossing 22 

Table 1-1: Main LHC functional parameters that are most important from the experimental point of view. 

There are four collision points spread over the LHC ring which house the main LHC experiments. The two 

largest, Compact Muon Solenoid (CMS, [2]) and A Toroidal LHC ApparatuS (ATLAS, [3]) are general purpose 

experiments that take different approaches, in particular to the detection of muons. 

CMS is built around a very high field solenoid magnet; its relative compactness derives from the fact that there is 

a massive iron yoke so that the muons are detected by their bending over a relatively short distance in a very high 

magnetic field. The ATLAS experiment is substantially bigger and essentially relies upon an air-cored toroidal 

magnet system for the measurement of the muons. 

Two more special-purpose experiments have been approved to start their operation at the switch on of the LHC 

machine, A Large Ion Collider Experiment (ALICE, [4]) and the Large Hadron Collider beauty experiment 

(LHC-b, [5]). ALICE is a dedicated heavy-ion detector that will exploit the unique physics potential of nucleusnucleus 

interactions at LHC energies, and the LHC-b detector is dedicated to the study of CP violation and other 

rare phenomena in the decays of beauty particles.

The Compact Muon Solenoid detector 3 

1.2 The Compact Muon Solenoid detector 

The CMS detector is a general-purpose quasi-hermetic detector. This kind of particle detector is designed to 

observe all possible decay products of an interaction between subatomic particles in a collider by covering as 

large an area around the interaction point as possible and incorporating multiple types of sub-detectors. CMS is 

called “hermetic” because it is designed to let as few particles as possible escape. 

There are three main components of a particle physics collider detector. From the inside out, the first is a tracker, 

which measures the momenta of charged particles as they curve in a magnetic field. Next there are calorimeters, 

which measure the energy of most charged and neutral particles by absorbing them in dense material, and a 

muon system which measures the type of particle that is not stopped in the calorimeters and can still be detected. 

The concept of the CMS detector was based on the requirements of having a very good muon system whilst 

keeping the detector dimensions compact. In this case, only a strong magnetic field would guarantee good 

momentum resolution for high momentum muons. Studies showed that the required magnetic field could be 

generated by a superconducting solenoid. It is also a particularity of CMS that the solenoid surrounds the 

calorimeter detectors. 

Figure 1-2 shows a schematic drawing of the CMS detector and its components that will be described in detail in 

the subsequent sections. Figure 1-3 shows a transverse slice of the detector. Trajectories of different kinds of 

particles and the traces they leave in the different components of the detector are also shown. 

The coordinate system adopted by CMS has the origin centered at the nominal collision point inside the 

experiment, the y-axis pointing vertically upward, and the x-axis pointing radially inward toward the center of 

the LHC. Thus, the z-axis points along the beam direction toward the Jura mountains from LHC Point 5. The 

azimuthal angle (φ) is measured from the x-axis in the x-y plane. The polar angle (θ) is measured from the z-axis. 

Pseudorapidity is defined as η = -ln tan(θ/2). Thus, the momentum and energy measured transverse to the beam 

direction, denoted by p T and E T , respectively, are computed from the x and y components. 

Figure 1-2: Drawing of the complete CMS detector, showing both the scale and complexity.


Figure 1-3: Slice through CMS showing particles incident on the different sub-detectors. 

Tracker 

The tracking system [6] records the helix traced by a charged particle that curves in a magnetic field by 

localizing it in space in finely-segmented layers of detecting material composed of silicon. The degree to which 

the particle curves is inversely proportional to its momentum perpendicular to the beam, while the degree to 

which it drifts in the direction of the beam axis gives its momentum in that direction. 

Calorimeters 

The calorimeter system is installed inside the coil. It slows particles down and absorbs their energy allowing that 

energy to be measured. This detector is divided into two types: the Electromagnetic Calorimeter (ECAL, [7]), 

made of lead tungstate (PbWO 4 ) crystals, absorbs particles that interact electromagnetically by producing 

electron/positron pairs and bremsstrahlung 1 ; and the Hadronic Calorimeter (HCAL, [8]), made of interleaved 

copper absorber and plastic scintillator plates, can detect hadrons which interact via the strong nuclear force. 

Muon system 

Of all the known stable particles, only muons and neutrinos pass through the calorimeter without losing most or 

all of their energy. Neutrinos are undetectable, and their existence must be inferred, but muons (which are 

charged) can be measured by an additional tracking system outside the calorimeters. 

A redundant and precise muon system was one of the first requirements of CMS [9]. The ability to trigger on and 

reconstruct muons, being an unmistakable signature for a large number of new physics processes CMS is 

designed to explore, is central to the concept. The muon system consists of three technologically different 

components: Resistive Plate Chambers (RPC), Drift Tubes (DT) and Cathode Strip Chambers (CSC). 

1 Bremsstrahlung is electromagnetic radiation produced by the deceleration of a charged particle, such as an electron, when 

deflected by another charged particle, such as an atomic nucleus.

The Trigger and DAQ system 5 

The muon system of CMS is embedded in the iron return yoke of the magnet. It makes use of the bending of 

muons in the magnetic field for transverse momentum measurements of muon tracks identified in association 

with the tracker. The large thickness of absorber material in the return yoke helps to filter out hadrons, so that 

muons are practically the only particles apart from neutrinos able to escape from the calorimeter system. The 

muon system consists of 4 stations of muon chambers in the barrel region (Figure 1-3 shows how the 4 stations 

correspond to 4 layers of muon chambers) and disks in the forward region. 

1.3 The Trigger and DAQ system 

1.3.1 Overview 

The CMS Trigger and Data Acquisition (DAQ) system is designed to collect and to analyze the detector 

information at the LHC bunch crossing frequency of 40 MHz. The rate of events to be recorded for offline 

processing and analysis is of the order of 100 Hz. At the design luminosity of 10 34 cm −2 s −1 , the LHC rate of 

proton collisions will be around 22 per bunch crossing, producing approximately 1 MB of zero-suppressed data 2 

in the CMS readout system. The Level-1 (L1) trigger is designed to reduce the incoming data rate to a maximum 

of 100 kHz, by processing fast trigger information coming from the calorimeters and the muon chambers, and 

selecting events with interesting signatures. Therefore, the DAQ system must sustain a maximum input rate of 

100 kHz, for an average data flow of 100 GB/s coming from about 650 data sources, and must provide enough 

computing power for a high level software trigger (HLT) to reduce the rate of stored events by a factor of 1000. 

In CMS all events that pass the Level-1 trigger are sent to a computer farm (Event Filter) that performs physics 

selections, using the offline reconstruction software, to filter events and achieve the required output rate. The 

design of the CMS Data Acquisition system and of the High Level trigger is described in detail in the Technical 

Design Report [10]. The architecture of the CMS Trigger and DAQ system is shown schematically in Figure 1-4. 

Figure 1-4: Overview of the CMS Trigger and DAQ system architecture. 

1.3.2 The Level-1 trigger decision loop 

The L1 trigger [11] is a custom pipelined hardware logic intended to analyze the bunch crossing data every 25 ns 

without deadtime using special coarsely segmented trigger data from the muon systems and the calorimeters. The 

L1 trigger reduces the rate of crossings to below 100 kHz. 

The L1 trigger has local, regional and global components. At the bottom end, the Local Triggers, also called 

Trigger Primitive Generators (TPG), are based on energy deposits in calorimeter trigger towers 3 and track 

2 Zero suppression consists of eliminating leading zeros. This encoding is performed by the on-detector readout electronics to 

reduce the data volume. 

3 Each trigger tower identifies a detector region with an approximate (η,φ)-coverage of 0.087 x 0.087 rad.


3.2 µs 

192 L1A’s (128 Algorithms + 64 Technical Triggers) 

Global Trigger 

DAQ 

DAQ 

Drift Tube 

Track finder 

Drift Tube Sector 

Collector 

Slink 

Back pressure 

(sTTS) 

Local Control 

Local trigger Local 0 

Control 31 

Global Muon Trigger 

CSC Track 

Finder 

RPC 

Trigger 

DT CSC RPC 

Muon Det. Front End 

L1A + TTC 

ECAL TPG 

Global Calorimeter 


Regional Calorimeter 


HCAL/HF TPG 

ECAL HCAL HF 

Calorimeters Front End 

Back pressure 

(aTTS + sTTS) 

Partition 

controller 0 

OR (192 L1A) 

L1A + TTC 

Trigger Control System 

Partition 

controller 7 

OR (192 L1A) 

Figure 1-5: The Level-1 trigger decision loop. 

segments or hit patterns in muon chambers, respectively. Regional Triggers (or Local Triggers) combine their 

information and use pattern logic to determine ranked and sorted trigger objects such as electron or muon 

candidates in limited spatial regions. The rank is determined as a function of energy or momentum and quality, 

which reflects the level of confidence attributed to the L1 trigger parameter measurements, based on detailed 

knowledge of the detectors and trigger electronics and on the amount of information available. The Global 

Calorimeter and Global Muon Triggers determine the highest-rank calorimeter and muon objects across the 

entire experiment and transfer them to the Global Trigger, the top entity of the L1 trigger hierarchy. 

While the L1 trigger is taking its decision the full high-precision data of all detector channels are stored in analog 

or digital buffers, which are only read out if the event is accepted. The L1 decision loop takes 3.2 μs or 128 

bunch crossings which is the size of the front-end buffers. The Level-1 Accept (L1A) decision is communicated 

to the sub-detectors through the Timing, Trigger and Control (TTC) system. Figure 1-5 shows a diagram of the 

L1 decision loop. 

1.3.2.1 Calorimeter Trigger 

The first step of the Calorimeter trigger pipeline is the TPGs. For triggering purposes the calorimeters are 

subdivided in trigger towers. The TPGs sum the transverse energies measured in ECAL crystals or HCAL 

readout towers to obtain the trigger tower E T and attach the correct bunch crossing number. The TPG electronics 

is integrated with the calorimeter readout. The TPGs are transmitted through high-speed serial links to the 

Regional Calorimeter Trigger (RCT, [12]), which determines candidates for electrons or photons, jets, isolated 

hadrons and calculates energy sums in calorimeter regions of 4 x 4 trigger towers. These objects are forwarded to 

the Global Calorimeter Trigger (GCT, [13]) where the best four objects of each category are sent to the Global 

Trigger.

The Trigger and DAQ system 7 

1.3.2.2 Muon Trigger 

All three components of the muon systems (DT, CSC and RPC) take part in the trigger. The barrel DT chambers 

provide local trigger information in the form of track segments in the φ-projection and hit patterns in the η- 

projection. The endcap CSCs deliver 3-dimensional track segments. All chamber types also identify the bunch 

crossing of the corresponding event. The Regional Muon Trigger joins segments to complete tracks and assigns 

physical parameters. It consists of the DT Sector Collector (DTSC, [14]), DT Track Finders (DTTF, [15]) and 

CSC Track Finders (CSCTF, [16]). In addition, the RPC trigger chambers, which have excellent timing 

resolution, deliver their own track candidates based on regional hit patterns. The Global Muon Trigger (GMT, 

[17]) then combines the information from the three sub-detectors, achieving an improved momentum resolution 

and efficiency compared to the stand-alone systems. 

1.3.2.3 Global Trigger 

The Global Trigger (GT, [18]) takes the decision to accept an event for further evaluation by the HLT based on 

trigger objects delivered by the GCT and GMT. The GT has five basic stages: input, logic, decision, distribution 

and readout. Three Pipeline Synchronizing Buffer (PSB) input boards receive the calorimeter trigger objects 

from the GCT and align them in time. The muons are received from the GMT through the backplane. An 

additional PSB board can receive direct trigger signals from sub-detectors or the TOTEM experiment [19] for 

special purposes such as calibration. These signals are called “technical triggers”. The core of the GT is the 

Global Trigger Logic (GTL) board, in which algorithm calculations are performed. The most basic algorithms 

consist of applying p T or E T thresholds to single objects, or of requiring the jet multiplicities to exceed defined 

values. Since location and quality information is available, more complex algorithms based on topological 

conditions can also be programmed into the logic. The number of algorithms that can be executed in parallel is 

128, and up to 64 technical trigger bits may in addition be received directly from a dedicated PSB board. The set 

of algorithm calculations performed in parallel is called “trigger menu”. 

The results of the algorithm calculations are sent to the Final Decision Logic (FDL) board in the form of one bit 

per algorithm. Up to eight final ORs can be applied and correspondingly eight L1A signals can be issued. For 

normal physics data taking a single trigger mask is applied, and the L1A decision is taken accordingly. The rest 

of L1As are used for commissioning, calibration and tests of individual sub-systems 4 . 

The distribution of the L1A decision to the sub-systems is performed by two L1A OUT output boards, provided 

that it is authorized by the Trigger Control System described in Section 1.3.2.4. A TIMing module (TIM) is also 

necessary to receive the LHC machine clock and to distribute it to the boards. 

Finally, the Global Trigger Front-end (GTFE) board sends to the DAQ Event Manager (EVM, Section 1.4.3), 

located in the surface control room, the GT data records which consists of the GPS event time received from the 

machine, the total L1A count, the bunch crossing number in the range from 1 to 3564, the orbit number, the 

event number for each TCS/DAQ partition, all FDL algorithm bits and other information 

1.3.2.4 Timing Trigger and Control System 

The Trigger Timing and Control (TTC) system provides for distribution of L1A and fast control signals (e.g. 

synchronization and reset commands, and test and calibration triggers) to the detector front-ends depending on 

the status of the sub-detector readout systems and the data acquisition. The status is derived from signals 

provided by the Trigger Throttle System (TTS). The TTC system consists of the Trigger Control System (TCS, 

[20]) module and the Timing, Trigger and Control distribution network [21]. 

The TCS allows different sub-systems to be operated independently if required. For this purpose the experiment 

is subdivided into 32 partitions. A partition represents a major component of a sub-system. Each partition must 

be assigned to a partition group, also called a TCS partition. Within such a TCS partition all connected partitions 

operate concurrently. For commissioning and testing up to eight TCS partitions are available, which each receive 

their own L1A signals distributed in different time slots allocated by a priority scheme or in round robin mode. 

During normal physics data taking there is only one single TCS partition. 

4 The sub-system concept includes the sub-detectors and the Level-1 trigger sub-systems.


Sub-systems may either be operated centrally as members of a partition or privately through a Local Trigger 

Controller (LTC). Switching between central and local mode is performed by the TTCci (TTC CMS interface) 

module, which provides the interface between the respective trigger control module and the destinations for the 

transmission of the L1A signal and other fast commands for synchronization and control. At the destinations the 

TTC signals are received by TTC receivers (TTCrx). 

The TCS, which resides in the Global Trigger crate, is connected to the LHC machine through the TIM module, 

to the FDL through the GT backplane, and to 32 TTCci modules through the LA1 OUT boards. The TTS, to 

which it is also connected, has a synchronous (sTTS) and an asynchronous branch (aTTS). The sTTS collects 

status information from the front-end electronics of 24 sub-detector partitions and up to eight tracker and preshower 

front-end buffer emulators 5 . The status signals, coded in four bits, denote the conditions “disconnected”, 

“overflow warning”, “synchronization loss”, “busy”, “ready” and “error”. The signals are generated by the Fast 

Merging Modules (FMM) through logical operations on up to 32 groups of four sTTS binary signals and are 

received by four conversion boards located in a 6U crate next to the GT central crate. The aTTS runs under 

control of the DAQ software and monitors the behavior of the readout and trigger electronics. It receives and 

sends status information concerning the 8 DAQ partitions, which match the TCS partitions. It is coded in a 

similar way as for the sTTS. 

Depending on the meaning of the status signals different protocols are executed. For example, in case of warning 

on the use of resources due to excessive trigger rates pre-scale factors may be applied in the FDL to algorithms 

causing them. A loss of synchronization would initiate a reset procedure. General trigger rules for minimal 

spacing of L1As are also implemented in the TCS. The total deadtime estimated at the maximum L1 trigger 

output rate of 100 kHz is estimated to be below 1%. Deadtime and monitoring counters are provided by the TCS. 

1.4 The CMS Experiment Control System 

The CMS Experiment Control System (ECS) is a complex distributed software system that manages the 

configuration, monitoring and operation of all equipment involved in the different activities of the experiment: 

Trigger and DAQ system, detector operations and the interaction with the outside world. This software system 

consists of the Run Control and Monitor System (RCMS), the Detector Control System (DCS), a distributed 

processing environment (XDAQ) and the sub-system Online SoftWare Infrastructure (OSWI). These 

components are described in the following sections. 

1.4.1 Run Control and monitoring System 

The Run Control and Monitoring System (RCMS) ([10], pp.191-208; [22]) is one of the principal components of 

the ECS and the one that provides the interface to control the overall experiment in data taking operations. This 

software system configures and controls the online software of the DAQ components and the sub-detector 

control systems. 

The RCMS system has a hierarchical structure with eleven main branches, one per sub-detector, e.g. HCAL, 

central DAQ or the L1 trigger. The basic element in the control tree is the Function Manager (FM). It consists of 

a finite state machine and a set of services. The state machine model has been standardized for the first level of 

FM’s in the control tree. These nodes are the interface to the sub-detector control software (Section 1.4.4). 

The RCMS system is implemented in the RCMS framework, which provides a uniform API to common tasks 

like storage and retrieval from the process configuration database, state-machine models for process control, and 

access to the monitoring system. The framework provides also a set of services which are accessible to the FM’s. 

The services comprise a security service for authentication and user account management, a resource service for 

storing and delivering configuration information of online processes, access to remote processes via resource 

proxies, error handlers, a log message application to collect, store and distribute messages, and the “job control” 

to start, stop and monitor processes in a distributed environment. 

5 Buffer emulator: Hardware system responsible for emulating the status of the front-end buffers and vetoing trigger 

decisions based on this status.

The CMS Experiment Control System 9 

The RCMS services are implemented in the programming language Java as web applications. The controller 

Graphical User Interface (GUI) is based on Java Server Pages technology (JSP, [23]). The eXtended Markup 

Language (XML [24]) data format and the Simple Object Access Protocol (SOAP, [25]) protocol are used for 

inter process communication. Finally, the job control is implemented in C++ using the XDAQ framework 

(Section 1.4.3). 

1.4.2 Detector Control System 

The Detector Control System (DCS) ([10], pp. 209-222) is responsible for operating the auxiliary detector 

infrastructures: high and low voltage controls, cooling facilities, supervision of all gas and fluids sub-systems, 

control of all racks and crates, and the calibration systems. The DCS also plays a major role in the protection of 

the experiment from any adverse event. The DCS runs as a slave of the RCMS system during the data-taking 

process. Many of the functions provided by DCS are needed at all times, and as a result DCS must function also 

outside data-taking periods as the master. 

The DCS is organized in a hierarchy of nodes. The topmost point of the hierarchy offers global commands like 

“start” and “stop” for the entire detector. The commands are propagated towards the lower levels of the 

hierarchy, where the different levels interpret the commands received and translate them into the corresponding 

commands specific to the system they represent. As an example, a global “start” command is translated into a 

“HV ramp-up” command for a sub-detector. Correspondingly, a summary of the lower level states defines the 

state of the upper levels. As an example, the state “HV on” of a sub-detector is summarized as “running” in the 

global state. The propagation of commands ends at the lowest level at the “devices” which are representations of 

the actual hardware. 

A commercial Supervisory Controls And Data Acquisition (SCADA) system PVSS II [26] was chosen by all 

LHC experiments as the supervisory system of the corresponding DCS systems. PVSS II is a development 

environment for a SCADA system which offers many of the basic functionalities needed to fulfill the tasks 

mentioned above. 

1.4.3 Cross-platform DAQ framework 

The XDAQ framework ([10], Pp.173-190; [27]) is a domain-specific middleware 6 designed for high energy 

physics data acquisition systems [28]. The framework includes a collection of generic components to be used in 

various application scenarios and specific environments with a limited customization effort. One of them is the 

event builder [29] that consists of three collaborating components, a Readout Unit (RU), a Builder Unit (BU) and 

an EVent Manager (EVM). The logical components and interconnects of the event builder are shown 

schematically in Figure 1-6. 

An event enters the system as a set of fragments distributed over the Front-end Devices (FED’s). It is the task of 

the EVB to collect the fragments of an event, assemble them and send the full event to a single processing unit. 

To this end, a builder network connects ~500 Readout Units (RU’s) to ~500 Builder Units (BU’s). The event 

data is read out by sub-detector specific hardware devices and forwarded to the Readout Units. The RU’s 

temporally store the event fragments until the reception of a control message to forward specific event fragment 

to a builder unit. A builder unit collects the event fragments belonging to a single collision event from all RUs 

and combines them to a complete event. The BU exposes an interface to event data processors, called the filter 

units (FU). This interface can be used to make event data persistent or to apply event-filtering algorithms. The 

EVM interfaces to the L1 trigger readout electronics and controls the event building process by mediating 

control messages between RU’s and BU’s. 

All components of the DAQ: Event managers (8), Readout Units (~500), Builder Units (~4000) and Filter units 

(~4000) are supervised by the RCMS system. 

6 A Middleware is a software framework intended to facilitate the connection of other software components or applications. 

It consists of a set of services that allow multiple processes running on one or more machines to interact across a network.


Readout Units 

buffer event 

fragments 

Event data fragments are 

stored in separated physical 

memory systems 

Event manager 

interfaces between 

RU, BU and Trigger 

Builder Units assemble 

event fragments 

Collection of 

Filter Units 

Full event data are stored in 

a single physical memory 

system associated to a 

processing unit 

Events are processed 

and stored persistently 

by the Filter Units 

Figure 1-6: Logical components and interconnects of the event builder. 

1.4.4 Sub-system Online Software Infrastructure 

In addition to the sub-system DCS sub-tree and the Readout Units tailored to fit the specific front-end 

requirements, the sub-system Online SoftWare Infrastructure (OSWI) consists of Linux device drivers, C++ 

APIs to control the hardware at a functional level, scripts to automate testing and configuration sequences, 

standalone graphical setups and web-based interfaces to remotely operate the sub-system hardware. 

Graphical setups were developed using a broad spectrum of technologies: Java programming language [30], C++ 

language and the Qt library [31] or Python scripting language [32]. Web-based applications were developed also 

with the Java programming language and the Tomcat server [33] and with C++ language and the XDAQ 

middleware. 

Most of the sub-detectors implemented their supervisory and control systems with C++ and the XDAQ 

middleware. These distributed systems are mainly intended to download and upload parameters in the front-end 

electronics. The sub-detector control systems expose also a SOAP API in order to integrate with the RCMS. 

1.4.5 Architecture 

Figure 1-7 shows the architecture of the CMS Experiment Control System which integrates the online software 

systems presented in Sections 1.4.2, 1.4.3, and 1.4.4. 

Up to eight instances of the RCMS or RCMS sessions can exist concurrently. Each of them operates a subset of 

the CMS sub-detectors. A RCMS session consists of a central Function Manager (FM) that coordinates the 

operation of the sub-systems FM involved in the session. A RCMS session normally involves a number of subdetectors, 

DAQ components and the L1 trigger. 

The sub-detector FM operates the sub-detector supervisory and control systems which in turn configure the subdetector 

front-end electronics. The DAQ FM configures and controls the DAQ software and hardware

Research program 11 

Run Control 

Session 1 

… 

x8 

Run Control 

Session 8 

DCS 

Panel 

FM 

Subdetector 1 

FM 

DAQ 

FM 

Triggger 

FM 

Triggger 

FM 

Subdetector 8 

FM 

DAQ 

DCS 

Srv1 

DCS 

Supervisor 

DCS 

Srv2 

SD1 

DCS 

XDAQ 

Front end 

crate 

XDAQ 

RUs, Bus, 

FUs EVMs 

GT 

GMT 

… 

RCT 

GCT 

CSCTF 

OSWI 

SD8 

DCS 

XDAQ 

Front end 

crate 

XDAQ 

RUs, Bus, 

FUs EVMs 

Trigger crates 

components in order to set up a distributed system able to read out the event fragments from the sub-detectors, 

and to build, filter and record the most promising events. 

Finally, the L1 trigger FM drives the configuration of the L1 decision loop. The L1 trigger generates L1As that 

are distributed to the 32 sub-detector partitions according to the configuration of the TTC system. Up to eight 

exclusive subsets of the sub-detector partitions or DAQ partitions can be handled independently by the TTC 

system. Each RCMS session controls the configuration of one DAQ partition. Therefore, the L1 decision loop is 

a shared infrastructure among the different sessions. A software facility to control it must be able to serve 

concurrently up to 8 RCMS sessions avoiding inconsistent configuration operations among sessions. The design 

of the L1 decision loop hardware management system is the main object of this PhD thesis. 

1.5 Research program 

1.5.1 Motivation 

Figure 1-7: Architecture of the CMS Experiment Control System. 

The design and development of a software system to operate DAQ hardware devices includes the definition of 

sequences containing read, write, test and exception handling operations for initialization and parameterization 

purposes. These sequences, for instance, are responsible for downloading firmware code and for setting tunable 

parameters like threshold values or parameters to compensate for the accrued radiation damage. Mechanisms to 

execute tests on hardware devices and for detecting and diagnosing faults are also needed. 

However, choosing a programming language, reading the hardware application notes and defining configuration, 

testing and monitoring sequences is not enough to deal with the complexity of the last generation of HEP 

experiments. The unprecedented number of hardware items, the long periods of preparation and operation, and 

last but not least the human context, are three complexity dimensions that need to be added to the conceptual 

design process. 

Number 

Fabjan and Fischer [34] have observed that the availability of the ever increasing sophistication, reliability and 

convenience in data handling instrumentation has led inexorably to detector systems of increased complexity. 

CMS and ATLAS are the greatest exponents of this rising complexity. The progression in channel numbers, 

event rates, bunch crossing rates, event sizes, and data rates in three well known big experiments which belong to


the decades 1980s (UA1), 1990s (H1) and 2000s (CMS) is shown in Table 1-2. The huge number of channels, 

the highly configurable DHI based on FPGA’s and the distributed nature of this hardware system were 

unprecedented requirements to cope with during the conceptual design. 

Experiment UA1 H1 CMS 

Tracking [channels] 10 4 10 4 10 8 

Calorimeter [channels] 10 5.10 4 6. 10 5 

Muons [channels] 10 4 2. 10 5 10 6 

Bunch crossing rate [ns] 3400 96 25 

Raw data rate [bit·s -1 ] 10 9 3. 10 11 4.10 15 

Tape write rate [Hz] 10 10 100 

Mean event size [byte] 100k 125k 1M 

Table 1-2: Data acquisition parameters for UA1 (1982), H1 (1992) and CMS [35]. 

Time 

The preparation and operation of HEP experiments typically spans over a period of many years (e.g. 1992, CMS 

Letter of intent [36]). During this time the hardware and software environments evolve. Throughout all phases, 

integrators have to deal with system modifications [28]. In such a heterogeneous and evolving environment, a 

considerable development effort is required to design and implement new interfaces, synchronize and integrate 

them with all other sub-systems, and support the configuration and control of all parts. 

The long operational phases influence also the possible discussion about the convenience of using commercial 

components rather than in-house solutions. There is simply not enough manpower to build all components inhouse. 

However, the use of commercial components has a number of risks: First, a selected component may turn 

out to have insufficient performance or scalability, or simply have too many bugs to be usable. Significant 

manpower is therefore spent on selecting components, and on validating selected components. Another 

significant risk with commercial components is that the running time of the CMS experiment, at least 15 years 

starting from 2008, is much larger than the lifetime of most commercial software products [37]. 

Human 

Despite the necessary and highly hierarchic structure in a collaboration of more than 2000 people, different subsystems 

might implement solutions based on heterogeneous platforms and interfaces. Therefore, the design of a 

hardware management system should maximize the possible technologies that can be integrated. A second aspect 

of the human context that should guide the system design is that only some of the software project members are 

computing professionals: most are trained as physicists, and they often work only part-time on software. 

1.5.2 Goals 

This research work, carried out in the context of the Trigger and Data Acquisition (TriDAS) project of the CMS 

experiment at the Large Hadron Collider, proposes web-based technological solutions to simplify the 

implementation and operation of software control systems to manage hardware devices for high energy physics 

experiments. The main subject of this work is the design and development of the Trigger Supervisor, a hardware 

management system that enables the integration and operation of the Level-1 trigger decision loop of the CMS 

experiment. An initial investigation about the usage of the eXtended Markup Language (XML) as uniform data 

representation format for a software environment to implement hardware management systems for HEP 

experiments was also performed.

Chapter 2 

Uniform Management of Data Acquisition 

Devices with XML 

2.1 Introduction 

In this chapter, a novel software environment model, based on web technologies, is presented. This research was 

carried out in the context of the CMS TriDAS project in order to better understand the difficulties of building a 

hardware management system for the L1 decision loop. This research was motivated by the unprecedented 

complexity in the construction of hardware management systems for HEP experiments. 

The proposed model is based on the idea that a uniform approach to manage the diverse interfaces and operations 

of the data acquisition devices would simplify the development of a configuration and control system and should 

save development time. A uniform scheme would be advantageous for large installations, like those found in 

HEP experiments [2][3][4][5][38] due to the diversity of front-end electronic modules, in terms of configuration, 

functionality and multiplicity (e.g. Section 1.3). 

2.2 Key requirements 

This chapter proposes to work toward an environment to define hardware devices and their behavior at a logical 

level. The approach should facilitate the integration of various different hardware sub-systems. The design 

should at least fulfill the following key requirements. 

• Standardization: The running time of the CMS experiment is expected to be at least 15 years which is a 

much larger period than the lifetime of most commercial software products. To cope with this, the 

environment should maximize the usage of standard technologies. For instance, we believe that standard 

C++ with its standard libraries and XML-based technologies will still be used 10 years from now. 

• Extensibility: A mechanism to define new commands and data for a given interface must exist, without the 

need to change either control or controlled systems that are not concerned by the modification. 

• Platform independence: The specification of commands and configuration parameters must not impose a 

specific format of a particular operating system or hardware platform. 

• Communication technology independence: Hardware devices are hosted by different sub-systems that 

expose different capabilities and types of communication abilities. Choosing the technology that is most 

suitable for a certain platform must not require an overall system modification. 

• Performance: The additional benefits of any new infrastructure should not imply a loss of execution 

performance compared to similar solutions which are established in the HEP community.

Uniform Management of Data Acquisition Devices with XML 14 

2.3 A uniform approach for hardware configuration control and 

testing 

Taking into account the above requirements, we present a model for the configuration, control and testing 

interface of data acquisition hardware devices [39]. The model, shown in Figure 2-1, builds upon two principles: 

1) The use of the eXtensible Markup Language (XML [24]) as a uniform syntax for describing hardware devices, 

configuration data, test results and control sequences. 

2) An interpreted, run-time extensible, high-level control language for these sequences that provides 

independence from specific hosts and interconnect systems to which devices are attached. 

This model, as compared to other approaches [40], enforces the uniform use of XML syntax to describe 

configuration data, device specifications, and control sequences for configuration and control of hardware 

devices. This means that control sequences can be treated as data, making it easy to write scripts that manipulate 

other scripts and embed them into other XML documents. In addition, the unified model makes it possible to use 

the same concepts, tools, and persistency mechanisms, which simplifies the software configuration management 

of large projects 7 . 

2.3.1 XML as a uniform syntax 

Figure 2-1: Abstract description of the model. 

When designing systems composed of heterogeneous platforms and/or evolving systems, platform independence 

is provided by a uniform syntax, using a single data representation to describe hardware devices, configuration 

data, test results, and control sequences. A solution based on the XML syntax presents the following advantages. 

• XML is a W3C (World Wide Web Consortium) non-proprietary, platform independent standard that plays 

an increasingly important role in the exchange of data. A large set of compliant technologies, like XML 

schema [42], DOM [43] and XPath [44] are defined. In addition, tools that support programming become 

available through projects like Apache [45]. 

• XML structures can be formally specified and extended, following a modularized approach, using an XML 

schema definition. 

7 Software Configuration Management is the set of activities designed to control change by identifying the work products 

that are likely to change, establishing relationships among them, defining mechanisms for managing different versions of 

these work products, controlling the changes imposed, and auditing and reporting on the changes made [41].

A uniform approach for hardware configuration control and testing 15 

• XML documents can be directly transmitted using any kind of protocols including HTTP [46]. In this case, 

SOAP [25], a XML based protocol, can be used. 

• XML documents can be automatically converted into documentation artifacts by means of an XSLT 

transformation [47]. Therefore, system documentation can be automatically and consistently maintained. 

• XML is widely used for nonevent information in HEP experiments: “XML is cropping up all over in online 

configuration and monitoring applications” [48]. 

On the other hand, XML has one big drawback: it uses by default textual data representation, which causes much 

more network traffic to transfer data. Even BASE64 or Uuencoded byte arrays are approximately 1.5 times 

larger than a binary format. Furthermore, additional processing time is required for translating between XML 

and native data representations. Therefore, the current approach is not well suited for devices generating 

abundant amount of real-time data, but is still valid for configuration, monitoring, and slow control purposes. 

Figure 2-2: Example program in XSEQ exemplifying the basic features of the language. 

2.3.2 XML based control language 

A control language (XSEQ: cross-platform sequencer) that processes XML documents to operate hardware 

devices has been syntactically and semantically specified. The language is XML based and has the following 

characteristics: 

• Extensibility: The syntax has been formally specified using XML schema. A schema document contains the 

core syntax of the language, describing the basic structures and constraints on XSEQ programs (e.g. variable 

declarations and control flow). The basic language can be extended in order to cope with user specific 

requirements. Those extensions are also XML schema documents, whose elements are instances of abstract 

elements of the core XML schema. This mechanism is one of the most important features of the language 

because it facilitates a modular integration of different user requirements and eases resource sharing (code 

and data). The usage and advantages of this feature will be discussed in Section 2.4.1.


• Imperative and object oriented programming styles: The language provides standard imperative constructs 

just like most other programming languages in order to carry out conditions, sequencing and iteration. It is 

also possible to use the main object oriented programming concepts like encapsulation, inheritance, 

abstraction and polymorphism. 

• Exception handling with error recovery mechanisms. 

• Local execution of remote sequences with parameter passing by reference. 

• Non-typed scoped variables. 

Additional functionalities have been added to the core syntax in the form of modular XML schema extensions, in 

order to fit frequently encountered use cases in data acquisition environments: 

• Transparent access to PCI and VME devices: This extension facilitates the configuration and control of 

hardware devices, following a common interface for both bus systems. This interface is designed to facilitate 

its extension in order to cope with future technologies. 

• File system access. 

• SOAP messaging: This allows inclusion of control sequences and configuration data into XML messages. 

The messages can be directly transported between remote hosts in a distributed programming environment. 

• DOM and XPath interface to facilitate integration in an environment where software and hardware device 

configuration are fully XML driven. 

• System command execution interface with redirected standard error and standard output to internal string 

objects. 

In Figure 2-2 an XSEQ program is shown where basic features of the language are exemplified. In Figure 2-3 an 

example is given of how the hardware access is performed following the proposed model. Device specifications, 

configuration data and control sequences are XML documents. In this example, configuration data are retrieved 

through an XPath query from a configuration database. 

 

 

 

… 

 

Configuration database 

… 

0x01 

0x01 

… 

Figure 2-3: Example of a program in XSEQ, which shows how the model is applied. Device specifications 

(register_table.xml), configuration data (retrieved from a configuration data base accessible through a XPath 

query) and control sequences are all based on uniform use of XML.

Interpreter design 17 

2.4 Interpreter design 

To enable code sharing among different platforms, we have chosen a purely interpreted approach that allows 

control sequences to run independently of the underlying platform in a single compile/execution cycle. In 

addition, the interpreted approach is characterized by small program sizes and an execution environment that 

provides controlled and predictable resource consumption, making it easily embeddable in other software 

systems. 

An interpreter [49] for XSEQ programs has been implemented in C++ under Linux. The pattern of the interpreter 

is based on the following concepts: 

• The source format is a DOM document already validated against the XSEQ XML schema document and the 

required extensions. This simplifies interpreter implementation and separates the processing into two 

independent phases: 1) syntactic validation and 2) execution. 

• Every XML command has a C++ class representation that inherits from a single class named XseqObject. 

• A global context accessible to all instruction objects. It contains: 1) the execution stack, which stores nonstatic 

variables; 2) the static stack, which stores static variables and is useful to retain information from 

previous executions; 3) the code cache, which maintains already validated DOM trees in order to accelerate 

the interpretation process; 4) the dynamic factory, which facilitates the interpreter run-time extension; and 5) 

debug information to properly trace the execution and to find errors. 

2.4.1 Polymorphic structure 

Every class inherits from a single abstract class XseqObject, and it has information about how to perform its 

task. For example, the XSEQ command is represented with the XseqIf class. This class inherits from the 

XseqObject class, and the execution algorithm is implemented in the overridden eval() method. 

Extends interpreter in 

order to execute a new 

syntactic extension 

 

 

 

 

… 

 

Figure 2-4: Example of program in XSEQ, which exemplifies the use of the tag. It extends 

dynamically the interpreter (semantics) in order to execute new commands (syntax) defined in a xsd 

document.


C++ classes that implement the functionality of every language syntactic extension are grouped and compiled as 

shared libraries. Such libraries can be dynamically linked to the running interpreter. They are associated with a 

concrete syntactic language extension by means of the special XSEQ command . This facility allows 

separate syntax language extensions, defined in XML schema modules, from the run-time interpreter extensions. 

The best practice of this facility enables two different sub-systems with similar requirements but different 

platforms, to share code by just assigning different interpreter extensions to the same language extension. Figure 

2-4 exemplifies the use of the tag. 

2.5 Use in a distributed environment 

The interpreter is also available as a XDAQ pluggable module (Section 1.4.3). XDAQ includes an executive 

component that provides applications with the necessary functions for communication, configuration, control and 

monitoring. All configuration, control and monitoring commands can be performed through the SOAP/HTTP 

protocol. 

In Figure 2-5 the use of the interpreter in a XDAQ framework is shown. This is the basic building block that 

facilitates the deployment of the model in a distributed environment. 

Figure 2-5: Use of the interpreter in a XDAQ framework. 

To operate this application, the user must provide in XML format the configuration of the physical and logical 

properties of the system and its components. The configuration process defines the available web services as 

XSEQ scripts. 

Once the running application is properly configured, the client can send commands through SOAP messages. As 

a function of the received command, the corresponding XSEQ script is executed. The SOAP message itself can 

be processed using the language extension to manipulate SOAP messages. Such functionality is useful when 

parameters must be remotely passed. Finally, every XSEQ program ends by returning a SOAP message that will 

be forwarded by the executive to the client. 

2.6 Hardware management system prototype 

The architecture of a hypothetical hardware management system for the CMS experiment is shown in Figure 2-6. 

A number of application scenarios were integrated [50]. Hardware modules belonging to the Global Trigger [18], 

the Silicon-Tracker sub-detector [6] and the Data Acquisition system [10] participated in this demonstrator. 

The basic building block presented in Section 2.5 was implemented for every different platform that played the 

role of hardware module host. The same infrastructure was used to develop a central node which was in charge

Hardware management system prototype 19 

to buffer all calls from clients, coordinate the operation of all sub-system control nodes and to forward the 

responses from the different sub-system control nodes again to the client. 

Hardware modules were quite heterogeneous in terms of configuration, functionality and multiplicity. In 

addition, the control software sub-system for every sub-detector was independent from the others. Therefore, a 

diverse set of control software sub-systems existed. This offered a heterogeneous set of interfaces that had to be 

understood by a common configuration and control system. 

Control sequences executed by the sub-system control nodes depended on a set of language extensions. The 

language was augmented, following a modular approach, by means of the XML schema technology (Section 

2.4.1). For a given language extension the interpreter was associated with a platform specific support. Some 

language extensions were shared by several sub-systems. For instance, platform 2 and platform 3 were operating 

the GT crate through different PCI to VME interfaces. The tag was used for binding a common GT 

language extension to a specific interpreter extension that knew how to use the concrete PCI to VME interface. 

The tag was also used to share code between platform 3 and platform 4 in order to test PCI and VME 

memory boards. The default language extension to execute system commands was used to operate the Fast 

Merging Module board (FMM, [51]) and to forward the standard output and the standard error to XSEQ string 

objects. Finally, a driver to read and write registers from and to a flash memory embedded into a PCI board was 

implemented following the chip application notes. 

The homogeneous use of XML syntax to describe data, control sequences, and language extensions allowed a 

distributed storage of any of these documents that could be simply accessed through their URLs. Interpreter runtime 

extensions could also be remotely linked and, therefore, a local binary copy was not necessary. Another 

advantage of this approach was that both hardware and software configuration schemes were unified since the 

online software of the data acquisition system was also fully XML driven. 

The default SOAP extension of the control language made it possible to manipulate, send, and receive SOAP 

messages. 

Figure 2-6: Hardware management system based on the XSEQ software environment.


2.7 Performance comparison 

Timing measurements have been performed on a desktop PC (Intel D845WN chipset) with a Pentium IV 

processor (1.8 GHz), 256 MB SDRAM memory (133 MHz), and running Linux Red Hat 7.2, with kernel version 

2.4.9–31.1. 

The main objective of this section is to present a comparison of the existing interpreter implementation with a 

Tcl interpreter [52], focusing on the overhead induced by the interpreter approach when accessing hardware 

devices. Tcl has been chosen as a reference because it is a well-established scripting language in the HEP 

community, and it shares many features with XSEQ: it is simple, easily extensible and embeddable. 

For both interpreters the same hardware access library (HAL [53]) has been used to implement the necessary 

extensions. This library has been also used to implement a C++ binary version of the test program for reference 

purposes. 

The test is a loop that reads consecutive memory positions of a memory module. In order to properly identify the 

interpreter overhead and to decouple it from the driver overhead, the real hardware access has been disabled and 

a dummy driver emulates all accesses. The results are shown in Table 2-1. 

XSEQ Tcl C++ 

16.9 μs 16 μs 2.63 μs 

Table 2-1 Comparison of average execution times (memory read) for Tcl, XSEQ and C++. 

The results indicate an overhead which results from the interpreted approach that lies in the same order of 

magnitude as the Tcl interpreter. Execution times of XSEQ can be further reduced with customized language 

extensions that encapsulate a specific macro behavior. For instance, a loop command with a fixed number of 

iterations has been implemented. This command reduces the timing of the test program to 5.3. However, 

flexibility is reduced, because the macro command cannot be modified at run time. 

2.8 Prototype status 

In this chapter a uniform model based on XML technologies for the configuration, control and testing of data 

acquisition hardware was presented. It matches well the extensibility and flexibility requirements of a long 

lifetime experiment that is characterized by an ever-changing environment. 

The following chapters present the design and development details of the Level-1 trigger hardware management 

system or Trigger Supervisor. Theoretically, this would be an ideal opportunity to apply XSEQ. However, the 

prototype status of the software, the limited resources and reduced development time were concluding reasons to 

remove this technological option from the initial survey. 

Therefore, the XSEQ project did not succeed to reach its final goal which is the same of any other software 

project: to be used. On the other hand, this effort carried out in the context of the CMS Trigger and Data 

Acquisition project improved the overall team knowledge on XML technologies, created a pool of ideas and 

helped to anticipate the difficulties of building a hardware management system for the Level-1 trigger.

Chapter 3 

Trigger Supervisor Concept 


The Trigger Supervisor (TS) is an online software system. Its purpose is to set up, test, operate and monitor the 

L1 decision loop (Section 1.3.2) components on one hand, and to manage their interplay and the information 

exchange with the Run Control and Monitoring System (RCMS, Section 1.4.5) on the other. It is conceived to 

provide a simple and homogeneous client interface to the online software infrastructure of the trigger subsystems. 

Facing a large number of trigger sub-systems and potentially a highly heterogeneous environment 

resulting from different sub-system Application Program Interfaces (API), it is crucial to simplify the task of 

implementing and maintaining a client that allows operating several trigger sub-systems either simultaneously or 

in standalone mode. 

An intermediate node, lying between the client and the trigger sub-systems, which offers a simplified API to 

perform control, monitoring and testing operations, will ease the design of this client. This layer provides a 

uniform interface to perform hardware configurations, monitor the hardware behavior or to perform tests in 

which several trigger sub-systems participate. In addition, this layer coordinates the access of different users to 

the common L1 trigger resources. 

The operation of the L1 decision loop will necessarily be within the broader context of the experiment operation. 

In this context, the RCMS will be in charge of offering a control window from which an operator can run the 

experiment, and in particular the L1 trigger system. On the other hand, it is also necessary to be able to operate 

the L1 trigger system independently of the other experiment sub-systems. This independence of the TS will be 

mainly required during the commissioning and maintenance phases. Once the TS is accessed through RCMS, a 

scientist working on a data taking run will be presented with a graphical user interface offering choices to 

configure, test, run and monitor the L1 trigger system. Configuring includes setting up the programmable logic 

and physics parameters such as energy or momentum thresholds in the L1 trigger hardware. Predefined and 

validated configuration files are stored in a database and are proposed as defaults. Tests of the L1 trigger system 

after configuration are optional. Once the TS has determined that the system is configured and operational, a run 

may be started through RCMS and the option to monitor can be selected. For commissioning periods more 

options are available in the TS, namely the setting up of different TCS partitions and separate operations of subsystems. 

The complexity of the TS is a representative example of the discussion presented in Section 1.5.1: 64 crates, 

O(10 3 ) boards with an average of 15 MB of downloadable firmware and O(10 2 ) configurable registers per board, 

8 independent DAQ partitions, and O(10 3 ) links that must be periodically tested in order to assure the correct 

connection and synchronization are figures of merit of the numeric complexity dimension; the human dimension 

of the project complexity is represented by a European, Asian and American collaboration of 27 research 

institutes in experimental physics. The long development and operational periods of this project are also 

challenging due to the fast pace of the technology evolution. For instance, although the TS project just started in 

August 2004, we have already observed how one of the trigger sub-systems has been fully replaced (Global

Trigger Supervisor Concept 22 

Calorimeter Trigger, [13]) and recently a number of proposals to upgrade the trigger sub-systems for the Super 

LHC (SLHC, [54]) have been accepted [55]. 

This chapter presents the conceptual design of the CMS Trigger Supervisor (TS, [56]). This design was approved 

by the CMS collaboration in March 2005 as the baseline design for the L1 decision loop hardware management 

system. The conceptual design is not the final design but the seed of a successful project that lasted four years 

from conception to completion and involved people from all CMS sub-systems. Because the conceptual design 

takes into account the challenging context of the last generation of HEP experiments, in addition to the 

functional and non-functional requirements, the description model and concrete solution can be an example for 

future experiments about how to deal with the initial steps of designing a hardware management system. 

3.2 Requirements 

3.2.1 Functional requirements 

The TS is conceived to be a central access point that offers a high level API to facilitate setting a concrete 

configuration of the L1 decision loop, to launch tests that involve several sub-systems or to monitor a number of 

parameters in order to check the correct functionality of the L1 trigger system. In addition, the TS should provide 

access to the online software infrastructure of each trigger sub-system. 

1) Configuration: The most important functionality offered by the TS is the configuration of the L1 trigger 

system. It has to facilitate setting up the content of the configurable items: FPGA firmware, LUT’s, 

memories and registers. This functionality should hide from the controller the complexity of operating the 

different trigger sub-systems in order to set up a given configuration. 

2) High Level Trigger (HLT) Synchronization: In order to properly configure the HLT, it is necessary to 

provide a mechanism to propagate the L1 trigger configuration to the HLT in order to assure a consistent 

overall trigger configuration. 

3) Test: The TS should offer an interface to test the L1 trigger system. Two different test services should be 

provided: the self test, intended to check each trigger sub-system individually, and the interconnection test 

service, intended to check the connection among sub-systems. Interconnection and self test services involve 

not only the trigger sub-systems but also the sub-detectors themselves (Section 3.3.3.3). 

4) Monitoring: The TS interface must enable the monitoring of the necessary information that assures the 

correct functionality of the trigger sub-systems (e.g., measurements of L1 trigger rates and efficiencies, 

simulations of the L1 trigger hardware running in the HLT farm), sub-system specific monitoring data (e.g., 

data read through spy memories), and information for synchronization purposes. 

5) User management: During the experiment commissioning the different sub-detectors are tested 

independently, and many of them might be tested in parallel. In other words, several run control sessions, 

running concurrently, need to access the L1 trigger system (Section 1.4.5). Therefore, it is necessary that the 

TS coordinates the access to the common resources (e.g., the L1 trigger sub-systems). In addition, it is 

necessary to control the access to the L1 trigger system hierarchically in order to determine which 

users/entities (controllers) can have access to it and what privileges they have. A complete access control 

protocol has to be defined that should include identification, authentication, and authorization processes. 

Identification includes the processes and procedures employed to establish a unique user/entity identity 

within a system. Authentication is the process of verifying the identification of a user/entity. This is 

necessary to protect against unauthorized access to a system or to the information it contains. Typically, 

authentication takes place using a password. Authorization is the process of deciding if a requesting 

user/entity is allowed to have access to a system service. A hierarchical list of users with the corresponding 

level of access rights as well as the necessary information to authenticate them should be maintained in the 

configuration database. The lowest-level user should be only allowed to monitor. A medium-level user, such 

as a scientist responsible for the data taking during a running period of the experiment, may manage 

partition setups, select predefined L1 trigger menus and change thresholds, which are written directly into 

registers on the electronics boards. In addition to all the previously cited privileges the highest-level user or 

super user should be allowed to reprogram logic and change internal settings of the boards. In addition to

Requirements 23 

coordinate the access of different users to common resources, the TS must also ensure that operations 

launched by different users are compatible. 

6) Hierarchical start-up mechanism: In order to maximize sub-system independence and client decoupling 

(Section 3.2.2, Point 3) ), a hierarchical start-up mechanism must be available (Section 3.3.3.5 describes the 

operational details). As will be described later, the TS should be organized in a tree-like structure, with a 

central node and several leaves. The first run control session or controller should be responsible for starting 

up the TS central node, and in turn this should offer an API that provides start-up of the TS leaves and the 

online software infrastructure of the corresponding trigger sub-system. 

7) Logging support: The TS must provide logging mechanisms in order to support the users carrying out 

troubleshooting activities in the event of problems. Logbook entries must be time-stamped and should 

include all necessary information such as the details of the action and the identity of the user responsible. 

The log registry should be available online and should be also recorded for offline use. 

8) Error handling: An error management scheme, compatible with the global error management architecture, 

is necessary. It must provide a standard error format, and remote error handling and notification 

mechanisms. 

9) User support: A graphical user interface (GUI) should be provided. This should allow a standalone 

operation of the TS. It would also help the user to interact with the TS and to visualize the state of a given 

operation or the monitoring information. From the main GUI it should be possible to open specific GUIs for 

each trigger sub-system. Those should be based on a common skeleton that should be fulfilled by the trigger 

sub-system developers following a given methodology described in a document that will be provided. An 

adequate online help facility should be available to help the user operate the TS, since many of the users of 

the TS would not be experienced and may not have received detailed training. 

10) Multi user: During the commissioning and maintenance phases, several run control sessions run 

concurrently. Each of them is responsible for operating a different TCS partition. In addition, the TS should 

allow standalone operations (not involving the RCMS), for instance, to execute tests or monitor the L1 

trigger system. Therefore, it is necessary to allow that several clients can be served in parallel by the TS. 

11) Remote operation: The possibility to program and operate the L1 trigger components remotely is essential 

due to the distributed nature of the CMS Experiment Control System (Section 1.4.5). It is important also to 

consider that, unlike in the past, most scientists can in general not be present in person at the experiment 

location during data taking and also during commissioning, but have to operate and supervise their systems 

remotely. 

12) Interface requirements: In order to facilitate the integration, the implementation and the description of the 

controller-TS interface a web service based approach [57] should be followed. The chosen communication 

protocol to send commands and state notifications should be the same as for most CMS sub-systems, and 

especially the same as already chosen for run control, data acquisition and slow control. Therefore Simple 

Object Access Protocol (SOAP) [25] and the representation format Extensible Markup Language (XML) 

[24] for exchanged data should be selected. The format of the transmitted data and the SOAP messages is 

specified using the XML schema language [42], and the Web Services Description Language (WSDL) [58] 

is used to specify the location of the services and the methods the service exposes. To overcome the 

drawback that XML uses a textual data representation, which causes much network traffic to transfer data, a 

binary serialization package provided within the CMS online software project and I2O messaging [59] could 

be used for devices generating large amounts of real-time data. 

Due to the long time required to finish the execution of configuration and test commands, an asynchronous 

protocol is necessary to interface the TS. This means that the receiver of the command replies immediately 

acknowledging the reception, and that this receiver sends another message to the sender once the command 

is executed. An asynchronous protocol improves the usability of the system because the controller is not 

blocked until the completion of the requested command. 

3.2.2 Non-functional requirements 

1) Low-level infrastructure independence: The design of the TS should be independent of the online 

software infrastructure (OSWI) of any sub-system as far as possible. In other words, the OSWI of a concrete


sub-system should not drive any important decision in the design of the TS. This requirement is intended to 

minimize the TS redesign due to the evolution of the OSWI of any sub-system. 

2) Sub-system control: The TS should offer the possibility of operating a concrete trigger sub-system. 

Therefore, the design should be able to provide at the same time a mechanism to coordinate the operation of 

a number of trigger sub-systems, and a mechanism to control a single trigger sub-system. 

3) Controller decoupling: The TS must operate in different environments: inside the context of the common 

experiment operation, but also independently of the other CMS sub-systems, such as, during the phases of 

commissioning and maintenance of the experiment, or during the trigger sub-system integration tests. Due to 

the diversity of operational contexts, it is useful to facilitate the access to the TS through different 

technologies: RCMS, Java applications, web browser or even batch scripts. In order to allow such a 

heterogeneity of controllers, the TS design must be totally decoupled from the controller, and the following 

requirements should be taken into account: 

a. The logic of the TS should not be split between a concrete controller and the TS itself; 

b. The technology choice to develop the TS should not depend on the software frameworks used 

to develop a concrete controller. 

In addition, the logic and technological decoupling from the controller increases the evolution potential and 

decreases the maintenance effort of the TS. It also increases development and debug options, and reduces 

the complexity of operating the L1 trigger system in a standalone way. 

4) Robustness: Due to 1) the key role of the TS in the overall CMS online software architecture, and 2) the 

fact that a malfunctioning can result in significant losses of physics data but also economic ones, the TS 

should be considered a critical system [60] and therefore design decisions had to be taken accordingly. 

5) Reduced development time: The schedule constraints are also a non-functional requirement. The project 

development phase only started in May 2005, a first demonstrator of the TS system was expected to be 

ready four months later, and an almost final system had to be drafted for the second phase of the Magnet 

Test and Cosmic Challenge that took place in November 2006 with the aim that the TS would be able to 

follow the monthly increasing deployment of CMS experiment components during the Global Run exercises 

started in May 2007. 

6) Flexibility: The TS has to be designed as an open system capable of adopting non-foreseen functionalities 

or services required to operate the L1 decision loop or just specific sub-systems. These new capabilities 

must be added in a non-disruptive way, without requiring major developments. 

7) Human context awareness: The TS design and development has to take into account the particular human 

context of the L1 trigger project. The available human resources in all sub-systems were limited and their 

effort was split among hardware debugging, physics related tasks and software development including 

online, offline and hardware emulation. In this context, most collaboration members were confronted with a 

heterogeneous spectrum of tasks. In addition, the most common professional profiles were hardware experts 

and experimental physicists with no software engineering academic background. The resources assigned to 

the TS project were also very limited; initially and for more than one year, one single person had to cope 

with the design, development, documentation and communication tasks. An additional Full Time Equivalent 

(FTE) incorporated to the project after this period and a number of students have collaborated for few 

months developing small tasks.

Design 25 

Controller side 

TS responsibility 

(customized by 

every sub-system) 

Trigger sub-systems 

responsibility 


subsystem 

GUI 

1 

1 

0..n 

Control cell 

(TS leaf) 

1 

0..n 

SOAP 

SOAP (HTTP, I2O, custom) 

wsdl 

RC 

Session 

0..n 

1 

Control cell 

(TS central node) 

1 

Control cell 

(TS leaf) 

wsdl 

1 1 

1 

wsdl 

… 

Control cell 

(TS leaf) 

1 1 

1 

OSWI OSWI OSWI 

wsdl 

Figure 3-1: Architecture of the Trigger Supervisor. 

3.3 Design 

The TS architecture is composed of a central node in charge of coordinating the access to the different subsystems, 

namely the trigger sub-systems and sub-detectors concerned by the interconnection test service (Section 

3.3.3.3), and a customizable TS leaf (Section 3.3.2) for each of them that offers the central node a well defined 

interface to operate the OSWI of each sub-system. Figure 3-1 shows the architecture of the TS. 

Each node of the TS can be accessed independently, fulfilling the requirement outlined in Section 3.2.2, Point 2). 

The available interfaces and location for each of those nodes are defined in a WSDL document. Both the central 

node and the TS leaves are based on a single common building block, the “control cell”. Each sub-system group 

will be responsible for customizing a control cell and keeping the consistency of the available interface with the 

interface described in the corresponding WSDL file. 

The presented design is not driven by the available interface of the OSWI of a concrete sub-system (Section 

3.2.2, Point 1) ). Therefore, this improves the evolution potential of the low-level infrastructure and the TS. 

Moreover, the design of the TS is logically and technologically decoupled from any controller (Section 3.2.2, 

Point 3) ). In addition, the distributed nature of the TS design facilitates a clear separation of responsibilities and 

a distributed development. The common control cell software framework could be used in a variety of different 

control network topologies (e.g., N-level tree or peer to peer graph). 

3.3.1 Initial discussion on technology 

The development of a distributed software system like the TS requires the usage of distributed programming 

facilities. An initial technological survey pointed to a possible candidate: a C++ based cross-platform data 

acquisition framework called XDAQ developed in-house by the CMS collaboration (Section 1.4.3). The OSWI 

of many sub-systems was already based on this distributed programming framework (Section 1.4.4). It was 

therefore an obvious option to develop the TS. The following reasons backed up this technological option: 

• The software frameworks used in both the TS and the sub-systems are homogeneous. 

• For a faster messaging protocol, I2O messages could be used instead of being limited to messages according 

to the SOAP communication protocol.


HTTP SOAP I2O or Custom 


Access Control 

Module (ACM) 

Task Scheduler 

Module (TSM) 

Shared Resource 

Manager (SRM) 

Error Manager 

(EM) 

Task 1 

Task 1 

Task 1 

Task 1 

Task 1 

Task 1 

Task 1 

Task 1 

Task 1 

• Monitoring and security packages are available. 

• XDAQ development was practically finished, and its API was considered already stable when the 

conceptual design was approved. 

3.3.2 Cell 

Figure 3-2: Architecture of the control cell. 

The architecture of the TS is characterized by its tree topology, where all tree nodes are based on a common 

building block, the control cell. Figure 3-2 shows the architecture of the control cell. The control cell is a 

program that offers the necessary functionalities to coordinate the control operations over other software 

systems, for instance the OSWI of a concrete trigger sub-system, an information server, or even another control 

cell. Each cell can work independently of the rest (fulfilling the requirement of Section 3.2.2, Point 2) ), or inside 

a more complex topology. 

The following points describe the components of the control cell. 

1) Control Cell Interface (CCI): This is the external interface of the control cell. Different protocols should 

be available. An HTTP interface could be provided using the XDAQ facilities; this should facilitate a first 

entry point from any web browser. A second interface based on SOAP should also be provided in order to 

ease the integration of the TS with the run control or any other controller that requires a web service 

interface. Future interface extensions are foreseen (e.g., an I2O interface should be implemented). Each 

control cell should have an associated WSDL document that will describe its interface. The information 

contained in that document instructs any user/entity how to properly operate with the control cell. 

2) Access Control Module (ACM): Each module is responsible for identifying and authenticating every user 

or entity (controller) attempting to access, and for providing an authorization protocol. The access control 

module should have access to a user list, which should provide the necessary information to identify and 

authenticate, and the privileges assigned to each controller. Those privileges should be used to check 

whether or not an authenticated controller is allowed to execute a given operation. 

3) Task Scheduler Module (TSM): This module is in charge of managing the command requests and 

forwarding the answer messages. The basic idea is that a set of available operations exist that can be 

accessed by a given controller. Each operation corresponds to a Finite State Machine (FSM). The default set 

of operations is customizable and extensible. The TSM is also responsible for preventing the launching of 

operations that could enter into conflict with other running operations (e.g., simultaneous self test operations

Design 27 

within the same trigger sub-system, interconnection test operations that cannot be parallelized). The 

extension and/or customization of the default set of operations could change the available interface of the 

control cell. In that case, the corresponding WSDL should be updated. 

4) Shared Resources Manager (SRM): This module is in charge of coordinating access to shared resources 

(e.g., the configuration database, other control cells, or a trigger sub-system online software infrastructure). 

Independent locking services for each resource are provided. 

5) Error Manager (ERM): This module provides the management of all errors not solved locally, which have 

been generated in the context of the control cell, and also the management of those errors that could not be 

resolved in a control cell immediately controlled by this one. Both the error format and the remote error 

notification mechanism will be based on the global CMS distributed error handling scheme. The control 

over what operations can be executed is distributed among the ACM for user access level control (e.g., a 

user with monitoring privileges cannot launch a self test operation), the TSM for conflictive operation 

control (e.g., to avoid running in parallel operations that could disturb each other), and inside the commands 

code of each operation (e.g., to check that a given user is allowed to set up the requested configuration). 

More details are given in Section 3.3.3.1. 

3.3.3 Trigger Supervisor services 

The Trigger Supervisor services are the final functionalities offered by the TS. These services emerge from the 

collaboration of several nodes of the TS tree. In general, the central node is always involved in all services 

coordinating the operation of the necessary TS leaves. The goal of this section is to describe, for each different 

service, what the default operations are in both the central node of the TS and in the TS leaves, and how the 

services emerge from the collaboration of these distributed operations. It is remarked that a control cell operation 

is always a Finite State Machine (FSM). The main reason of using FSM’s to define the TS services is that FSM’s 

are a well known model in HEP to define control systems. It is therefore a feasible tool to communicate and 

discuss ideas with the rest of the collaboration. 

3.3.3.1 Configuration 

This service is intended to perform the hardware configuration of the L1 trigger system, which includes the 

setting of registers or Look-Up Tables (LUT’s) and downloading the L1 trigger logic into the programmable 

logic devices of the electronics boards. The configuration service requires the collaboration of the central node of 

the TS and all the TS leaves. Each control cell involved implements the operation represented in Figure 3-3. 

ConfigurationServiceInit() 

Configure(Key) 

Reconfigure(Key) 

Enable() 

Not 

configured 

Configuring 

Configured 

Enabling 

Enabled 

Error() 

Error() 

Error 

Figure 3-3: Configuration operation.


RC 

session 

Configure(TS_key) 



(central node) 

Session_key 

TS_key 

TS_key 

… Other_keys 

TCS_key 

GM_key 



… GC_key 

TS resp. 

Configure(TCS_key) Configure(GM_key) 

Configure(GC_key) 



(GT/TCS) 



(Global Muon) 

… 



(Global Calo) 

TCS_key 

BC table 

Throttle 

logic 

… Other TCS param 

Subsystem 


Figure 3-4: Configuration service. 

Due to the asynchronous interface, it is also necessary to define transition states such as Configuring and 

Enabling, which indicate that a transition is in progress. All commands are executed while the FSM is in a 

transition state. If applicable, an error state is invoked from the transition state. Figure 3-4 shows how the 

different nodes of the TS collaborate in order to fully configure the L1 trigger system. 

A key 8 is assigned to each node. Each key maps into a row of a database table that contains the configuration 

information of the system. The sequence of steps that a controller of the TS should follow in order to properly 

use the configuration service is as follows. 

• Send a ConfigurationServiceInit() command to the central node of the TS. 

• Once the operation reaches the Not configured state, the next step is to send a Configure(Key) command, 

where Key identifies a set of sub-system keys, one per trigger sub-system that is to be configured. The 

Configure(Key) command initiates the configuration operation in the relevant TS leaves. The configure 

command in the configuration operation of each TS leaf will check whether or not the user is allowed to set 

the configuration identified by a given sub-system key. This means that each trigger sub-system has the full 

control over who and what can be configured. This also means that the list of users in the central node of the 

TS will be replicated in the TS leaves. 

• Once the configuration operation of the TS leaves reaches the Configured state, the configuration operation 

in the central node of the TS jumps to the Configured state. 

• Send an Enable command. This fourth step is just a switch-on operation. 

From the point of view of the L1 trigger system, everything is ready to run the experiment once the configuration 

operation reaches the Enabled state. 

Each trigger sub-system has the responsibility to customize the configuration operation of its own control cell 

and thus has to implement the commands of the FSM. The central node of the TS owns the data that relates a 

given L1 trigger key to the trigger sub-system keys. 

The presented configuration service is flexible enough to allow a full or a partial configuration of the L1 trigger 

system. In the second case, the Key identifies just a subset of sub-system keys, one per trigger sub-system that is 

to be configured, and/or each sub-system key identifies just a subset of all the parameters that can be configured 

for a given trigger sub-system. The configuration database consists of separated databases for each sub-system 

and for the central node. Each trigger sub-system is then responsible for populating the configuration database 

and to assign key identifiers to sets of configuration parameters. 

8 Key: Name that uniquely identifies the configuration of a given system.

Design 29 

3.3.3.2 Reconfiguration 

This section complements Section 3.3.3.1. A reconfiguration of the L1 trigger system may become necessary, for 

example if thresholds have to be adapted due to a change in luminosity conditions. The new configuration table 

must be propagated to the filter farm, as it was required in Section 3.2.1, Point 2). The following steps show how 

a controller of the TS should behave in order to properly reconfigure the L1 trigger system using the 

configuration service. 

• Once the L1 trigger system is configured, the configuration operation in the central node of the TS will be in 

the Enabled state. 

• Send a Reconfigure(Key) command. The following steps show how this command behaves. 

o Stop the generation of L1A signals. 

o Send a Configure(Key) command as in Section 3.3.3.1, and 

o Jump to the state Configured 

• The controller is also responsible for propagating the configuration changes to the filter farm hosts in charge 

of the HLT and the L1 trigger simulation through the configuration/conditions database (Section 3.2.1, Point 

2). 

• Send an Enable command: This signal will be sent by the controller to confirm the propagation of 

configuration changes to the filter farm hosts in charge of the HLT and the L1 trigger simulation. This 

command will be in charge of resuming the generation of L1A signals. Run control is in charge of 

coordinating the configuration of the TS and the HLT. There is no special interface between the central node 

of the TS and the HLT. 

3.3.3.3 Testing 

The TS offers two different test services: the self test service and the interconnection test service. The following 

sections describe both. 

The self test service checks that each individual sub-system is able to operate as foreseen. If anything fails during 

the test of a given sub-system, an error report is returned, which can be used to define the necessary corrective 

actions. The self test service can involve one or more sub-systems. In the second, more complex case, the self 

test service requires the collaboration of the central node of the TS and all the corresponding TS leaves. Each 

control cell involved implements the same self test operation. The self test operation running in each control cell 

is a FSM with only two states: halted and tested. This is the sequence of steps that a controller of the TS 

should follow in order to properly use the self test service. 

• Send a SelfTestServiceInit() command. Once the self test operation is initiated, the operation reaches 

the halted state (initial state). 

• Send a RunTest(LogLevel) command, where the parameter LogLevel specifies the level of detail of the 

error report. An additional parameter type, in the RunTest() command, might be used to distinguish among 

different types of self test. 

The behavior of the RunTest() command depends on whether it is the self test operation of the central node of 

the TS, or a self test operation in a TS leaf. In the central node of the TS, the RunTest() command is used to 

follow the above sequence for each TS leaf, and collect all error reports coming from the TS leaves. In the case 

of a TS leaf, the RunTest() command will implement the test itself and will generate an error report that will be 

forwarded to the central node of the TS. It is important to note that the error report will be generated in a 

standard format specified in a XML Schema Document (XSD). This should ease the automation of test reports. 

The interconnection test service is intended to check the connections among sub-systems. In each test, several 

trigger sub-systems and sub-detectors can participate as sender/s or receiver/s. 

Figure 3-5 shows a typical scenario for participants involved in an interconnection test. The example shows the 

interconnection test of the Trigger Primitive Generators and the Global Trigger logic.


Det. 

FE 

Opt. 

S-Link 

DAQ 

Readout 

TPG 

Trig. 

Links 

Trig. Subsystem 

Trig. 

Links 

Global 


Sender 

Start(L1A) 

Receiver 

TCS 

Figure 3-5: Typical scenario of an interconnection test. 

The interconnection test service requires the collaboration of the central node of the TS and some of the TS 

leaves. Each control cell involved will implement the operation represented in Figure 3-6. 

ConTestServiceInit() 

Prepare_test(Test_id) 

Start_test() 

Not 

tested 

Preparing 

Error() 

Ready 

for test 

Error() 

Testing 

Tested 

Error 

Figure 3-6: Interconnection test operation. 

This is the sequence of steps that a controller of the TS should follow in order to properly use the interconnection 

test service. 

• Send a ConTestServiceInit() command. 

• Once the operation reaches the Not tested state, the next step is to send a PrepareTest(Test_id). This 

command implemented in the central node of the TS will do the following steps: 

o Retrieve from the configuration database the relevant information for the central node of the TS. 

o Send a ConTestServiceInit() command to sender/s and receiver/s. 

o Send Prepare_test() command to sender/s and receiver/s. 

o Wait for Ready_for_test signal from all senders/receivers. 

• Once the operation reaches the Ready state, the next step is to send a Start_test command. 

• Wait for results. 

This is the sequence of steps that the TS leaves acting as senders/receivers should follow when they receive the 

Prepare_test(Test_id) command from the central node of the TS. 

Retrieve from the configuration database the relevant information for the leaf (e.g., which role: sender or 

receiver, test vectors to be sent or to be expected). 

• Send a Ready_for_test signal to the central node of the TS. 

• Wait for the Start_test() command.

Design 31 

• Do the test, and generate the test report to be forwarded to the central node of the TS (if the TS leaf is a 

receiver). 

In contrast to the configuration service, the central node of the TS can already check whether a given user can 

launch interconnection test operations. However, the TSM of each TS leaf will still be in charge of checking 

whether acting as a sender/receiver is in conflict with an already running operation. Each sub-detector must also 

customize a control cell in order to facilitate the execution of interconnection tests that involve the TPG modules. 

3.3.3.4 Monitoring 

The monitoring service is implemented by an operation running in a concrete TS leaf or as a collaborative 

service where an operation, running in the central node of the TS, is monitoring the monitoring operations 

running in a number of TS leaves. 

The basic monitoring operation is a FSM with only two states: monitoring and stop. Once the monitoring 

operation is initiated, the monitoring process is started. At this point, any controller can retrieve items by sending 

pull commands. A more advanced monitoring infrastructure should be offered in a second development phase 

where a given controller receives monitoring updates following a push approach. This second approach 

facilitates the implementation of an alarm mechanism. 

3.3.3.5 Start-up 

From the point of view of a controller (run control session or standalone client), the whole L1 trigger system is 

one single resource, which can be started by sending three commands. Figure 3-7 shows how this process is 

carried out. This approach will simplify the implementation of the client. 

RC 

session 

Session_key 

TS_start_key 

TS_URL 

TS_config_data 

… 



1 st .To JC: Start(TS_URL) 

JC 

1st.To JC: Start(GT_URL) 



(central node) 

2 nd . To TS: Config_trigger_sw(TS_config_data) 

3 rd . To TS: Startup_trigger(TS_start_key) 

TS_Start_key 

GT_start_key 

GT_URL 

GT_config_data 

2 nd . To TS: Config_trigger_sw(GT_config_data) 

3 rd . To TS: Startup_trigger(GT_start_key) 

… 

TS responsibility 

JC 



(Global Trigger) 

JC 



(Global Muon) 

… 

JC 



(Global Calo) 

GT_start_key 

GT_OSWI: URL’s, config_data 

Sub-system 


OSWI 

OSWI 

OSWI 

Figure 3-7: Start-up service. 

The first client that wishes to operate with the TS must follow these steps: 

• Send a Start(TS_URL) command to the job control daemon in charge of starting up the central node of the 

TS, where TS_URL identifies the Uniform Resource Locator from where the compiled central node of the TS 

can be retrieved. 

• Send a Config_trigger_sw(TS_config_data) command to the central node of the TS in order to properly 

configure it. Steps 1 and 2 are separated to facilitate an incremental configuration process.


• Send a Startup_trigger(TS_start_key) command to the central node of the TS. This command will send 

the same sequence of three commands to each TS leaf, but now the command parameters are retrieved from 

the configuration database register identified with the TS_start_key index. 

The Config_trigger_sw(TSLeaf_config_data) command that is received by the TS leaf is in charge of 

starting up the corresponding online software infrastructure. 

The release of the TS nodes is also hierarchic. Each node of the TS (i.e., TS central node and TS leaves) will 

maintain a counter of the number of controllers that are operating on it. When a controller wishes to stop 

operating a given TS node, it has to demand the value of the reference counter from the TS node. If it is equal to 

1, the controller will send a Release_node command and will wait for the answer. When a TS node receives a 

Release_node command it will behave like the controller outlined above in order to release the unnecessary 

software infrastructure. 

3.3.4 Graphical User Interface 

Together with the basic building block of the TS or control cell, an interactive graphic environment to interact 

with it should be provided. It should feature a display to help the user/developer to operate the control cell in 

order to cope with the requirement outlined in Section 3.2.1, Point 9). Two different interfaces are foreseen: 

• HTTP: The control cell should provide an HTTP interface that allows full operation of the control cell and 

visualization of the state of any running operation. The HTTP interface should provide an additional entry 

point to the control cell (Section 3.3.2), bypassing the ACM, in order to offer a larger flexibility in the 

development and debug phases. 

• Java: A generic controller developed in Java should provide to the user an interactive window to operate the 

control cell through a SOAP interface. This Java application should also be an example of how to interact 

with the monitoring operations offered by the control cell, and graphically represent the monitored items. 

This Java controller can be used by the RCMS developers control as an example of how to interact with the 

TS. 

3.3.5 Configuration and conditions database 

In this design, a dedicated configuration/conditions database per sub-system is foreseen. Different sets of 

firmware for the L1 trigger electronics boards and default parameters such as thresholds should be predefined 

and stored in the database. The information should be validated with respect to the actual hardware limitations 

and compatibility between different components. However, as it is shown in Figure 3-1, all these databases share 

the same database server provided by the CMS DataBase Working Group (DBWG). The general CMS database 

infrastructure, which the TS will use, includes the following components: 

• HW infrastructure: Servers. 

• SW infrastructure: Likely based on Oracle, scripts and generic GUIs to populate the databases, methodology 

to create customized GUIs to populate sub-system specific configuration data. 

Each trigger sub-system should provide the specific database structures for storing configuration data, access 

control information and interconnection test parameters. Custom GUIs to populate these structures should also 

be delivered. 

3.4 Project communication channels 

The development of the Trigger Supervisor required the collaboration of all trigger sub-systems, sub-detectors 

and the RCMS. Other parties of the CMS collaboration are also involved in this project: the Luminosity 

Monitoring System (LMS), the High Level Trigger (HLT), the Online Software Working Group (OSWG) and 

the DataBase Working Group (DBWG). A consistent configuration of the Trigger Primitive Generator (TPG) 

modules of each sub-detector, the automatic update of the L1 trigger pre-scales as a function of information 

obtained from the LMS, the adequate configuration of the HLT and the agreement in the usage of software tools 

and database technologies enlarged the number of involved parties during the development of the TS. Due to the 

large number of involved parties and sub-system interfaces an important effort was dedicated to documentation 

and communication purposes.

Project development 33 

One of the problems in defining the communication channels is that they may concern different classes of 

consumers having fairly different background and language - electronics engineers, physicists, programmers and 

technicians. Consumers can be roughly divided between the TS team and the rest. For internal use, the TS 

members use the Unified Modeling Language (UML) [61] descriptions to model and document the status of the 

TS software framework: concurrency, communication mechanism, access control, task scheduling and error 

management. This model is kept consistent with the status of the TS software framework. This additional effort 

is worthwhile because it accelerates the learning curve of new team members that are able to contribute 

effectively to the project in a shorter period of time, it helps to detect and remove errors and can be used as 

discussion material with other software experts, for instance to discuss the data base interface with the DBWG or 

to justify to the OSWG an upgrade in a core library. But this approach is no longer valid when the consumer is 

not a software expert. Project managers, electronic engineers or physicists must also contribute. Periodic 

demonstrators with all involved parties have proved to be powerful communication channels. This simple 

approach has facilitated the understanding of the TS for a wide range of experts and has helped in the continuous 

process of understanding the requirements. A practical way to communicate the status of the project has also 

facilitated the maintenance of a realistic development plan and manpower prevision calendar. 

3.5 Project development 

The development of the TS was divided in three main development layers: the framework, the system and the 

services. The framework is the software infrastructure that facilitates the main building block or control cell, and 

the integration with the specific sub-system OSWI. The system is a distributed software architecture built out of 

these building blocks. Finally, the services are the L1 trigger operation capabilities implemented on top of the 

system as a collaboration of finite state machines running in each of the cells. The decomposition of the project 

development tasks into three layers has the following advantages: 

1) Project development coordination: The division of the project development effort into three conceptually 

different layers facilitates the distribution of tasks between a central team and the sub-systems. In a context 

of limited human resources the central team can focus on those tasks that had a project overall scope like 

project organization, communication, design and development of the TS framework, coordination of subsystem 

integration, sub-system support and so on. The tasks assigned to the sub-systems are those that 

require an expert knowledge of the sub-system hardware. These tasks consist of developing the sub-system 

TS cells according to the models proposed by the central team, and the development of the sub-system cell 

operations required by the central team in order to build the configuration and test services. 

2) Hardware and software upgrades: Periodic software platform and hardware upgrades are foreseen during 

the long operational life of the experiment. A baseline layer that hides these upgrades and provides a stable 

interface avoids the propagation of code modifications to higher conceptual layers. Therefore, the code and 

number of people involved in updating the TS after each SW/HW upgrade are limited and well localized. 

3) Flexible operation capabilities: A stable distributed architecture built on top of the baseline layer is the 

first step towards providing a simple methodology to create new services to operate the L1 decision loop 

(Section 3.2.2, Point 6) ). The simplicity of this methodology is necessary because the people in charge of 

defining the way of operating the experiment are in general not software experts but particle physicists with 

almost full time management responsibilities.


Periodic demonstrators as a communication channel with all involved parties: 

•Trigger sub-systems and sub-detectors 

• Luminosity monitoring system 

• High Level Trigger 

• Run Control and Monitoring System 

• Database Working Group 

• Online Software Working Group 

Services 

System 

Framework 

Chapter 6 

Chapter 5 

Chapter 4 

Prototype 

SW Context 

Concept 

HW Context 

Chapter 3 

Chapter 1 

Figure 3-8: Trigger Supervisor project organization and communication schemes. 

The TS framework, presented in Chapter 4, consists of the distributed programming facilities required to build 

the distributed software system known as TS system. The TS system, presented in Chapter 5, is a set of nodes 

and the communication channels among them that serve as the underlying infrastructure that facilitate the 

development of the TS services presented in Chapter 6. Figure 3-8 shows a simplified diagram of the project 

organization, the communication channels and the contents of Chapters 1, 3, 4, 5, and 6. 

3.6 Tasks and responsibilities 

The development of the TS framework, system and services can be further divided into a number of tasks. Due 

to the limited resources in the central TS team and in some cases due to the required expertise about concrete 

sub-system hardware, these tasks are distributed among the trigger sub-systems and the TS team. 

Central team responsibilities 

The tasks assigned to the central team are those that have a project overall scope like project organization, 

communication, design and development of common infrastructure, coordination of sub-system integration, subsystem 

support and so on. The following list describes the tasks assigned to the central team. 

1) Trigger Supervisor framework development: The creation of the basic building blocks that form the TS 

system and that facilitate the integration of the different sub-systems is a major task which requires a 

continuous development process from the prototype to the periodic upgrades in coordination with the 

OSWG and DBWG. 

2) Coordination: The central team is responsible to discuss and propose to each sub-system an integration 

model with the TS system. The central team is also responsible to develop the central cell and to coordinate 

the different sub-systems in order to create the TS services. 

3) Sub-system support: It is important to provide adequate support to the sub-systems in order to ease the 

integration process and the usage of the TS framework. With this aim, the project web page [62] was 

regularly updated with the last version of the user’s guide [63] and the last presentations, a series of 

workshops [64][65][66] were organized, and finally a web-based support management tool was set up [67].

Conceptual design in perspective 35 

4) Software configuration management: A set of configuration management actions were proposed by the 

central team in order to improve the communication of the system evolution and the coordination among 

sub-system development groups. A common Concurrent Versions System 9 (CVS) repository for all the 

online software infrastructure of the L1 trigger was created, which facilitates the production and 

coordination of L1 trigger software releases. A generic Makefile 10 was adopted to homogenize the build 

process of the L1 trigger software. This allowed a more automatic deployment of the L1 trigger online 

software infrastructure, and prepared it for integration with the DAQ online software. 

5) Communication: The central team was also responsible for communicating with all involved parties 

according to Section 3.4. The communication effort consisted of periodic demonstrators, the framework 

internal documentation and presentations in the collaboration meetings. 

Sub-system responsibilities 

The tasks assigned to the sub-systems were those that required an expert knowledge of the sub-system hardware. 

These tasks consisted of developing the sub-system TS cells according to the models proposed by the central 

team, and the development of the sub-system cell operations required by the central team in order to build the 

configuration and test services. 

Shared responsibilities 

Due to an initial lack of human resources in the sub-system teams, some sub-system cells were initially 

prototyped by the central team: GT, GMT, and DTTF. At a later stage, the bulk of these developments was 

transferred to the corresponding sub-systems. 

3.7 Conceptual design in perspective 

The TS conceptual design presented in this chapter consists of functional and non-functional requirements, a 

feasible architecture that fulfills these requirements and the project organization details. These three points 

define the project concept. Some initial technical aspects have also been presented in order to prove the 

feasibility of the design: XDAQ as baseline infrastructure and GUI technologies, the usage of FSM’s, services 

implementation details and so on. 

During three years the project scope has not been altered proving the suitability of the initial conceptual ideas. 

However, some technical details have evolved towards different solutions, some have disappeared and a few 

have been added. The following chapters describe the final technical details of the Trigger Supervisor. 

9 The Concurrent Versions System (CVS), also known as the Concurrent Versioning System, is an open-source version 

control system that keeps track of all work and all changes in a set of files, typically the implementation of a software project, 

and allows several (potentially widely-separated) developers to collaborate (Wikipedia). 

10 In software development, make is a utility for automatically building large applications. Files specifying instructions for 

make are called Makefiles (Wikipedia).

Chapter 4 

Trigger Supervisor Framework 

4.1 Choice of an adequate framework 

The conceptual design of the Trigger Supervisor presented in Chapter 3 outlines a distributed software control 

system with a hierarchical topology where each node is lying on a common architecture. Such a distributed 

system requires the usage of a distributed programming framework 11 that should facilitate the necessary tools 

and services for remote communication, system process management, memory management, error management, 

logging and monitoring. A suitable solution had to cope with the functional and non-functional requirements 

presented in Chapter 3. 

As discussed in Section 1.4, the CMS Experiment Control System (ECS) is based on three main distributed 

programming frameworks, namely XDAQ, DCS and RCMS, which as official projects of the CMS collaboration 

will be maintained and supported during an operational phase of the order of ten years. The choice was therefore 

limited to these frameworks. Other external projects were not considered due to the impossibility to assure the 

long-term maintenance. 

Among them, XDAQ had proven to be the most complete and able to facilitate a fast development as required in 

Section 3.2.2, Point 5): 

• The Online SoftWare Infrastructure (OSWI) of all sub-systems is mainly formed by libraries written in C++ 

running on an x86/Linux platform. These are intended to hide hardware complexity from software experts. 

Therefore, a distributed programming framework based on C++ would simplify the model of integration 

with the sub-system OSWI’s. 

• When the survey took place, XDAQ was already a mature product with an almost final API which 

facilitated the upgrading effort. 

• XDAQ provides infrastructure for monitoring, logging and database access. 

The RCMS and PVSSII/JCOP frameworks were not selected due to the additional complexity of the overall 

architecture. First, RCMS is written in Java and therefore the integration of C++ libraries would require an 

additional effort. Besides, RCMS was being totally re-developed when the survey took place. Regarding PVSSII, 

it could have been adopted if the sub-system C++ code would have run within a Distributed Information 

Management (DIM) server [70]. This could have provided an adequate remote interface to PVSSII [71]; 

however, the usage of two distributed programming frameworks (PVSSII and DIM) within two different 

platforms (PVSSII runs on Windows and DIM on Linux) would have resulted in an undesirably complex 

architecture. 

11 A software framework is a reusable software design that can be used to simplify the implementation of a specific type of 

software. If this is implemented in an object oriented language, this consists of a set of classes and the way their instances 

collaborate [68][69].

Trigger Supervisor Framework 38 

Despite the fact that XDAQ was the best available option, it was not an out-of-the-box solution to implement the 

Trigger Supervisor and therefore further development was needed. Section 4.2 describes the requirements of the 

Trigger Supervisor framework. Section 4.3 describes the functional architecture. Section 4.4 discusses the 

implementation details. Section 4.5 presents a concrete usage guide of the framework. Finally, the performance 

and scalability issues are presented in Section 4.6. 

4.2 Requirements 

This section presents the requirements of a suitable software framework to develop the TS. It is shown how the 

functional (Section 3.2.1) and non-functional (Section 3.2.2) requirements associated with the conceptual design 

motivate a number of additional developments which are not covered by XDAQ. 

4.2.1 Requirements covered by XDAQ 

The software basic infrastructure necessary to implement the TS should fulfill a number of requirements in order 

to be able to serve as the core framework of the TS system. The following list presents the requirements which 

were properly covered by XDAQ: 

1) Web services centric: The CMS online software, and more exactly, the Run Control and Monitoring 

System (RCMS) is extensively using web services technologies ([10], p. 202). XDAQ is also a web services 

centric infrastructure. Therefore, it simplifies the integration with RCMS (Section 3.2.1, Point 12) ). 

2) Logging and error management: According to Sections 3.2.1, Point 7) and 3.2.1, Point 8), the TS 

framework should provide facilities for logging and error management in a distributed environment. XDAQ 

provides this infrastructure compatible with the CMS logging and error management schemes. 

3) Monitoring: According to Section 3.2.1, Point 4), the TS framework should provide infrastructure for 

monitoring in a distributed environment. 

4.2.2 Requirements non-covered by XDAQ 

Additional infrastructure had to be designed and developed to cope with the requirements of the conceptual 

design: 

1) Synchronous and asynchronous protocols: The TS frameworks should facilitate the development of 

distributed systems featuring both synchronous and asynchronous communication among nodes (Section 

3.2.1, Point 12) ). 

2) Multi-user: The nodes of a distributed system implemented with the TS framework should facilitate 

concurrent access to multiple clients (Section 3.2.1, Point 10) ). 

However, the main additional developments were motivated by the human context (Section 3.2.2, Point 7) ) of 

the project and time constraints (Section 3.2.2, Point 5) ). This section presents a number of desirable 

requirements grouped as a function of few generic guidelines. 

Simplify integration and support effort: The resources in the central TS team were very limited. Threfore, it 

was necessary to provide infrastructure that simplified the software integration and reduced the the need for subsystem 

support. 

3) Finite State Machine (FSM) based control system: A framework that guides the sub-system developer 

reducing the degrees of freedom during the customization process would simplify the software integration 

and would reduce the support tasks. A control system model based on Finite State Machines (FSM) is well 

known in HEP. It was proposed in Section 3.3 as a feasible model to implement the final services of the 

Trigger Supervisor. FSM’s have been used in other experiment control systems [72][73][74], and are 

currently being used by the CMS DCS [75] and other CERN experiments [76][77]. On the other hand, just a 

well known model is not enough, a concrete FSM had to be provided with a clear specification of all states 

and transitions, their expected behavior, input/output parameter data types and names. The more complete 

this specification is the easier is the sub-system coordination and the more it facilitates a clear separation of 

responsibilities among sub-systems. Some more concrete implementation details, shown in the 

implementation section, like a clear separation of the error management, are intended to ease the

Cell functional structure 39 

customization and the maintenance phases. In addition, the usage of a well known model would accelerate 

the learning curve and therefore, the integration process. 

4) Simple access to external services: A framework should provide facilities to access Oracle relational 

databases, XDAQ applications, and remote web-based services (i.e. SOAP-based, HTTP/CGI based 

services) in a simple and homogeneous way. This infrastructure would ease the development of the FSM 

transition methods, for instance when it is necessary to access the configuration database. 

5) Homogeneous integration methodology independent of the concrete sub-system OSWI: The TS 

framework should facilitate a common integration methodology independent of the available OSWI and the 

hardware setup. 

6) Automatic creation of graphical user interfaces: In order to reduce the integration development time, a 

framework should provide a mechanism to automatically generate a GUI to control the sub-system 

hardware. This should also facilitate a common look and feel for all sub-systems graphical setups. 

Therefore, an operator of the L1 trigger system could learn faster how to operate any sub-system. 

7) Single integration software infrastructure: A single visible software framework would simplify the 

understanding of the integration process for the sub-systems. 

Simplify software tasks during the operational phase: The framework architecture should take into account 

that support and maintenance tasks are foreseen during the experiment operational phase. 

8) Homogeneous online software infrastructure: In addition to simplify the understanding of the integration 

process, for the sub-system, a single integration software infrastructure would ease the creation of releases, 

the user support and maintenance tasks. 

A common technological approach, with the Trigger Supervisor, to design and develop sub-system expert 

tools, like graphical setups or command line utilities to control a concrete piece of hardware would also help 

to simplify the overall maintenance effort of the whole L1 trigger OSWI. 

9) Layered architecture: From the maintenance point of view, any additional development on top of XDAQ 

had to be designed such that it is easy to upgrade to new XDAQ versions or even to other distributed 

programming frameworks. 

4.3 Cell functional structure 

The “cell” is the main component of the additional software infrastructure motivated by the requirements not 

covered by XDAQ. This component serves as the main facility to integrate the sub-system’s OSWI with the 

Trigger Supervisor. Figure 4-1 shows the functional structure of the cell for a stable version of the TS 

framework. 

This functional structure is more detailed and it has a number of differences compared to the cell presented in the 

conceptual design chapter. The following sections describe in detail this architecture. 

4.3.1 Cell Operation 

A cell operation is basically a FSM running inside the cell which can be remotely operated. In general, FSM’s 

are applied to HEP control problems where it is necessary to monitor and control the stable state of a system. 

The TS services outlined in Chapter 3 were suitable candidates to use it.


Control panel 

plug‐in 

HTTP/CGI (GUI) 

Access Control 

SOAP 

Monitoring 

data source 

Monitorable 

item handler 

Operation 

plug‐in 

Operations 

factory 

Response Control 

Command 

factory 

Command 

plug‐in 

Sub‐system 

hardware 

driver 

Operations Pool 

Error 

Mgt. 

Module 

Commands Pool 

Sub‐system 

hardware 

driver 

Monitor 

Xhannel 

Data base 

Xhannel 

Cell 

Xhannel 

XDAQ 

Xhannel 

Figure 4-1: Architecture of the main component of the TS framework: The cell. 

To use a cell operation it is necessary to initialize an operation instance. The cell facilitates a remote interface to 

create instances of cell operations. Figure 4-2 shows a cell operation example with one initial state (S1), several 

normal states (S2 and S3), transitions between state pairs (arrows), and one event name assigned to each 

transition (e 1 , e 2 , e 3 and e 4 ). 

Operation events are issued by the controller in order to change the current state. The state changes when a 

transition named with the issued event and with origin in the actual state is successfully executed. A transition 

named with the event e i has two customizable methods: c i and f i . The method c i returns a boolean value. The 

method f i defines the functionality assigned to a successful transition. In case c i returns false, the current state 

does not change and f i is not executed. If c i returns true, f i is executed and after this execution the actual state of 

the FSM changes. 

A first aspect to note is that each transition has two functions (f i , c i ). This design has been chosen to enforce a 

customization style that simplifies the implementation, the understanding and the maintenance of the transition 

code (f i ) whilst facilitating a progressive improvement of the necessary previous system check code (c i ). For 

instance, reading from a database and configuring a board would be a sequence of actions defined by the 

transition code, whilst checking that the board is plugged-in and the database is reachable, among other possible 

error conditions, would be defined in the check method. 

e 1 

e 2 

S1 

e 4 

S2 

e 3 

S3 

e i : if (c i ) then { f i , Move to next state } 

else { do not move } 

Warning_level = 1000 

Warning Message = “no message” 

Figure 4-2: Cell operation.

Cell functional structure 41 

Each operation has a warning object which provides a way to monitor the status of the operation. How this is 

updated with the execution of every new event, the warning object can also be used to provide feedback about 

the success level of the transition execution. A warning object contains a warning level and a warning message. 

The warning message is destined for human operators and the warning level is a numeric value that can be 

eventually processed by a remote machine controller. 

A number of operation specific parameters can be set. All of them are accessible during the definition and 

execution of any of f i ’s and c i ’s. The value of the parameters can be set by the controller when the operation is 

initialized or when the controller is sending an event. The type of the parameters can be signed or unsigned 

integer, string and boolean. The return message, after executing the transition methods, always includes a 

payload and the operation warning object. The payload data type can be any of the parameter types. 

Standard operations are provided with the TS framework for the implementation of the configuration and 

interconnection test services. The transition methods for these operations are left empty and each sub-system is 

responsible for defining this code. The TS services, presented in Chapter 6, appear as a coordinated collaboration 

of the different sub-system specific operations. Additional operations can be created by each sub-system to ease 

concrete commissioning and debugging tasks. For instance, an operation can be implemented to move data from 

memories to spy buffers in order to check the proper information processing in a number of stages. 

In order to simplify the understanding of the cell operation model, the intermediate states (Section 3.3.3.1) 

,representing the execution of the transition methods, are not visible in Figure 4-2. However, each transition has 

a hidden state which indicates that the transition methods are being executed. 

4.3.2 Cell command 

A cell command is a functionality of the cell which can be remotely called. Every command splits its 

functionality in two methods: the precondition() and the code(). The method precondition() returns a 

boolean value and the method code() defines the command functionality. In case precondition() returns 

false, the code() method is not executed. When precondition() returns true, the code() method is executed. 

The cell commands can have an arbitrary number of typed parameters which can be used within the command 

methods. 

Like the cell operation, the command has a warning object. This is used to provide a better feedback of the 

success level of the command execution. The warning object can be modified during the execution of the 

precondition() and/or code() methods. 

4.3.3 Factories and plug-ins 

A number of operations and commands are provided with the TS framework. These can be enlarged with 

operations and command plug-ins. The operation factory and command factory are meant to create instances of 

the available plug-ins under the request of an authorized controller. Several instances of the same operation or 

command can be operated concurrently. 

4.3.4 Pools 

The cell command’s and operation’s pools are cell internal structures which store all operation and command 

instances respectively. Each instance of an operation and a command is identified with a unique name 

(operation_id and command_id). This identifier is used to retrieve and operate a specific instance. 

4.3.5 Controller interface 

Compared to the functional design presented in the conceptual design (Section 3.3.2), the input interfaces were 

limited to SOAP and HTTP/CGI (Common Gateway Interface 12 ). The I2O high performance interface was not 

12 The Common Gateway Interface (CGI) is a standard protocol for interfacing information servers, commonly a web server. 

Each time a request is received, the server analyzes what the request asks for, and returns the appropriate output. CGI can use 

the HTTP protocol as transport layer (HTTP/CGI).


added to the definitive architecture because, finally, it was just necessary to serve slow control requests. The 

possibility to extend the input interface with a sub-system specific protocol was also eliminated because none of 

the sub-systems required it. 

Both interfaces (SOAP and HTTP/CGI) facilitate the initialization, destruction and operation of any available 

command and operation. The HTTP/CGI interface also provides access to all monitoring items in the sub-system 

cell and other cells belonging to the same distributed system (Section 4.4.4.12). The HTTP/CGI interface is 

automatically generated during the compilation phase. This simplifies the sub-system development effort and 

homogenizes the look and feel of all sub-system GUIs. This human-to-machine interface can be extended with 

control panel plug-ins (Section 4.4.4.11). A control panel is also a web-based graphical setup facilitated by the 

HTTP/CGI interface but with a customized look and feel. The default and automatically generated GUI provides 

access to the control panels. 

The second interface is a SOAP-based machine-to-machine interface. It is intended to facilitate the integration of 

the TS with the RCMS and to provide a communication link between cells. Appendix A presents a detailed 

specification of this interface. 

4.3.6 Response control module 

The Response Control Module (RCM) was not introduced in the conceptual design chapter. This cell functional 

module is meant to handle both synchronous and asynchronous responses with the controller side. The 

synchronous protocol is intended to assure an exclusive usage of the cell and the asynchronous mode enables 

multi-user access and an enhanced overall system performance. 

4.3.7 Access control module 

The Access Control Module (ACM) is intended to identify and to authorize a given controller. A new controller 

trying to gain access to a cell will have to identify himself with a user name and a password. The ACM will 

check this information in the user’s database and will grant the user a session identifier. This session identifier 

will be stored and will be accessible from any cell. The session identifier is the key to those services that are 

granted to a concrete user. This key has to be sent with every new controller request. 

4.3.8 Shared resource manager 

The Shared Resource Manager (SRM), outlined in the conceptual design (Section 3.3.2), is no longer the unique 

responsible for coordinating the access to any internal or external resource. In the final design, the concurrent 

access to common resources, like the sub-system hardware driver, the communication ports with external entities 

is coordinated by each individual entity. The main reason for this approach is that it is not possible to assure that 

all requests pass through the cell. 

4.3.9 Error manager 

The Error Manager (ERM) is meant to detect any exceptional situation that could happen when a command or 

operation transition method is executed and the method is not able to solve the problem locally. In this case, the 

ERM takes the control over the method execution and sends back the reply message to the controller with textual 

information about what went wrong during the execution of the command or operation transition. This message 

is embedded in the warning object of the reply message (Appendix A). 

4.3.10 Xhannel 

The xhannel infrastructure has been designed to gain access to external resources from the cell command and 

operation methods. It provides a simple and homogeneous interface to a wide range of external services: other 

cells, XDAQ applications and web services. This infrastructure eases the definition of the command and 

operation transition methods by simplifying the process of creating SOAP and HTTP/CGI messages, processing 

the responses and handling synchronous and asynchronous protocols.

Implementation 43 

4.3.11 Monitoring facilities 

The TS monitoring infrastructure consists of a methodology to declare cell monitoring items and an additional 

infrastructure which facilitates the definition of the code to be executed every time that each item is being 

checked. The TS monitoring infrastructure is based on the XDAQ monitoring components. 

4.4 Implementation 

The TS framework is the implementation of the additional infrastructure required in the discussion of Section 4.2 

and formalized with a functional design in Section 4.3. The layered architecture of Figure 4-3 shows how the TS 

framework is implemented on top of the XDAQ middleware and a number of external software packages 13 . The 

TS framework, together with the XDAQ middleware, is used to implement the Trigger Supervisor system. 

4.4.1 Layered architecture 

The L1 trigger OSWI has the layered structure shown in Figure 4-3. In this organization, the TS framework lies 

between a specific sub-system OSWI on the upper side, and the XDAQ middleware and other external packages 

on the lower side. Figure 4-4 shows the package level description of the L1 trigger OSWI. Each layer of Figure 

4-3 is represented by a box in Figure 4-4 and each box includes a number of packages. The dependencies among 

packages are also presented in Figure 4-4. Sections 4.4.2 to 4.4.4 present each of the layers outlined in Figure 

4-3. 

Figure 4-3: Layered description of a Level-1 trigger online software infrastructure. 

4.4.2 External packages 

This section describes the external packages used by the TS and XDAQ frameworks. The C++ classes contained 

in these packages are used to enhance the developments described in Section 4.4. 

4.4.2.1 Log4cplus 

Inserting user notifications, also known as “log statements”, into the code is a method for debugging it (Section 

3.2.1, Point 7) ). It may also be the only way for multi-threaded applications and distributed applications at large. 

Log4cplus is a C++ logging software framework modeled after the Java log4j API [78]. It provides precise 

context about a running application. Once inserted into the code, the generation of logging output requires no 

human intervention. Moreover, log output can be saved in a persistent medium to be studied at a later time. 

The Log4cplus package is used to facilitate the debugging of the TS system and to have a persistent register of 

the run time system behavior. This facilitates the development of post-mortem analysis tools. Logging facilities 

are also used to document and to monitor alarm conditions. 

13 A software package in object-oriented programming is a group of related classes with a strong coupling. A software 

framework can consist of a number of packages.


Figure 4-4: Software packages of the Level-1 trigger online software Infrastructure. 

4.4.2.2 Xerces 

Xerces [79] is a validating XML parser written in C++. Xerces eases C++ applications to read and write XML 

data. An API is provided for parsing, generating, manipulating, and validating XML documents. Xerces 

conforms to the XML 1.1 [80] recommendation. Xerces is used to ease the parsing of the SOAP request 

messages in order to extract the command and parameter names, the parameter values and other message 

attributes. 

4.4.2.3 Graphviz 

Graphviz [81] is a C++ framework for graph filtering and rendering. This library is used to draw the finite state 

machine of the cell operations. 

4.4.2.4 ChartDirector 

ChartDirector [82] is a C++ framework which enables a C++ application to synthesize charts using standard 

chart layers. This package is used to present the monitoring information. 

4.4.2.5 Dojo 

Dojo [83] is a collection of JavaScript functions. Dojo eases building dynamic capabilities into web pages and 

any other environment that supports JavaScript. The components provided by Dojo can be used to make web 

sites more usable, responsive and functional. The Dojo toolkit is used to implement the TS graphical user 

interface.


4.4.2.6 Cgicc 

Ccgicc [84] is a C++ library that simplifies the processing of the HTTP/CGI requests on the server side (the cell 

in our case). This package is used by the CellFramework, Ajaxell and sub-system cell packages to ease the 

implementation of the TS web-based graphical user interface. 

4.4.2.7 Logging collector 

The logging collector or log collector [85] is a software component that belongs to the RCMS framework 

(Section 1.4.1). It is designed and developed to collect logging information from log4j compliant applications 

and to forward these logging statements to several consumers at the same time. These consumers can be: Oracle 

database, files or a real time message system. The log collector is not a component of the TS framework but it is 

used as a component of the TS logging system, a component of the TS system. 

4.4.3 XDAQ development 

XDAQ (pronounced Cross DAQ) was introduced in Section 1.4.3 as a domain-specific middleware designed for 

high energy physics data acquisition systems. It provides platform independent services, tools for local and 

remote inter-process communication, configuration and control, as well as technology independent data storage. 

To achieve these goals, the framework is built upon industrial standards, open protocols and libraries. 

This distributed programming framework is designed according to the object-oriented model and implemented 

using the C++ programming language. This infrastructure facilitates the development of scalable distributed 

software systems by partitioning applications into smaller functional units that can be distributed over multiple 

processing units. In this scheme each computing node runs a copy of an executive that can be extended at runtime 

with binary components. A XDAQ-based distributed system is therefore designed as a set of independent, 

dynamically loadable modules 14 , each one dedicated to a specific sub-task. The executive simply acts as a 

container for such modules, and loads them according to an XML configuration provided by the user. 

A collection of C++ utilities is available to enhance the development of XDAQ components: logging, data 

transmission, exception handling facilities, remote access to configuration parameters, thread management, 

memory management and communication among XDAQ applications. 

Some core components are loaded by default in the executive in order to provide basic functionalities. The main 

components of the XDAQ environment are the peer transports. These implement the communication among 

XDAQ applications. Another default component is the Hyperdaq web interface application which turns an 

executive into a browsable web application that can visualize its internal data structure [86]. 

The framework supports two data formats, one based on the I2O [87] specification and the other on XML. I2O 

messages are binary packets with a maximum size of 256 KB. I2O messages are primarily intended for the 

efficient exchange of binary information, e.g. data acquisition flow. Despite its efficiency the I2O scheme is not 

universal and lacks flexibility. A second type of communication has been chosen for tasks that require higher 

flexibility such as configuration, control and monitoring. This message-passing protocol, called Simple Object 

Access Protocol (SOAP) relies on the standard Web protocol (HTTP) and encapsulates data using the eXtensible 

Markup Language (XML). SOAP is a means to exchange structured data in the form of XML-based messages 

among computers over HTTP. 

XDAQ uses SOAP for a concept called Remote Procedure Calls (RPC). This means that the SOAP message 

contains an XML tag that is associated with a function call, a so called callback, at the receiver side. That way a 

controller can execute procedures on remote XDAQ nodes. 

The XDAQ framework is divided into three packages: Core Tools, Power Pack and Work Suite. The Core Tools 

package contains the main classes required to build XDAQ applications, the Power Pack package consists of 

pluggable components to build DAQ applications, and the Work Suite package contains additional infrastructure, 

totally independent of XDAQ, which is intended to perform some related data acquisition tasks. 

14 XDAQ component, module and application are equivalent concepts.


XDAQ example 

A XDAQ application is a C++ class which extends the base class xdaq::Application. It can be loaded into a 

XDAQ executive at run-time. Unlike ordinary C++ applications, a XDAQ application does not have a main() 

method as an entry point, but instead, has several methods to control specific aspects of its execution. Each of 

these methods can be assigned to a RPC in order to facilitate its remote execution. 

At the startup, a XDAQ executive can be configured passing the path of a configuration file as a command line 

argument. The configuration file contains the configuration information of the XDAQ executive. This file uses 

XML to hierarchically structure the configuration information in three levels: 

• Partition: Each configuration file contains exactly one partition that is a collection of XDAQ executives 

hosting XDAQ applications. 

• Context: Each context defines one XDAQ executive uniquely identified by its URL that is composed of 

host name and port. A partition may contain an arbitrary number of contexts. The tag inside the 

tag specifies the location of shared libraries that have to be loaded in order to make applications 

available. 

• Application: The tag uniquely identifies a XDAQ application. Each context can be 

composed of an arbitrary number of XDAQ applications. Applications can define properties using the 

tag. The application properties can be accessed at run-time. 

The cell is implemented as a XDAQ component or application. Figure 4-5 shows the configuration file of the 

Global Trigger (GT) cell. The GT cell is running on the first host configured with a number of properties. The 

GT cell is compiled in one library located in the path given by tag. A second executive runs on a 

different host and contains one single application named Tstore. 

 

 

 

 

 

GT 

file://… 

 

 

file://…/libCell.so 

 

 

 

...- 

 

 

Figure 4-5: Example of XDAQ configuration file: GT cell configuration file. 

4.4.4 Trigger Supervisor framework 

The TS framework is the software layer built on top of XDAQ and the external packages. This software layer 

fills the gap between XDAQ and a suitable solution that copes with the project related human factors (Section 

3.2.2, Point 7) ), time constraints (Section 3.2.2, Point 5) ) and non-covered functional requirements (Section 

3.2.1, Point 10) ) discussed in the TS conceptual design. This solution has been developed according to the 

requirements discussed in Section 4.2 and the functional architecture presented in Section 4.3.


The components of the TS framework can be divided in two groups: the TS core framework and the sub-system 

customizable components. The TS core framework is the main infrastructure used by the customizable 

components. Figure 4-6 shows a Unified Modeling language (UML) diagram of the most important classes of 

the TS core framework and a possible scenario of derived or customizable sub-system classes. 

This section presents the structure of classes contained in the TS framework. Its description has been organized 

following the same structure of the cell functional description presented in Section 4.3. The implementation of 

each functional module is described as a collaboration of classes using the UML. The main classes that 

collaborate to form the cell functional modules are contained in the CellFramework package. This section 

presents also a number of packages developed specifically for this project: the CellToolbox package and a new 

library designed and developed to implement the TS Grapical User Interface. Finally, the database interfaces, 

and the integration of the XDAQ monitoring and logging infrastructures are presented. 

4.4.4.1 The cell 

A SubsystemCell class (or sub-system cell) is a C++ class that inherits from the CellAbstract class which in 

turn is a descendant of the xdaq::Application class. The fact that a sub-system cell is a XDAQ application 

allows the sub-system cell to be added to a XDAQ partition, then making it browsable through the XDAQ 

HTTP/CGI interface. The XDAQ SOAP Remote Procedure Calls (RPC’s) interface is also available to the subsystem 

cell. The RPC interface, implemented in the CellAbstract class, allows a remote usage of the cell 

operations and commands. The CellAbstract class is also responsible for the dynamic creation of 

communication channels between the cell and external services also known as “xhannels”. The xhannel run-time 

setup is done according to a XML file known as “xhannel list”. The CellAbstract class implements a GUI 

accessible through the XDAQ HTTP/CGI interface which can be extended with custom graphical setups called 

“control panels”.


Trigger Supervisor core framework 

xdaq::Application 

1 

1 

CellAbstractContext 

1 

1 

CellXhannel 

1 

1 

1 

1 

1 

CellCommandPort 

+run(in msg : ) : 

CellAbstract 

+addCommand() 

+addOperation() 

+addChannel() 

«uses» 

«uses» 

+createRequest() : CellXhannelRequest 

+removeRequest(in req : CellXhannelRequest) 1 

«uses» 

CellOperationFactory 

CellToolbox 

Ajaxell 

«instance» 

+createFromOperation(in name : string) : CellOperation 

1 

«instance» 

CellCommandFactory 

+createFromCommand() : CellCommand 

1 

CellXhannelRequest 

CellWarning 

«instance» 

CellPanelFactory 

+createPanel(in className : string) : CellPannel «uses» 

«instance» 

1 

1 

1 

1 

DataSource 

CellOperation 

CellCommand 

CellPannel 

+layout() 

Subsystem monitoring handlers 

SubsystemOperation 

SubsystemCommand 

SubsystemPanel 

SubsystemCell 

+layout() 

1 

1 

1 

1 

SubsystemContext 

1 

1 

1 1 

Subsystem 

OSWI 

Subsystem customizable classes and OSWI 

Figure 4-6: Components of the TS framework and sub-system customizable classes. 

4.4.4.2 Cell command 

A cell command, presented in Section 4.3.2, is an internal method of the cell that can be executed by an external 

entity or controller. There are few default commands that allow a controller to remotely instantiate, control and 

kill cell operations. These commands are presented in the following section. It is also possible to extend the 

default cell commands with sub-system specific ones. 

Figure 4-7 shows a UML diagram of the TS framework components involved in the creation of the cell 

command concept. The CellCommand class inherits from the CellObject class which provides access to the 

CellAbstractContext object and to the Logger object. The CellAbstactContext object is a shared object 

among all instances of CellObject in a given cell, in particular for all CellCommand and CellOperation 

instances. The CellAbstractContext provides access to the factories and to the xhannels. Through a dynamic


CellObject 

1 


-xhannels 

-factories 

1 

1 

Logger 

1 

CellCommand 

-paramList : xdata::serializable 

+run() : xoap::message 

+virtual init() 

+virtual code() 

+virtual precondition() : bool 

1 1 

CellWarning 

-message : xdata::serializable 

-level : xdata::serializable 

CellSubsystemCommand 

«uses» 


-HWdriver SubCrate 

Figure 4-7: UML diagram of the main classes involved on the creation of the cell command concept. 

cast, it is also possible to access a sub-system specific descendant of the CellAbstractContext class (or just cell 

context). In some cases, the sub-system cell context gives access to a sub-system hardware driver. Therefore, all 

CellCommand and CellOperation instances can control the hardware. The CellObject interface facilitates also 

access to the logging infrastructure through the logger object. Each CellCommand or CellOperation object has a 

CellWarning object. 

The CellCommand has one public method named run(). When this method is called, a sequence of three virtual 

methods is executed. These virtual methods have to be implemented in the specific CellSubsystemCommand 

class: 1) the Init() method initializes those objects that will be used in the precondition() and code() 

methods (Section 4.3.2); 2) the precondition() method checks the necessary conditions to execute the 

command; and 3) the code() method defines the functionality of the command. The warning message and level 

can be read or written within any of these methods. Finally, the run() method returns the reply SOAP message 

which embeds a serialized version in XML of the code() method result and warning objects. 

4.4.4.3 Cell operation 

Figure 4-8 shows a UML diagram of the TS framework components involved in the creation of the cell operation 

concept. 

toolbox::lang::class 

CellObject 

1 

1 

Logger 

1 


-xhannels 

-factories 

CellCommand 

1 

OpSendCommand 

OpGetState 

CellFSM 

1 1 

CellOperation 

-paramList : xdata::serializable 

+apply(in CellCommand) 

+virtual initFSM() 

1 1 

CellWarning 

-message : xdata::serializable 

-level : xdata::serializable 

OpInit OpKill OpReset 

CellSubsystemOperation 

«uses» 


-HWdriver SubCrate 

Figure 4-8: UML diagram of the TS framework components involved in the creation of the cell operation 

concept.


Like the CellCommand class, the CellOperation class is a descendant of the CellObject class. Therefore, it has 

access to the logger object and to the cell context. The CellOperation class inherits also from 

toolbox::lang::class. This XDAQ class facilitates a loop that will run in an independent thread executing a 

concrete job defined in the CellOperation::job() method. This is known as the “cell operation work-loop”. 

An important member of the CellOperation class is the CellFSM attribute. This attribute implements the FSM 

defined in Section 4.3.1. The initialization code of the CellFSM class is defined in the initFSM() method of the 

CellSubsystemOperation class. This method defines the states, transitions and (f i , c i ) methods associated with 

each transition. 

An external controller can interact with the CellOperation infrastructure through a set of predefined cell 

commands: OpInit, OpSendCommand, OpGetState, OpReset and Opkill. The OpInit::code() method triggers in 

the cell the creation of a new CellOperation object. Once the CellOperation object is created the operation 

work-loop starts. This work-loop reads periodically from a given queue the availability of new events. If a new 

event arrives, it is then passed to the CellFSM object. This queue avoids losing any event and assures that the 

events are served orderly. The rest of predefined commands are considered events over existing operation 

objects. Therefore, the code() method for these commands just pushes the command itself to the operation 

queue. 

4.4.4.4 Factories, pools and plug-ins 

Figure 4-9 shows the components involved in the creation of the factory, the pool and the plug-in concepts. 

There are three types of factories: command, operation and panel factories. The factories are responsible for 

controlling the creation, destruction and operation of the respective items (operations, commands or control 

panels). Sub-system specific commands, operations and panels are also called plug-ins. The available 

commands, operations and panels in the factories can be extended at run-time using the CellAbstract::add() 

method. 


1 

1 

1 1 1 

CellOperationFactory 

+createFromOperation(in name : string) : CellOperation 

+add() 

1 

1 


* 

«instance» 

1 


+add() 

CellPanelFactory 

«instance» 

+createPanel(in className : string) : CellPannel 

+add() 

«instance» 

CellOperation CellCommand CellPannel 

* 

1 * 

1 

1 

CellAbstract 

+addCommand() 


+addChannel() 


SubsystemOperation SubsystemCommand SubsystemPanel 

SubsystemCell 

Figure 4-9: TS framework components involved in the creation of the factory, the pool and the plug-in 

concepts.


The factories also play the role of pools. Each factory keeps track of the created objects and is responsible for 

assigning a unique identifier to each of them. After the object creation, this identifier is embedded in the reply 

SOAP message and sent back to the controller (Section 4.4.4.5 and Appendix A). 

4.4.4.5 Controller interface 

Figure 4-10 shows the components involved in the creation of the cell Controller Interface (CI). As it is shown in 

Section 4.4.4.1, sub-system cells are XDAQ applications and therefore are able to expose both a HTTP/CGI and 

a SOAP interface. The cell HTTP/CGI interface is defined in the CellAbstract class by overriding the 

default() virtual method of the xdaq::Application class. This method parses the input HTTP/CGI request 

which is available as a Cgicc input argument (Section 4.4.2.6). The HTTP/CGI response is written into the 

Cgicc output argument at the end of the default() method and is sent back by the executive to the browser. The 

TS GUI is presented in Section 4.4.4.11. 


1 

CellAbstract 


1 

+addCommand() 


+addChannel() 

+xoap::MessageReference guiResponse(xoap::MessageReference msg)() 

+xoap::MessageReference command(xoap::MessageReference msg)() 

+void Default(xgi::Input* in, xgi::Output* out)() 

«uses» 

SubsystemCell 

Ajaxell 

Figure 4-10: Components involved in the creation of the controller interface. 

A second interface is the SOAP interface. A non-customized cell is able to serve the default commands which 

allows to instantiate, control and kill cell operations. The cell SOAP interface and the callback routine assigned 

to each SOAP command are defined in the CellAbstract class. This interface is enlarged when a new command 

is added using the CellAbstract::addCommand() method. 

All SOAP commands are served by the same callback method CellAbstract::command(). This method uses the 

CommandFactory object to create a CellCommand object and executes the command public method 

CellCommand::run() (Section 4.4.4.2). The SOAP message object returned by the run() method is forwarded 

by the executive to the controller. Section 4.4.4.6 discusses in more detail the implementation of the synchronous 

and asynchronous interaction with the controller and the Appendix A presents the SOAP API from the controller 

point of view. 

4.4.4.6 Response control module 

Figure 4-11 shows a UML diagram of the classes involved in the implementation of the Response Control 

Module (RCM). The RCM implements the details of the communication protocols with a cell client or controller. 

A given controller has two possible ways to interact with the cell: synchronous and asynchronous (Appendix A). 

When the controller requests a synchronous execution of a cell command, it assumes that the reply message will 

be sent back when the command execution will have finished. A second way to interact with the cell is the 

asynchronous one. In this case, an empty acknowledge message will be sent back immediately to the controller 

and a second message will be sent back again when the execution of the command will be completed. The 

asynchronous protocol allows implementing cell clients with an improved response time and facilitates the multiuser 

(or multi-client) functional requirement outlined in Section 3.2.1, Point 10). The asynchronous protocol


1 


1 



1 

1 

1 

CellCommandPort 

1 

+run(in msg) : 

«uses» 

CellAbstract 

+addCommand() 


+addChannel() 

toolbox::lang::class 

«instance» 

CellObject 

CellCommand 

«uses» 

SoapMessengeer 

+send(xoap::mesage)() 

Figure 4-11: UML diagram of the classes involved in the implementation of the Response Control Module. 

facilitates the multi-user interface because the single user SOAP interface provided by the XDAQ executive is 

leveraged immediately. However, the synchronous protocol is interesting for a given controller that wants to 

block the access to a given cell whilst it is using the cell. 

It was shown in Section 4.4.4.5 that all SOAP commands are served by the same callback routine defined in the 

method CellAbstract::command(). This method uses the CommandFactory object to create a CellCommand 

object and then executes the method CellCommand::run() which returns the SOAP reply message (Section 

4.4.4.2). In the synchronous case, the CellCommand::run() method returns just after executing the code() 

method. In the asynchronous case, the CellCommand::run() method returns immediately after starting the 

execution of the code() method which continues running in a dedicated thread. The asynchronous SOAP reply 

message is sent back to the controller by this thread when the code() method finishes. The thread is facilitated 

by the cell command inheritance from the toolbox::lang::class class. Figure 4-12 shows a simplified 

sequence diagram of the interaction between a controller and a cell using synchronous and asynchronous SOAP 

message protocols.


Cel 

SOAP message(async=true, cid=xyz) 

CellCommand 1 

CellCommand 2 

 

run(async=true) 

SOAP reply(ack, cid=xyz) 

Ack 

SOAP reply(result, cid=xyz) 

SOAP message(async=false, cid=xyz) 

 

run(async=false) 

SOAP reply(result, cid=xyz) 

result 

Figure 4-12: Simplified sequence diagram of the interaction between a controller and a cell using 

synchronous and asynchronous SOAP messages. 

4.4.4.7 Access control module 

The Access Control module (ACM) is not implemented in version 1.3 of the TS framework, although a place 

holder is available. The run() method of the CellCommandPort object (Figure 4-6) is meant to hide the access 

control complexity. 

4.4.4.8 Error management module 

The Error Management Module (EMM) catches all software exceptional situations not handled in the command 

and operation transition methods. When this method is executed due to a synchronous request message, the 

CellAbstract::command() method is responsible for catching any software exception. If one is caught, the 

method builds the reply message with the warning level equal to 3000 (Appendix A) and the warning message 

specifying the software exception. When the command or operation transition method is executed after an 

asynchronous request, all possible exceptions are caught in the same thread where the code() methods runs. In 

this second case, the thread itself builds the reply message with the adequate warning information. 

In case the cell dies during the execution of a given synchronous request, this will be detected on the client side 

because the socket connection between the client and cell would be broken. If the request is sent in asynchronous 

mode, the request message is sent through a socket which is closed just after receiving the acknowledge 

message. In this case, the reply message is sent through a second socket opened by the cell. Therefore, the client 

is not automatically informed if the cell dies, and it is the client’s responsibility to implement a time-out or a 

periodic “ping” routine to check that the cell is still alive. 

4.4.4.9 Xhannel 

The xhannel infrastructure was implemented to simplify the access from a cell to external web service providers 

(SOAP, HTTP, etc.) like for instance other cells. The cell xhannels are designed to hide the concrete details of 

the remote service provider protocol and to provide a homogeneous and simple interface. This infrastructure 

eases decoupling the development of external services and the cell customization process.


Four different xhannels are provided: CellXhannelCell or xhannel to other cells, CellXhannelTB or xhannel to 

Oracle-based relational databases, CellXhannelXdaqSimple or xhannel to access to XDAQ applications through 

a SOAP interface, CellXhannelMonitor or xhannel to access to monitoring information collected in a XDAQ 

collector. Table 4-1 outlines the purpose of each of the xhannels. 

Xhannel class name 

Purpose (External service) 

CellXhannelCell To interact with other cells (Section 4.4.4.9.1) 

CellXhannelTB To interact with a Tstore application (Section 4.4.4.9.2) 

CellXhannelXdaqSimple To interact with a XDAQ application 

CellXhannelMonitor To interact with a monitor collector (Section 4.4.4.12) 

Table 4-1: Cell xhannel types and their purpose. 

Each CellXhannel class has an associated CellXhannelRequest class. CellXhannel classes are in charge of 

hiding the process of sending and receiving whilst the CellxhannelRequest classes are in charge of creating the 

SOAP or HTTP request messages and to parse the replies. All CellXhannel and CellXhannelRequest classes 

inherit respectively from the CellXhannel and CellXhannelRequest classes. 

4.4.4.9.1 CellXhannelCell 

For instance, the CellXhannelCell class provides access to the services offered by remote cells and the 

CellXhannelRequestCell class is used to create the SOAP request messages and to parse the replies. The 

CellXhannelCell class can handle both synchronous and asynchronous interaction modes. The asynchronous 

reply is caught because the CellXhannelCell is also a XDAQ application which is loaded in the same executive 

as the cell. A callback method in charge of processing all the asynchronous replies assigns them to the 

corresponding CellXhannelRequestCell object. 

A usage example is shown in Figure 4-13. First, the CellXhannel pointer is obtained from the CellContext. 

Second, the CellXhannel object is used to create the request and the message (line 5). Third, the request is sent 

to the remote cell using the CellXhannelCell (line 7). And finally, when the reply is received (line 12), the 

request is destroyed (line 16). 

The definition of all available xhannels in a cell is made in a XML configuration file called “xhannel list”. When 

the cell is started-up, this file is processed and the xhannel objects are attached to the CellContext. Figure 4-14 

shows an example of xhannel list file. The xhannel list should be referenced from the sub-system configuration 

file as shown in Figure 4-5.


1 CellXhannelCell* pXC = dynamic_cast(contextCentral->getXhannel(“GT”)); 

2 CellXhannelRequestCell* req=dynamic_cast(pXC->createRequest()); 

3 map param; 

4 bool async = true; 

5 req->doCommand(currentSid_, async, “checkTriggerKey”, param); 

6 try { 

7 pXhannelCell->send(req); 

8 } catch (xcept::Exception& e){ 

9 pXhannelCell->remove(req); 

10 XCEPT_RETHROW(CellException, “Error sending request to Xhannel GT”,e); 

11 } 

12 while(!req->hasResponse()) sleepmillis(100); 

13 try { 

14 LOG4CPLUS_INFO(getLogger(), “GT key is “ + req->commandReply()->toString()); 

15 } catch(xcept::Exception& e) { 


17 XCEPT_RETHROW(CellException, “Parsing error in the GT reply”,e); 

18 } 


Figure 4-13: Example of how to use the xhannel to send SOAP messages to the GT cell. 

 

 

 

DB 

tstore 

 

 

 

MON 

monitor 

 

 

 

GT 

cell 

 

 

 

Figure 4-14: Example of xhannel list file. This file corresponds to the central cell of the TS system and defines 

xhannels to the monitor collector, to a Tstore application and to the GT cell. 

4.4.4.9.2 CellXhannelTb 

The CellXhannelTB class is another case of the xhannel infrastructure. It simplifies the development of the 

command and operation transition methods that need to interact with an Oracle database server. The


XDAQ executive 

Cell 1 


Cell 2 


CellXhannelTB(SOAP) 


Tstore 

OCCI 

Oracle DB 

Cell 3 

Figure 4-15: Recommended architecture to access a relational database from a cell. 

CellXhannelTB provides read and write (insert and update) access to the database. Figure 4-15 shows the 

recommended architecture to access a relational database from a cell using this communication channel. 

The CellXhannelTB sends SOAP requests to an intermediate XDAQ application named Tstore which is 

delivered with the XDAQ Power Pack package. Tstore allows reading and writing XDAQ table structures in an 

Oracle relational database. Tstore is the agreed solution for the CMS experiment as intermediate node between 

the sub-systems online software and the central CMS database server. It is designed to efficiently manage 

multiple connections with a central database server. The communication between Tstore and the server uses an 

Oracle proprietary protocol named OCCI. 

4.4.4.10 CellToolbox 

The CellToolbox package contains a number of classes intended to simplify the implementation of the cell. 

Table 4-2 presents the CellToolbox class list. 

Class name 

CellException 

CellToolbox 

CellLogMacros 

HttpMessenger 

SOAPMessenger 

4.4.4.11 Graphical User Interface 

Functionality 

Definition of the TS framework exception 

Several methods to create and parse SOAP messages 

Macros to insert log statements 

To send a HTTP request 

To send a SOAP message 

Table 4-2: Class list of the CellToolbox package. 

When a XDAQ executive is started-up, a number of core components are loaded in order to provide basic 

functionalities. One of the main core components is Hyperdaq. It facilitates a web interface which turns an 

executive into a browsable web application able to provide access to the internal data structure of any XDAQ 

application loaded in the same executive [86]. Any XDAQ application can customize its own web interface by 

overriding the default() virtual method of the xdaq::Application class (4.4.4.5). The web interface 

customization process requires developing Hypertext Markup Language (HTML) and JavaScript [88] code 

embedded in C++. Mixing three different languages in the same code has a cost associated with the learning 

curve because developers must learn two new languages, their syntax, best practices and the testing and 

debugging methodology using a web browser.


Command 

execution 

control 

Operation 

execution 

control 

Possible 

events 

Fish eye 

interface: 

logging, 

configuration 

database, 

support, … 

Control 

panels 

Operation 

parameters 

Monitoring 

information 

visualization 

Operation 

FSM 

Figure 4-16: Screenshot of the TS GUI. The GUI is accessible from a web browser and integrates the many 

services of the cell in a desktop-like fashion. 

Ajaxell [89] is a C++ library intended to smooth this learning curve. This library provides a set of graphical 

objects named “widgets” like “sliding windows”, “drop down lists”, “tabs”, buttons, “dialog boxes” and so on. 

These widgets ease the development of web interfaces with a look-and-feel and responsiveness similar to the 

stand-alone tools executed locally or through remote terminals (Java Swing, Tcl/Tk or C++ Qt. See Section 

1.4.4). The web interface of the cell implemented in the CellAbstract::default() method uses the Ajaxell 

library. This is an out-of-the-box solution which does not require any additional development by the subsystems. 

Figure 4-16 shows the TS GUI. It provides several controls: i) to execute cell commands; ii) to 

initialize, operate, and kill cell operations; iii) to visualize monitoring information retrieved from a monitor 

collector; iv) to access to the logging record for audit trials and postmortem analysis; v) to populate the L1 

trigger configuration database; vi) to request support; and vii) to download documentation. 

The cell web interface fulfills the requirement of automating the generation of a graphical user interface (Section 

4.2.2). The default TS GUI can be extended with “control panels”. A control panel is a sub-system specific 

graphical setup, normally intended for expert operations of the sub-system hardware. The control panel 

infrastructure allows developing expert tools with the TS framework. This possibility opens the door for the 

migration of existing standalone tools (Section 1.4.4) to control panels, and therefore contributes to the 

harmonization of the underlying technologies for both the expert tools and the TS. This homogeneous 

technological approach has the following benefits: i) smoothing the learning curve of the operators, ii) 

simplification of the overall L1 trigger OSWI maintenance, and iii) enhancing the sharing of code and 

experience. 

The implementation of a sub-system control panel is equivalent to develop a SubsystemPanel class which 

inherits from the CellPanel class (Figure 4-6). This development consists of defining the 

SubsystemPanel::layout() method following the guidelines of the TS framework user’s guide and using the 

widgets of the Ajaxell library [90]. The example of the Global Trigger control panel is presented in Section 

6.5.1. 

4.4.4.12 Monitoring infrastructure 

The monitoring infrastructure allows the users of a distributed control system implemented with the TS 

framework to be aware of the state of the cells or of any of its components (e.g. CellContext, CellOperation, 

etc.). Once a monitoring item is declared and defined for one of the cells, it can be retrieved from any node of the


system. Actually the TS framework is using the monitoring infrastructure of XDAQ and one additional class 

(Datasource) to assist on the definition of the code that updates the monitoring data. The monitoring 

infrastructure has the following characteristics: 

• An interface to declare and define monitoring items (integers, strings and tables). 

• Centralized collection of monitoring data coming from monitoring items that belong to different cells of the 

distributed system. 

• Central collector provides HTTP/CGI to consumers of monitoring data. 

• Visualization of monitoring items history through tables and graphs from all GUI of the cells. 

4.4.4.12.1 Model 

The XDAQ monitoring model is no longer based on FSM’s as proposed in Section 3.3.3.4. Figure 4-17 shows a 

distributed monitoring system implemented with the TS framework. A central node known as monitor collector 

polls the monitoring information from each of the cells that has an associated monitor sensor. The monitor sensor 

forwards the requests to the cell and sends back to the collector the updated monitoring information. The 

collector is responsible for storing this information and of providing a HTTP/CGI interface. The GUIs of the 

cells use the collector interface to read updated monitoring information from any cell. 

4.4.4.12.2 Declaration and definition of monitoring items 

The creation of monitoring items for a given cell consists of the monitoring items declaration and the monitoring 

update code definition. The declaration of a new monitoring item is accomplished by declaring this item in a 

XML file called “flashlist”. One of these files exists per cell. The declaration step also requires inserting the path 

to this file in the configuration file of the corresponding monitor sensor application and also of the central 

collector (Figure 4-18). Second, it is necessary to create the update code of the monitoring items using the 

DataSource class. The following sections present one example. 

PCI to VME 

External 

system 

d 

m 

h s 

xe sensor 

SOAP 

Http 

h 

mx 

Occi 

h 

Mon 

Collector Mstore 

s 

s 

Monitoring DB 

s 

Tstore 

o 

o 

Figure 4-17: Distributed monitoring system implemented in the TS framework. The monitor collector polls 

the cell sensor through the sensor SOAP interface, and the system cells read monitoring data stored in the 

collector using the HTTP/CGI interface.


 

 

 

 

Subsystem 

 

 

 

 

 

true 

 

 

${XDAQ_ROOT}/trigger/subsystem/ts/client/xml/flashlist1.xml 

 

 

 

 

 

 

Figure 4-18: Sub-system cell configuration file configures cell sensor with one flashlist named flashlist1.xml. 

Declaration 

Figure 4-19 presents an example of flashlist. This file declares three monitoring items: item1 of type string, 

item2 of type int (integer) and, table of type table. The monitoring items belong to the items group (or 

“infospace”) named monitorsource (see below: definition of monitoring items). The name of the infospace is 

the same as the name of the DataSource descendant class that is used to define the update code of the monitoring 

items. 

The tag embeds the definition of the different parameters that will use the monitor 

collector to poll monitoring information from the sensors. The most important attributes are: 

• Attribute every: Defines the sampling frequency (the time unit is 1 second). 

• Attribute history: If true, the monitor collector stores the history of past values. 

• Attribute range: Defines the size of the monitoring history in time units. 

Definition 

The classes involved in the definition of the monitoring item are shown in the UML diagram of Figure 4-20. The 

monitor collector is responsible for periodically sending SOAP messages to the cell sensors requesting updated 

monitoring data. Each monitor sensor translates the SOAP request into an internal event that is forwarded to all 

objects created inside a given XDAQ executive that belong to a descendant class of xdata::ActionListener.


The DataSource class is a descendant of xdata::ActionListener. It is therefore able to process the incoming 

events by overriding the actionPerformed(xdata::Event&) method. This method is responsible for executing 

the MonitorableItem::refresh() method which gets the updated value for the monitoring item. A sub-system 

specific descendant of the DataSource is meant to contain the refresh methods for each of the monitoring items 

of the cell. The DataSource class is responsible also for creating the infospace object with the same name 

declared in the flashlist (Figure 4-19). 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 4-19: Declaration of monitoring items using a flashlist. 

xdata::ActionListerner 


DataSource 

-std::map monitorables_ 

-std::string infospaceName_ 

-xdata::InfoSpace* infospace_ 

CellAbstract 

1 1 


* 

+void DataSource::actionPerformed(xdata::Event& received)() 

1 

MonitorableItem 

-string name_ 

-xdata::Serializable* serializable_ 

-RefreshFunctionalSignature* refreshFunctional_ 

Subsystem monitoring handlers 

1 

1 


+refresh()() 

Figure 4-20: Components of the TS framework involved in the definition of monitoring items.


4.4.4.13 Logging infrastructure 

Each cell of a distributed control system implemented with the TS framework can send logging statements to a 

common logging database. Logging records can also be retrieved and visualized from any cell. Figure 4-21 

shows the logging model for a distributed control system implemented with the TS framework. 

The Architecture of the data logging model consists of the following components: 

• Logging database: A relational database stores the logging information that is sent from the logging 

collector. The logging database is set up according to the schema proposed for the entire CMS experiment. 

• Logging collector: The logging collector is part of the RCMS framework (Section 4.4.2.7). It is a hub that 

accepts logging messages via UDP protocol 15 . The collector filters the logging messages by logging level, if 

necessary, and relays them to other applications, databases or other instances of logging collector. 

• Logging console: A XDAQ application named XS included with the Work Suite package (Section 4.4.3) is 

used as logging console to retrieve the logging information from the database. This application lists logging 

sessions according to their cell session identifier. A session identifier is the identifier of a session that a 

given controller has initiated with a distributed control system implemented with the TS framework. The 

logging console is able to display the logging messages. In addition, the user can filter the logging messages 

of each session using keywords. 

• Logging Macros: The TS framework provides macros to notify a log from inside the command and 

operation transition methods. These macros accept a cell session identifier, a logger object and a message 

string. The cell session identifier is accessible in any command and operation. The logger object is 

accessible from any class descendant of CellObject class (Section 4.4.4.2). 

u 

h 

Cell xe 

d 

u 

XS 

o 

u 

Log 

Collector x 

c 

j 

Log 

u 

Collector 

x 

c 

j 

Chainsaw 

XML file 

Console 

PCI to VME 

Udp 

Occi 

Http 

j 

o 

Logging 

DB 

Figure 4-21: Logging model of a distributed control system implemented with the Trigger Supervisor 

framework. 

15 User Datagram Protocol (UDP) is one of the core protocols (together with TCP) of the Internet protocol suite. Using UDP, 

programs on networked computers can send short messages sometimes known as datagrams to one another. UDP is 

sometimes called the Universal Datagram Protocol. UDP does not guarantee reliability or ordering in the way that the 

Transmission Control Protocol (TCP) does.


4.4.4.14 Start-up infrastructure 

The start-up infrastructure of the TS framework consists of one component, the job control (Section 1.4.1). This 

is a XDAQ application included as a component of the RCMS framework. The purpose of the job control 

application is to launch and terminate XDAQ executives. Job control is a small XDAQ application running on a 

XDAQ executive, which is launched at boot time. It exposes a SOAP API which allows launching another 

XDAQ executive with its own set of environment variables and terminating them. A distributed system 

implemented with the TS framework has a job control application running at all times in every host of the 

cluster. In this context, a central process manager would coordinate the operation of all job control applications 

running in the cluster. 

4.5 Cell development model 

The TS framework, together with XDAQ and the external packages, forms the software infrastructure that 

facilitated the development of a single distributed software system to control and monitor all trigger sub-systems 

and sub-detectors. This section describes how to implement a cell to operate a given sub-system hardware. The 

integration of this node into a complex distributed control and monitoring system is exemplified with the TS 

system presented in Chapter 5. 

Install framework 

Do cell 

Do operation 

Loop 

Do command 

Prepare cell context 

Prepare xhannels 

Do monitoring item 

Compile & test 

Do control panel 

Figure 4-22: Usage model of the TS framework. 

Figure 4-22 schematizes the development model associated with the TS framework. It consists of a number of 

initial steps common to all control nodes, and an iterative process intended to customize the functionalities of the 

node according to the specific operation requirements. 

• Install framework: The TS and XDAQ frameworks have to be installed in the CERN Scientific Linux 

machine where the cell should run. The installation details are described in the Trigger Supervisor 

framework user’s guide [63]. 

• Do cell: Developing a cell consists of defining a class descendant of CellAbstract (Section 4.4.4.1). 

• Prepare cell context: The cell context, presented in Section 4.4.4.2, is a shared object among all 

CellObject objects that forms a given cell. The CellAbstractContext object contains the Logger, the 

xhannels and the factories. The cell context can be extended in order to store sub-system specific shared 

objects like a hardware driver. To extend the cell context it is necessary to define a class descendant of 

CellAbstractContext (e.g. SubsystemContext in Figure 4-6). The cell context object has to be created in 

the cell constructor and assigned to the context_ attribute. The cell context attribute can be accessed from 

any CellObject object, for instance a cell command or operation. 

• Prepare xhannel list file: The preparation of the xhannel list consists of defining the external web service 

providers that will be used by the cell: other cells, Tstore application to access the configuration database or 

any other XDAQ application (Section 4.4.4.9). Once the cell is running, the xhannels are accessible through 

the cell context object.

Performance and scalability measurements 63 

• Do plug-in: Additional cell operations (Section 4.4.4.3), commands (Section 4.4.4.2), monitoring items 

(Section 4.4.4.12) and control panels (Section 4.4.4.11) can be gradually implemented when they are 

required. The details are described in the corresponding sections and in the TS framework user’s guide [63]. 

4.6 Performance and scalability measurements 

This section presents performance and scalability measurements of the TS framework. This discussion focuses 

on the most relevant framework factors that affect the ability to build a distributed control system complex 

enough to cope with the operation of O(10 2 ) VME crates and assuming that each crate is directly operated by one 

cell. These factors are the remote execution of cell commands and operations using the TS SOAP API (Appendix 

A). The measurements are neither meant to evaluate external developments (i.e. monitoring, database, logging 

and start-up infrastructures) nor the responsiveness of the TS GUI which was presented in [90]. 

4.6.1 Test setup 

Timing and scalability tests have been carried out in the CMS PC cluster installed in the underground cavern. 

The tests ran on 20 identical rack-mounted PCs (Dell Power Edge SC2850, 1U Dual Xeon 3GHz, hyperthreading 

and 64 bit-capable) equipped with 1 GB memory and connected to the Gigabit Ethernet private 

network of the CMS cluster. All hosts run CERN Scientific Linux version 3.0.9 [91] with kernel version 

2.4.21.40.EL.cernsmp and version 1.3 of the Trigger Supervisor framework. 

The most relevant factors of the cell command and operations are presented. In order to evaluate the scalability 

of each factor under test, five software distributed control system configurations have been set up. Table 4-3 

summarizes the setups. 

Setup name 

# of 

Hosts 

# of 

Level-0 

cells 

# of 

Level-1 

cells 

# of 

Level-2 

cells 

Central 1 1 0 0 1 

Central_10Level1 11 1 10 0 11 

Central_10Level1_10Level2 20 1 10 10 21 


Total 

# of 

cells 


Notes 

Level-2 cells are all in the 

same branch 

Level-2 cells are 

distributed in 2 branches 

Level-2 cells are equally 

distributed in 10 branches 

Table 4-3: System configuration setups. 

Each table row specifies a test setup. A test setup consists of a number cells organized in a hierarchical way. 

There is always 1 level-0 cell or central cell which coordinates the operation of up to 10 level-1 cells, and as 

function of the setup the level-1 cells coordinate also a number of level-2 cells. Figure 4-23 presents the example 

of the Central_10Level1_20Level2 setup architecture. This setup consists of 1 central cell, 10 level-1 cells 

controlled by the central cell and 10 level-2 cells controlled by the first and the second level-1 cells. 

4.6.2 Command execution 

This section measures the remote execution of cell commands. This study has been carried out with the 

central_10Level1 setup. These tests measure the necessary time for the central cell to remotely execute a number 

of commands in the first level-1 cell. Each measure starts when the first request message is sent from the central 

cell and it finishes when the last reply arrives. 

The first exercise measures the time to execute commands which have a code() method that does nothing. 

Figure 4-24 shows the test results.


s 

cx 

h 

Central 

Cell 

SOAP (CellXhannelCell) 

HTTP/CGI 

s h 

Level-1 

Cell 1 

cx 

s h 

Level-1 

Cell 2 

cx 

s h 

Level-1 

Cell 3 

… 

s h 

Level-1 

Cell 10 

s h s h s 

Level-2 

Branch 1 … 

Cell 2 

Level-2 

Branch 1 

Cell 1 

h 

Level-2 

Branch 1 

Cell 10 

s h s h s 

Level-2 

Branch 2 … 

Cell 2 

Level-2 

Branch 2 

Cell 1 

h 

Level-2 

Branch 2 

Cell 10 

Figure 4-23: Central_10Level1_20Level2 test setup architecture consists of 1 central cell, 10 level-1 cells 

controlled by the central cell and 10 level-2 cell controlled by the first and the second level-1 cells. 

The first conclusion that can be extracted from Figure 4-24 is that in both synchronous and asynchronous 

communication cases, the execution time scales linearly. A second conclusion is that there is a small time 

overhead due to the asynchronous protocol. For instance, the execution of 256 commands in synchronous mode 

takes 1.81 seconds whilst the execution of the same number of commands in asynchronous modes takes 1.94 

seconds. This overhead is due to the additional complexity of handling the asynchronous protocol in both the 

client (central cell) and the server (first level-1 cell). In synchronous mode the average time to execute a 

command is 7 ms, which is just a little bit better than the 7.7 ms obtained in asynchronous mode. 

time (s) 

2.5 

2 

1.5 

1 

0.5 

0 

Remote command execution with delta = 0 

0 50 100 150 200 250 300 

number of messages 

synchronous SOAP asynchronous SOAP 

Figure 4-24: Summary of performance tests to study the remote execution cell commands between the 

central cell and a level-1 cell. 

However, the importance of this overhead disappears when the performance test presents a more realistic 

scenario. In this new scenario the remote command executes a delay (delta). This delay in the code() method 

emulates for instance a hardware configuration sequence or a database access. Figure 4-25 summarizes the 

results of performance tests intended to study the remote execution of 256 cell commands between the central 

cell and a level-1 cell in synchronous and asynchronous mode (Y axis) and as a function of delta (X axis). 

The results in synchronous mode increase approximately linearly with the level-1 cell command delay (delta) 

whilst the results in asynchronous mode remain constant when delta increases. The performance advantage is


Remote execution of 256 commands as a function of delta 

30 

time (s) 

20 

10 

0 

0 0.02 0.04 0.06 0.08 0.1 0.12 

delta time (s) 

synchronous SOAP asynchronous SOAP 

Figure 4-25: Summary of performance tests to study the remote execution of 256 cell commands between 

the central cell and a level-1 cell in synchronous and asynchronous mode. 

visible for down to 2 messages and small deltas down to 20 milliseconds. This is a proof of the suitability of the 

asynchronous protocol to improve the overall performance of a given controller. This feature is very much 

appreciated during the configuration of the trigger sub-systems because the asynchronous protocol allows 

starting the configuration process in parallel in all the trigger sub-systems. Therefore, the overall configuration 

time will be approximately the configuration time of the slowest sub-system rather than the addition of all 

configuration times. 

4.6.3 Operation instance initialization 

This section discusses the performance and scalability of the cell operation initialization. The test setups used for 

these measurements are: Central_10Level1, Central_10Level1_10Level2, Central_10Level1_20Level2 and 

Central_10Level1_100Level2. Each test consists of measuring the overall time necessary to initialize an 

operation in each node of the configuration setup. The measurement includes the operation initialization in the 

central cell plus the remote initialization in the sibling cells. The test finishes when the last reply message arrives 

at the central cell. 

Operation initialization 

time (s) 

1.6 

1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

0 

0 20 40 60 80 100 120 

number of nodes 

Figure 4-26: Total time to initialize an operation instance in all cells of a setup as a function of the number 

of cells. 

Figure 4-26 shows the results of measuring the total time to initialize an operation instance in each cell as a 

function of the number of cells in the setup, and Figure 4-27 shows the results of measuring the total time to 

initialize an operation instance in each cell as a function of the number cell levels in the setup. The tests are just 

done in the synchronous case because the operation initialization request is just available in synchronous mode



time (s) 

1.6 

1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

0 

Central_10Level1_20Level2 


0 0.5 1 1.5 2 2.5 3 3.5 

number of levels 

Figure 4-27: Total time to initialize an operation instance in all cells of a setup as a function of the number 

of cell levels. It is interesting to note that, due to the synchronous protocol, the number of cells in the setup 

define the total initialization time. E.g. Central_10Level1_20Level2 and Central_10Level1_100Level2 setups 

have different total initialization time despite having the same number of levels (3). 

(cell blocked). This interface constraint was set in order to assure that no operation events were received before 

the operations instance was created. 

The results show that the average time to initialize a cell operation is 13.4 ms. We can also conclude that the 

overall time to initialize one operation in each cell scales linearly with the number of cells independently of the 

cell levels. 

4.6.4 Operation state transition 

This section discusses the performance and scalability of the cell operation transition. The test setups used for 

these measurements are again: Central_10Level1, Central_10Level1_10Level2, Central_10Level1_20Level2 and 

Central_10Level1_100Level2. Each test consists of measuring the overall time necessary to execute an operation 

transition in each node of the configuration setup.The measurement includes the operation transition in the 

central cell plus the remote execution of an operation transition in the sibling cells. The test finishes when the 

last reply message arrives at the central cell. All cell operation transition methods have an internal delay of 1 

time (s) 

Operation transtion in synchronous mode 

120 

100 

80 

60 

40 

20 

0 

0 20 40 60 80 100 120 


Figure 4-28: Total time to execute an operation transition in all cells of a setup as a function of the number 

of cells and in synchronous mode.


second. This time lapse is defined in milliseconds and is called “delta”. Delta is meant to emulate a hardware 

configuration sequence and/or a database access. 

Figure 4-28 shows the results of measuring the total time to execute an operation transition in all cells of a setup 

as a function of the number of cells in the setup and in synchronous mode. This Figure shows that, in 

synchronous mode, the overall execution time scales linearly with the number of cells and it is therefore 

independent of the cell levels as it is shown in Figure 4-29. 


time (s) 

1.6 

1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

0 



0 0.5 1 1.5 2 2.5 3 3.5 


Figure 4-29: Total time to execute an operation transition in all cells of a setup as a function of the cell levels 

and in synchronous mode. It is interesting to note that, due to the synchronous protocol, the number of cells 

in the setup define the total execution time. E.g. Central_10Level1_20Level2 and 

Central_10Level1_100Level2 setups have different total execution time despite having the same number of 

levels (3). 

Figure 4-30 shows the results of measuring the total time to execute an operation transition in all cells of a setup 

as a function of the number of cells and in asynchronous mode. This Figure shows that, in asynchronous mode, 

the overall execution time, for all test cases, is much better than for the synchronous case. This overall time is 

equal to adding the worst cases in each level (1 second per level of the test setup). Figure 4-31 shows how in 

asynchronous mode the overall execution time scales linearly with the number of levels. 

Operation transition in asynchronous mode 

time (s) 

3.5 

3 

2.5 

2 

1.5 

1 

0.5 

0 

0 20 40 60 80 100 120 



of cells and in asynchronous mode.


Operation transition in asynchronous mode 

time (s) 

4 

3 

2 

1 

0 

0 1 2 3 4 



of cell levels and in asynchronous mode.

Chapter 5 

Trigger Supervisor System 


The TS system is a distributed software system, initially outlined in the TS conceptual design chapter (Section 

3.3). It consists of a set of nodes and the communication channels among them. The TS system is designed to 

facilitate a stable platform, despite hardware and software upgrades, on top of what the TS services can be 

implemented following a well defined methodology. This approach implements the “flexibility” non-functional 

requirement discussed in Section 3.2.2, Point 6). 

This chapter is organized in the following sections: Section 5.1 is the introduction; in Section 5.2 the system 

design guidelines are discussed; in Section 5.3 the system building blocks, the sub-system integration strategies 

and an overview of the system architecture are presented; Section 5.4 describes the TS control, monitoring, 

logging and start-up systems. Finally, the service development process associated with the TS system is 

discussed in Section 5.5. 

5.2 Design guidelines 

The TS system design principles, presented in this section, have two main sources of inspiration: i) the software 

infrastructure presented in Chapter 4, which consists of a number of external packages, the XDAQ middleware 

and the TS framework; ii) the functional and non-functional requirements described in the TS conceptual design, 

with an special attention to the “human context awareness” non-functional requirement (Section 3.2.2, Point 7) ), 

which already guided the design decisions of the TS framework. 

5.2.1 Homogeneous underlying infrastructure 

The design of the TS system is solely based on the software infrastructure presented in Chapter 4, which consists 

of a number of external packages, the XDAQ middleware and the TS framework. A homogeneous underlying 

software infrastructure simplifies the support and maintenance tasks during the integration and operational 

phases. Moreover, the concrete usage of the TS framework was encouraged in order to profit from a number of 

facilities designed and developed to fulfill additional functional requirements and to cope with the project human 

factors and the reduced development time (Section 4.2.2). 

5.2.2 Hierarchical control system architecture 

The TS control system has a hierarchical topology with a central cell that coordinates the operation of the lower 

level sub-system central cells. These second level cells are responsible for operating the sub-system crate or to 

coordinate a third level of sub-system cells that finally operate the sub-system crates. A hierarchical TS control 

system eases the implementation of the following system level features:

Trigger Supervisor System 70 

1) Distributed development: Each sub-system has always one central cell exposing a well defined interface. 

This cell hides from the TS central cell the implementation details of the sub-system control infrastructure. 

This approach simplifies the role of a TS system coordinator because then s/he just needs to worry about the 

interface definition exposed by each sub-system central cell. The respective sub-system software responsible 

takes care to implement this interface. At the sub-system level, the development of the sub-system control 

infrastructure is further divided into smaller units following the same approach. This development 

methodology eased the central coordination tasks by dividing the system overall complexity into much 

simpler sub-systems which could be developed with a minimal central coordination. 

2) Sub-system control: The hierarchical design facilitates the independent operation of a given sub-system. 

This is possible by operating the corresponding sub-system central cell interface. This feature fulfills the 

non-functional requirement outlined in Section 3.2.2, Point 2). 

3) Partial deployment: The hierarchical design simplifies the partial deployment of the TS system by just 

deploying certain branches of the TS system. This is useful, for instance, to create a sub-system test setup. 

4) Graceful degradation: The hierarchical design facilitates a graceful degradation inline with the 

“Robustness” non-functional requirement stated in Section 3.2.2, Point 4). If something goes wrong during 

the system operation, only one branch of the hierarchy needs to be restarted. 

5.2.3 Centralized monitoring, logging and start-up systems architecture 

The TS framework uses the monitoring, logging and start-up infrastructure provided by the XDAQ middleware 

and the RCMS framework. This infrastructure is characterized by enforcing a centralized architecture. Therefore, 

the TS monitoring, logging and startup systems cannot be a pure hierarchical systems, as proposed in Section 

3.3.3, due to the trade-off of reusing existing components. 

5.2.4 Persistency infrastructure 

The TS system requires a database infrastructure to store and retrieve configuration, monitoring and logging 

information. The following points present the design guidelines for this infrastructure. 

5.2.4.1 Centralized access 

A CMS wide architectural decision enforces the centralization of common services to access the persistency 

infrastructure. This common access points should facilitate a simple interface to the persistency infrastructure 

and should be responsible to manage the connections to the persistency server. The CMS database task force 

recommends using one single Tstore (Section 4.4.4.9.2) application for all nodes of the TS system. 

5.2.4.2 Common monitoring and logging databases 

The TS monitoring and logging systems (Sections 5.4.2 and 5.4.3) are based on XDAQ and RCMS 

infrastructure. In this context, single monitor and logging collector applications gather periodically the 

monitoring and logging information respectively and facilitate an HTTP/CGI interface to any possible 

information consumer. These collectors are also responsible to store the gathered information into the L1 trigger 

monitoring and logging databases. These two databases are common to all L1 trigger sub-systems. 

5.2.4.3 Centralized maintenance 

All TS databases are maintained in the central CMS database server (Oracle database 10g Enterprise Edition 

Release 10.2.0.2, [92]) which is under the responsibility of the CMS and the CERN-IT database services. 

5.2.5 Always on system 

The TS configuration and monitoring services are used to operate the L1 trigger when the experiment is running 

but are also used during the integration, commissioning and test operations of the L1 trigger in standalone mode. 

In addition, the TS services to test each of the L1 trigger sub-systems and to check the inter sub-system

Sub-system integration 71 

connections and synchronization are required outside the experiment running periods. Therefore, the TS system 

should always be available. 

5.3 Sub-system integration 

Figure 5-1 shows an overview of the TS system with the central node controlling twelve TS nodes, one per subsystem 

including all L1 trigger sub-systems and sub-detectors: the Global Trigger (GT), the Global Muon 

Trigger (GMT), the Drift Tube Track Finder (DTTF), the Cathode Strip Chamber Track Finder (CSCTF), the 

Global Calorimeter Trigger (GCT), the Regional Calorimeter Trigger (RCT), the Electromagnetic Calorimeter 

(ECAL), the Hadronic Calorimeter (HCAL), the Drift Tube Sector Collector (DTSC), the Resistive Plate 

Chambers (RPC), the Tracker and the Luminosity Monitoring System (LMS). This is the entry point for any 

controller that wishes to access only sub-system specific services. For some sub-systems, an additional level of 

TS nodes can be controlled by the sub-system central node. 

L‐1Trigger 

FM 

Central Node 

TS Node 

TS Node 

TS Node TS Node TS Node 

TS Node 

TS Node 

TS Node 

TS Node 

Common Services: 

Logging, DB, Monitoring 

Persistency 

Infrastructure 

5.3.1 Building blocks 

The following sections present the building blocks used to build the TS system. The main role is played by the 

cell. In addition, XDAQ and the RCMS frameworks contribute with a number of secondary elements. 

5.3.1.1 The TS node 

Figure 5-1: Overview of the Trigger Supervisor system. 

The TS node, shown in Figure 5-2, is the basic unit of a distributed system implemented with the TS framework. 

It has three main components: the cell, the monitor sensor and the job control. The cell is the element that has to 

be customized (Section 4.5), the monitor sensor is a XDAQ application intended to interact with the monitor 

collector forwarding update requests to the cell and sending back to the monitor collector the updated monitoring 

information (Section 4.4.4.12). Finally, the job control is a building block of the start-up system (Section 

4.4.4.14). 

The cell has two input ports exposing respectively the cell SOAP (s) and HTTP/CGI (h) interfaces and four 

output ports corresponding to the monitoring (mx), database (dx), cell (cx) and XDAQ (xx) xhannels (Section 

4.4.4.9). The functionality of the cell is meant to be customized according to specific needs of each sub-system. 

The customization process consists of implementing control panel, commands and operations plug-ins, and 

adding monitoring items (Section 4.5). Those cells intended to directly control a sub-system crate should also 

embed the sub-system crate hardware driver (Section 4.5).


op 

s 

cp 

h 

m 

xe 

h 

Mon 

Sensor 

s 

xe 

Job 

control 

d 

mx dx 

c 

cx xx 

Figure 5-2: Components of a TS node. (s: SOAP interface, h: HTTP/CGI interface, xe: XDAQ executive, 

op: Operation plug-ins, c: Command plug-ins, m: monitoring item handlers, d: hardware driver, cp: control 

panel plug-in). 

The sub-system cells are meant to act as abstractions of the corresponding sub-system hardware. These black 

boxes expose a stable SOAP API regardless of hardware and/or software upgrades. This facilitates a stable 

platform on top of which the TS services (Chapter 6) can be implemented. This approach allows largely 

decoupling the evolution of sub-system hardware and software platforms from changes in the operation 

capabilities offered by the TS. 

5.3.1.2 Common services 

The common services of the TS system, shown in Figure 5-3, are unique nodes of the distributed system which 

are used by all TS nodes. These nodes are the logging collector, the Tstore, the monitor collector and the Mstore. 

u 

tc c 

Log 

Collector x 

u 

j 

xe 

s 

Tstore 

o 

h 

xe 

Mon 

Collector Mstore 

s 

s 

Figure 5-3: Common service nodes. (tc: Tomcat server, u: UDP interface, x: XML local file, j: JDBC 

interface, xe: XDAQ executive, s: SOAP interface, o: OCCI interface, h: HTTP/CGI interface). 

5.3.1.2.1 Logging collector 

The logging collector or log collector [85] is a software component that belongs to the RCMS framework. It is a 

web application written in Java and running on a Tomcat sever. It is designed and developed to collect logging 

information from log4j compliant applications and to distribute these logs to several consumers. These 

consumers can be: an Oracle database, files, other log collectors or a real time message system. The log 

collector is part of the TS logging infrastructure (Section 4.4.4.13). 

5.3.1.2.2 Tstore 

The Tstore is a XDAQ application delivered with the XDAQ Power Pack package. Tstore provides a SOAP 

interface which allows reading and writing XDAQ table structures in an Oracle database (Section 4.4.4.9.2). The 

CMS DataBase Working Group (DBWG) stated that having one single Tstore application for all cells of the TS 

system already assures a suitable management of the database connections. 

5.3.1.2.3 Monitor collector 

The monitor collector is also a XDAQ application delivered with the XDAQ Power Pack package. This XDAQ 

application periodically pulls from all TS system sensors the monitoring information of all items declared in the


sub-system flashlist files. The collection of each flashlist can be performed at regular intervals by providing the 

collector a snapshot of the corresponding data values at retrieval time. Optionally, a history of data values can be 

buffered in memory at the collector node. This buffered data can be made persistent for later retrieval. The 

interface between sensor and collector is SOAP. The collector also provides an HTTP/CGI interface to read the 

monitoring information coming from all the TS system. The monitor collector is part of the TS monitoring 

infrastructure (Section 4.4.4.12). 

5.3.1.2.4 Mstore 

The Mstore application is a XDAQ application delivered with the Work Suite package of XDAQ. This 

application takes flashlist data from a monitor collector and forwards it to a Tstore application for persistent 

storage in a database. 

5.3.2 Integration 

All sub-systems use the same building blocks, presented in Section 5.3.1, to integrate with the TS system. 

However, each sub-system follows a particular integration model which depends on a number of parameters 

related to either the sub-system Online SoftWare Infrastructure (OSWI) or to the sub-system hardware setup. 

This section presents the definition of all integration parameters, the description of the most relevant integration 

models and finally a summary of all the integration exercises. 

5.3.2.1 Integration parameters 

This section presents the sub-system infrastructure parameters which were relevant during the integration 

process with the TS system. These have been separated in those related to the OSWI and those related to the subsystem 

hardware setup. 

5.3.2.1.1 OSWI parameters 

Usage of HAL 

This parameter defines the low level software infrastructure to access the sub-system custom hardware boards. 

The CMS recommendation to access VME boards is the Hardware Access Library (HAL [53]). HAL is a library 

that provides user-level access to VME and PCI modules in the C++ programming language. Most of the subsystems 

follow the CMS recommendation to access VME boards with the exception of the RCT and the GCT. In 

the GCT case, board control is provided by a USB interface and the GCT software infrastructure uses a USB 

access library. In the RCT case, a sub-system specific driver and user level C++ libraries were developed. 

C++ API 

On top of HAL or the sub-system specific hardware access library or driver, most of the sub-systems have 

developed a C++ library which offers a high level C++ API to control the hardware from a functional point of 

view. 

XDAQ application 

Some sub-systems have developed their own XDAQ application to remotely operate their hardware setups 

(Sectio 1.4.4). In some of these cases the sub-system XDAQ application is the visible interface to the hardware 

from the point of view of the cell. 

Scripts 

In addition to the compiled applications (i.e. C++ and XDAQ applications), some sub-systems have opted for an 

additional degree of flexibility enhancing their OSWI with interpreted scripts. Python and HAL sequences are 

being used. Scripts are used to define test procedures but also to define configuration sequences. These 

configuration scripts used to mix the configuration code with the configuration data. In the final system, 

configuration data is retrieved separately from the configuration database. However, during the commissioning 

phase, some sub-systems retrieve configuration scripts from the configuration database. This is an acceptable


practice because it helps to decouple the continuous firmware updates with the maintenance of a consistent 

configuration database. 

5.3.2.1.2 Hardware setup parameters 

Bus adapter 

From the hardware point of view, the L1 trigger sub-system hardware is hosted in VME crates controlled by an 

x86/Linux machine. With few exceptions, the interface between the PC and the VME crate is done with a PCI to 

VME bus adapter [93]. 

Hardware crate types and number 

These parameters tell us how many different types of crates and how many units of each type have to be 

controlled. It was decided to have a one on one relationship between cells and crates. In other words, each cell 

controls one single crate and each crate is controlled by only one cell. This approach enhances the reusability of 

the same sub-system cells in different hardware setups. For instance: 

1) During the debugging phases, in the home institute laboratory, and during the initial commissioning 

exercises, when just one or few crates are available, a single cell controlling one single crate was developed 

in order to enhance the board debugging process. Afterwards, this cell has been reused as a part of a more 

complex control system. 

2) During the system deployment in its final location, when the complete hardware setup must be controlled, 

all individual cells implemented during the debugging and commissioning exercises were reused and 

integrated into the corresponding sub-system control system. 

Exceptions to this rule are the GT, the GMT and the RPC integration models. Board level cells were discarded 

due to the higher complexity of the resulting distributed control system. Just the control of one single crate with a 

number of boards would require a central cell which coordinated the operations of as many cells as boards. 

Hardware crate sharing 

This parameter tells us whether or not a given sub-system crate is shared by more than one sub-system. This has 

to be taken into account because to share a crate means also to share the bus adapter. 

5.3.2.2 Integration cases 

The TS sub-systems presented in the following sections are examples of the main different integration cases. 

Each integration case corresponds to a different L1 trigger sub-system or sub-detector, and it is defined by the 

parameters presented in Sections 5.3.2.1.1 and 5.3.2.1.2. The result of each integration case is a set of building 

blocks and the communication channels among them. 

5.3.2.2.1 Cathode Strip Chamber Track Finder 

The hardware setup of the CSCTF is one single VME crate controlled by a PCI to VME bus adapter. The OSWI 

consists of C++ classes built on top of the HAL library. These classes offer a high level abstraction of the VME 

boards and facilitate their configuration and monitoring. 

The integration model for the CSCTF represents the simplest integration case. One single cell running in the 

CSCTF host was enough. The customization process of the CSCTF cell is based on using the C++ classes of the 

CSCTF OSWI to operate the crate. 

5.3.2.2.2 Global Trigger and Global Muon Trigger 

The integration of the GT and the GMT represents a special case because despite being two different subsystems, 

they share the same crate.


The integration model followed in this concrete case, shown in Figure 5-4, contradicts the rule of one cell per 

crate. In this case two cells access the same crate. Compared to the single cell integration model, this approach 

has several advantages: 

1) Smaller complexity: During the initial development process, we realized that the overall complexity of two 

individual cells was smaller than the complexity of one single cell. Therefore, this solution was easier to 

maintain. 

2) Enhanced distributed development: The development work to integrate the GT and GMT sub-systems can 

be more easily split between two different developers working independently. 

3) Homogeneous architecture: The Interconnection test service between GT and GMT can be logically 

implemented like any other interconnection test service between two sub-systems hosted in different crates. 

Concerning the OSWI, it consists of C++ classes built on top of HAL. Therefore, the definition of the cells 

command and operation transition methods is based on using this API. 

op 

s 

cp 

h 

m 

h 

xe 

Mon 

Sensor 

GT Cell 

op 

s 

cp 

h 

m 

h 

xe 

Mon 

Sensor 

GMT Cell 

xe 

s 

Job 

control 

d 

c 

d 

c 

GMT/GT Host 

mx dx 

cx xx 

mx dx cx xx 

PCI to VME bus adapter 

GMT/GT crate 

5.3.2.2.3 Drift Tube Track Finder 

Figure 5-4: Integration model used by the GT and GMT. 

The DTTF hardware setup consists of six identical track finder crates, one central crate and one clock crate. Due 

to limitations of the device driver specifications, it is not possible to have more than three PCI to VME interfaces 

per host. Therefore, the six track finder crates are controlled by two hosts. An additional host controls the clock 

crate and the central crate. The OSWI is based on C++ classes built on top of HAL. 

Figure 5-5 shows the integration model followed by the DTTF. As usual, each crate is controlled by one cell. 

There are four different cells: 1) track finder cell (TFC) which is in charge of controlling a track finder crate, 2) 

clock crate cell (CKC), 3) the central crate cell (CCC) and 4) the DTTF central cell (DCC) which is in charge of 

coordinating the operation of all other cells. The DCC provides a single access point to operate all DTTF crates 

and simplifies the implementation of the TS central cell. 

The customization process of the DTTF crate cells (i.e.: TFC, CKC and CCC) uses the C++ class libraries of the 

DTTF OSWI. Therefore, all crate cells must run in the same hosts where the PCI to VME interfaces are pluggedin.


PCI to VME 

s 

xe 

Job 

control 

DTTF 

host 2 

s h s 

DCC xe sensor 

cx 

dx 

s 

xe Job 

control 

DTTF 

host 1 

DTTF 

host 3 


SOAP (CellXhannelTstore) 

Occi 

s 

xe 

Job 

control 

s h s 

TFC xe sensor 

d 

dx 

s h s 

TFC xe sensor 

d 

dx 

s h s s h s s h s s h s s h s s h s 

TFC xe sensor CKC xe sensor CCC xe sensor TFC xe sensor TFC xe sensor TFC xe sensor 

d d d d d d 

dx 

dx 

dx 

dx 

dx 

dx 

3 x Track finder crates 

•Clock crate 

•DIO 

Central crate: 

•DCC 

•TIM 

•FSC 

•Barrel sorter 

s 

3 x Track finder crates 

o 

Tstore 

Configuration 

DB 

o 

Figure 5-5: Integration model for the DTTF. 

5.3.2.2.4 Resistive Plate Chamber 

The OSWI of the RPC Trigger system consists of three different XDAQ applications that are used to control 

three different types of crates: 1) twelve RPC Trigger crates, 2) one RPC Sorter crate and 3) one RPC CCS/DCC 

crate. 

The integration model of the RPC with the TS is shown in Figure 5-6. In this case, the hardware interface is 

facilitated by XDAQ applications and these applications are operated by one cell, the RPC cell. 

s h s 

RPC 

xe sensor 

Cell 

xx 

dx 

s 

xe 

Job 

control 

RPC 

host 1 

s 

o 

Tstore 

Configuration 

DB 

o 

s 

RPC 

Xdaq 

app 

xe 

s 

RPC 

Xdaq 

app 

xe 

s 

RPC 

Xdaq 

app 

xe 

RPC 

host 2 

… 

RPC 

host 3 

RPC 

host 4 

s 

RPC 

Xdaq 

app 

xe 

s 

RPC 

Xdaq 

app 

xe 

RPC 

host 5 

PCI to VME 


12 x RPC Trigger crates 

RPC Sorter 

crate 

RPC 

CCS/DCC 

crate 


Http 

Occi 

Figure 5-6: RPC integration model. 

5.3.2.2.5 Global Calorimeter Trigger 

The Global Calorimeter Trigger (GCT) hardware setup consists of one main crate and three data source card 

crates. The particularity of this hardware setup is that all boards are controlled independently through a USB


s h s 

GCT xe sensor 

Cell 

d 

xx dx 

s 

xe 

Job 

control 

GCT 

host 1 

s 

Tstore 

o 

Configuration 

DB 

o 

GCT 

host 2 

s 

GCT 

Xdaq 

app 

xe 

s 

GCT 

Xdaq 

app 

xe 

s 

GCT 

Xdaq 

app 

xe 

USB 



Http 

Occi 

Main 

crate 

3 x Data source crates 

d: Python interpreter, Python configuration 

sequences, Python extension and USB driver 

interface. Therefore, it is possible to control the four crates from one single host because the limitation of the 

CAEN driver does not exist. 

The OSWI consists of a C++ class library, a Python language extension and XDAQ applications. The low level 

OSWI for both the data source crates and the main crate is based on a C++ class library built on top of a USB 

driver. A second component of the GCT software is the Python extension that allows developing Python 

programs in order to create complex configuration, test sequences or simple hardware debugging routines 

without having to compile C++ code. The third component is a XDAQ application which allows remote access 

to the boards in the data source crates. 

Figure 5-7 shows the integration model followed by the GCT. This integration model maximizes the usage of the 

existing infrastructure. It consists of one single cell, which embeds a Python interpreter in order to execute 

Python sequences to configure the main crate. This same cell coordinates the operation of the data source crates 

through the remote SOAP interface of the GCT XDAQ applications. 

5.3.2.2.6 Hadronic Calorimeter 

Figure 5-7: GCT integration model. 

The HCAL sub-detector has its own supervisory and control system which is responsible for the configuration, 

control and monitoring of the sub-detector hardware and for handling the interaction with RCMS (Section 1.4.4). 

In addition to this infrastructure, a HCAL cell will provide the interface to the central cell to set the configuration 

key of the trigger primitive generator (TPG) hardware and to participate in the interconnection test service 

between the HCAL TPG and the RCT. The HCAL cell exposes also a SOAP interface that makes it easier for the 

HCAL supervisory software to read the information that is set by the central cell. The HCAL integration model 

is shown in Figure 5-8. This model is equally valid for the ECAL sub-detector.


s h 

Central 

Cell 

cx 

dx 

s 

Trigger Supervisor system 

o 

Tstore 

Configuration 

DB 

o 

s 

s 

s 

HCAL s 

manager 

HCAL s 

HCAL 

manager 

manager 

HCAL 

HCAL 

manager 

manager 

h 

HCAL 

Cell 

PCI to VME 



Http 

Occi 

Subdetector 

control systems 

Figure 5-8: HCAL integration model. 

5.3.2.2.7 Trigger, Timing and Control System 

The TTC hardware setup (Section 1.3.2.4) consists of one crate per sub-system with as many TTCci boards as 

TTC partitions are assigned to the sub-system. Table 5-1 shows TTC partitions and TTCci boards assigned to 

each sub-system. Some sub-systems share the same TTC crate. This is the case of: 1) DTTF and DTSC, 2) RCT 

and GCT, and 3) CSC and CSCTF. The GT has no TTCci board because the GTFE board receives the TTC 

signals from the TCS directly through the backplane. 

Sub-system # of partitions Partition names #of TTCci 

Pixels 2 BPIX, FPIX 2 

Tracker 4 TIB/TID, TOB, TEC+, TEC- 4 

ECAL 6 EB+, EB-, EE+, EE-, SE+, SE- 6 

HCAL 5 HBHEa, HBHEb, HBHEc, HO, HF 5 

DT 1 DT 1 

DTTF 1 DTTF 1 

RPC 1 RPC 1 

CSCTF 1 CSCTF 1 

CSC 2 CSC+, CSC- 2 

GT 1 GT 0 

RCT 1 RCT 1 

GCT 1 GCT 1 

Totem and Castor 2 Totem, Castor 2 

Totals 28 27 

Table 5-1: TTC partitions.


s 

cx 

h 

Central 

Cell 

dx 

Trigger Supervisor system 

s h 

ECAL 

supervisor 

s 

… 

s h 

Tracker 

supervisor 

s 

s h 

DTTF 

Central 

Cell 

cx dx 

s h 

GCT 

Cell 

cx dx 

… 

s h 

Ecal 

Cell 

cx dx 

s h 

Tracker 

Cell 

cx dx 

Subdetector 

control 

systems 

s h 

TTCci 

Cell 

xx 

dx 

s h 

TTCci 

Cell 

xx 

dx 

s h 

TTCci 

Cell 

xx 

dx 

s h 

TTCci 

Cell 

xx 

dx 

s 

o 

Tstore 

Configuration 

DB 

o 

s 

s 

TTCci 

Xdaq 

TTCci 

Xdaq 

PCI to VME 



Http 

s 

TTCci 

Xdaq 

s 

TTCci 

Xdaq 

Occi 

… 

DT‐DTTF TTCci 

crate (1+1 board) 

GCT‐RCT TTCci 

crate (1+1 board) 

ECAL TTCci crate 

(6 board) 

Tracker TTCci crate 

(6 board) 

The Integration model for the TTCci infrastructure is shown in Figure 5-9. Every TTCci board is controlled by 

one TTCci XDAQ application. The central cell of each L1 trigger sub-system interacts with the TTCci XDAQ 

application through a TTCci cell. The TTCci cell retrieves the TTCci configuration information and passes it to 

the TTCci XDAQ application. 

The sub-detector TTCci boards are operated slightly differently. The sub-detector supervisory software interacts 

directly with the TTCci XDAQ application. The sub-detector central cell also has a TTCci cell which controls 

the TTCci XDAQ applications running in the sub-detector supervisory software tree. This additional control path 

is necessary to run TTC interconnection tests between the TCS module, located in the GT crate, and TTCci 

boards that belong to sub-detectors. The sub-detector TTCci cells can control more than one TTCci XDAQ 

application. 

The configuration of the L1 trigger sub-systems TTCci boards is driven by the TS. On the other hand, the 

configuration of the sub-detector TTCci boards is driven by the corresponding sub-detector supervisory software. 

5.3.2.2.8 Luminosity Monitoring System 

Figure 5-9: TTCci integration model. 

The Luminosity Monitoring System (LMS) provides beam luminosity information. The LMS cell uses the 

monitoring xhannel (Section 4.4.4.12) to retrieve information from the L1 trigger monitoring collector. This 

information is sent periodically to a LMS XDAQ application which gathers luminosity information from several 

sources and distributes it to a number of consumers, for instance the luminosity database. Figure 5-10 shows the 

LMS integration model.


HCAL 

software 

xe 

LMS software 

s h 

HTR 

Xdaq 

s 

s h 

Central 

Cell 

cx 

s 

LMS 

Cell 

dx 

s o 

Tstore 

h Mon 

Collector 

Configuration 

DB 

o 

PCI to VME 



SOAP (CellXhannelMonitor) 

Http 

Occi 

xx 

mx 

xe 

s 

distributor 

Xdaq 



Figure 5-10: LMS integration model. 

5.3.2.2.9 Central cell 

The central cell coordinates the operation of the sub-system central cells using the cell xhannel interface (Section 

4.4.4.9). Figure 5-11 shows the integration model of the central cell with the rest of sub-system central cells. 

s 

h 

Central 

Cell 

cx 

dx 

PCI to VME 



s 

h 

s 

h 

s 

h 

s 

h 

Http 

GT 

Cell 

dx 

GMT 

Cell 

dx 

DTTF 

Cell 

dx 

… 

ECAL 

Cell 

dx 

s 

o 

Tstore 

Occi 

Configuration 

DB 

o 

Figure 5-11: Central cell integration model. 

5.3.2.3 Integration summary 

Table 5-2 summarizes the most important parameters that define the integration model for each of the subsystems 

including L1 trigger sub-systems and sub-detectors.

System integration 81 

Subsystem 

Online software related parameters HW setup parameters TS system parameters 

HAL 

C++ 

API 

GT Yes Yes No 

GMT Yes Yes No 

XDAQ 

apps. 

GCT No (Usb) Yes Yes 

DTTF Yes Yes No No 

Scripts 

Yes 

(HAL) 

Yes 

(HAL) 

Yes 

(Python) 

Crates 

(type/#) 

Shared 

crates 

Cells 

(type/#) 

Integration case 

1 GT/GMT 1 Section 5.3.2.2.2 

1 GT/GMT 1 Section 5.3.2.2.2 

GC A (3), 

GC B (1) 

D A (6), 

D B (1), 

D C (1) 

No 

DTTF crates 

host DTSC 

receiver board 

GC A (3), 

GC B (1), 

CN(1) 

D A (6), 

D B (1), 

D C (1), 

CN(1) 

Section 5.3.2.2.5 

Section 5.3.2.2.3 

CSCTF Yes Yes No No 1 No 1 Section 5.3.2.2.1 

RCT No Yes No No 

R A (18), 

R B (1) 

DTSC Yes Yes No No D A (10) 

RPC Yes Yes Yes No 

RP A (12), 

RP B (1), 

RP C (1) 

No 

Receiver 

optical board 

in DTTF crate 

R A (18), 

R B (1), 

CN (1) 

DT A (10), 

CN (1) 

Section 5.3.2.2.3 

Section 5.3.2.2.3 

No 1 Section 5.3.2.2.4 

ECAL Yes Yes Yes No NA NA 1 Section 5.3.2.2.6 

HCAL Yes Yes Yes No NA NA 1 Section 5.3.2.2.6 

Tracker NA NA NA NA NA NA 1 Section 5.3.2.2.6 

LMS NA NA Yes NA NA NA 1 Section 5.3.2.2.8 

TTC Yes Yes Yes No 7 

DTTF/DTSC 

RCT/GCT 

CSCTF/CSC 

8 Section 5.3.2.2.7 

CC NA NA NA No NA NA 1 Section 5.3.2.2.9 

5.4 System integration 

Table 5-2: Summary of integration parameters. 

The TS system is formed by the integration of the local scope distributed systems presented in Section 5.3.2. The 

TS system itself can be described as four distributed systems with an overall scope: the TS control system, the 

TS monitoring system, the TS logging system and the TS start-up system. The following sections describe the 

node structure and the communication channels among them for each of the four TS systems. 

5.4.1 Control system 

The TS control system (TSCS) and the TS monitoring system (TSMS) are the main distributed systems with an 

overall scope. These two systems facilitate the development of the configuration, test and monitoring services


s 

cx 

h 

Central 

Cell 

dx 

s 

h 

s 

h 

Subsystem 

Central Cell 

cx dx 

… 

d 

dx 

s 

o 

Tstore 

Configuration 

DB 

o 

s h 

Crate 

Cell d 

dx 

s 

d 

dx 

h 

… 

s 

d 

dx 

h 

PCI to VME 



Http 

Occi 

Figure 5-12: Architecture of the TS control system. (s: SOAP interface, h: HTTP/CGI interface, d: Hardware 

driver, cx: Cell xhannel interface (SOAP), dx: Tstore xhannel interface (SOAP), o: OCCI interface). 

outlined in the conceptual design. Figure 5-12 shows the TSCS. It consists of the sub-system cells, one Tstore 

application, the sub-system relational databases and the communication channels among all these nodes. 

The TSCS is a purely hierarchical control system where each node can communicate only with the immediate 

lower level nodes. The central node of the TSCS uses its cell xhannel interface to coordinate the operation of the 

sub-system central cells. Sub-system central cells are responsible to coordinate the operation over all sub-system 

crates. The crate operation is done through an additional level of cells when the sub-system has more than one 

crate, or directly when the sub-system is contained in one single crate. Each sub-system has its own relational 

database that can be accessed from the sub-system cell using the Tstore xhannel interface. All database queries 

sent through the Tstore xhannel are centralized into the Tstore application. This node’s task is to manage the 

connections with the database server and to translate the SOAP request messages into OCCI requests 

understandable by the Oracle database server (Section 4.4.4.9.2). 

The TSCS can be remotely controlled using the TS SOAP interface (Appendix A) or using the TS GUI. Both 

interfaces are accessible from any node of the TSCS. On the other hand, not all services are available in all the 

nodes. The central node of the TSCS facilitates access to the global level services, the sub-system central nodes 

facilitate the access to the sub-system level services and finally the crate cells facilitate the access to the crate 

level services. The TS services are discussed in Chapter 6. 

5.4.2 Monitoring system 

The TS monitoring, logging and start-up systems are not hierarchic. These systems are very much dependent on 

existing infrastructure provided by the XDAQ middleware or RCMS framework. The usage model for this 

infrastructure is characterized by a centralized architecture (Section 4.4.4.12). 

The TS Monitoring System (TSMS), shown in Figure 5-13, is a distributed application intended to facilitate the 

development of the TS monitoring service. The TSMS consists of the same cells that participate in the TSCS, the 

sensor applications associated to each cell, one monitor collector application, one Mstore application, the Tstore 

application and the monitoring relational database. 

A TSCS cell that wishes to participate in the TSMS has to customize a class descendent of the DataSource class. 

This class defines the code intended to create the updated monitoring information. The monitor collector 

periodically requests from the sensors of the TSMS, through a SOAP interface, the updated monitoring 

information of all items declared in the flashlist files (Section 4.4.4.12). The Mstore application is responsible for

Services development process 83 

External 

system 

m 

h 

s 

xe sensor 

h 

Mon 

Collector 

s 

h 

m 

h s 

xe sensor 

mx 

… 

h s 

m xe sensor 

d 

mx 

h s 

m xe sensor 

h s 

m xe sensor 

… 

h s 

m xe sensor 

d 

mx 

d 

mx 

d 

mx 

Figure 5-13: Architecture of the TS monitoring system. 

embedding the collected monitoring information into a SOAP message and for sending it to the Tstore 

application in order to be stored in the monitoring database. A user of the TSCS can visualize any monitoring 

item of the TSMS with a web browser connected to the HTTP/CGI interface of any cell. 

5.4.3 Logging system 

Figure 5-14 shows the TS Logging System (TSLS). The logging records are generated by any node of the TSCS 

and stored in the logging database. The TSLS facilitates also a filtering GUI embedded in the TS GUI of any 

cell. It allows any user to follow the execution flow of the TS system. 

The TS logging collector is responsible for filtering the logging information and for sending it to its final 

destinations including the TS logging database. The persistent storage of logging records in the logging database 

facilitates the development of post-mortem analysis tools. The TS logging collector can also send the TS logging 

records to a number of destinations: i) a central CMS logging collector intended to gather all logging information 

from the CMS online software infrastructure, ii) a XML file and iii) a GUI-based log viewer (Chainsaw [94]). 

5.4.4 Start-up system 

Figure 5-15 shows the TS Start-up System (TSSS). The TSSS enables to remotely start-up the TSCS or any 

subset of its nodes. The TSSS consists of one job control application in each host of the TS cluster. Each job 

control application exposes a SOAP interface which allows starting or killing an application in the same host. 

The job control applications are installed as operative system services and are started-up at boot time. A central 

process manager coordinates the operation of the job control applications in order to start/stop the TS nodes. 

5.5 Services development process 

The Trigger Supervisor Control System (TSCS) and the Trigger Supervisor Monitoring System (TSMS) provide 

a stable layer on top of which the TS services have been implemented following a well defined methodology 

[95]. Figure 5-16 schematizes the TS service development model associated with the TS system. The following 

description explains each of the steps involved in the creation of a new service.


• Entry cell definition: The first step to implement a service is to designate the cell of the TSCS that 

facilitates the client interface. This cell is known as Service Entry Cell (SEC). When the service involves 

more than one sub-system, the SEC is the TS central cell. When the scope of the service is limited to a given 

sub-system, the SEC is the sub-system central cell. Finally, when the service scope is limited to a single 

crate, the SEC is the corresponding crate cell. 

• Operation states: The second step is to identify the operation states. These represent the stable states of the 

system under control that wish to be monitored during the operation execution. For instance, a possible 

configuration operation intended to set up one single crate could have as many states as boards; and the 

successful configuration of each board could be represented as a different operation state. 

• Operation transition: Once the FSM states are known, the next step is to define the possible transitions 

among stable states and for each transition identify an event that triggers this transition. 

• Operation transition methods: For each FSM transition, the conditional and functional methods and 

associated parameters have to be defined. These methods actually do the system state change. In case the 

SEC is a crate cell, these methods use the hardware driver, located in the cell context, to modify the crate 

state. When the SEC is a central cell, these methods use the xhannel infrastructure to operate lower level 

cells and XDAQ applications, and to read monitoring information. New services may require new 

operations, commands and monitoring items in lower level cells. The developer of the SEC is responsible 

for coordinating the required developments in the lower level cells. 

• Service test: The last step of the process is to test the service. 

Although changes to the L1 decision loop hardware and associated software platforms are expected during the 

operational life of the experiment, these changes may occur independently of the requirement of new services or 

the evolution of existing ones. The TS system is a software infrastructure that facilitates an stable abstraction of 

the L1 decision loop despite of hardware and software upgrades. 

The stable layer of the TS system enables the development coordination of new services uniquely following a 

well defined methodology, with very limited knowledge of the TS framework internals and independently of the 

hardware and software platform upgrades. This approach to coordinatie the development of new L1 operation 

capabilities fits the professional background and experience of managers and technical coordinators well. 

Chapter 6 presents the result of applying this methodology to implement the configuration and interconnection 

test services outlined in Section 3.3.3. 

u 

h 

xe 

Cell 

u 

XS 

o 

u 

Log 

Collector x 

c 

j 

Log 

u 

Collector 

x 

c 

j 

h 

xe 

Cell 

u 

XS 

o 

… 

Cell 

d 

h 

xe 

u 

XS 

o 

Chainsaw 

XML file 

Console 

Cell 

d 

h 

xe 

u 

XS 

o 

Cell 

d 

h 

xe 

u 

XS 

o 

… 

Cell 

d 

h 

xe 

u 

XS 

o 

j 

Logging 

DB 

o 

PCI to VME 

Figure 5-14: Architecture of the TS logging system. 

Udp 

Occi 

Http

Services development process 85 

Start-up 

manager 

s 

s 

xe 

Job 

control 

s 

xe 

Job 

control 

… 

xe 

s 

Job 

control 

xe 

s 

Job 

control 

xe 

s 

Job 

control 

… 

xe 

s 

Job 

control 

Figure 5-15: Architecture of the TS start-up system. 

Entry cell Operation states Operation transitions Operation transition methods Service test 

Figure 5-16: TS services development model.

Chapter 6 

Trigger Supervisor Services 


The TS services are the final Trigger Supervisor functionalities developed on top of the TS control and 

monitoring systems. They have been implemented following the TS services development process described in 

Section 5.5. The functional descriptions outlined in Section 3.3.3 were initial guidelines. The logging and startup 

systems provide the corresponding final services and do not require any further customization process beyond 

the system integration presented in Section 5.4. 

Guided by the “controller decoupling” non-functional requirement presented in Section 3.2.2, Point 3), the TS 

services were totally implemented on top of the TS system and did not require the implementation of any 

functionality on the controller side. This approach to implement the TS services simplified the development of 

controller applications, and it eased the deployment and maintenance of the TS system and services. 

The goal of this chapter is to describe for each different service: the functionality seen by an external controller, 

the internal implementation details from the TS system point of view, and finally, the service operational use 

cases. 

This chapter has been organized in the following sections: Section 6.1 is the introduction, the configuration 

service is presented in Section 6.2, Section 6.3 is dedicated to the interconnection test service, Section 6.4 

describes the monitoring service, and finally, Section 6.5 presents the graphical user interfaces. 

6.2 Configuration 

6.2.1 Description 

The TS configuration service facilitates setting up the L1 trigger hardware. It defines the content of the 

configurable items: FPGA firmware, LUT’s, memories and registers. Figure 6-1 illustrates the client point of 

view to operate the L1 trigger with this service. In general, the TS Control System (TSCS) provides two 

interfaces to access the TS services: a SOAP based protocol for remote procedure calls (Appendix A) and the TS 

GUI based on the HTTP/CGI protocol (Section 4.4.4.11 and Section 6.5). Both interfaces to the central cell 

expose all TS services. The following description presents the service operation instructions without the SOAP 

or HTTP/CGI details. 

Up to eight remote clients can use this service simultaneously in order to set up the L1 trigger and the TTC 

system (Sections 1.3.2 and 1.4.5). The first client that connects to the central cell initiates a configuration 

operation and executes the first transition configure with a key assigned to the TSC_KEY parameter. The key 

corresponds to a full configuration of the L1 trigger which is common for all DAQ partitions. When the 

configure transition finalizes, the L1 trigger system should be in a well defined working state. Additional clients 

attempting to operate with the configuration service have to initiate another configuration operation and also to

Trigger Supervisor Services 88 

execute the configure transition. To avoid configuration inconsistencies, these additional clients have to provide 

the same configuration TSC_KEY parameter, otherwise they are not allowed to reach the configured state. 

All clients can execute the partition transition with a second key assigned to the TSP_KEY parameter and the 

run number assigned to the Run Number parameter. This key identifies the configurable parameters of the L1 

decision loop which are exclusive of the DAQ partition that the corresponding client is controlling. The 

following list presents these parameters: 

• TTC vector: This 32 bit vector identifies the TTC partitions assigned to a DAQ partition. 

• DAQ partition: This number from 0 to 7 defines the DAQ partition. 

• Final-Or vector: This vector defines which algorithms of the trigger menu (128 bits) and technical triggers 

(64 bits) should be used to trigger a DAQ partition. 

• BX Table: This table defines which bunch crossings should be used for triggering and which fast and 

synchronization signals should be sent to the TTC partitions belonging to one DAQ partition. 

OpInit(“configuration”, “sesion_id1”, “opid_1“) 

OpInit(“configuration”, “sesion_id2”, “opid_2”) 

… 

OpInit(“configuration”, “sesion_id8”, “opid_8“) 

SOAP or Http/cgi (GUI) 

Configuration 

Operation 

plug‐in 

Operations 

factory 

CellContext 

String TSC_KEY; 

Bool isConfigured; 

Bool isEnabled; 

Central Cell 

partition(“opid_1”,TSP_KEY, Run Number) 

configure(“opid_1”, TSC_KEY) 

enable(“opid_1”) 

Operations Pool 

… 

suspend(“opid_1”) 

5) suspend(“opid_2”) 

5) suspend(“opid_8”) 

halted configured partitioned enabled suspended 

stop(“opid_1”) 

resume(“opid_1”) 

Data base 

Xhannel 

Cell 

Xhannel 

Figure 6-1: Client point of view of the TS configuration service. 

The enable transition starts the corresponding DAQ partition controller in the TCS module. The suspend 

transition temporally stops the partition controller without resetting the associated counters. The resume 

transition facilitates the recovery of the normal running state. Finally, the stop transition which can be executed 

from either the suspended or enabled states stops the DAQ partition controller and resets all associated 

counters. 

6.2.2 Implementation 

The configuration service requires the collaboration of the TSCS nodes, the Luminosity Monitoring Software 

System (LMSS), the sub-detectors supervisory and control systems (SSCS), and the usage of the L1 trigger 

configuration databases. All involved nodes are shown in Figure 6-2.

Configuration 89 

s h 

HTR 

Xdaq 

s 

s s 

HCAL 

manager 

HCAL 

s 

manager 

s 

… 

h Mon 

Mstore 

Collector 

s 

s 

cx 

h 

Central 

Cell 

dx 

PCI to VME 



SOAP (CellXhannelMonitor) 

LMS 

software 

Sub-detector 

control 

systems 

(HCAL) 

s h 

HCAL 

Cell 

s 

LMS 

Cell 

xx 

mx 

s h 


Sub-system 

Central Cell 

cx dx 

… 

s h 

GT 

Cell 

d 

dx 

s o 

Tstore 

Http 

Occi 

Configuration 

DB 

o 

s h 

TTCci 

Cell 

xx 

dx 

s h 

TTCci 

Cell 

xx 

dx 

s h 

Crate 

Cell 

d 

dx 

… 

s h 

Crate 

Cell 

d 

dx 


Control System 

s 

distributor 

Xdaq 

s 

TTCci 

Xdaq 

s 

TTCci 

Xdaq 

HCAL TTCci 

crate (5 boards) 

Trigger sub‐system 

TTCci crate (1 board) 

Trigger sub‐system crates 

GT crate 

Figure 6-2: Distributed software and hardware system involved in the implementation of the TS 

configuration and interconnections test services. 

6.2.2.1 Central cell 

The role of the central cell in the configuration services is twofold: to facilitate the remote client interface 

presented in Section 6.2.1 and to coordinate the operation of all involved nodes. Both the interface to the client 

and the system coordination are defined by the configuration operation installed in the central cell (Figure 6-1). 

This section describes the stable states, and the functional (f i ) and conditional (c i ) methods of the central cell 

configuration operation transitions (Section 4.3.1). 

Initialization() 

This method stores the session_id parameter in an internal variable of the configuration operation instance. 

This number will be propagated to lower level cells when a cell command or operation is instantiated. The 

session_id is attached to every log record in order to help identify which client directly or indirectly executed a 

given action in a cell of the TSCS. 

Configure_c() 

The conditional method of the configure transition checks whether this is the first configuration operation 

instance. If this is the case, this method disables the isConfigured flag, iterates over all cell xhannels accessible 

from the central cell and initiates a configuration operation in all trigger sub-system central cells with the same 

session_id provided by the client. If one of these configuration operations cannot be successfully started this 

method returns false, the functional method of the configure transition is not executed and the operation state 

stays halted. This method does not retrieve information from the configuration database. 

In case this is not the first configuration operation instance, this method checks if the parameter TSC_KEY is equal 

to the variable TSC_KEY stored in the cell context. If this is different, the configure transition is not executed and 

the operation state stays halted. Otherwise, this method enables the isConfigured flag, returns true and the 

functional method of the configure transition is executed.


Configure_f() 

The functional method for this transition performs the following steps: 

1. If the isConfigured flag is false, the method executes steps 2, 3, 4 and 5. Otherwise this method does 

nothing. 

2. To read from the TSC_CONF table, shown in Figure 6-3, of the central cell configuration database the row 

with the unique identifier equal to TSC_KEY. This row contains as many identifiers as sub-systems have to be 

configured (sub-system keys). If a sub-system shall not be configured, the corresponding position in the 

TSC_KEY row is left empty. 

3. To execute the configure transition in each sub-system central cell sending as a parameter the sub-system 

key. This transition is not executed in those sub-systems with an empty key. Section 6.2.2.2 presents the 

configuration operation of the sub-system central and crate cells. 

4. To store in the cell context the current TSC_KEY. 

TSC_CON 

F 

TSC_KEY 

GT_KEY 

GMT_KEY 

DTTF_KEY 

CSCTF_KEY 

GCT_KEY 

RCT_KEY 

RPCTrig_KEY 

ECAL_TPG_KEY 

HCAL_TPG_KEY 

DT_TPG_KEY 

GT_CONF 

GT_KEY 

… 

GMT_CONF 

GMT_KEY 

… 

DTTF_CONF 

DTTF_KEY 

… 

CSCTF_CONF 

CSCTF_KEY 

… 

GTL_CONFIG 

GTL_KEY 

GTL_FW_KEY 

GTL_REG_KEY 

GTL_SEQ_KEY 

URL_TRIG_MENU 

Figure 6-3: L1 configuration database structure is organized in a hierarchic way. The main table is named 

TSC_CONF. 

Partition_c() 

This method performs the following steps: 

1. To read from TSP_CONF table (Figure 6-4) the row with the unique identifier equal to TSP_KEY. This row 

points to the hardware configuration parameters that affect just the concrete DAQ partition, namely: the 32 

bits TTC vector, the DAQ partition identifier, the 128 + 64 bits vector of the final or and the bunch crossing 

table. 

2. To use the GT cell commands to check that the DAQ partition and the TTC partitions are not being used. If 

there is an inconsistency, this method returns false, the functional method of the partition transition is not 

executed and the operation state stays configured. Section 6.2.2.3.1 presents the GT cell commands. 

Partition_f() 


1. To read from TSP_CONF table the row with the unique identifier equal to TSP_KEY. 

2. To execute the GT cell commands (Section 6.2.2.3.1) in order to: 

a. Set up the DAQ partition dependent parameters retrieved in the first step. 

b. Reset the DAQ partition counters. 

c. Assign the Run Number parameter to the DAQ partition.


TSP_CONF 

TSP_KEY 

TTC_VECTOR 

FIN_OR 

DAQ_PARTITION 

BC_TABLE 

Figure 6-4: The database table that stores DAQ partition dependent parameters is named TSP_CONF. 

Enable_c() 

This method checks whether this is the first configuration operation instance. If this is the case, this method 

disables the isEnabled flag. Otherwise, this method enables the isEnabled flag and checks in all trigger subsystem 

central cells that the configuration operation is in configured state. 

Enable_f() 

The functional method of the enable transition performs the following steps: 

1. If the isEnabled flag is disabled, the method executes steps 2 and 3. Otherwise this method only executes 

step 3. 

2. To execute the enable transition in the configuration operation of all sub-systems central cells. This enables 

the trigger readout links with the DAQ system and the LMS software. 

3. To execute the GT cell commands to start the DAQ partition controller in the TCS module. 

Suspend_c() 

This method checks nothing. 

Suspend_f() 

This method executes in the GT cell a number of commands that simulate a busy sTTS signal (Section 1.3.2.4) in 

the corresponding DAQ partition. The procedure stops the generation of L1A’s and TTC commands in this DAQ 

partition. Section 6.2.2.3 presents these commands. 

Resume_c() 


Resume_f() 

This method executes in the GT cell a command that disables the simulated busy sTTS signal that was enabled in 

the functional method of the suspend transition. Section 6.2.2.3.1 presents these commands. 

Stop_c() 


Stop_f() 

This method executes in the GT cell the command to stop a given DAQ partition (Section 6.2.2.3.1). 

Destructor() 

This method is executed when the remote client finishes using the configuration operation service and destroys 

the configuration operation instance. The destructor method of the last configuration operation destroys the 

configuration operations running in the sub-system central cells. This stops the trigger readout links with the 

DAQ system and the LMS software.


6.2.2.2 Trigger sub-systems 

Each trigger crate is configured by a configuration operation running on a dedicated cell for that crate (Section 

5.3.2.1.2). A configuration operation provided by the sub-system central cell coordinates the operation over all 

crate cells. When a trigger sub-system consists of one single crate, the central cell and the crate cell are the same. 

A complete description of all integration scenarios was presented in Section 5.3.2.2. 

Figure 6-5 shows the configuration operation running in all trigger sub-system cells. The description of the 

functional and conditional methods depends on whether it is a cell crate or not. This is a generic description that 

can be applied to any trigger sub-system. It is not meant to provide the specific hardware configuration details of 

a concrete trigger sub-system. Specific sub-system configuration details can be checked in the code itself [96]. 

This section describes the stable states, and the functional (f i ) and conditional (c i ) methods of the trigger subsystem 

cell configuration operation transitions. This description includes the sub-system central and crate cell 

cases. 

OpInit(“configuration”, “session_id”, “opid“) 

configure(“opid”, KEY) enable(“opid”) suspend(“opid”) 

halted configured enabled suspended 


This method stores the session_id parameter in an internal variable of the configuration operation instance. If 

the current operation instance was started by the central cell, the session_id is the same as the one provided by 

the central cell client. 

Configure_() 

If the operation runs in the trigger sub-system central cell, this method iterates over all available cell xhannels 

and initiates a configuration operation in all crate cells and TTCci cells (if the trigger sub-system has a TTCci 

board). If the operation runs in a crate cell, this method checks if the hardware is accessible using the hardware 

driver. 

If one of these configuration operations cannot be successfully started or the hardware is not accessible, this 

method returns false, the functional method of the configure transition is not executed and the operation state 

stays halted. 

Configure_f() 

resume(“opid”) 

Figure 6-5: Trigger sub-system configuration operation. 

The functional method for this transition performs the following steps: 

1. To read from the trigger sub-system configuration database the row with the unique identifier equal to KEY. 

If the operation runs in the trigger sub-system central cell, this row contains as many identifiers as crate 

cells. If a crate cell is not going to be configured, the corresponding position in the KEY row is left empty. If 

the operation runs in a crate cell, this row contains configuration information, links to firmware or look up 

table (LUT) files and/or references to additional configuration database tables. Section 6.2.2.3 presents the 

GT configuration database example. 

2. If the operation runs in the trigger sub-system central cell, this method executes in each crate cell and TTCci 

cell the configure transition sending as a parameter the crate or TTCci key. If the operation runs in a crate 

cell, the configuration information is retrieved from the configuration database using the database xhannel. 

The crate is configured with this information using the hardware driver.


Enable_c() 


and checks if the current state is configured. If the operation runs in a crate cell, this method checks if the 

hardware is accessible using the hardware driver. 

If one of these configuration operations is not in the configured state or the hardware is not accessible, this 

method returns false, the functional method of the enable transition is not executed and the operation state 

stays configured. 

Enable_f() 


and executes the enable transition. If the operation runs in a crate cell, this method configures the hardware in 

order to enable the readout link with the DAQ system. 

Suspend_c() 


and checks if the current state is enabled. If the operation runs in a crate cell, this method checks if the hardware 

is accessible using the hardware driver. 

If one of these configuration operations is not in the enabled state or the hardware is not accessible, this method 

returns false, the functional method of the suspend transition is not executed and the operation state stays 

enabled. 

Suspend_f() 


and executes the suspend transition. If the operation runs in a crate cell, this method configures the hardware in 

order to disable the readout link with the DAQ system. 

Resume_c() 


and checks if the current state is suspended. If the operation runs in a crate cell, this method checks if the 

hardware is accessible using the hardware driver. 

If one of these configuration operations is not in the suspended state or the hardware is not accessible, this 

method returns false, the functional method of the resume transition is not executed and the operation state 

stays suspended. 

Resume_f() 


and executes the resume transition. If the operation runs in a crate cell, this method configures the hardware in 

order to enable again the readout link with the DAQ system. 

Destructor() 

The destructor method of the trigger sub-system central cell configuration operation is executed by the destructor 

method of the last TS central cell configuration operation. If the operation runs in the trigger sub-system central 

cell, this method iterates over all available cell xhannels and destroys all configuration operations. If the 

operation runs in a crate cell, this method configures the hardware in order to disable the readout link with the 

DAQ system. 

6.2.2.3 Global Trigger 

The GT cell operates the GT where L1A decisions are taken based on trigger objects delivered by the GCT and 

the GMT (Section 1.3.2.3). The GT cell plays a special role in the configuration of the L1 trigger. It facilitates a 

set of cell commands used by the central cell configuration operation and an implementation of the trigger subsystem 

configuration operation presented in Section 6.2.2.2. This section presents the interface of the GT cell 

[97] involved in the configuration and the interconnection test services.


6.2.2.3.1 Command interface 

The GT command interface is used by the configuration and interconnection test operations running in the 

central cell, and also by the GT control panel (Section 6.5.1). The command interface has been mostly designed 

according to the needs of these clients. The commands can be classified as a function of the GT boards: Trigger 

Control System (TCS), Final Decision Logic (FDL) and Global Trigger Logic (GTL). 

FDL commands 

The FDL is one of the GT modules that are configured during the partition transition of the central cell 

configuration operation (Section 6.2.2.1). For instance, to set up the Final-Or of the FDL for a given DAQ 

partition, to monitor the L1A rate counters for each of the 192 L1A’s (FDL slice) coming from the GTL or to 

apply a pre-scaler to a certain algorithm or technical trigger. 

NAME TYPE VALID VALUES 

Number of slice 

xdata::UnsignedShort 

The number of FDL slices depends on the firmware. Currently there are 192 

slices foreseen on the FDL. Valid values for the parameter are therefore [0:191]. 

DAQ partition xdata::UnsignedShort The Number of DAQ partitions is 8. Therefore valid values are between [0:7]. 

Pre-scale factor 

Update step size 

Bit for refresh rate 

xdata::UnsignedLong 



Value of the pre-scaler for a slice that is determined by a 16 bit register. Range 

of valid values is [0:65535]. 

Value of the update step size is determined by a 16 bit register. Range of valid 

values is [0:65535]. 

Each of 8 bits refers to a different multiplicity that is defined in the firmware of 

the FDL. Valid values are between [0:7]. 

Table 6-1: Description of parameters used in FDL commands. 

SetFinOrMask: 

Description: 

Parameters: 

Return value: 

Each slice can be added to the Final-Or of one or more DAQ partitions. This command 

adds or removes a specific slice to or from a DAQ partition’s Final-Or according to the 

”Enable for Final-Or” parameter. 


Number of DAQ partition 

Enable for Final-Or 

Slice number: ”Number of slice” ”enabled/disabled” for Final-Or in DAQ partition 

number: ”Number of DAQ partition” 

GetFinOrMask: 

Description: 

Parameters: 

Return value: 

Reads out whether a slice is currently part of the Final-Or of a certain DAQ partition. 



xdata::Boolean 

SetVetoMask: 

Description: 

Parameters: 

Each slice can suppress a L1A for one or more DAQ partitions. This command enables 

or disables that mechanism for a given slice and DAQ partition. 


Number of DAQ partition


Return value: 

Enable for veto 

Slice number: ”Number of slice” ”enabled/disabled” as veto for DAQ partition number: 

”Number of DAQ partition” 

GetVetoMask: 

Description: 

Parameters: 

Return value: 

Reads if a certain slice is currently defined as veto for a certain DAQ partition. 




SetPrescaleFactor: 

Description: 

Parameters: 

Return value: 

To control L1A rates that are too high, a pre-scale factor for each slice can be applied. 

This factor can be set individually for each slice. Setting the factor to 0 or 1 does no 

pre-scaling. 


Pre-scale factor 

Pre-scale factor of slice Number: ”Number of slice” set to: ”Pre-scale factor” 

GetPrescaleFactor: 

Description: 

Parameters: 

Return value: 

Reads out the pre-scale factor for a certain FDL slice. 



ReadRateCounter: 

Description: 

Parameters: 

Return value: 

Reads-out the rate counter for a certain slice. 



SetUpdateStepSize: 

Description: 

Parameters: 

Return value: 

Sets the common step-size for the reset period of all rate counters. 

Update step size 

Update step size set to: ”Update step size” 

SetUpdatePeriod: 

Description: 

Parameters: 

Sets the “update period” of the rate counters for a certain slice, based on the common 

update step-size. The update-period is chosen by setting a register. Each register bit 

corresponds to a factor the common update-period is multiplied with. An array in the 

code of the command maps bit numbers to multiplicities. 


Bit for refresh rate


Return value: 

Update Period of slice Number: ”Number of slice” set to: ”multiplicity” 

GetNumberOfAlgos: 

Description: 

Parameters: 

Return value: 

Depending on the version of the firmware of the FDL chip, the number of Technical 

Triggers (TT’s) may differ. This command gives back the number of TT’s currently 

implemented. 


TCS commands 

The Trigger Control System module (TCS) controls the distribution of L1A’s (Section 1.3.2.4). Therefore, it 

plays a crucial role with respect to Data Acquisition and readout of the trigger components. The TCS command 

interface of the GT cell is used by the configuration operation running in the central cell (Section 6.2.2.1) and by 

the GT control panel (Section 6.5.1). This interface provides very fine grained control over the TCS module. 

Assigning TTC partitions to DAQ partitions, assigning time slots, controlling the random trigger generator and 

the generation of fast and synchronization signals, and loading predefined bunch crossing tables separately for 

each DAQ partition are tasks the command interface has to cope with. 

Commands of the TCS can be grouped into commands affecting more than one DAQ partition controller (PTC) 

and PTC dependent commands. The first group of commands therefore contains the prefix ”Master” whereas 

commands of the second group start with ”Ptc”. The second group of commands has the number of the PTC as a 

common parameter. 


DAQ partition xdata::UnsignedShort The number of DAQ partitions is 8. Therefore, valid values are between [0:7]. 

Number of PTC 

Detector partition 

Time slot 

Random trigger frequency 





For each DAQ partition there is a PTC implemented on the TCS chip. 

Therefore, valid values are between [0:7]. 

This parameter refers to one of 32 TTC partitions. Valid values are between 

[0:31]. 

The time slot for a PTC is calculated from a 8 bit value. Valid values are 

between [0:255]. 

The random frequency is calculated from a 16 bit register value. Valid values 

are between [0:65535]. 

Table 6-2: Description of parameters used in TCS commands. 

MasterSetAssignPart: 

Description: 

Parameters: 

Return value: 

This command assigns a TTC partition to a DAQ partition. In case the TTC partition 

is already part of a DAQ partition it will be assigned to the new partition anyway. 


DAQ partition 

Detector partition ”Detector partition” assigned to DAQ partition: ”DAQ partition”.


MasterGetAssignPart: 

Description: 

Parameters: 

Return value: 

Returns the number of the DAQ partition a certain TTC partition is part of. 



MasterSetAssignPartEn: 

Description: 

Parameters: 

Return value: 

This command enables or disables a TTC partition. Before a TTC partition can be 

assigned to a DAQ partition it has to be enabled. 


Enable partition 

Detector partition enabled/disabled 

MasterGetAssignPartEn: 

Description: Reads-out whether or not a certain TTC partition is enabled . 

Parameters: 


Return value: 


MasterStartTimeSlotGen: 

Description: 

Parameters: 

Return value: 

Depending on the registers that define the time slots for every DAQ partition the 

time slot generator switches between the DAQ partitions in round robin mode. 

This command starts the time slot generator. 

Time slot generator started. 

PtcGetTimeSlot: 

Description: 

Parameters: 

Return value: 

Returns the current time slot assignment for a certain PTC. 

Number of PTC 


PtcStartRndTrigger: 

Description: 

Parameters: 

Return value: 

Starts the random trigger generator for a specified PTC. 

Number of PTC 

Random trigger generator started for DAQ partition controller ”number of PTC” 

PtcStopRndTrigger: 

Description: 

Parameters: 

Return value: 

Stops the random trigger generator for a specified PTC. 

Number of PTC 

Random trigger generator stopped for DAQ partition controller ”number of PTC”


PtcRndFrequency: 

Description: 

Parameters: 

Return value: 

Sets the frequency of generated triggers by the random trigger generator for a specified 

PTC. 

Number of PTC 

Random trigger frequency 

Random frequency of partition Group: ”number of PTC” set to: "random trigger 

frequency" 

PtcGetRndFrequency: 

Description: 

Parameters: 

Return value: 

Reads-out the frequency of the random trigger generator for a PTC. 

Number of PTC 


PtcStartRun: 

Description: 

Parameters: 

Return value: 

Starts a run for a PTC, by first resetting and starting the PTC and then sending a start 

run command pulse. 

Number of PTC 

Run started for PTC: ”number of PTC” 

PtcStopRun: 

Description: 

Parameters: 

Return value: 

Stops a run for a PTC. 

Number of PTC 

Run stopped for PTC: ”number of PTC” 

PtcCalibCycle 

Description: 

Parameters: 

Return value: 

Starts a calibration cycle for the specified PTC. 

Number of PTC 

Calibration cycle for DAQ partition ”number of PTC” started. 

PtcResync: 

Description: 

Parameters: 

Return value: 

Manually starts a resynchronization procedure for the specified PTC. 

Number of PTC 

Resynchronization procedure for DAQ partition “number of PTC” initialized. 

PtcTracedEvent: 

Description: 

Parameters: 

Return value: 

Manually sends a traced event for a specified PTC. 

Number of PTC 

Traced event initiated for DAQ partition “number of PTC”.


PtcHwReset: 

Description: 

Parameters: 

Return value: 

Manually sends a hardware reset to the PTC. 

Number of PTC 

Hardware for DAQ partition ”number of PTC” has been reset. 

PtcResetPtc: 

Description: 

Parameters: 

Return value: 

Resets the state machine of the PTC. 

Number of PTC 

PTC ”number of PTC” reset. 

Other commands 

This section describes a number of commands not specifically implemented for a certain type of GT module but 

rather used during the initialization, for debugging or for filling the database with register data. 


Item 

Offset 

Board serial number 

xdata::String 

xdata::UnsignedInteger 

xdata::String 

Refers to a register item, defined in the HAL “AddressTable” file for a module. 

If the specified item is not found, HAL will throw an exception that is caught in 

the command. 

The offset to the register address specified by an Item parameter according to 

the “HAL AddressTable” file. In case the offset gets too large, a HAL exception 

caught by the command will indicate that. 

Only serial numbers of GT modules that are initialized will be accepted. The 

GetCrateStatus command returns a list of boards in the crate. 

Bus adapter xdata::String The GT cell only accepts bus adapters of type ”DUMMY” and ”CAEN”. 

“Module Mapper” File 

“AddressTableMap” File 

xdata::String 

xdata::String 

The full path to the HAL “ModuleMapper” file has to be specified. If the file is 

not found a HAL exception caught by the command will inform the user about 

that. 

The full path to the HAL “AddressTableMap” file has to be specified. If the file 

is not found a HAL exception caught by the command will inform the user 

about that. 

Table 6-3: Description of parameters used in the auxiliar commands. 

GtCommonRead: 

Description: 

Parameters: 

Return value: 

This command was written to read out register values from any GT module in the crate. 

This is useful for debugging. When correctly using the offset parameter also lines of 

memories can be read out. 

Item 

Offset 



GtCommonWrite: 

Description: 

Parameters: 

Generic write access for all GT modules. 

Item


Return value: 

Value 

Offset 


Register Value for Item: ”Item” set to: ”Value” (offset=”Offset” ) for board with serial 

number: ”board serial number” 

GtInitCrate: 

Description: 

Parameters: 

Return value: 

The initialization of the GT crate object (GT crate software driver) is done during startup 

of the cell application. If the creation of the crate object did not work correctly or if 

another type of bus adapter or different HAL files should be used, this command is 

used. Only if the ”reinitialize crate” parameter is set to true a new CellGTCrate object 

is instantiated. 

Module Mapper File 

AddressTableMap File 

Bus adapter 

Reinitialize crate 

The GT crate has been initialized with ”bus adapter” bus adapter. 

Board with serial nr.: ”board1 serial number” in slot Nr. ”board1 slot number” 


GtGetCrateStatus: 

Description: 

Parameters: 

Return value: 

The crate object dynamically creates associative maps during its initialization where 

information about modules in the crate is put. This information can be read out using 

this command. 

Module Mapper File 

AddressTableMap File 

Bus adapter 

Reinitialize crate 

The GT crate has been initialized with ”bus adapter” bus adapter. 



GtInsertBoardRegistersIntoDB: 

Description: 

Parameters: 

Return value: 

This command reads-out all registers for a specified GT module that are 

in the configuration database and inserts a row of values with a unique 

identifier and optionally a description into the corresponding GT 

configuration database table. 


Primary Key 

Description 

Register values have been read from the hardware and inserted into table 

”Name of Register Table” with Primary Key: ”Primary Key”


6.2.2.3.2 Configuration operation and database 

The configuration operation of the GT cell is interesting for two reasons: for being responsible for configuring 

the GT hardware that is common to all DAQ partitions; and is also interesting as an example of configuration 

operation defined for a trigger sub-system crate cell (Section 6.2.2.2). This section describes in detail the 

functional method of the configure transition for this operation and the GT configuration database. 

Figure 6-6: Flow diagram of the configure transition functional method. 

Configure_f() 

The flow diagram for this method is shown in Figure 6-6. The method performs the following steps: 

1. To retrieve a row from the main table of the GT configuration database named GT_CONFIG (Figure 6-7). 

This row is identified by the key that is given as a parameter to the operation. If a certain board should not 

be configured at all, the corresponding entry in the GT_CONFIG table has to be left empty. 

2. To loop over all boards in the GT crate in order to log those not found.


Figure 6-7: Main table of the GT configuration database. 

3. For all boards that are initialized, the BOARD_FIRMWARE table, shown in Figure 6-8, is retrieved. New 

firmware is attempted to be loaded if the version number of the current firmware does not match the 

firmware version of the configuration. 

4. The same loop is executed over all possible board memories found in the BOARD_MEMORIES table. Empty 

links are omitted just like above. 

Figure 6-8: Each BOARD_CONFIG table references a set of sub tables.


5. The register table for each board is retrieved. If this table is empty because of a missing link, a warning 

message is issued, because loading registers is essential to put the hardware into a well defined state. 

6. Finally, a sequencer file is attempted to be downloaded for every board. This sequencer file can be used to 

write values in a set of registers. 

6.2.2.4 Sub-detector cells 

HCAL and ECAL sub-detectors have just one cell each (Section 5.3.2.2.6). The configuration operation 

customized by the sub-detector cells is the same as for the trigger cells (Section 6.2.2.2). The configuration 

operation of the sub-detector cell only does something during the execution of its functional method of the 

configure transition. This method sets the sub-detector TPG configuration key to an internal variable of the subdetector 

cell. However, the sub-detector cell is not responsible for actually setting the hardware. Instead, when 

the sub-detector FM requires the configuration of the TPG (Section 1.4.5), the sub-detector supervisory system 

performs the following sequence: 

1. It reads the key using a dedicated cell command of the sub-detector cell. 

2. It uses this key to retrieve the hardware configuration from the sub-detector configuration database. 

3. It configures the TPG hardware. 

6.2.2.5 Luminosity monitoring system 

The Luminosity Monitoring System (LMS) cell implements a configuration operation which resets the LMS 

software (Section 5.3.2.2.8) during its functional method of the enable transition. This method announces that 

the trigger system is running and the LMS readout software can be started. The destructor method of the LMS 

configuration operation stops the LMS software. Therefore, the LMS system will be enabled as far as there is at 

least one configuration operation instance running in the central cell. 

6.2.3 Integration with the Run Control and Monitoring System 

The experiment Control System (ECS) presented in Section 1.4 coordinates the operation of all detector subsystems 

and among them the L1 decision loop. The interface between the central node of the ECS and each of 

the sub-systems is the First Level Function Manager (FLFM) which is basically a finite state machine. 

Figure 6-9 shows the state diagram of the FLFM. It consists of solid and dashed ellipses to symbolize states. The 

solid ellipses are steady states that are exited only if a command arrives from the central node of the ECS or an 

error is produced. The dashed ellipses are transitional states which are executing instructions on the sub-systems 

supervisors and self-trigger a transition to the next steady state upon completion of work. The command 

Interrupt may force the transition to Error from a transitional state. The transitions itself are instantaneous and 

guaranteed to succeed as no execution of instructions is taking place. The entry to the state machine is the 

Initial state [98]. 

This FLFM has to be customized by each sub-system. This customization consists of implementing the code of 

the main transitional states. For the L1 decision loop, the code for the Configuring, Starting, Pausing, 

Resuming and Stopping states has been defined. This definition uses the TS SOAP API described in Appendix A 

to access the TS configuration service. In this context, the FLFM acts as a client of the TS. 

During the configuring state, The FLFM instantiates a configuration operation in the central cell of the TS and 

executes the configure and the partition transitions. 

During the starting state, the FLFM executes the enable transition. 

During the pausing state, the FLFM executes the suspend transition. 

During the resuming state, the FLFM executes the resume transition. 

Finally, the FLFM stopping state executes the stop transition. 

The parameters TSC_KEY, TSP_KEY and Run Number are passed during the corresponding transitions 

(Section 6.2.2.1).


Figure 6-9: Level-1 function manager state diagram.

Interconnection test 105 

6.3 Interconnection test 


Due to the large number of communication channels between the trigger primitive generator (TPG) modules of 

the sub-detectors and the trigger system, and between the different trigger sub-systems, it is necessary to provide 

an automatic testing mechanism. The interconnection test service of the Trigger Supervisor is intended to 

automatically check the connections between sub-systems. 

From the client point of view, the interconnection test service is another operation running in the TS central cell. 

Figure 6-10 shows the state machine of the interconnection test operation. The client of the interconnection test 

service initiates an interconnection test operation in the central cell and executes the first transition prepare with 

a key assigned to the IT_KEY parameter and an optional second string assigned to the custom parameter. This 

transition prepares the L1 trigger hardware and the TS system for the starting of the interconnection test. The 

start transition enables the starting of the test. Finally, the client executes the analyze transition to get the test 

result from the sub-system central cells. 

OpInit(“interconnectionTest”, “session_id”, “opid“) 

prepare(IT_KEY, “custom”) start(“opid”) analyze(“opid”) 

halted prepared started analyzed 

6.3.2 Implementation 

The following sections describe how the TS interconnection test service is formed by the collaboration of 

different cell operations installed in different cells of the TS system. In addition, this service requires the 

collaboration of the Sub-detectors Supervisory and Control Systems (SSCS), and the usage of the L1 trigger 

configuration databases (Figure 6-2). A unique operation is necessary in the TS central cell. However, every 

interconnection test requires specific operations for the concrete sender and receiver sub-system central cells and 

crate cells. 

6.3.2.1 Central cell 

The role of the central cell in the interconnection test service is similar to the role played in the configuration 

service: to facilitate the remote client interface presented in Section 6.3.1 and to coordinate the operation of all 

involved nodes. Both the interface to the client and the system coordination are defined by the interconnection 

test operation installed in the central cell (Figure 6-10). This section describes the stable states, and the 

functional (f i ) and conditional (c i ) methods of the central cell interconnection test operation transitions. 


This method stores the session_id parameter in an internal variable of the interconnection test operation 

instance. This number will be propagated to lower level cells when a cell command or operation is instantiated. 

The session_id is attached to every log record in order to help identify which client directly or indirectly 

executed a given action in a cell of the TSCS. 

Prepare_c() 


Figure 6-10: Interconnection test operation. 

resume(“opid”)


1. To read the IT_KEY row from the IT_CONF database table shown in Figure 6-11. This row contains two keys 

(TSC_KEY and TSP_KEY) and the cell operation names that have to be initiated in each of the central cells of 

those sub-systems involved in the interconnection test. 

2. To initiate the corresponding operation in the required trigger sub-system central cells with the same 

session_id provided by the central cell client. This method also initiates a configuration operation in the 

central cell. If one of these operations cannot be successfully started then this method returns false, the 

functional method of the prepare transition is not executed and the operation state stays halted. 

Prepare_f() 


1. To execute the configure and the partition transitions with the TSC_KEY and TSP_KEY keys respectively in 

the central cell configuration operation. This configures the TCS module in order to deliver the required 

TTC commands to the sender and/or to the receiver sub-systems. By reconfiguring the BX table of a given 

DAQ partition, the TCS can send periodically any sequence of TTC commands to a set of TTC partitions 

(i.e. senders or receivers or both). The usual configuration use case is that senders are waiting for a BC0 

signal 16 to start sending patterns, whilst the receiver systems do not need any TTC signal. The configuration 

operation is also used to configure intermediate trigger sub-systems in order to work in transparent mode. 

2. To execute the prepare transition in the interconnection test operation of each trigger sub-system central 

cell sending as a parameter the custom string parameter. This parameter is intended to be used by the subsystem 

interconnection test operation (Section 6.3.2.2). 

Start_c() 

This method checks if the interconnection test operation state of each trigger sub-system central cell is in 

prepared state. This method also checks if the configuration operation of the central cell is in partitioned state. 

If one of these operations is not in the expected state, this method returns false, the functional method of the 

start transition is not executed and the operation state stays prepared. 

Start_f() 


IT_KEY 

IT_CONF 

TSC_KEY 

TSP_KEY 

GT_IT_CLASS 

GMT_IT_CLASS 

DTTF_IT_CLASS 

CSCTF_IT_CLASS 

GCT_IT_CLASS 

RCT_IT_CLASS 

RPCTrig_IT_CLASS 

ECAL_IT_CLASS 

HCAL_IT_CLASS 

DTSC_IT_CLASS 

Figure 6-11: Main database table used by the central cell interconnection test operation. 

16 This TTC command signals the beginning of an LHC orbit.

Interconnection test 107 

1. To execute the start transition in the interconnection test operation of each trigger sub-system central cell. 

This enables input and output buffers on the receiver and sender sides respectively. 

2. To execute the enable transition in the configuration operation of the central cell. This enables the delivery 

of TTC commands to the sender and receiver sub-systems. 

Analyze_c() 

This method checks if the interconnection test operation state of each trigger sub-system central cell is in 

started state. This method also checks if the configuration operation of the central cell is in enabled state. If 

one of these operations is not in the expected state, this method returns false, the functional method of the 

analyze transition is not executed and the operation state stays started. 

Analyze_f() 


1. To execute the suspend transition in the configuration operation of the central cell. This temporally stops the 

delivery of TTC commands to the sender and receiver sub-systems. 

2. To execute the analyze transition in the interconnection test operation of each trigger sub-system central 

cell. This method retrieves the test result from the sub-systems and disables the input and output buffers on 

the receiver and sender sides respectively. Usually, the sender returns nothing and the receiver returns the 

result after comparing the expected patterns with the actual received patterns. 

Resume_c() 

This method checks in the interconnection test operation of each trigger sub-system central cell that the current 

state is analyzed. This method also checks if the configuration operation of the central cell is in suspended state. 

If one of these operations is not in the expected state, this method returns false, the functional method of the 

resume transition is not executed and the operation state stays analyzed. 

Resume_f() 


1. To execute the resume transition in the interconnection test operation of each trigger sub-system central cell. 

This enables input and output buffers on the receiver and sender sides respectively. 

2. To execute the resume transition in the configuration operation of the GT cell. This enables the delivery of 

TTC commands to the sender and receiver sub-systems. 

6.3.2.2 Sub-system cells 

The interconnection test operation interface running in the trigger sub-system cells is almost the same as the one 

running in the TS central cell (Figure 6-10), with the difference that the IT_KEY parameter does not exist. This 

section describes the stable states, and the functional (f i ) and conditional (c i ) methods of the sub-system cells 

interconnection test operation transitions. This description includes the crate cell and the trigger sub-system 

central cell cases. The following method descriptions do not match a concrete interconnection test example but 

describe the relevant aspects common to all the cases. 


This method stores the session_id parameter in an internal variable of the configuration operation instance. 

This number will be propagated to lower level cells when a cell command or operation is instantiated. The 

session_id is attached to every log record in order to help identify which client directly or indirectly executed a 

given action in a cell of the TSCS. 

Prepare_c() 

If the operation runs in the sub-system central cell, this method reads the custom parameter and initiates the 

interconnection test operation in the crate cells involved in the test. If the operation runs in a crate cell, this 

method checks if the hardware is accessible. If an operation cannot be started in the crate cells or the hardware is 

not accessible, this method returns false, the functional method of the prepare transition is not executed and the 

operation state stays halted.


Prepare_f() 

This method reads the custom parameter and executes the necessary actions to prepare the sub-system to perform 

the test according to this parameter. If the operation runs in the sub-system central cell, this method executes the 

prepare transition in the required interconnection test operation running in the lower level crate cells. If the 

operation runs in a crate cell, this method prepares the patterns to be sent or to be received. 

Start_c() 

If the operation runs in the sub-system central cell, this method checks if the current state of the interconnection 

test operation running in the crate cells is prepared. If the operation runs in a crate cell, this method checks if the 

hardware is accessible. If one of these checks fails, this method returns false, the functional method of the start 

transition is not executed and the operation state stays prepared. 

Start_f() 

If the operation runs in the sub-system central cell, this method executes the start transition in the interconnection 

test operation running in the lower level crate cells. If the operation runs in a crate cell, this method enables the 

input or the output buffers depending on whether the crate is on the receiver or on the sender side. 

Analyze_c() 


test operation running in the crate cells is started. If the operation runs in a crate cell, this method checks if the 

hardware is accessible. If one of these checks fails, this method returns false, the functional method of the 

analyze transition is not executed and the operation state stays started. 

Analyze_f() 

If the operation runs in the sub-system central cell, this method executes the analyze transition in the 

interconnection test operation running in the lower level crate cells, gathers the results and returns them to the 

central cell. If the operation runs in a crate cell, this method compares the expected patterns, prepared during the 

prepare transition, against the received ones and returns the result to the sub-system central cell. 

Resume_c() 


test operation running in the crate cells is analyzed. If the operation runs in a crate cell, this method checks if the 

hardware is accessible. If one of these checks fails, this method returns false, the functional method of the 

resume transition is not executed and the operation state stays analyzed. 

Resume_f() 

If the operation runs in the sub-system central cell, this method executes the resume transition in the 

interconnection test operation running in the lower level crate cells. If the operation runs in a crate cell, this 

method enables again the input or the output buffers depending on whether the crate is on the receiver or on the 

sender side. 

6.4 Monitoring 


The TS monitoring service provides access to the monitoring information of the L1 decision loop hardware. This 

service is implemented using the TSMS presented in Section 5.4.2. The HTTP/CGI interface of the monitor 

collector provides remote access to the monitoring information. 

Event data based monitoring system 

A second source of monitoring information is the event data. For instance, the GTFE board is designed to gather 

monitoring information from almost all boards of the GT and to send this information as an event fragment every 

time that the GT receives a L1A. Therefore, an online monitoring system for the GT could be based on 

extracting this data from the corresponding event fragment. This approach would be very convenient because 

every event would contain precise monitoring information of the L1 hardware status for the corresponding bunch

Graphical user interfaces 109 

crossing (BX). In addition, this approach would not require the development of a complex monitoring software 

infrastructure. On the other hand, we would face two limitations: 

• The GT algorithm rates are accumulated in the Final Decision Logic (FDL) board and the current version of 

the GTFE board cannot access its memories and registers. The only way to read out the rate counters is 

through VME access. 

• The GTFE board will send event fragments only when the DAQ infrastructure is running. 

These limitations could be overcome using the TS monitoring service. This is meant to be an “always on” 

infrastructure (Section 5.2.5) and to provide a HTTP/CGI interface to access all monitoring items, and 

specifically the GT algorithm rate counters. Therefore, the TS monitoring service is the only feasible approach to 

read out the GT algorithm rates and to achieve an “always on” external system depending on this information. 

6.5 Graphical user interfaces 

The HTTP/CGI interface of every cell facilitates the generic TS web-based GUI presented in Section 4.4.4.11. 

This is automatically generated and provides a homogeneous look and feel to control any sub-system cell 

independent of the operations, commands and monitoring customization details. The generic TS GUI of the 

DTTF, GT, GMT and RCT was extended with control panel plug-ins. The following section presents the Global 

Trigger control panel example [90]. 

6.5.1 Global Trigger control panel 

The GT control panel is integrated into the generic TS GUI of the GT cell. It uses the GT cell software in order 

to get access to the GT hardware. This control panel has the following features: 

• Monitoring and control of the GT hardware: The GT Control Panel implements the most important 

functionalities to monitor and control the GT hardware. That includes monitoring of the counters and the 

TTC detector partitions assigned to DAQ partitions, setting the time slots, enabling and disabling the TTC 

sub-detectors for a given DAQ partition, setting the FDL board mask, starting a run, stopping a run, starting 

random triggers, stopping random triggers, changing the frequency and step size for random triggers and 

resynchronization and resetting each of the DAQ partitions. 

• Configuration database population tool: The GT Control Panel allows hardware experts to create 

configuration entries in the configuration database without the need of any knowledge of the underlying 

database schema. 

• Access control integration: The GT Control Panel supports different access control levels. Depending on 

the user logged in (i.e. an expert, a shifter or a guest) the panel visualizes different information and allows 

different tasks to be performed. 

• Trigger menu generation: the GT Control Panel allows the visualization and modification of the trigger 

menu. The trigger menu is the high-level description of the algorithms that will be used to select desired 

physics events. For each algorithm it is possible to visualize and modify the name, algorithm number, prescale 

factor, algorithm description and condition properties (i.e. threshold, quality, etc.) 

Figure 6-12 presents a view of the GT control panel where it is shown which TTC partitions (32 columns) are 

assigned to each of the eight DAQ partitions (8 rows). The red color means that a given TTC partition is not 

connected.


Figure 6-12: GT control panel view showing the current partitioning state.

Chapter 7 

Homogeneous Supervisor and Control 

Software Infrastructure for the CMS Experiment 

at SLHC 

This chapter presents a project proposal to homogenize the supervisory control, data acquisition, and control 

software infrastructure for an upgraded CMS experiment at the SLHC. Its advantage is a unique, modular 

development platform enabling an efficient use of manpower and resources. 


This proposal aims to develop the CMS Experiment Control System (ECS) based on a new supervisory and 

control software framework. We propose a homogeneous technological solution for the CMS infrastructure of 

Supervisory Control And Data Acquisition (SCADA [99]). The current CMS software control system consists of 

the Run Control and Monitoring System (RCMS), the Detector Control System (DCS), the Trigger Supervisor 

(TS), and the Tracker, ECAL, HCAL, DT and RPC sub-detector supervisory systems. This infrastructure is 

based on three major supervisor and control software frameworks: PVSSII (Section 1.4.2), RCMS (Section 

1.4.1) and TS (Chapter 4). In addition, each sub-detector has created its own SCADA software. 

A single SCADA software framework used by all CMS sub-systems would have advantages for the 

maintenance, support and operation tasks during the experiment life-cycle: 

1) Overall design strategy optimization: There is an evident similarity in technical requirements for controls 

amongst the different levels of the experiment control system. A common SCADA framework will allow an 

overall optimization of requirements, design and implementation. 

2) Support and maintenance resources: The project should enable an efficient use of resources. A common 

SCADA infrastructure for CMS will manage the increasing complexity of the experiment control and reduce 

the effects of current and future constraints on manpower. 

3) Accelerated learning curve: Operators and developers will benefit from a common SCADA infrastructure 

due to: 1) One-time learning cost, 2) Moving between CMS control levels and sub-systems will not imply a 

change in technology. 

This project proposal is based on the evolution of the software infrastructure used to integrate the L1 trigger subsystems. 

Section 7.2 presents the project technology baseline and the criteria for its selection. Section 7.3 

presents an overview of the project road map. Finally, Section 7.4 outlines the project schedule and the required 

human resources. 

7.2 Technology baseline 

The design and development of the unique underlying supervisory and control infrastructure should initially start 

from the software framework currently used to implement the L1 trigger control software system or TS

Homogeneous Supervisor and Control Software Infrastructure for the CMS Experiment at SLHC 112 

framework. The following paragraphs describe the principal objective criteria for which this technological 

baseline has been chosen: 

1) Proven technology: It is used in the implementation of a supervisory and control system that coordinates 

the operation of all L1 trigger sub-systems, the TTC system, the LMS and to some extent the ECAL, HCAL, 

DT and RPC sub-detectors. This solution was successfully used during the second phase of the Magnet Test 

and Cosmic Challenge, has been used in the monthly commissioning exercises of the CMS Global Runs and 

is the official solution for the experiment operation. 

2) Homogeneous TriDAS infrastructure and support: The TS framework is based on XDAQ, which is the 

same middleware used by the DAQ event builder (Section 1.4.3). This component is a key part of the DAQ 

system and as such it is not likely to evolve towards a different underlying middleware. Therefore, a 

supervisory and control software framework based on the XDAQ middleware could profit from a long term, 

in-house supported solution. In addition, a SCADA infrastructure based on the XDAQ middleware would 

homogenize the underlying technologies for the DAQ and for the supervisory control infrastructure that 

would automatically reduce the overall support and maintenance effort. 

3) Simplified coordination and support tasks: The TS framework is designed to reduce the gap between 

software experts and experimental physicists and to reduce the learning curve. Examples are the usage of 

well known models in HEP control systems like finite state machines or homogeneous integration 

methodologies independent of the concrete sub-system Online SoftWare Infrastructure (OSWI) and 

hardware setup, or the automatic creation of graphical user interfaces. The latter is a development 

methodology characterized by a modular upgrading process and one single visible software framework. 

4) C++: The OSWI of all sub-systems is mainly formed by libraries written in C++ running on x86/Linux 

platforms. These are intended to hide hardware complexity from software experts. Therefore, a SCADA 

infrastructure based on C++, like the TS framework, would simplify the complexity of the integration 

architecture. 

7.3 Road map 

This project aims to reach the technological homogenization of the CMS Experiment Control System following a 

progressive and non-disruptive strategy. This shall allow a gradual and smooth transition from the current 

SCADA infrastructure to the proposed one. An adequate approach could have the following project tasks: 

1) L1 trigger incremental development: Continue with the current development and maintenance process in 

the L1 trigger using the proposed framework. 

2) Sub-detector control and supervisory software integration: This task involves the incremental adoption 

of a common software framework for all sub-detectors in order to homogenize the control and supervisory 

software of CMS. The participating sub-detectors are ECAL, HCAL, DT, CSC, RPC, and Tracker. 

Currently, this step is partially achieved because all sub-detectors are partially integrated with the TS system 

in order to: 1) Automate the pattern tests between the sub-detector TPG’s and the regional trigger systems, 

2) Check configuration consistency between L1 trigger and the trigger primitive generators. 

3) L1 trigger emulators supervisory system: This task involves the upgrade of the supervisory software of 

the L1 trigger emulators to the proposed common framework. The hardware emulators of the L1 trigger 

have been deployed as components of the CMSSW framework [100]. This task does not involve any change 

in the emulator code or in the CMSSW framework. 

4) High Level Trigger (HLT) supervisory system: This task involves the upgrade of the supervisory 

software of the HLT to the proposed common framework. In this way the components of the HLT (filter 

units, slice supervisors, and storage managers) will be launched, configured and monitored as the other 

software components of the CMS online software [101]. This task does not involve any change on the 

supervised components. 

5) Event builder supervisory system: This task involves the deployment of the event builder supervisory 

system as nodes of the proposed framework. The event builder supervisory software will launch all software 

components, will configure and will monitor the Front-End Readout Links (FRL), the Front-End Driver 

Network (FED Builder Network), and the different slices of Event Managers (EVM), Builder Units (BU)

Schedule and resource estimates 113 

and Readout Units (RU). This task does not involve the modification of the event builder components 

(Section 1.4.3). 

6) Experiment Control System feasibility study and final homogenization step: This is the last stage of the 

homogenization process. This task involves the feasibility study to change the top layer of the ECS and, 

afterwards, its substitution by components of the proposed framework. This means the substitution of the 

Function Managers by the nodes of the proposed SCADA software. This task also involves the feasibility 

study and homogenization of the top software layer of the DCS in order to be supervised, controlled and 

monitored by the ECS (Section 1.4.2). 

7.4 Schedule and resource estimates 

Schedule and resource estimates have been approximated according to the COCOMO II model [102] assuming 

the delivery of 50000 new Source Lines Of Code (SLOC), the modification of 10000 SLOC and reusing 30000 

SLOC, with the model parameters rated as a project with an average complexity. The SLOC effort has been 

estimated using the development experience with the TS and RCMS frameworks. Additional assumptions are a 

development team of people working in an in-house environment with extensive experience with related 

systems, and having a thorough understanding of how the system under development will contribute to the 

objectives of CMS. 

The four project phases are: 1) Inception: This phase includes the analysis of requirements, system definitions, 

specification and prototyping of user interfaces, and cost estimation; 2) Elaboration: This period is meant to 

define the software architecture and test plan; 3) Construction: this includes the coding and testing phases; 4) 

Transition: this last phase includes the final release delivery and set up of support and maintenance 

infrastructure. 

Table 7-1 shows the schedule for the project phases and the required resources per phase in person-months. This 

estimate includes the resources to deliver the infrastructure stated in Section 7.3: all templates, standard elements 

and functions required to achieve a homogeneous system and to reduce as much as possible the development 

effort for the sub-system integration developers. This estimate does not include the sub-system integration, 

which follows the transition phase. 

Phase 

Phase effort 

(Person-months) 

Inception 16 3 

Elaboration 64 8 

Construction 199 14 

Transition 32 13 

Schedule 

(Months) 

Table 7-1: Project phases schedule and associated effort in person-months. 

We summarize in Table 7-2 the top-level resource and schedule estimate of the project. 

Total effort (Person-months) 311 

Schedule (months) 38 

Table 7-2: Top-level estimate for elaboration and construction.

Chapter 8 

Summary and Conclusions 

The life span of the last generation of HEP experiment projects is of the same order of magnitude as a human 

being’s life, and both the experiment’s and the human being’s life phases share a number of analogies: 

During the conception period of a HEP experiment, key people discuss about the feasibility of a new project. For 

instance, the initial informal discussions about CMS started in 1989 and continued for nearly three years. This 

period finished with a successful conceptual design (CMS Letter of intent, 1992). In a similar way, the 

conception of a human being would follow a dating period and the decision of having a common life project. 

Right after the conceptual design, the research and prototyping phase starts. During this period research and 

prototyping tasks are performed in order to prove the feasibility of the former design. A successful culmination 

of this period is the release of a number of Technical Design Reports (TDR’s) describing the design details, the 

project schedule and organization. For the CMS experiment this period lasted until the year 2002. This second 

period is similar to the human childhood and infancy where the child grows up, experiments with her 

environment, learns the basic knowledge for life and approximately plans what she wants to be when she will 

grow up. 

The next stage in the life of a HEP experiment is the development phase. During this time, the building blocks 

described in the individual TDR’s are produced. For the CMS experiment this period lasted approximately until 

early 2007. Following the analogy of the human being, this period could be similar to the formation life period 

spent in high school and college where the adolescent learns several different subjects. 

Before being operational, the building blocks produced during the development phase need to be assembled and 

commissioned. The CMS commissioning exercises started in 2006 with Magnet Test and Cosmic Challenge and 

continued during 2007 with a monthly periodic and incremental commissioning exercise known as Global Run. 

This is similar to what happens to recent graduates starting their careers with a trainee period in a company or 

research institute. They learn how to use the knowledge acquired during the formation period in order to perform 

a concrete task. 

After a successful commissioning period the experiment is ready for operation. The CMS experiment is expected 

to be operational for at least 20 years. During this phase, periodic sub-system upgrades will be necessary to cope 

with the radiation damage or new requirements due to the SLHC luminosity upgrade. This period would be like 

the adult professional life when the person is fully productive and needs to periodically undergo medical checks 

or recycle her knowledge in order to fit the continuous changes in the evolution of the job market. 

Finally, the experiment will be decommissioned at the end of its operational life. The analogy also works in this 

case, because at the end of a successful career a person will also retire. 

The long life span is not the only complexity dimension of the last generation of HEP experiments that finds a 

good analogy in the metaphor of the human being. The numeric complexity of the sub-systems collaborating is 

amazing also on both sides. 

We have discussed the time scale and complexity similarities between human beings and HEP experiments, but 

we can still go further in this analogy and ask: “What is the experiment’s genetic material” In other words, what 

is the seed of a HEP experiment project It cannot be people, because only few collaboration members stay

Summary and Conclusions 116 

during the whole lifetime of the experiment. The good answer is that the experiment genetic material is the 

knowledge consisting of successful ideas applied in past experiments and of novel contributions from other 

fields which promise improved results. This set of ideas is a potential future HEP experiment. 

And people Where do the members of the collaboration fit In this analogy, the scientists, engineers and 

technicians are responsible for transmitting and expressing the experiment’s genetic material. In other words, the 

collaboration members are the hosts of the experiment DNA and are also responsible for its expression in actual 

experiment body parts. Therefore, even though concrete people are more able than others to transmit and express 

the experiment DNA, none is essential. 

The metaphor between the most advanced HEP experiments with the human beings serves the author to explain 

how this thesis contributed to CMS, and to the HEP and scientific communities. The following sections 

summarize the contributions of this work to both the CMS body or experiment, and the CMS DNA or knowledge 

base of the CMS collaboration and HEP communities. 

8.1 Contributions to the CMS genetic base 

This work encompasses a number of ideas intended to enhance the expression of a concrete CMS body part, the 

control and hardware monitoring system of the L1 trigger or Trigger Supervisor (TS). A successful final design 

was reached not just by gathering a detailed list of functional requirements. It was necessary to understand the 

complexity of the task, and the most promising technologies had to be proven. 

The unprecedented number of hardware items, the long periods of preparation and operation, and the human and 

political context were presented as three complexity dimensions related to building hardware management 

systems for the latest generation of HEP experiments. The understanding of the problem context and associated 

complexity, together with the experience acquired with an initial generic solution, guided us to the conceptual 

design of the Trigger Supervisor. 

8.1.1 XSEQ 

An initial generic solution to the thesis problem context proposed a software environment to describe 

configuration, control and test systems for data acquisition hardware devices. The design followed a model that 

matched well the extensibility and flexibility requirements of a long lifetime experiment that is characterized by 

an ever-changing environment. The model builds upon two points: 1) the use of XML for describing hardware 

devices, configuration data, test results, and control sequences; and 2) an interpreted, run-time extensible, highlevel 

control language for these sequences that provides independence from a specific host platform and from 

interconnect systems to which devices are attached. The proposed approach has several advantages: 

• The uniform usage of XML assures a long term technological investment and a reduced in house 

development due to an existing large asset of standards and tools. 

• The interpreted approach enables the definition of platform independent control sequences. Therefore, it 

enhances the sub-system platform upgrade process. 

The syntax of a XML-based programming language (XSEQ, XML-based sequencer) was defined. It was shown 

how an adequate use of XML schema technology facilitated the decoupling of syntax and semantics, and 

therefore enhanced the sharing of control sequences among heterogeneous sub-system platforms. 

An interpreter for this language was developed for the CERN Scientific Linux (SLC3) platform. It was proved 

that the performance of an interpreter for a XML-based programming language oriented to hardware control 

could be at least as good as the performance of an interpreter for a HEP standard language for hardware control. 

The model implementation was integrated into a distributed programming framework specifically designed for 

data acquisition in the CMS experiment (XDAQ). It was shown that this combination could be the architectural 

basis of a management system for DAQ hardware. A feasibility study of this software defined a number of 

standalone applications for different CMS hardware modules and a hardware management system to remotely 

access these heterogeneous sub-systems through a uniform web service interface.

Contributions to the CMS genetic base 117 

8.1.2 Trigger Supervisor 

The experience acquired during this initial research together with the L1 trigger operation requirements seeded 

the conceptual design of the Trigger Supervisor. It consists of a set of functional and non-functional 

requirements, the architecture design together with few technological proposals, and the project tasks and 

organization details. 

The functional purpose of the TS is to coordinate the operation of the L1 trigger and to provide a flexible 

interface that hides the burden of this coordination. The required operation capabilities had to simplify the 

process of configuring, testing and monitoring the hardware. Additional functionalities were required for 

troubleshooting, error management, user support, access control and start-up purposes. The non-functional 

requirements were also discussed. These take into account the magnitude of the infrastructure under control, the 

implications related to the periodic hardware and software upgrades necessary in a long-lived experiment like 

CMS, the particular human and political context of the CMS collaboration, the required long term support and 

maintenance, the limitations of the existing CMS online software infrastructure and the particularities of the 

operation environment of the CMS Experiment Control System. 

The design of the TS architecture fulfills the functional and non-functional requirements. This architecture 

identifies three main development layers: the framework, the system and the services. The framework is the 

software infrastructure that facilitates the main building block or cell, and the integration with the specific subsystem 

OSWI. The system is a distributed software architecture built out of these building blocks. Finally, the 

services are the L1 trigger operation capabilities implemented on top of the system as a collaboration of finite 

state machines running in each of the cells. 

The decomposition of the project development tasks into three layers enhances the coordination of the 

development tasks; and helps to keep a stable system, in spite of hardware and software upgrades, on top of 

which new operation capabilities can be implemented without software engineering expertise. 

8.1.3 Trigger Supervisor framework 

The TS framework is the lowest level layer of the TS. It consists of the basic software infrastructure delivered to 

the sub-systems to facilitate their integration. This infrastructure is based on the XDAQ middleware and few 

external libraries. XDAQ was chosen among the CMS officially supported distributed programming frameworks 

(namely XDAQ, RCMS and JCOP) as the baseline solution because it offered the best trade-off between 

infrastructure completeness and fast sub-system integration. Although XDAQ was the best available option, 

further development was needed to reach the usability required by a community of customers with no software 

engineering background and limited time dedicated to software integration tasks. 

The cell is the main component of the additional software infrastructure. This component is a XDAQ application 

that needs to be customized by each sub-system in order to integrate with the Trigger Supervisor. The 

customization process has the following characteristics: 

• Based on Finite State Machines (FSM): The integration of a sub-system with the TS consists of defining 

FSM plug-ins. A FSM model was chosen because this is a well known approach to define control systems 

for HEP experiments and therefore it would accelerate the customer’s learning curve. FSM plug-ins wrap 

the usage of the sub-system OSWI and offers a stable remote interface despite software platform and 

hardware upgrades. 

• Simple: Additional facilities were also delivered to the sub-systems in order to simplify the customization 

process. The most important one is the xhannel API. It provides a simple and homogeneous interface to a 

wide range of external services: other cells, XDAQ applications and web services. 

• Automatically generated GUI: A mechanism to automatically generate the cell GUI reduced the 

customization time and facilitated a common look and feel for all sub-systems graphical setups. The 

common look and feel improved the learning curve for new L1 trigger operators. 

• Remote interface: The cell provided a human and a machine interface based on the HTTP/CGI and the 

SOAP protocols respectively, fitting well the web services based model of the CMS Online SoftWare 

Infrastructure (OSWI). This interface facilitated the remote operation of the sub-system specific FSM plugins. 

This interface could also be enlarged with custom functionalities using command plug-ins.

Summary and Conclusions 118 

8.1.4 Trigger Supervisor system 

The intermediate layer of the TS is the TS System (TSS). It provides a stable layer on top of which the TS 

services have been implemented. The TS system is designed to require a reduced maintenance and to provide a 

methodology to develop services which can fit present and future experiment operational requirements. In this 

scheme, the development of new services requires very limited knowledge about the internals of the TS 

framework, and uniquely needs to follow a well defined methodology. The stable TS system together with the 

associated methodology facilitates to accommodate these functionalities in a non-disruptive way, without 

requiring major developments. 

The TSS consists of four distributed software systems with well defined functionalities: TS Control System 

(TSCS), TS Monitoring System (TSMS), TS Logging System (TSLS) and TS Start-up System (TSSS). The 

following points describe the design principles: 

• Reduced number of basic building blocks: The TSS is uniquely based on the sub-system cells and already 

existing monitoring, logging and start-up components provided by the XDAQ and RCMS frameworks. 

Reusing XDAQ and RCMS components minimized the development effort and at the same time guaranteed 

the long term support and maintenance. A reduced number of basic building blocks helped also to 

communicate architectural concepts. 

• Nodes and connections without logic: The TSCS is a collection of nodes and the communication channels 

among them. It does not include the logic of the L1 decision loop operation capabilities. This is 

implemented one layer above following a well defined methodology. The improved modularity obtained by 

decoupling the stable infrastructure (TSCS) from the L1 trigger operation capabilities eases the distribution 

of development tasks. Sub-system experts and technical coordinators were responsible for maintaining 

and/or implementing L1 trigger operation capabilities, whilst the TS central team focused on assuring a 

stable TSCS. 

• Hierarchical control system: It is shown how a hierarchical topology for the TSCS enhances a distributed 

development, facilitates the independent operation of a given sub-system, simplifies a partial deployment 

and provides graceful system degradation. 

• Well defined subsystem integration model: The integration of each sub-system is done according to 

guidelines proposed by the TS central team. Those are intended to maximize the deployment of the TSS in 

different set-ups, and to ease the hardware evolution without affecting the services layer intended to provide 

the L1 trigger operation capabilities. 

8.1.5 Trigger Supervisor services 

The TS services are the L1 decision loop operation capabilities. The current services are the final functionalities 

required during the conceptual design. These have been implemented on top of the TS system and according to 

the proposed methodology. The following services were presented: 

• Configuration: This is the main service provided by the TS. It facilitates the configuration of the L1 

decision loop. Up to eight remote clients can use this service simultaneously without risking inconsistent 

configurations of the L1 decision loop. The configuration information (e.g. firmware, LUT’s, registers) is 

retrieved from the configuration database using a database identifier provided by the client. RCMS uses the 

remote interface provided by the central node of the TS in order to configure the L1 decision loop. 

• Interconnection test: It is intended to automatically check the connections between sub-systems. From the 

client point of view, the interconnection test service is another operation running in the TS central cell. 

• Logging and start-up services: They are provided by the corresponding TS logging and start-up systems 

and did not require any further customization process. 

• Monitoring: This service, facilitated by the TS monitoring system, provides access to the monitoring 

information of the L1 decision loop hardware. It is designed to be an “always on” source of monitoring 

information despite the availability of the DAQ system. 

• Graphical User Interface (GUI): This service is facilitated by the HTTP/CGI interface of every cell. It is 

automatically generated and provides a homogeneous look and feel to control any sub-system cell

Final remarks 119 

independent of the operations, commands and monitoring customization details. It was also shown that the 

generic TS GUI could be extended with subsystem specific control panels. 

8.1.6 Trigger Supervisor Continuation 

A continuation line for the TS was presented. The project proposal is is intended to homogenize the Supervisory 

Control And Data Acquisition infrastructure (SCADA) for the CMS experiment. A single SCADA software 

framework used by all CMS sub-systems would have advantages for the maintenance, support and operation 

tasks during the experiment operational life. The proposal is based on the evolution of the TS framework. A 

tentative schedule and resource estimates were also presented. 

8.2 Contribution to the CMS body 

The main initial goal of this PhD thesis was to build a tool to operate the L1 trigger decision loop and to integrate 

it in the overall Experiment Control System. This objective has been achieved: The Trigger Supervisor has 

become a real body part of the CMS experiment and it serves its purpose. 

Periodic demonstrators brought the TS to the first joint operation with the Experiment Control System in 

November 2006 with the second phase of the Magnet Test and Cosmic Challenge ([103], pag. 9). It has 

continued improving and serving every monthly commissioning exercise since May 2007 and is the official tool 

for the CMS experiment to operate the L1 decision loop ([104], pag. 190). 

Using the introductory analogy, the CMS Experiment Control System would be the experiment brain, and the 

Trigger Supervisor a specialized brain module just like the human brain is thought to be divided in specialized 

units for instance to turn sounds into speech or to recognize a face. The development of the CMS Trigger 

Supervisor can be seen as the expression of a newly added genetic material in the CMS DNA. 

This thesis has also an important influence on how the CMS experiment is being controlled. The operation of the 

CMS experiment is influenced by how the configuration and monitoring services of the TS allows operating the 

L1 decision loop. 

Continuing with the analogy, if the TS is a specialized brain module, the TS system would be the static neural 

net and the TS services would be the behavior pattern stored in it. Having the possibility to adopt new operation 

capabilities on top of a stable architecture, without requiring major upgrades, fits well a long-life experiment, 

just like the human brain which keeps an almost invariant neural architecture but is able to learn and adapt to its 

environment. 

8.3 Final remarks 

This thesis contributes to the CMS knowledge base and by extension to the HEP and scientific communities. The 

motivation and goals, a generic solution and finally a successful design for a distributed control system are 

discussed in detail. This new CMS genetic material has achieved its full expression and has become a CMS body 

part, the CMS Trigger Supervisor. This is the maximum impact we could initially expect inside the CMS 

Collaboration. 

A more complicated question is the impact of the exposed material outside the CMS collaboration. Answering 

this question is like answering the question of how well the added CMS genetic material will spread. To a certain 

important extent, the chances to successfully propagate the knowledge written in this thesis depends of how well 

adapted is CMS to its environment - In other words, how successful CMS will be to fulfill its physics goals.

Appendix A 

Trigger Supervisor SOAP API 

A.1 Introduction 

This chapter specifies the SOAP Application Program Interface (API) exposed by a Trigger Supervisor (TS) cell. 

The audience for this specification is mainly the application developers requiring the remote execution of cell 

commands and/or operations (e.g. the developer of the L1 trigger function manager in order to use the TS 

services provided by the TS central cell). 

A.2 Requirements 

• Command and operation control: The protocol should allow the remote initialization, operation and 

destruction of cell operations and the execution of commands. 

• Controller identification: The protocol should enforce the identification of the controller in the cell in 

order to be able to classify all the logging records as a function of the controller. 

• Synchronous and asynchronous communication: The protocol should allow both synchronous and 

asynchronous communication modes. The synchronous protocol is intended to assure an exclusive usage of 

the cell. The asynchronous mode should enable multi-user access and achieve an enhanced overall system 

performance. 

• XDAQ data type serialization: The protocol should be able to encode different data types like integer, 

string or boolean. The encoding scheme should be compatible with the XDAQ encoding/decoding data type 

from/to XML. 

• Human and machine interaction mechanism: The protocol should embed a warning message and level in 

each reply message. The warning information should facilitate a machine comprehension of the request 

success level. 

A.3 SOAP API 

A.3.1 Protocol 

The cell SOAP protocol allows both synchronous and asynchronous communication between the controller and 

the cell. Figure A-1 shows a UML sequence diagram that exemplifies the synchronous communication protocol 

between a controller and a cell. In that case, the controller is blocked until the reply message arrives. This 

protocol also blocks the cell. Therefore, additional requests coming from other controllers will not be served 

until the cell has replied to the former controller. 

Figure A-2 shows a UML sequence diagram that exemplifies the asynchronous communication protocol between 

a controller and a cell. In the asynchronous case, the controller is blocked just a few milliseconds per request

Trigger Supervisor SOAP API 122 

Synchronous controller 

Cell 

request(async=false, cid=1) 

reply(result, cid=1) 

request(async=false, cid=2) 

reply(result, cid=2) 

Figure A-1: UML sequence diagram of a synchronous SOAP communication between a controller and a 

cell. 

Asynchronous controller 

Cell 

request(async=true, cid=1) 

Ack(cid=1) 

request(async=true, cid=2) 

Ack(cid=2) 

reply(result, ci=1) 

reply(result, ci=2) 

Figure A-2: UML sequence diagram of an asynchronous SOAP communication between a controller and a 

cell. 

until it receives the acknowledge message. The asynchronous reply is received in a parallel thread that listen the 

corresponding port. In that case, the overall response time as a function of the number of SOAP request 

messages (n) will grow as O(1) instead of O(n) (synchronous case). The total response time will be slightly 

longer than the longest remote call. 

On the cell side, each asynchronous request opens a new thread where the command is executed. Therefore, 

several controllers are allowed to remotely execute commands concurrently in the same cell. 

Whatever communication mechanism is used, the reply message embeds the warning information. The warning 

level provides the request success level to the controller. The warning message completes this information with a 

human-readable message.

SOAP API 123 

A.3.2 Request message 

Figure A-3 shows an example of a request message. This request executes the command ExampleCommand in a 

given cell. 

 

 

 

 

 

 

3 

CommandResponse 

http://centralcell.cern.ch:50001 

urn:xdaq-application:lid=13 

 

 

 

Figure A-3: SOAP request message example. 

The first XML tag (or just tag) inside the body of the SOAP message (i.e. Examplecommand) identifies the cell 

command to be executed in the remote cell. The attribute async takes a boolean value and tells the cell whether 

this request has to be executed synchronously or asynchronously. The cid attribute is set by the controller and 

the same value is set by the cell in the reply message cid. This mechanism allows a controller to identify 

request-reply pairs in an asynchronous communication (cid is not necessary in the synchronous communication 

case). The sid attribute identifies a concrete controller. The value of this attribute is added into all log message 

generated by the execution of the command. It is therefore possible to trace the actions of each individual 

controller by analyzing the logging statements. 

The asynchronous communication modality requires the specification of three additional tags: callbackFun, 

callbackUrl and callbackUrn. The value of these tags identifies univocally the controller side callback that will 

handle the asynchronous reply. 

When async is equal to false (i.e. synchronous communication) the attributes cid, callbackFun, callbackUrl 

and callbackUrn are not needed. 

The parameters of the command are set using the tag param. The name of the parameter is defined with the 

attribute name. The type of the parameter is defined with the attribute xsi:type and its value is set inside the tag. 

Table A-1 presents the list of possible types and their correspondence with the class that facilitates the 

marshalling process 17 . 

xsi:type attribute 

xsd:integer 

xsd:unsignedShort 

xsd:unsignedLong 

xsd:float 

XDAQ class 

xdata::Integer 



xdata::Float 

17 In the context of data transmission, marshalling or serialization is the process of transmitting an object across a network 

connection link in binary form. The series of bytes can be used to deserialize or unmarshall an object that is identical in its 

internal state to the original one.


xsd:double 

xsd:Boolean 

xsd:string 

xdata::Double 


xdata::String 

Table A-1: Correspondence between xsi:type data types and the class that facilitates the marshalling process. 

A.3.3 Reply message 

Figure A-4 shows an example of a reply message. This message is the asynchronous response sent by the cell 

after executing the command ExampleCommand requested with the request message of Figure A-3. 

 

 

 

 

Hello World! 

Warning message 

SOAP API 125 

 

 

 

 

 

A.3.4 Cell command remote API 

The SOAP API for cell commands has already been presented to exemplify the request and reply messages in 

Sections A.3.2 and A.3.3ions A.3.2 and A.3.3. 

A.3.5 Cell Operation remote API 

The SOAP API for cell operations consists of a number of request messages which allow to remotely instantiate, 

reset, execute a transition, get the state and finally kill an operation instance. The following sections present the 

request and reply messages for all relevant cases. 

A.3.5.1 OpInit 

Figure A-5: Acknowledge reply message. 

Figure A-6 shows the request message to instantiate a new operation. 

 

 

 

 

MTCCIIConfiguration 

NULL 

NULL 

NULL 

 

 

 

Figure A-6: Request message to create an operation instance. 

This request example corresponds to a synchronous request. It is therefore not needed to specify a value for the 

cid, callbackFun, callbackUrl and callbackUrn tags. The operation tag serves to specify the operation class 

name, and the opId attribute is an optional attribute defining the instance name or identifier. If the opId is not 

specified, the cell will assign a random opId to the operation instance. 

Figure A-7 shows the reply message to the request of Figure A-6. In this case, the callback function was not 

specified (i.e. setting callbackFun, callbackUrl and callbackUrn tags to NULL). Therefore, the tag inside the 

body is named NULL. Inside the callback tag NULL there are two more tags: payload and operation. The payload 

tag contains a string with information about the instantiation process. The tag operation contains the name (or 

identifier) that has been assigned to the operation instance. This identifier is used by the controller to refer to that 

operation instance. The operation warning object is also embedded in the reply message.


 

 

 

 

InitOperation done 

my_opid 

 

SOAP API 127 

Figure A-9 shows the reply message to the request of Figure A-8. The tag payload contains the result of the 

transition execution that depends on the customization process. The operation warning object is also embedded 

in the reply message. 

 

 

 

 

Ok 

 


 

 

 

 

Operation reset Ok 

 

SOAP API 129 

 

 

 

 

halted 

 


 

 

 

 

Operation killed 

 

Acknowledgements 

First of all I want to thank Claudia-Elisabeth Wulz, Joao Varela, Wesley Smith and Sergio Cittolin for granting 

me the privilege to lead the conceptual design and development effort of the Trigger Supervisor project. 

My special thanks to Marc Magrans de Abril for being the “always on” motor of the project, for his continuous 

will to improve, for the never ending flow of ideas and most important for being my brother and strongest 

support. 

This thesis work could not have reached its full expression without the hard work of so many CMS collaboration 

members: managers, sub-system cell developers and TS central team members built the bridge between a dream 

and a reality. 

The very careful reading of the manuscript by Marco Boccioli, Iñaki García Echebarría, Joni Hahkala, Elisa 

Lanciotti, Raúl Murillo García, Blanca Perea Solano and Ana Sofía Torrentó Coello. Their suggestions improved 

the English and made this document readable for other people than me alone. 

Many thanks to all my colleagues at the High Energy Physics Institute of Vienna as it was always a pleasure to 

work with them. 

Last but not least, I wish to thank my family for the unconditional support.

References 

[1] P. Lefèvre and T. Petterson (Eds.), “The Large Hadron Collider, conceptual design”, CERN/AC/95-05. 

[2] CMS Collaboration, “The Compact Muon Solenoid”, CERN Technical Proposal, LHCC 94-38, 1995. 

[3] ATLAS Collaboration, “ATLAS Technical Proposal,” CERN/LHCC 94-43. 

[4] ALICE Collaboration, “ALICE - Technical Proposal for A Large Ion Collider Experiment at the CERN 

LHC”, CERN/LHCC 95-71. 

[5] LHCb Collaboration, “LHCb Technical proposal”, CERN/LHCC 98-4. 

[6] CMS Collaboration, “The Tracker System Project, Technical Design Report”, CERN/LHCC 98-6. 

[7] CMS Collaboration, “The Electromagnetic Calorimeter Project, Technical Design Report”, 

CERN/LHCC 97-33. CMS Addendum CERN/LHCC 2002-27. 

[8] CMS Collaboration, “The Hadron Calorimeter Technical Design Report”, CERN/LHCC 97-31. 

[9] CMS Collaboration, “The Muon Project, Technical Design Report”, CERN/LHCC 97-32. 

[10] CMS Collaboration, “The Trigger and Data Acquisition Project, Volume II, Data Acquisition & High- 

Level Trigger, Technical Design Report,” CERN/LHCC 2002-26. 

[11] CMS Collaboration, “The TriDAS Project - The Level-1 Trigger Technical Design Report”, 

CERN/LHCC 2000-38. 

[12] P. Chumney et al., “Level-1 Regional Calorimeter Trigger System for CMS", in Proc. of Computing in 

High Energy Physics and Nuclear Physics, La Jolla, CA, USA, 2003. 

[13] J.J. Brooke et al., “The design of a flexible Global Calorimeter Trigger system for the Compact Muon 

Solenoid experiment”, CMS Note 2007/018. 

[14] R. Martinelli et al., “Design of the Track Correlator for the DTBX Trigger”, CMS Note 1999/007 

(1999). 

[15] J. Erö et al., “The CMS Drift Tube Track Finder”, CMS Note (in preparation). 

[16] D. Acosta et al., “The Track-Finder Processor for the Level-1 Trigger of the CMS Endcap Muon 

System”, in Proc. of the 5 th Workshop on Electronics for LHC Experiments, Snowmass, Co, USA, Sept. 

1999, CERN/LHCC/99-33 (1999). 

[17] H. Sakulin, “Design and Simulation of the First Level Global Muon Trigger for the CMS Experiment at 

CERN”, PhD tesis, University of Technology, Vienna (2002). 

[18] C.-E. Wulz, “Concept of the CMS First Level Global Trigger for the CMS Experiment at LHC”, Nucl. 

Instr. Meth. A 473/3 231-242 (2001). 

[19] TOTEM Collaboration, paper to be published in Journal of Instrumentation (JINST). 

[20] CMS Trigger and Data Acquisition Group, “CMS L1 Trigger Control System”, CMS Note 2002/033. 

[21] B. G. Taylor, “Timing Distribution at the LHC”, in Proc. of the 8 th Workshop on Electronics for LHC 

and Future Experiments, Colmar, France (2002). 

[22] V. Brigljevic et al., “Run control and monitor system for the CMS experiment,”, in Proc. of Computing 

in High Energy and Nuclear Physics 2003, La Jolla, CA (2003).

[23] JavaServer Pages Technology, http://java.sun.com/products/jsp/ 

[24] W3C standard, “Extensible Markup Language (XML)”, http://www.w3.org/XML 

[25] W3C standard, “Simple Object Access Protocol (SOAP)”, http://www.w3.org/TR/SOAP 

[26] PVSS II system from ETM, http://www.pvss.com 

[27] J. Gutleber and L. Orsini, “Software architecture for processing clusters based on I2O,” in Cluster 

Computing, New York, Kluwer Academic Publishers, Vol. 5, pp. 55–65 (2002). 

[28] J. Gutleber, S. Murray and L. Orsini, “Towards a homogeneous architecture for high-energy physics 

data acquisition systems”, Comput. Phys. Commun. 153, Issue 2 (2003) 155-163. 

[29] V. Brigljevic et al., “The CMS Event Builder”, in Proc. of Computing in High-Energy and Nuclear 

Physics, La Jolla CA, March 24-28 (2003). 

[30] P. Glaser et al.,”Design and Development of a Graphical Setup Software for the CMS Global Trigger”, 

IEEE Transactions on Nuclear Science, Vol. 53, No. 3, June 2006. 

[31] Qt Project, http://trolltech.com/products/qt 

[32] Python Project, http://www.python.org/ 

[33] Tomcat Project, http://tomcat.apache.org/ 

[34] C. W. Fabjan and H.G. Fischer, “Particle Detectors”, Rep. Prog. Phys., Vol. 43, 1980. 

[35] R.E Hughes-Jones et al., “Triggering and Data Acquisition for the LHC”, in Proc. of the International 

Conference on Electronics for Particle Physics, May 1995. 

[36] CMS Collaboration, “CMS Letter of Intent”, CERN/LHCC 92-3, LHCC/I 1, Oct 1, 1992. 

[37] K. Holtman, “Prototyping of the CMS Storage Management”, Ph.D. Thesis, Technische Universiteit 

Eindhoven, Eindhoven, May 2000. 

[38] CDF II Collaboration, “The CDF II Detector: Technical Design Report”, FERMILAB-PUB-96/390-E, 

1996. 

[39] J. Gutleber, I. Magrans, L. Orsini and M. Nafría, “Uniform management of data acquisition devices 

with XML”, IEEE Transactions on Nuclear Science, Vol. 51, Nº. 3, June 2004. 

[40] M. Elsing and T. Schorner-Sadenius, “Configuration of the ATLAS trigger system,” in Proc. of 

Computing in High Energy and Nuclear Physics 2003, La Jolla, CA (2003). 

[41] Roger Pressman, “Software Engineering: A Practitioner's Approach”, McGraw-Hill, 2005. 

[42] W3C standard, “XML Schema“, http://www.w3.org/XML/Schema 

[43] W3C standard, “Document Object Model (DOM)”, http://www.w3.org/DOM/ 

[44] W3C standard, “XML Path Language (XPath)”, http://www.w3.org/TR/xpath 

[45] Apache Project, http://xml.apache.org/ 

[46] W3C standard, “HTTP - Hypertext Transfer Protocol”, http://www.w3.org/Protocols/ 

[47] W3C standard, “XSL Transformations (XSLT)”, http://www.w3.org/TR/xslt 

[48] G. Dubois-Felsman, “Summary DAQ and Trigger”, in Proc.of Computing in High Energy and Nuclear 

Physics 2003, La Jolla, CA (2003). 

[49] S. N. Kamin, “Programming Languages: An Interpreted-Based Approach”, Reading, MA, Addison- 

Wesley, 1990. 

[50] I. Magrans et al., “Feasibility study of a XML-based software environment to manage data acquisition 

hardware devices”, Nucl. Instr. Meth. A 546 324-329 (2005). 

[51] E. Cano et al., ”The Final Prototype of the Fast Merging Module (FMM) for Readout Status Processing 

in CMS DAQ”, in Proc. of the 10 th Workshop on Electronics for LHC Experiments and Future 

Experiments, Amesterdam, Netherland, September 29 - October 03, 2003.

[52] J. Ousterhout, “Tcl and Tk Toolkit”, Reading, MA, Addisson-Wesley, 1994. 

[53] HAL Project, http://cmsdoc.cern.ch/~cschwick/software/documentation/HAL/index.html 

[54] Albert De Roeck, John Ellis and Fabiola Gianotti, “Physics Motivations for Future CERN 

Accelerators”, CERN-TH/2001-023, hep-ex/0112004. 

[55] CMS SLHC web page, http://cmsdoc.cern.ch/cms/electronics/html/elec_web/common/slhc.html 

[56] I. Magrans, C.-E. Wulz and J. Varela, “Conceptual Design of the CMS Trigger Supervisor”, IEEE 

Transactions on Nuclear Sciences, Vol. 53, Nº. 2, November 2005. 

[57] W3C Web Services Activity, http://www.w3.org/2002/ws/ 

[58] W3C standard, “Web Services Description Language (WSDL)”, http://www.w3.org/TR/wsdl 

[59] I2O Special Interest Group, “Intelligent I/O (I2O) Architecture Specification v2.0”, 1999. 

[60] I. Magrans and M. Magrans, “The CMS Trigger Supervisor Project”, in Proc of the IEEE Nuclear 

Science Symposium 2005, Puerto Rico, 23-29 October, 2005. 

[61] Unified Modeling Language, http://www.rational.com/uml/ 

[62] Trigger Supervisor web page, http://triggersupervisor.cern.ch/ 

[63] I. Magrans and M. Magrans, “Trigger Supervisor - User’s Guide”, 

http://triggersupervisor.cern.ch/index.phpoption=com_docman&task=doc_download&gid=32 

[64] Trigger Supervisor Framework Workshop, 


[65] Trigger Superviosr Interconnection Test Workshop, 


[66] Trigger Supervisor Framework v 1.4 Workshop, 

http://indico.cern.ch/getFile.py/accessresId=0&materialId=slides&confId=24530 

[67] Trigger Supervior Support Management Tool, https://savannah.cern.ch/projects/l1ts/ 

[68] R.E. Johnson, and B. Foote, “Designing reusable classes”, Journal of Object-Oriented Programming, 

1(2): pp. 22-35, 1988. 

[69] L. Peter Deutsch, “Design reuse and frameworks in the smalltalk-80 system”, In Software Reusability - 

Volume II, Applications and Experience, pp. 57-72, 1989. 

[70] C. Gaspar and M. Dönszelmann, “DIM - A Distributed Information Management System for the 

DELPHI Experiment at CERN”, in Proc. of the 8 th Conference on Real-Time Computer Applications in 

Nuclear, Particle and Plasma Physics, Vancouver, Canada, June 1993. 

[71] R. Jacobsson, "Controlling Electronic Boards with PVSS”, in Proc. of the 10 th International Conference 

on Accelerator and Large Experimental Physics Control Systems, Geneva, 10-14 October 2005, P- 

01.045-6. 

[72] B. Franek and C. Gaspar, “SMI++ Object-Oriented Framework for Designing and Implementing 

Distributed Control Systems”, IEEE Transactions on Nuclear Science, Vol. 52, Nº. 4, August 2005. 

[73] T. Adye et al., “The DELPHI Experiment Control”, in Proc. of the International Conference on 

Computing in High Energy Physics 1992, Annecy, France. 

[74] A. J. Kozubal, L. R. Dalesio, J. O. Hill and D. M. Kerstiens, “A State Notation Language for Automatic 

Control”, Los Alamos National Laboratory report LA-UR-89-3564, November, 1989. 

[75] R. Arcidiacono et al., “CMS DCS Design Concepts”, in Proc. of the 10 th International Conference on 

Accelerator and Large Experimental Physics Control Systems, Geneva, Switzerland, 10-14 Oct. 2005. 

[76] A. Augustinus et al., “The ALICE Control System - a Technical and Managerial Challenge”, in Proc. of 

the 9 th International Conference on Accelerator and Large Experimental Physics Control Systems, 

Gyeongju, Korea, 2003.

[77] C. Gaspar et al.,”An Integrated Experiment Control System, Architecture and Benefits: the LHCb 

Approach”, in Proc. of the 13 th IEEE-NPSS Real Time Conference, Montreal, Canada, May 18-23, 

2003. 

[78] Log4j Project, http://logging.apache.org/log4j/docs/index.html 

[79] Xerces-C++ project, http://xml.apache.org/xerces-c/ 

[80] W3C recommendation, “XML 1.1 (1 st Edition)”, http://www.w3.org/TR/2004/REC-xml11-20040204/ 

[81] Graphviz Project, http://www.graphviz.org/ 

[82] ChartDirector Project, http://www.advsofteng.com/ 

[83] Dojo project, http://dojotoolkit.org/ 

[84] Cgicc project, http://www.gnu.org/software/cgicc/ 

[85] Logging Collector documentation, http://cmsdoc.cern.ch/cms/TRIDAS/RCMS/ 

[86] J. Gutleber, L. Orsini et al., “Hyperdaq, Where Data Adquisition Meets the Web”, in Proc. of the 10 th 

International Conference on Accelerator and Large Experimental Physics Control Systems, Geneva, 

Switzerland, 10-14 Oct. 2005. 

[87] I2O Special Interest Group, “Intelligent I/O (I2O) Architecture Specification v2.0”, 1999. 

[88] ECMA standard-262, “ECMAScript Language Specification”, December 1999. 

[89] I. Magrans and M. Magrans, “Enhancing the User Interface of the CMS Level-1 Trigger Online 

Software with Ajax”, in Proc. of the 15 th IEEE-NPSS Real Time Conference, Fermi National 

Accelerator Laboratory in Batavia, IL, USA, May 2007. 

[90] A. Winkler, “Suitability Study of the CMS Trigger Supervisor Control Panel Infrastructure: The Global 

Trigger Case”, Master Thesis, Technical University of Vienna, March 2008. 

[91] Scientific Linux CERN 3 (SLC3), http://linux.web.cern.ch/linux/scientific3/ 

[92] Oracle Corp., http://www.oracle.com/ 

[93] CAEN bus adapter, model: VME64X - VX2718, http://www.caen.it 

[94] Apache Chainsaw project, http://logging.apache.org/chainsaw/index.html 

[95] I. Magrans and M. Magrans, “The Control and Hardware Monitoring System of the CMS Level-1 

Trigger”, in Proc of the IEEE Nuclear Science Symposium 2007, Honolulu, Hawaii, October 29 - 

November 2, 2007. 

[96] Web interface of the Trigger Supervisor CVS repository, http://isscvs.cern.ch/cgi-bin/viewcvsall.cgi/TriDAS/trigger/root=tridas 

[97] P. Glaser, "System Integration of the Global Trigger for the CMS Experiment at CERN", Master thesis, 

Technical University of Vienna, March 2007. 

[98] A. Oh, “Finite State Machine Model for Level 1 Function Managers, Version 1.6.0”, 

http://cmsdoc.cern.ch/cms/TRIDAS/RCMS/Docs/Manuals/manuals/level1FMFSM_1_6.pdf 

[99] IEEE standard C37.1-1994, “IEEE standard definition, specification, and analysis of systems used for 

supervisory control, data acquisition, and automatic control”. 

[100] CMS Collaboration, “CMS physics TDR - Detector performance and software”, CERN/LHCC 2006- 

001. 

[101] A. Afaq et Al, “The CMS High Level Trigger System”, IEEE NPSS Real Time Conference, Fermilab, 

Chicago, USA, April 29 - May 4, 2007. 

[102] B. Boehm et al., “Software cost estimation with COCOMO II”. Englewood Cliffs, NJ: Prentice-Hall, 

2000. ISBN 0-13-026692-2. 

[103] CMS Collaboration, “The CMS Magnet Test and Cosmic Challenge (MTCC Phase I and II) - 

Operational Experience and Lessons Learnt”, CMS Note 2007/005.

[104] CMS Collaboration, “The Compact Muon Solenoid detector at LHC”, To be submitted to Journal of 

Instrumentation.

The CMS Trigger Supervisor: - HEPHY

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?