03.08.2013 Views

Proceedings - ijcai-11

Proceedings - ijcai-11

Proceedings - ijcai-11

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

RCRA 20<strong>11</strong><br />

The 18th RCRA International Workshop on<br />

“Experimental Evaluation of Algorithms for solving<br />

problems with combinatorial explosion”<br />

A workshop of the<br />

22nd International Joint Conference on Artificial Intelligence<br />

(IJCAI 20<strong>11</strong>)<br />

Barcelona, 17–18 July 20<strong>11</strong>


Preface<br />

Many problems in Artificial Intelligence show an exponential explosion of the search space. Although<br />

stemming from different research areas in AI, such problems are often addressed with algorithms that<br />

have a common goal: the effective exploration of huge state spaces. Many algorithms developed in<br />

one research area are applicable to other problems, or can be hybridised with techniques in other<br />

areas. Artificial Intelligence tools often exploit or hybridise techniques developed by other research<br />

communities, such as Operations Research.<br />

In recent years, research in AI has more and more focussed on experimental evaluation of algorithms,<br />

the development of suitable methodologies for experimentation and analysis, the study of languages<br />

and the implementation of systems for the definition and solution of problems.<br />

Scope of this workshop series is fostering the cross-fertilisation of ideas stemming from different areas,<br />

proposing benchmarks for new challenging problems, comparing models and algorithms from an experimental<br />

viewpoint, and, in general, comparing different approaches with respect to efficiency, problem<br />

modelling, and ease of development.<br />

RCRA workshops are organised by “Rappresentazione della Conoscenza e Ragionamento Automatico”<br />

(RCRA, rcra.aixia.it), which is a scientific community interested in Knowledge Representation<br />

and Automated Reasoning. The RCRA group, founded in 1993, is part of the Italian Association for<br />

Artificial Intelligence (AI*IA, www.aixia.it).<br />

i


Workshop committees<br />

Workshop chairs<br />

Marco Gavanelli, University of Ferrara, Italy<br />

Toni Mancini, Sapienza University, Roma, Italy<br />

Programme committee<br />

Marco Alberti, Universidade Nova de Lisboa, Portugal<br />

Tolga Bektas, University of Southampton, UK<br />

Francesco Calimeri, University of Calabria, Italy<br />

Agostino Dovier, University of Udine, Italy<br />

Esra Erdem, Sabanci University, Instanbul, Turkey<br />

Wolfgang Faber, University of Calabria, Italy<br />

Pierre Flener, Uppsala University, Sweden<br />

Scott E. Grasman, Missouri University of Science and Technology, USA<br />

Angel Juan, IN3-Universitat Oberta de Catalunya, Barcelona, Spain<br />

Henry Kautz, University of Rochester, USA<br />

Daniel Le Berre, Université d’Artois, Lens Cedex, France<br />

Inês Lynce, INESC-ID, Lisboa, Portugal<br />

Marco Maratea, University of Genova, Italy<br />

Joao Marques-Silva, University College Dublin, Ireland<br />

Michela Milano, University of Bologna, Italy<br />

Massimo Narizzano, University of Genova, Italy<br />

Angelo Oddi, ISTC-CNR, Roma, Italy<br />

Gilles Pesant, University of Montreal, Canada<br />

Pilar Pozos Parra, Universidad Juarez Autonoma de Tabasco, Tabasco, Mexico<br />

Steve Prestwich, University College Cork, Ireland<br />

Helena Ramalhinho Dias Lourenço, Universitat Pompeu Fabra, Barcelona, Spain<br />

Daniel Riera i Terrén, IN3-Universitat Oberta de Catalunya, Barcelona, Spain<br />

Fabrizio Riguzzi, University of Ferrara, Italy<br />

Rubén Ruiz Garcìa, Universidad Politécnica de Valencia, Spain<br />

Alessandro Saetti, University of Brescia, Italy<br />

Andrea Schaerf, University of Udine, Italy<br />

Bart Selman, Cornell University, USA<br />

ii


Helmut Simonis, University College Cork, Ireland<br />

Mirek Truszczyński, University of Kentucky, Lexington, KY, USA<br />

External referees<br />

Sara Ceschia, University of Udine, Italy<br />

Raffaele Cipriano, University of Udine, Italy<br />

Giuseppe Filippone, University of Calabria, Italy<br />

Jonathan Gaudreault, FOCAC, Université Laval, Canada<br />

Giovambattista Ianni, University of Calabria, Italy<br />

Marco Kuhlmann, Uppsala University, Sweden<br />

Marco Manna, University of Calabria, Italy<br />

Vasco Manquinho, INESC-ID, Lisboa, Portugal<br />

Malek Mouhoub, University of Regina, Canada<br />

Luca Pulina, University of Sassari, Italy<br />

Kristian Reale, University of Calabria, Italy<br />

Francesco Ricca, University of Calabria, Italy<br />

Greg Rix, University of Montreal, Canada<br />

Andrea Roli, University of Bologna, Italy<br />

Giorgio Terracina, University of Calabria, Italy<br />

Local organisation<br />

Angel Juan, IN3-Universitat Oberta de Catalunya, Barcelona, Spain<br />

Daniel Riera i Terrén, IN3-Universitat Oberta de Catalunya, Barcelona, Spain<br />

Josep Jorba, IN3-Universitat Oberta de Catalunya, Barcelona, Spain<br />

Helena R. Lourenço, Universitat Pompeu Fabra, Spain<br />

David Masip, IN3-Universitat Oberta de Catalunya, Barcelona, Spain<br />

Joan M. Marques, IN3-Universitat Oberta de Catalunya, Barcelona, Spain<br />

Joan A. Pastor, IN3-Universitat Oberta de Catalunya, Barcelona, Spain<br />

iii


Table of contents<br />

Full papers<br />

Parallel Search for Boolean Optimization<br />

Ruben Martins, Vasco Manquinho and Inês Lynce ................................................1<br />

Hydra-MIP: Automated Algorithm Configuration and Selection for Mixed Integer Programming<br />

Lin Xu, Frank Hutter, Holger Hoos and Kevin Leyton-Brown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16<br />

Predicting Natural Hazards Evolution: How to Overcome the Impact of Input-parameter Uncertainty<br />

Andrés Cencerrado, Ana Cortés and Tomás Margalef . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />

A Hybrid Algorithm Combining Path Scanning and Biased Random Sampling for the Arc Routing<br />

Problem<br />

Sergio González Martín, Angel A. Juan, Daniel Riera and José Cáceres Cruz . . . . . . . . . . . . . . . . . . . 46<br />

Algorithms for Interval Data Minmax Regret Paths<br />

Carolinne Torres, César Astudillo, Matthew Bardeen and Alfredo Candia . . . . . . . . . . . . . . . . . . . . . . . .55<br />

Community of Scientist Optimization: Foraging and Competing for Research Resources<br />

Alfredo Milani and Valentino Santucci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />

An Empirical Study of Learning and Forgetting Constraints<br />

Neil Charles Armour Moore, Ian Gent and Ian Miguel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81<br />

Job Shop Scheduling with Routing Flexibility and Sequence Dependent Setup-Times<br />

Angelo Oddi, Riccardo Rasconi, Amedeo Cesta and Stephen Smith . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96<br />

Automatic Generation of Efficient Domain-Specific Planners from Generic Parametrized Planners<br />

Mauro Vallati, Chris Fawcett, Alfonso Gerevini, Holger Hoos and Alessandro Saetti . . . . . . . . . . . . <strong>11</strong>1<br />

Taking Advantage of Domain Knowledge in Optimal Hierarchical Deepening Search Planning<br />

Pascal Schmidt, Florent Teichteil-Königsbuch and Patrick Fabiani . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124<br />

Short papers<br />

Solving Disjunctive Temporal Problems with Preferences using Boolean Optimization solvers<br />

Marco Maratea, Maurizio Pianfetti and Luca Pulina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139<br />

Visualizing Learning Dynamics in Large-Scale Networks<br />

Manal Rayess and Sherief Abdallah . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147<br />

ACO Algorithms for Solving a New Fleet Assignment Problem<br />

Javier Diego, Miguel Ortega-Mier, Alvaro Garcia-Sanchez and Ignacio Rubio . . . . . . . . . . . . . . . . . . .155<br />

A New Guillotine Placement Heuristic for the Orthogonal Cutting Problem<br />

Slimane Aboumsabah and Ahmed Riadh Baba-Ali . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163<br />

Solving Distributed FCSPs with Naming Games<br />

Stefano Bistarelli, Giorgio Gosti and Francesco Santini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171<br />

iv


Already published papers<br />

On Improving MUS Extraction Algorithms<br />

Joao Marques-Silva and Inês Lynce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179<br />

Applying UCT to Boolean Satisfiability<br />

Alessandro Previti, Raghuram Ramanujan, Marco Schaerf and Bart Selman . . . . . . . . . . . . . . . . . . . . 180<br />

An Efficient Hierarchical Parallel Genetic Algorithm for Graph Coloring Problem<br />

Reza Abbasian and Malek Mouhoub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181<br />

Checking Safety of Neural Networks with SMT Solvers: a Comparative Evaluation<br />

Luca Pulina and Armando Tacchella . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .182<br />

Plan Stability: Replanning versus Plan Repair<br />

Maria Fox, Alfonso Gerevini, Derek Long and Ivan Serina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183<br />

v


Parallel Search for Boolean Optimization<br />

Ruben Martins, Vasco Manquinho, and Inês Lynce<br />

IST/INESC-ID, Technical University of Lisbon, Portugal<br />

{ruben,vmm,ines}@sat.inesc-id.pt<br />

Abstract. The predominance of multicore processors has increased the interest<br />

in developing parallel Boolean Satisfiability (SAT) solvers. As a result, more<br />

parallel SAT solvers are emerging. Even though parallel approaches are known<br />

to boost performance, parallel approaches developed for Boolean optimization<br />

are scarce. This paper proposes parallel search algorithms for Boolean optimization<br />

and introduces a new parallel solver for Boolean optimization problem instances.<br />

Using two threads, an unsatisfiability-based algorithm is used to search<br />

on the lower bound value of the objective function, while at the same time a linear<br />

search is performed on the upper bound value of the objective function. Searching<br />

in both directions and exchanging learned clauses between these two orthogonal<br />

approaches makes the search more efficient. This idea is further extended for a<br />

larger number of threads by dividing the search space considering different local<br />

upper values of the objective function. The parallel search on different local upper<br />

values leads to constant updates on the lower and upper bound values, which<br />

result in reducing the search space. Moreover, different search strategies are performed<br />

on the upper bound value, increasing the diversification of the search.<br />

1 Introduction<br />

An increasing number of parallel Boolean Satisfiability (SAT) solvers have come to<br />

light in the recent past as a result of multicore processors having become the dominant<br />

platform. The use of SAT is widespread with many practical application and it is clear<br />

that the optimization version of SAT, i.e. Boolean optimization, can be applied to solve<br />

many practical optimization problems. The competitive performance and robustness of<br />

Boolean optimization solvers is certainly required to achieve this goal.<br />

When compared with SAT instances, Boolean optimization instances tend to be<br />

more intricate as it is not sufficient to find an assignment that satisfies all the constraints,<br />

but rather an optimization function has to be taken into account. Hence, it comes as a<br />

natural step to develop parallel algorithms to Boolean optimization, following the recent<br />

success in the SAT field.<br />

Although this reasoning comes as natural, there are only a few parallel implementation<br />

for solving Boolean Optimization. SAT4J PB RES//CP 1 implements a resolution<br />

based algorithm that competes with a cutting plane based algorithm to find a new upper<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms for Solving<br />

Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

1 http://www.satcompetition.org/PoS/presentations-pos/leberre.pdf<br />

1


2 R. Martins, V. Manquinho, I. Lynce<br />

bound or to prove optimality. When one of the algorithms finds a new upper bound, it<br />

terminates the search of the other algorithm and both restart their search within the new<br />

upper bound. If one of the algorithms proves optimality then the problem is solved and<br />

the search is stopped. Clause sharing is not performed between these two algorithms.<br />

In the context of Integer Linear Programming (ILP), the commercial solver CPLEX is<br />

known to have the option of performing parallel search 2 but no detailed description is<br />

available.<br />

Parallel algorithms have the advantage of allowing to implement orthogonal approaches<br />

that complement each other. That is the case in SAT4J PB RES//CP where<br />

cutting planes are run against resolution. Another alternative, which will be explored in<br />

this paper, is to run an algorithm that searches to increase the lower bound value against<br />

an algorithm that searches to decrease the upper bound value. Furthermore, one may<br />

have more than one algorithm searching on the upper bound value.<br />

The main contribution of this paper is two-fold. First, we introduce a parallel search<br />

algorithm for Boolean optimization that uses two threads: one thread searches to reduce<br />

the upper bound value and the other thread searches to increase the lower bound value.<br />

Second, a more complex parallel algorithm is introduced, which extends the previous<br />

algorithm with additional threads searching to reduce the upper bound value.<br />

The paper is organized as follows. The next section describes the preliminaries,<br />

namely Maximum Satisfiability (MaxSAT) and Pseudo-Boolean Optimization (PBO).<br />

Section 3 describes a parallel two-thread search algorithm for Boolean optimization,<br />

which is extended to a multithread algorithm in section 4. Afterwards, an experimental<br />

evaluation of the new algorithms is presented and the paper concludes.<br />

2 Preliminaries<br />

In this section we briefly describe the Boolean optimization formalisms to be used in<br />

the remainder of the paper, namely Maximum Satisfiability (MaxSAT) and Pseudo-<br />

Boolean Optimization (PBO). Moreover, we also review the encoding from MaxSAT to<br />

PBO.<br />

The MaxSAT problem can be defined as finding an assignment to problem variables<br />

such that it minimizes (maximizes) the number of unsatisfied (satisfied) clauses<br />

in a CNF formula ϕ. However, MaxSAT has several variants such as partial MaxSAT,<br />

weighted MaxSAT and weighted partial MaxSAT. In the partial MaxSAT problem some<br />

clauses in ϕ are declared as hard, while the reminder are declared as soft. The objective<br />

in partial MaxSAT is to find an assignment such that all hard clauses are satisfied while<br />

minimizing the number of unsatisfied soft clauses. Finally, in the weighted versions of<br />

MaxSAT, soft clauses can have weights greater than 1 and the objective is to satisfy all<br />

hard clauses while minimizing the total weight of unsatisfied soft clauses.<br />

A related Boolean optimization formalism is Pseudo-Boolean Optimization (PBO).<br />

PBO is defined as finding an assignment to problem variables such that all pseudo-<br />

Boolean constraints are satisfied and the value of a linear cost function is minimized.<br />

Unlike MaxSAT, constraints in PBO are more general and there are no soft constraints.<br />

2 http://www.ibm.com/software/integration/optimization/cplex-optimizer/<br />

2


4 R. Martins, V. Manquinho, I. Lynce<br />

3 Parallel Search on the Lower and Upper Bound Values<br />

Unsatisfiability-based algorithms are very effective for several Boolean optimization<br />

problems [10, 18, 2]. These algorithms work by iteratively identifying unsatisfiable subformulas<br />

ϕU from the original formula ϕ. At each step, a SAT (or pseudo-Boolean)<br />

solver is used to check if the formula is unsatisfiable. If that is the case, for each soft<br />

constraint 3 in the identified unsatisfiable sub-formula ϕU , a new relaxation variable<br />

is added such that when assigned to 1, the soft constraint becomes satisfiable [18].<br />

Moreover, additional constraints are also added to ϕ such that only one of the newly<br />

created relaxation variables can be assigned value 1. Next, the solver checks if the<br />

formula remains unsatisfiable. The procedure ends when the working formula becomes<br />

satisfiable and the solver returns a solution (i.e. the optimum value was found), or if ϕU<br />

only contains hard constraints (i.e. the original problem instance is unsatisfiable) [10].<br />

The original procedure proposed by Fu and Malik [10] was improved, namely by<br />

using more effective encodings [21, 20] for the constraints on the relaxation variables,<br />

as well as different strategies to minimize the overall number of relaxation variables<br />

needed [20, 2]. Moreover, generalizations for the weighted MaxSAT variants have also<br />

been proposed [18, 3].<br />

The most classical approach for Boolean optimization is the use of branch and<br />

bound algorithms where an upper bound on the value of the objective function is updated<br />

when a new solution is found. In these algorithms, lower bounds are estimated<br />

and whenever the lower bound is higher or equal to the upper bound, the search procedure<br />

can safely backtrack since extending the current set of variables assignments will<br />

surely not result in a better solution. Several MaxSAT and PBO algorithms follow this<br />

approach using different lower bounding procedures [15, 16, 4, 12, 17].<br />

Another classical approach is to perform a linear search on the value of the objective<br />

function [8]. In this case, whenever a new solution is found, the upper bound value is<br />

updated and a new constraint is added such that all solutions with a higher value are<br />

excluded. Several PBO solvers use this approach [23, 9, 14, 1]. Moreover, by using an<br />

encoding to PBO, MaxSAT instances can also be solved using this approach [14].<br />

Notice that the unsatisfiability-based procedures correspond to searching on the<br />

lower bound of the value of the optimal solution. At each iteration the working formula<br />

is unsatisfiable, and the algorithm terminates when the working formula becomes<br />

satisfiable. On the other hand, linear search on the values of the objective function<br />

corresponds to searching on the upper bound. In this case, the working formula is satisfiable<br />

at each iteration. The algorithm terminates when the problem instance becomes<br />

unsatisfiable and the optimum value is given by the last recorded solution.<br />

An algorithm that searches on both the lower and upper bounds of the objective<br />

function has already been proposed [19]. The search is initially done by a pseudo-<br />

Boolean solver that performs a search on the upper bound value of the objective function.<br />

However, the use of the pseudo-Boolean solver is limited to 10% of the time limit<br />

given to solve the formula. If the PBO solver proves optimality within this time limit,<br />

the optimal solution has been found without having to search on the lower bound side.<br />

3 In MaxSAT a soft constraint is a clause, but for more general formulations, it can be any linear<br />

pseudo-Boolean constraint.<br />

4


Parallel Algorithms for Boolean Optimization 5<br />

On the other hand, if the PBO solver was not able to prove optimality within the time<br />

limit, an unsatisfiability-based algorithm is used to search on the lower bound value<br />

of the objective function. wbo [18, 19] is a weighted Boolean optimization solver that<br />

uses this approach. Experimental results show that searching on the upper and lower<br />

bound values leads to solving more instances. Since these approaches are orthogonal,<br />

they complement each other on several classes of problem instances. In this paper, for<br />

simplicity of the algorithmic description, it is assumed that the Boolean optimization<br />

problem to be solved is weighted partial MaxSAT. However, algorithms described next<br />

can be easily generalized to other Boolean formulations.<br />

3.1 Parallel Search<br />

Nowadays, extra computing power is not coming anymore from higher processor frequencies<br />

but rather from a growing number of cores and processors. Exploiting this new<br />

architecture will allow Boolean Optimization solvers to become more effective and to<br />

be able to solve more problem instances. In this section we propose to perform a parallel<br />

search on the upper and lower bound values of the objective function. Even though<br />

searching on both the upper and the lower bound is not new [21, 19], searching on both<br />

of them in parallel is novel to the best of our knowledge. In this paper we propose the<br />

parallelization of the wbo solver, and the new solver is named pwbo. pwbo uses a linear<br />

search algorithm to search on the upper bound side and an unsatisfiability-based<br />

algorithm for searching on the lower bound side.<br />

A parallel search with these two orthogonal strategies results in a performance as<br />

good as the best strategy for each problem instance. However, if both threads cooperate<br />

through clause sharing, it is possible to perform better than the best strategy. Additionally,<br />

both strategies can also cooperate in finding the optimum value. If during the search<br />

the lower bound value provided by the unsatisfiability-based algorithm and the upper<br />

bound value provided by the other thread become the same, it means that the optimum<br />

solution has been found. Therefore, it is not necessary for any of the threads to continue<br />

the search to prove optimality since their combined information already proves it.<br />

3.2 Clause Sharing<br />

It is commonly known that conflict-driven clause learning is crucial for the efficiency of<br />

modern Boolean optimization solvers. The description of conflict-driven clause learning<br />

procedures is out of the scope of the paper and will be assumed. We refer to the literature<br />

for detailed explanations on these procedures [22, 25]. In the context of parallel solving,<br />

it is expected that sharing learned clauses can help to further prune the search space and<br />

boost the performance of the parallel solver.<br />

In parallel SAT solving, learned clauses that have less than a given number of literals<br />

are shared among the different threads. More advanced heuristics can be used for<br />

controlling the throughput and quality of the shared clauses [<strong>11</strong>]. Moreover, the literal<br />

block distance [5] can also be used for sharing clauses in a parallel context [13]. In<br />

our approach, we start by sharing clauses that have 5 or fewer literals. This cutoff is<br />

dynamically changed using the throughput and quality heuristic proposed by Hamadi<br />

et al. [<strong>11</strong>]. Additionally, all clauses that have literal block distance 2 are also shared.<br />

5


6 R. Martins, V. Manquinho, I. Lynce<br />

It should be noted that in the pwbo solver not all conflict-driven learned clauses can<br />

be shared between both threads. This is due to the fact that the working formulas are<br />

different. On the unsatisfiability-based algorithm, the input formula ϕMS is a weighted<br />

partial MaxSAT formula with soft and hard constraints.<br />

However, on the thread that makes the linear search on the upper bound value of the<br />

objective function, we encode the input formula ϕMS into a PBO formulation ϕPBO.<br />

As a result of that encoding (see example 1), the set of variables in ϕPBO might have<br />

been extended by additional relaxation variables necessary to encode the soft clauses in<br />

the original formula ϕMS. In order to define the conditions for safe clause sharing, we<br />

start by defining soft and hard learned clauses.<br />

Definition 1 (Soft and Hard Learned Clauses). If in the conflict analysis procedure<br />

used in the unsatisfiability-based algorithm, at least one soft clause is used in the clause<br />

learning process, then the generated learned clause is labeled as soft. On the other<br />

hand, if only hard clauses are used, then the generated learned clause is labeled as<br />

hard.<br />

Since ϕMS contains both soft and hard clauses, it will also have soft and hard<br />

learned clauses. On the other hand, ϕPBO only has hard clauses, and as a result, will<br />

only have hard learned clauses. Nevertheless, as mentioned previously, ϕPBO may contain<br />

additional variables not present in ϕMS. As a result, the safe sharing procedure<br />

between the two threads is as follows:<br />

– A hard learned clause from the unsatisfiability-based algorithm can be safely shared<br />

to the other thread. This is due to the fact that the resolution operations used in ϕMS<br />

can also be reproduced in ϕPBO, since all original hard clauses in ϕMS are also<br />

present in ϕPBO.<br />

– A soft learned clause from the unsatisfiability-based algorithm is not shared since<br />

it may not be valid for formula ϕPBO.<br />

– A hard learned clause generated when solving ϕPBO can be shared with the unsatisfiability-based<br />

algorithm if the learned clause does not contain relaxation variables.<br />

This is safe since one can reproduce the generation of the hard learned clause<br />

by resolution steps using just hard clauses also present in ϕMS.<br />

Finally, between iterations of the unsatisfiability-based algorithm, working formula<br />

ϕMS is also extended with additional relaxation variables. However, since these variables<br />

are added to soft clauses, if a conflict-based learned clause contains any relaxation<br />

variable, then it will necessarily be considered a soft clause. This is due to the fact that<br />

at least one soft clause would have been used in the learning procedure.<br />

4 Parallel Search on the Upper Bound Value<br />

The previous section presented a parallel search solver for Boolean optimization based<br />

on two orthogonal strategies. In the proposed approach, one thread is used for each<br />

strategy. For computer architectures with more than two cores, we can extend the previous<br />

idea by performing a parallel search on the upper bound value of the objective<br />

function. Therefore, if n cores are available, we can use one thread to search on the<br />

6


Parallel Algorithms for Boolean Optimization 7<br />

lower bound value of the objective function, while at the same time k threads search<br />

on different local upper bound values of the objective function and n − k − 1 threads<br />

search on the upper bound value of the objective function. Local bound threads have a<br />

local upper bound value that is enforced in their search. The iterative search on different<br />

local upper bound values leads to constant updates on the lower and upper bound<br />

values that will reduce the search space. Next, an example of this approach is described.<br />

Afterwards, a more detailed description of the algorithm is provided.<br />

Example 2. Consider a weighted partial MaxSAT formula ϕMS as input. For the input<br />

formula, one can easily find initial lower and upper bounds. Suppose the initial lower<br />

and upper bound values are 0 and <strong>11</strong>, respectively. Moreover, consider also that the<br />

optimal solution is 3 and our goal is to find it using four threads, t0,t1,t2 and t3. Thread<br />

t0 applies an unsatisfiability-based algorithm (i.e., searches on the lower bound of the<br />

optimum value of the objective). This thread starts with a lower bound of 0 and will<br />

iteratively increase the lower bound until the optimum value is found.<br />

Thread t1 searches on the upper bound value of the objective function, while threads<br />

t2 and t3 search on different local upper bound values of the objective function. The<br />

initial input formula ϕMS is encoded into the pseudo-Boolean formalism (see section 2)<br />

and an additional constraint is added to limit the value of the objective function in each<br />

thread. For example, thread t1 starts its search with upper bound value of <strong>11</strong> and threads<br />

t2 and t3 can start their search with respective local upper bound values of 3 and 7.<br />

Suppose that thread t2 finishes its computation and finds that the formula is unsatisfiable<br />

for an upper bound of 3. This means that there is no solution with values 0, 1 and<br />

2 for the objective function. Therefore, the global lower bound value can be updated to<br />

3. Thread t2 is now free to search on a different local upper bound value, for example<br />

5. In the meantime, thread t3 found a solution with objective value 6. Hence, the global<br />

upper bound value can be updated to 6. Thread t1 updates its upper bound value to 6<br />

and thread t3 is now free to search on a different local upper bound value, for example<br />

4. Afterwards, consider that thread t1 found a solution with objective value 3. Again,<br />

the global upper bound value can be updated to 3. Since the global lower bound value is<br />

the same as the global upper bound value, the optimum has been found and the search<br />

terminates.<br />

4.1 Algorithmic Description<br />

In what follows it is shown how the parallel search on the values of the objective<br />

function can be implemented in pwbo. Algorithm 1 describes pwbo. It receives a<br />

weighted partial MaxSAT formula (ϕMS) and the number of available threads (n). The<br />

thread with index 0 is referred to as the lower bound thread and applies an unsatisfiabilitybased<br />

algorithm to ϕMS. The thread with index 1 is referred to as the upper bound<br />

thread and searches on the upper bound value of the objective function. The threads<br />

indexed 2 to n − 1 are referred as local upper bound threads and search on different<br />

local upper bound values of the objective function. For the sake of simplicity, it is<br />

considered that there is only one thread that searches on the upper bound value of the<br />

objective function. However, this algorithm can be easily generalized for k local upper<br />

7


Algorithm 2 Parallel Algorithms for Boolean Optimization<br />

PARALLELLOWERALG(ϕ)<br />

1 localLB ← 0<br />

2 while (search)<br />

3 do (st, ϕU ,model) ← PBSOLVER(ϕ)<br />

4 if st = UNSAT<br />

5 then localLB ← localLB+ COREWEIGHT(ϕU )<br />

6 RELAXCORE(ϕ, ϕU )<br />

7 UPDATELOWERBOUND(localLB, 0)<br />

8 if globalUB = globalLB<br />

9 then search ← false<br />

10 else if st = SAT<br />

<strong>11</strong> then UPDATEUPPERBOUND(localLB, 0)<br />

12 globalModel ← model<br />

13 search ← false<br />

PARALLELUPPERALG(ϕ,id)<br />

Parallel Algorithms for Boolean Optimization 9<br />

1 while (search)<br />

2 do (st, ϕU ,model) ← PBSOLVER(ϕ ∪ { P cjlj ≤ threadUB[id] − 1})<br />

3 if st = UNSAT<br />

4 then UPDATELOWERBOUND(threadUB[id], id)<br />

5 CLEARLOCALCONSTRAINTS(ϕ)<br />

6 else if st = FORCED ABORT<br />

7 then CLEARLOCALCONSTRAINTS(ϕ)<br />

8 else if st = SAT<br />

9 then UPDATEUPPERBOUND(VALUE(MODEL), id)<br />

10 globalModel ← model<br />

<strong>11</strong> if globalUB = globalLB<br />

12 then search ← false<br />

unsatisfiable sub-formula provided by the PB solver if ϕ is unsatisfiable, and model<br />

contains an assignment to the variables of ϕ when the formula is satisfiable. In this<br />

thread, if the outcome of the PB solver is forced abort, it means an optimal solution has<br />

been found by another thread (search was set to false) and the procedure terminates.<br />

When the status of the PB solver is unsatisfiable (line 3), the unsatisfiable subformula<br />

ϕU is relaxed in the procedure RELAXCORE. We refer to the literature for<br />

the details of this procedure [2, 18]. Next, if localLB is greater than the current global<br />

lower bound, the global lower bound is updated in UPDATELOWERBOUND (line 7).<br />

Notice that this may result in forcing one or more upper bound threads to abort and<br />

updating their upper bound limits. Otherwise, it means that an upper thread has already<br />

proved a better lower bound, and the search proceeds. If the status of the PB solver<br />

is satisfiable (line 10) it means that the unsatisfiability-based algorithm has found an<br />

optimal solution. As a result, the upper bound is updated (line <strong>11</strong>), the solution is stored<br />

(line 12) and the flag search is set to false so that the remaining threads terminate.<br />

9


10 R. Martins, V. Manquinho, I. Lynce<br />

The PARALLELUPPERALG procedure takes as input a PBO formula ϕ (section 2)<br />

and a thread identifier. At each iteration, a PB solver is used to solve ϕ (line 2), with an<br />

additional constraint that limits the value of the objective function. Let this constraint<br />

be named the thread bound constraint.<br />

Notice that the thread bound constraint cannot be shared among all threads, since<br />

it is only valid if the optimum value is lower than the thread upper bound. The same<br />

sharing rules must apply to conflict-driven learned clauses that depend on the thread<br />

bound constraint. Therefore, it is necessary to define what is a local constraint and in<br />

what conditions it can be shared with other threads.<br />

Definition 2 (Local Constraint). The thread bound constraint is labeled as a local<br />

constraint. Let ω be a conflict-driven learned clause and let ϕω be the set of constraints<br />

used in the implication graph to learn ω. The new clause ω is defined as a local constraint<br />

if at least one constraint in ϕω is a local constraint.<br />

After the call to the PB solver (line 3), if it returns unsatisfiable, it means that a new<br />

lower bound has been found. The lower bound is updated (line 4) and if the thread is<br />

searching on a local upper bound then it gets a new local upper bound value. Since the<br />

formula given to the PB solver was unsatisfiable, it is necessary to remove the thread<br />

bound constraint (line 5). Additionally, all local clauses are also removed since they<br />

may not be valid with the new local upper bound.<br />

If the status of the solver is forced abort, it means that some other thread already<br />

proved that the current search space is redundant. This can happen if the thread local<br />

upper bound is smaller than the global lower bound, or if the thread local upper bound<br />

is greater than the global upper bound. Local constraints are therefore removed (line<br />

7). In fact, the local constraints are only removed when the forced abort is caused by<br />

an update on the global lower bound value. Otherwise, local constraints remain valid.<br />

If the PB solver returns satisfiable, a new upper bound has been found. Therefore, the<br />

global upper bound is updated (line 9) and the model is stored (line 10). If the thread<br />

is searching on a local upper bound then it gets a new local upper bound value, since<br />

the upper bound thread will continue the search on the new upper bound that has been<br />

found. If the thread is searching on the global upper bound the search then proceeds<br />

as usual. Finally, after the necessary updates depending on the PB solver status, it is<br />

checked whether the global upper bound is equal to the global lower bound. If this<br />

occurs, optimality is proved and the search terminates (lines <strong>11</strong>-12).<br />

We should note that some details are not fully described in this algorithmic description<br />

due to lack of space. In particular, updates to global data structures are inside<br />

critical regions and locks are used to avoid two or more threads to be updating these<br />

data structures at the same time. Moreover, updates to global lower and upper bounds<br />

only take place when the new values improve the current ones. Additionally, the update<br />

on the saved model is also inside a critical region and is only done when the global<br />

upper bound is updated.<br />

Finally, when the global bounds are updated at UPDATELOWERBOUND and UP-<br />

DATEUPPERBOUND, that may result in forcing the PB solver in other threads to stop<br />

(resulting in a forced abort status). As a result, new thread local upper bounds must be<br />

defined for the aborted threads. Hence, each aborted thread is assigned a new local upper<br />

bound that covers the broadest range of yet untested bounds. More formally, the new<br />

10


12 R. Martins, V. Manquinho, I. Lynce<br />

Table 1. Number of industrial partial MaxSAT instances solved by sequential and parallel solvers<br />

Benchmark set #I QMaxSAT pm2 wbo<br />

pwbo<br />

2T 4T 4T-CNF<br />

bcp-fir 59 50 58 42 44 44 56<br />

bcp-hipp-yRa1 55 46 45 22 22 24 40<br />

bcp-msp 64 26 14 16 15 15 20<br />

bcp-mtg 40 40 40 31 32 33 40<br />

bcp-syn 74 32 39 34 36 36 40<br />

CircuitTraceCompaction 4 4 4 4 4 4 4<br />

HaplotypeAssembly 6 0 5 5 5 5 5<br />

pbo-mqc 168 153 129 147 167 168 168<br />

pbo-routing 15 15 15 15 15 15 15<br />

PROTEIN INS 12 6 3 1 1 1 2<br />

Total 497 372 352 317 341 345 390<br />

those categories. The evaluation was performed on two AMD Opteron 6172 processors<br />

(2.1 GHz with 64 GB of RAM) running Fedora Core 13 with a timeout of 1,800 seconds<br />

(wall clock time).<br />

The results were obtained by running each parallel solver on each instance for three<br />

times. Similarly to what is done when analyzing randomized solvers, the median time<br />

was taken into account. This means that an instance must be solved by at least two of<br />

the three runs to be considered solved. We should note, however, that this measure is<br />

more conservative than the one used in the SAT Race 2008 5 which is commonly used<br />

by parallel SAT solvers [<strong>11</strong>].<br />

Table 1 gives the number of partial MaxSAT instances from the industrial category<br />

that were solved by sequential and parallel solvers. The sequential solvers considered<br />

were QMaxSAT 6 (ranked 1 st in the MaxSAT Evaluation 2010), pm2 [2] (ranked 2 nd )<br />

and wbo [18, 19] (ranked 3 rd ). Note that wbo is also our reference solver as the new<br />

parallel algorithms were implemented on the top of wbo. SAT4J MAXSAT [14] and SAT4J<br />

MAXSAT RES//CP were not evaluated since their performance is not comparable to the<br />

remaining state-of-the-art partial MaxSAT solvers. For the 497 instances tested, SAT4J<br />

MAXSAT 2.2.3 and SAT4J MAXSAT RES//CP can only solve 277 and 290 instances,<br />

respectively.<br />

The parallel solvers evaluated correspond to the different versions of pwbo. pwbo<br />

is a parallel solver implemented on the top of wbo. pwbo 2T uses two threads according<br />

to what is described in section 3, thus having one thread searching on the lower<br />

bound value and another thread searching on the upper bound value. pwbo 4T and pwbo<br />

4T-CNF use four threads according to what is described in section 4, thus having one<br />

thread searching on the lower bound value and three threads searching on the upper<br />

bound value. The difference between pwbo 4T and pwbo 4T-CNF is on the number<br />

of threads that search on local and global upper bound values. Increasing the number<br />

of threads that search on local upper bound values allows to reduce the search space<br />

by finding new lower and upper bounds. On the other hand, increasing the number of<br />

5 http://baldur.iti.uka.de/sat-race-2008/<br />

6 http://www.maxsat.udl.cat/10/solvers/QMaxSat.pdf<br />

12


time (seconds)<br />

1800<br />

1600<br />

1400<br />

1200<br />

1000<br />

800<br />

600<br />

400<br />

200<br />

wbo<br />

pwbo T2<br />

pwbo T4<br />

PM2<br />

QMaxSAT<br />

pwbo T4-CNF<br />

Parallel Algorithms for Boolean Optimization 13<br />

Fig. 1. Cactus plot with running times of solvers<br />

0<br />

0 50 100 150 200 250 300 350 400<br />

instances<br />

threads that search on the global upper bound increases the diversification of the search,<br />

since those threads are searching using different strategies. pwbo 4T uses two threads to<br />

search on local upper bound values and one thread to search on the global upper bound<br />

value. On the other hand, pwbo 4T-CNF uses one thread to search on local upper bound<br />

values and two threads to search on the global upper bound value with the different<br />

strategies described in section 4.2. The objective function for partial MaxSAT instances<br />

corresponds to a cardinality constraint, since all coefficients are 1. Therefore, pwbo<br />

4T-CNF uses Sinz’s encoding [24] to translate the cardinality constraint into clauses.<br />

Clearly, all versions of pwbo perform better than the sequential solver wbo. When<br />

analyzing each benchmark family, one can conclude that the benefits obtained from<br />

parallel solvers are not the same for all benchmarks families, although in general the<br />

number of solved instances tends to increase for all families. There is a significant boost<br />

when using two threads (pwbo T2), showing that a parallel search on the lower and<br />

upper bounds makes the search mode efficient and solves more instances. When using<br />

four threads the number of solved instances still increases. pwbo T4 shows that reducing<br />

the search space by doing a local upper bound search allows solving more instances.<br />

Another significant boost is given by the diversification of the search. Indeed, pwbo<br />

T4-CNF with its combination of search diversification and search space reduction is<br />

able to solve more instances than the best sequential solver (QMaxSAT), thus improving<br />

the current state of the art.<br />

Figure 1 contains a cactus plot with the running times of all the solvers for which<br />

data was given in Table 1. With no doubt, the parallel versions of pwbo perform better<br />

than wbo. Moreover, the best performing solver is pwbo T4-CNF that clearly outperforms<br />

all other solvers, including the best sequential solver QMaxSAT. Finally, Table 2<br />

13


14 R. Martins, V. Manquinho, I. Lynce<br />

Table 2. Speedup on the 312 instances solved by wbo and all pwbo solvers<br />

Solver Time (s) Speedup<br />

wbo 36,208.33 1.00<br />

pwbo 2T 22,798.28 1.59<br />

pwbo 4T 18,203.79 1.99<br />

pwbo 4T-CNF 13,236.87 2.74<br />

contains the speedup resulting from using pwbo, the parallel version of wbo. wbo is<br />

compared against pwbo 2T, pwbo 4T and pwbo 4T-CNF. The results are conclusive. The<br />

speedup increases as the number of threads increases, being almost 2 in pwbo 4T when<br />

local upper bound search is used and close to 3 in pwbo 4T-CNF when diversification of<br />

the search is combined with reduction of the search space.<br />

6 Conclusions<br />

This paper introduces new parallel algorithms for Boolean optimization. This work was<br />

in part motivated by the recent success of parallel SAT algorithms, also taking into account<br />

that parallel algorithms for Boolean optimization are scarce. Two new algorithms<br />

were proposed. The first algorithm uses two threads, one searching on the lower bound<br />

value and the other one searching on the upper bound value of the objective function.<br />

The second algorithm uses an additional number of threads to search on local upper<br />

bound values. Moreover, this algorithm is further improved by increasing the diversification<br />

of the search through different search strategies on the global upper bound.<br />

Experimental results, obtained on a significant number of problem instances, clearly<br />

show the efficiency of the new proposed algorithms.<br />

Due to the success of our approach in partial MaxSAT, we plan to further extend<br />

our evaluation to weighted Boolean optimization, as future work. Moreover, we propose<br />

to further increase the diversification of the search by implementing a portfolio of<br />

complementary algorithms. The portfolio of algorithms can then be used to search on<br />

local and global upper bounds thus increasing the efficiency of the solver. Finally, an<br />

experimental study of the scalability of our approach should also be performed.<br />

Acknowledgement. This work was partially supported by FCT under research projects<br />

BSOLO (PTDC/EIA/76572/2006) and iExplain (PTDC/EIA-CCO/102077/2008), and<br />

INESC-ID multiannual funding through the PIDDAC program funds.<br />

References<br />

1. F. Aloul, A. Ramani, I. Markov, and K. A. Sakallah. Generic ILP versus specialized 0-1 ILP:<br />

An update. In International Conference on Computer-Aided Design, pages 450–457, 2002.<br />

2. C. Ansótegui, M. Bonet, and J. Levy. Solving (Weighted) Partial MaxSAT through Satisfiability<br />

Testing. In International Conference on Theory and Applications of Satisfiability<br />

Testing, pages 427–440, 2009.<br />

3. C. Ansótegui, M. Bonet, and J. Levy. A New Algorithm for Weighted Partial MaxSAT. In<br />

AAAI Conference on Artificial Intelligence, pages 3–8, 2010.<br />

14


Parallel Algorithms for Boolean Optimization 15<br />

4. J. Argelich, C. M. Li, and F. Manyà. An improved exact solver for partial max-sat. In<br />

International Conference on Nonconvex Programming: Local and Global Approaches, pages<br />

230–231, 2007.<br />

5. G. Audemard and L. Simon. Predicting Learnt Clauses Quality in Modern SAT Solvers. In<br />

International Joint Conference on Artificial Intelligence, pages 399–404, 2009.<br />

6. O. Bailleux, Y. Boufkhad, and O. Roussel. A Translation of Pseudo Boolean Constraints to<br />

SAT. Journal on Satisfiability, Boolean Modeling and Computation, 2:191–200, 2006.<br />

7. O. Bailleux, Y. Boufkhad, and O. Roussel. New Encodings of Pseudo-Boolean Constraints<br />

into CNF. In International Conference on Theory and Applications of Satisfiability Testing,<br />

pages 181–194, 2009.<br />

8. P. Barth. A Davis-Putnam Enumeration Algorithm for Linear Pseudo-Boolean Optimization.<br />

Technical Report MPI-I-95-2-003, Max Plank Institute for Computer Science, 1995.<br />

9. N. Eén and N. Sörensson. Translating pseudo-Boolean constraints into SAT. Journal on<br />

Satisfiability, Boolean Modeling and Computation, 2:1–26, 2006.<br />

10. Z. Fu and S. Malik. On solving the partial MAX-SAT problem. In International Conference<br />

on Theory and Applications of Satisfiability Testing, pages 252–265, 2006.<br />

<strong>11</strong>. Y. Hamadi, S. Jabbour, and L. Sais. Control-Based Clause Sharing in Parallel SAT Solving.<br />

In International Joint Conference on Artificial Intelligence, pages 499–504, 2009.<br />

12. F. Heras, J. Larrosa, and A. Oliveras. MiniMaxSAT: An efficient weighted Max-SAT solver.<br />

Journal of Artificial Intelligence Research, 31:1–32, 2008.<br />

13. S. Kottler. SArTagnan. SAT Race, Solver Description, 2010.<br />

14. D. Le Berre and A. Parrain. The sat4j library, release 2.2 system description. Journal on<br />

Satisfiability Boolean Modeling and Computation, 7:59–64, 2010.<br />

15. C. M. Li, F. Manyà, and J. Planes. New inference rules for Max-SAT. Journal of Artificial<br />

Intelligence Research, 30:321–359, 2007.<br />

16. H. Lin and K. Su. Exploiting inference rules to compute lower bounds for MAX-SAT solving.<br />

In International Joint Conference on Artificial Intelligence, pages 2334–2339, 2007.<br />

17. V. Manquinho and J. Marques-Silva. Search pruning techniques in SAT-based branch-andbound<br />

algorithms for the binate covering problem. IEEE Transactions on Computer-Aided<br />

Design, 21(5):505–516, 2002.<br />

18. V. Manquinho, J. Marques-Silva, and J. Planes. Algorithms for Weighted Boolean Optimization.<br />

In International Conference on Theory and Applications of Satisfiability Testing, pages<br />

495–508, 2009.<br />

19. V. Manquinho, R. Martins, and I. Lynce. Improving Unsatisfiability-Based Algorithms for<br />

Boolean Optimization. In International Conference on Theory and Applications of Satisfiability<br />

Testing, pages 181–193, 2010.<br />

20. J. Marques-Silva and V. Manquinho. Towards more effective unsatisfiability-based maximum<br />

satisfiability algorithms. In International Conference on Theory and Applications of<br />

Satisfiability Testing, pages 225–230, 2008.<br />

21. J. Marques-Silva and J. Planes. Algorithms for Maximum Satisfiability using Unsatisfiable<br />

Cores. In Design, Automation and Testing in Europe Conference, pages 408–413, 2008.<br />

22. J. Marques-Silva and K. Sakallah. GRASP: A new search algorithm for satisfiability. In<br />

International Conference on Computer-Aided Design, pages 220–227, 1996.<br />

23. H. Sheini and K. Sakallah. Pueblo: A Modern Pseudo-Boolean SAT Solver. In Design,<br />

Automation and Testing in Europe Conference, pages 684–685, March 2005.<br />

24. C. Sinz. Towards an Optimal CNF Encoding of Boolean Cardinality Constraints. In International<br />

Conference on Principles and Practice of Constraint Programming, pages 827–831,<br />

2005.<br />

25. L. Zhang, C. F. Madigan, M. W. Moskewicz, and S. Malik. Efficient conflict driven learning<br />

in boolean satisfiability solver. In International Conference on Computer-Aided Design,<br />

pages 279–285, 2001.<br />

15


Hydra-MIP: Automated Algorithm Configuration and<br />

Selection for Mixed Integer Programming<br />

Lin Xu, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown<br />

Department of Computer Science<br />

University of British Columbia, Canada<br />

{xulin730, hutter, hoos, kevinlb}@cs.ubc.ca<br />

Abstract. State-of-the-art mixed integer programming (MIP) solvers are highly<br />

parameterized. For heterogeneous and a priori unknown instance distributions, no<br />

single parameter configuration generally achieves consistently strong performance,<br />

and hence it is useful to select from a portfolio of different configurations. HYDRA<br />

is a recent method for using automated algorithm configuration to derive multiple<br />

configurations of a single parameterized algorithm for use with portfolio-based<br />

selection. This paper shows that, leveraging two key innovations, HYDRA can<br />

achieve strong performance for MIP. First, we describe a new algorithm selection<br />

approach based on classification with a non-uniform loss function, which<br />

significantly improves the performance of algorithm selection for MIP (and SAT).<br />

Second, by modifying HYDRA’s method for selecting candidate configurations,<br />

we obtain better performance as a function of training time.<br />

1 Introduction<br />

Mixed integer programming (MIP) is a general approach for representing constrained<br />

optimization problems with integer-valued and continuous variables. Because MIP serves<br />

as a unifying framework for NP-complete problems and combines the expressive power<br />

of integrality constraints with the efficiency of continuous optimization, it is widely used<br />

both in academia and industry. MIP used to be studied mainly in operations research,<br />

but has recently become an important tool in AI, with applications ranging from auction<br />

theory [19] to computational sustainability [8]. Furthermore, several recent advances in<br />

MIP solving have been achieved with AI techniques [7, 13].<br />

One key advantage of the MIP representation is that highly optimized solvers can<br />

be developed in a problem-independent way. IBM ILOG’s CPLEX solver 1 is particularly<br />

well known for achieving strong practical performance; it is used by over 1 300<br />

corporations (including one-third of the Global 500) and researchers at more than 1 000<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms for Solving<br />

Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

1 http://ibm.com/software/integration/optimization/cplex-optimization-studio/<br />

16


universities [16]. Here, we propose improvements to CPLEX that have the potential to<br />

directly impact this massive user base.<br />

State-of-the-art MIP solvers typically expose many parameters to end users; for<br />

example, CPLEX 12.1 comes with a 221-page parameter reference manual describing<br />

135 parameters. The CPLEX manual warns that “integer programming problems are more<br />

sensitive to specific parameter settings, so you may need to experiment with them.” How<br />

should such solver parameters be set by a user aiming to solve a given set of instances?<br />

Obviously—despite the advice to “experiment”—effective manual exploration of such a<br />

huge space is infeasible; instead, an automated approach is needed.<br />

Conceptually, the most straightforward option is to search the space of algorithm<br />

parameters to find a (single) configuration that minimizes a given performance metric<br />

(e.g., average runtime). Indeed, CPLEX itself includes a self-tuning tool that takes this<br />

approach. A variety of problem-independent algorithm configuration procedures have<br />

also been proposed in the AI community, including I/F-Race [3], ParamILS [15, 14],<br />

and GGA [2]. Of these, only PARAMILS has been demonstrated to be able to effectively<br />

configure CPLEX on a variety of MIP benchmarks, with speedups up to several orders<br />

of magnitude, and overall performance substantially better than that of the CPLEX<br />

self-tuning tool [13].<br />

While automated algorithm configuration is often very effective, particularly when<br />

optimizing performance on homogeneous sets of benchmark instances, it is no panacea.<br />

In fact, it is characteristic of NP-hard problems that no single solver performs well on<br />

all inputs (see, e.g., [30]); a procedure that performs well on one part of an instance<br />

distribution often performs poorly on another. An alternative approach is to choose a<br />

portfolio of different algorithms (or parameter configurations), and to select between<br />

them on a per-instance basis. This algorithm selection problem [24] can be solved by<br />

gathering cheaply computable features from the problem instance and then evaluating<br />

a learned model to select the best algorithm [20, 9, 6]. The well-known SATZILLA [30]<br />

method uses a regression model to predict the runtime of each algorithm and selects<br />

the algorithm predicted to perform best. Its performance in recent SAT competitions<br />

illustrates the potential of portfolio-based selection: it is the best known method for<br />

solving many types of SAT instances, and almost always outperforms all of its constituent<br />

algorithms.<br />

Portfolio-based algorithm selection also has a crucial drawback: it requires a strong<br />

and sufficiently uncorrelated portfolio of solvers. While the literature has produced many<br />

different approaches for solving SAT, there are few strong MIP solvers, and the ones that<br />

do exist have similar architectures. However, algorithm configuration and portfolio-based<br />

algorithm selection can be combined to yield automatic portfolio construction methods<br />

applicable to domains in which only a single, highly-parameterized algorithm exists.<br />

Two such approaches have been proposed in the literature. HYDRA [28] is an iterative<br />

procedure. It begins by identifying a single configuration with the best overall performance,<br />

and then iteratively adds algorithms to the portfolio by applying an algorithm<br />

configurator with a customized, dynamic performance metric. At runtime, algorithms<br />

are selected from the portfolio as in SATZILLA. ISAC [17] first divides instance sets into<br />

2<br />

17


clusters based on instance features using the G-means clustering algorithm, then applies<br />

an algorithm configurator to find a good configuration for each cluster. At runtime,<br />

ISAC computes the distance in feature space to each cluster centroid and selects the<br />

configuration for the closest cluster. We note two theoretical reasons to prefer HYDRA to<br />

ISAC. First, ISAC’s clustering is solely based on distance in feature space, completely<br />

ignoring the importance of each feature to runtime. Thus, ISAC’s performance can<br />

change dramatically if additional features are added (even if they are uninformative).<br />

Second, no amount of training time allows ISAC to recover from a misleading initial<br />

clustering or an algorithm configuration run that yields poor results. In contrast, HYDRA<br />

can recover from poor algorithm configuration runs in later iterations.<br />

In this work, we show that HYDRA can be used to build strong portfolios of CPLEX<br />

configurations, dramatically improving CPLEX’s performance for a variety of MIP<br />

benchmarks, as compared to ISAC, algorithm configuration alone, and CPLEX’s default<br />

configuration. This achievement leverages two modifications to the original HYDRA approach,<br />

presented in Section 2. Section 3 describes the features and CPLEX parameters<br />

we identified for use with HYDRA, along with the benchmark sets upon which we evaluated<br />

it. Section 4 evaluates HYDRA-MIP and presents evidence that our improvements to<br />

HYDRA are also useful beyond MIP. Section 5 concludes and describes future work.<br />

2 Improvements to Hydra<br />

It is difficult to directly apply the original HYDRA method to the MIP domain, for two<br />

reasons. First, the data sets we face in MIP tend to be highly heterogeneous; preliminary<br />

prediction experiments (not reported here for brevity) showed that HYDRA’s linear<br />

regression models were not robust for such heterogeneous inputs, sometimes yielding<br />

extreme mispredictions of more than ten orders of magnitude. Second, individual HYDRA<br />

iterations can take days to run—even on a large computer cluster—making it difficult<br />

for the method to converge within a reasonable amount of time. (We say that HYDRA<br />

has converged when substantial increases in running time stop leading to significant<br />

performance gains.)<br />

In this section, we describe improvements to HYDRA that address both of these issues.<br />

First, we modify the model-building method used by the algorithm selector, using a<br />

classification procedure based on decision forests with a non-uniform loss function.<br />

Second, we modify HYDRA to add multiple solvers in each iteration and to reduce the<br />

cost of evaluating these candidate solvers, speeding up convergence. We denote the<br />

original method as HydraLR,1 (“LR” stands for linear regression and “1” indicates the<br />

number of configurations added to the portfolio per iteration), the new method including<br />

only our first improvement as HydraDF,1 (“DF” stands for decision forests), and the<br />

full new method as HydraDF,k.<br />

3<br />

18


2.1 Decision forests for algorithm selection<br />

There are many existing techniques for algorithm selection, based on either regression<br />

[30, 26] or classification[10, 9, 25, 23]. SATZILLA [30] uses linear basis function<br />

regression to predict the runtime of each of a set of K algorithms, and picks the one<br />

with the best predicted performance. Although this approach has led to state-of-the-art<br />

performance for SAT, it does not directly minimize the cost of running the portfolio<br />

on a set of instances, but rather minimizes the prediction error separately in each of K<br />

predictive models. This has the advantage of penalizing costly errors (picking a slow<br />

algorithm over a fast one) more than less costly ones (picking a fast algorithm over a<br />

slightly faster one), but cannot be expected to perform well when training data is sparse.<br />

Stern et al [26] applied the recent Bayesian recommender system Matchbox to algorithm<br />

selection; similar to SATZILLA, this approach is cost-sensitive and uses a regression<br />

model that predicts the performance of each algorithm. CPHYDRA[23] uses case-based<br />

reasoning to determine a schedule of constraint satisfaction solvers (instead of picking a<br />

single solver). Its k-nearest neighbor approach is simple and effective, but determines<br />

similarity solely based on instance features (ignoring instance hardness). Finally, ISAC<br />

uses a cost-agnostic clustering approach for algorithm selection. Our new selection<br />

procedure uses an explicit cost-sensitive loss function—punishing misclassifications in<br />

direct proportion to their impact on portfolio performance—without predicting runtime.<br />

Such an approach has never before been applied to algorithm selection: all existing classification<br />

approaches use a simple 0–1 loss function that penalizes all misclassification<br />

equally (e.g., [25, 9, 10]). Specifically, this paper describes a cost-sensitive classification<br />

approach based on decision forests (DFs). Particularly for heterogeneous benchmark<br />

sets, DFs offer the promise of effectively partitioning the feature space into qualitatively<br />

different parts. In contrast to clustering methods, DFs take runtime into account when<br />

determining that partitioning.<br />

We constructed cost-sensitive DFs as collections of T cost-sensitive decision trees [27].<br />

Following [4], given n training data points with k features each, for each tree we construct<br />

a bootstrap sample of n training data points sampled uniformly at random with<br />

repetitions; during tree construction, we sample a random subset of log 2(k)+1features<br />

at each internal node to be considered for splitting the data at that node. Predictions are<br />

based on majority votes across all T trees. For a set of m algorithms {s1,...,sm}, an<br />

n × k matrix holding the values of k features for each of n training instances, and an<br />

n × m matrix P holding the performance of the m algorithms on the n instances, we<br />

construct our selector based on m · (m − 1)/2 pairwise cost-sensitive decision forests,<br />

determining the labels and costs as follows. For any pair of algorithms (i, j), we train a<br />

cost-sensitive decision forest DF(i, j) on the following weighted training data: we label<br />

an instance q as i if P (q, i) is better than P (q, j), and as j otherwise; the weight for that<br />

instance is |P (q, i) − P (q, j)|. For test instances, we apply each DF(i, j) to vote for<br />

either i or j and select the algorithm with the most votes as the best algorithm for that<br />

instance. Ties are broken by only counting the votes from those decision forests that<br />

involve algorithms which received equal votes; further ties are broken randomly.<br />

4<br />

19


We made one further change to the mechanism gleaned from SATZILLA. Originally,<br />

a subset of candidate solvers was chosen by determining the subset for which portfolio<br />

performance is maximized, taking into account model mispredictions. Likewise, a similar<br />

procedure was used to determine presolver policies. These internal optimizations were<br />

performed based on the same instance set used to train the models. However, this can<br />

be problematic if the model overfits the training data; therefore, in this work, we use<br />

10-fold cross validation instead.<br />

2.2 Speeding up convergence<br />

HYDRA uses an automated algorithm configurator as a subroutine, which is called in every<br />

iteration to find a configuration that augments the current portfolio as well as possible.<br />

Since algorithm configuration is a hard problem, configuration procedures are incomplete<br />

and typically randomized. Because a single run of a randomized configuration procedure<br />

might not yield a high-performing parameter configuration, it is common practice to<br />

perform multiple runs in parallel and to use the configuration that performs best on the<br />

training set [12, 14, 28, 13].<br />

Here, we make two modifications to HYDRA to speed up its convergence. First, in<br />

each iteration, we add k promising configurations to the portfolio, rather than just the<br />

single best. If algorithm configuration runs were inexpensive, this modification to HYDRA<br />

would not help: additional configurations could always be found in later iterations, if<br />

they indeed complemented the portfolio at that point. However, when each iteration must<br />

repeatedly solve many difficult MIP instances, it may be impossible to perform more<br />

than a small number of HYDRA iterations within any reasonable amount of time, even<br />

when using a computer cluster. In such a case, when many good (and rather different)<br />

configurations are found in an iteration, it can be wasteful to retain only one of these.<br />

Our second change to HYDRA concerns the way that the ‘best’ configurations returned<br />

by different algorithm configuration runs are identified. HydraDF,1 determines the ‘best’<br />

of the configurations found in a number of independent configurator runs by evaluating<br />

each configuration on the full training set and selecting the one with best performance.<br />

This evaluation phase can be very costly: e.g., if we use a cutoff time of 300 seconds per<br />

run during training and have 1 000 instances, then computing the training performance of<br />

each candidate configuration can take nearly four CPU days. Therefore, in HydraDF,k,<br />

we select the configuration for which the configuration procedure’s internal estimate<br />

of the average performance improvement over the existing portfolio is largest. This<br />

alternative is computationally cheap: it does not require any evaluations of configurations<br />

beyond those already performed by the configurator. However, it is also potentially risky:<br />

different configurator runs typically use the training instances in a different order and<br />

evaluate configurations using different numbers of instances. It is thus possible that<br />

the configurator’s internal estimate of improvement for a parameter configuration is<br />

high, but that it turns out to not help for instances the configurator has not yet used.<br />

5<br />

20


Fortunately, adding k parameter configurations to the portfolio in each iteration mitigates<br />

this problem: if each of the k selected configurations has independent probability p of<br />

yielding a poor configuration, the probability of all k configurations being poor is only<br />

p k .<br />

3 MIP: Features, Data Sets, and Parameters<br />

While the improvements to HYDRA presented above were motivated by MIP, they can<br />

nevertheless be applied to any domain. In this section, we describe all domain-specific<br />

elements of HYDRA-MIP: the MIP instance features upon which our models depend,<br />

the CPLEX parameters we configured, and the data sets upon which we evaluated our<br />

methods.<br />

3.1 Features of MIP Instances<br />

We constructed a large set of 139 MIP features, drawing on 97 existing features [21, <strong>11</strong>,<br />

17] and also including 42 new probing features. Specifically, existing work used features<br />

based on problem size, graph representations, proportion of different variable types<br />

(e.g., discrete vs continuous), constraint types, coefficients of the objective function,<br />

the linear constraint matrix and the right hand side of the constraints. We extended<br />

those features by adding more descriptive statistics when applicable, such as medians,<br />

variation coefficients, and interquantile distances of vector-based features. For the first<br />

time, we also introduce a set of MIP probing features based on short runs of CPLEX<br />

using default settings. These contain 20 single probing features and 22 vector-based<br />

features. The single probing features are as follows. Presolving features (6 in total) are<br />

CPU times for presolving and relaxation, # of constraints, variables, nonzero entries in<br />

the constraint matrix, and clique table inequalities after presolving. Probing cut usage<br />

features (8 in total) are the number of each of 7 different cut types, and total cuts applied.<br />

Probing result features (6 in total) are MIP gap achieved, # of nodes visited), # of feasible<br />

solutions found, # of iterations completed, # of times CPLEX found a new incumbent by<br />

primal heuristics, and # of solutions or incumbents found. Our 22 vector-based features<br />

contain descriptive statistics (averages, medians, variation coefficients, and interquantile<br />

distances, i.e., q90-q10) for the following 6 quantities reported by CPLEX over time: (a)<br />

improvement of objective function; (b) number of integer-infeasible variables at current<br />

node; (c) improvement of best integer solution; (d) improvement of upper bound; (e)<br />

improvement of gap; (f) nodes left to be explored (average and variation coefficient<br />

only).<br />

6<br />

21


3.2 CPLEX Parameters<br />

Out of CPLEX 12.1’s 135 parameters, we selected a subset of 74 parameters to be<br />

optimized. These are the same parameters considerd in [13], minus two parameters<br />

governing the time spent for probing and solution polishing. (These led to problems when<br />

the captime used during parameter optimization was different from that used at test time.)<br />

We were careful to keep all parameters fixed that change the problem formulation (e.g.,<br />

parameters such as the optimality gap below which a solution is considered optimal). The<br />

74 parameters we selected affect all aspects of CPLEX. They include 12 preprocessing<br />

parameters; 17 MIP strategy parameters; <strong>11</strong> parameters controlling how aggressively to<br />

use which types of cuts; 8 MIP “limits” parameters; 10 simplex parameters; 6 barrier<br />

optimization parameters ; and 10 further parameters. Most parameters have an “automatic”<br />

option as one of their values. We allowed this value, but also included other values (all<br />

other values for categorical parameters, and a range of values for numerical parameters).<br />

Exploiting the fact that 4 parameters were conditional on others taking certain values,<br />

they gave rise to 4.75 · 10 45 distinct parameter configurations.<br />

3.3 MIP Benchmark Sets<br />

Our goal was to obtain a MIP solver that works well on heterogenous data. Thus,<br />

we selected four heterogeneous sets of MIP benchmark instances, composed of many<br />

well studied MIP instances. They range from a relatively simple combination of two<br />

homogenous subsets (CL∪REG) to heterogenous sets using instances from many sources<br />

(e.g., MIX). While previous work in automated portfolio construction for MIP [17] has<br />

only considered very easy instances (ISAC(new) with a mean CPLEX default runtime<br />

below 4 seconds), our three new benchmarks sets are much more realistic, with CPLEX<br />

default runtimes ranging from seconds to hours.<br />

CL∪REG is a mixture of two homogeneous subset, CL and REG. CL instances come<br />

from computational sustainability; they are based on real data used for the construction of<br />

a wildlife corridor for endangered grizzly bears in the Northern Rockies [8] and encoded<br />

as mixed integer linear programming (MILP) problems. We randomly selected 1000 CL<br />

instances from the set used in [13], 500 for training and 500 for testing. REG instances are<br />

MILP-encoded instances of the winner determination problem in combinatorial auctions.<br />

We generated 500 training and 500 test instances using the regions generator from<br />

the Combinatorial Auction Test Suite [22], with the number of bids selected uniformly at<br />

random from between 750 and 1250, and a fixed bids/goods ratio of 3.91 (following [21]).<br />

CL∪REG∪RCW is the union of CL∪REG and another set of MILP-encoded instances<br />

from computational sustainability, RCW. These instances model the spread of the endangered<br />

red-cockaded woodpecker, conditional on decisions about certain parcels of<br />

land to be protected. We generated 990 RCW instances (10 random instances for each<br />

7<br />

22


combination of 9 maps and <strong>11</strong> budgets), using the generator from [1] with the same<br />

parameter setting, except a smaller sample size of 5. We split these instances 50:50 into<br />

training and test sets.<br />

ISAC(new) is a subset of the MIP data set from [17]; we could not use the entire set,<br />

since the authors had irretrievably lost their test set. We thus divided their 276 training<br />

instances into a new training set of 184 and a test set of 92 instances. Due to the small<br />

size of the data set, we did this in a stratified fashion, first ordering the instances based<br />

on CPLEX default runtime and then picking every third instance for the test set.<br />

MIX subsets of the sets studied in [13]. It includes all instances from MASS (100<br />

instances), MIK (120 instances), CLS (100 instances), and a subset of CL (120 instances)<br />

and REG200 (120 instances). (Please see [13] for the description of each underlying<br />

set.) We preserved the training-test split from [13], resulting in 280 training and 280 test<br />

instances.<br />

4 Experimental Results<br />

In this section, we examined HYDRA-MIP’s performance on our MIP datasets. We began<br />

by describing the experimental setup, and then evaluated each of our improvements to<br />

HydraLR,1.<br />

4.1 Experimental setup<br />

For algorithm configuration we used PARAMILS version 2.3.4 with its default instantiation<br />

of FOCUSEDILS with adaptive capping [14]. We always executed 25 parallel configuration<br />

runs with different random seeds with a 2-day cutoff. (Running times were always<br />

measured using CPU time.) During configuration, the captime for each CPLEX run was<br />

set to 300 seconds, and the performance metric was penalized average runtime (PAR-10,<br />

where PAR-k of a set of r runs is the mean over the r runtimes, counting timed-out<br />

runs as having taken k times the cutoff time). For testing, we used a cutoff time of<br />

3 600 seconds. In our feature computation, we used a 5-second cutoff for computing<br />

probing features. We omitted these probing features (only) for the very easy ISAC(new)<br />

benchmark set. We used the Matlab version R2010a implementation of cost-sensitive<br />

decision trees; our decision forests consisted of 99 such trees. All of our experiments<br />

were carried out on a cluster of 55 dual 3.2GHz Intel Xeon PCs with 2MB cache and<br />

2GB RAM, running OpenSuSE Linux <strong>11</strong>.1.<br />

In our experiments, the total running time for the various HYDRA procedures was<br />

often dominated by the time required for running the configurator and therefore turned<br />

out to be roughly proportional to the number of HYDRA iterations performed. Each<br />

8<br />

23


DataSet Model Train (cross valid.) Test SF: LR/DF (Test)<br />

Time PAR (Solved) Time PAR (Solved) Time PAR<br />

CL LR 39.7 39.7 (100%) 39.4 39.4 (100%) 1.00× 1.00×<br />

∪REG DF 39.7 39.7 (100%) 39.3 39.3 (100%)<br />

CL∪ LR 105.1 105.1 (100%) 102.6 102.6 (100%) 1.04× 1.04×<br />

REG∪RCW DF 98.8 98.8 (100%) 98.8 98.8 (100%)<br />

LR 2.68 2.68 (100%) 2.36 2.36 (100%) 1.18× 1.18×<br />

ISAC(new) DF 2.19 2.19 (100%) 2.00 2.00 (100%)<br />

LR 52 52 (100%) 56 172 (99.6%) 1.17× 1.05×<br />

MIX DF 48 48 (100%) 48 164 (99.6%)<br />

Table 1. MIPzilla performance (average runtime and PAR in seconds, and percentage solved),<br />

varying predictive models. Column SF gives the speedup factor achieved by cost-sensitive decision<br />

forests (DF) over linear regression (LR) on the test set.<br />

iteration required 50 CPU days for algorithm configuration, as well as validation time to<br />

(1) select the best configuration in each iteration (only for HydraLR,1 and HydraDF,1);<br />

and (2) gather performance data for the selected configurations. Since HydraDF,4 selects<br />

4 solvers in each iteration, it has to gather performance data for 3 additional solvers per<br />

iteration (using the same captime as used at test time, 3 600 seconds), which roughly<br />

offsets its savings due to ignoring the validation step. Using the format (HydraDF,1,<br />

HydraDF,4), the overall runtime requirements in CPU days were as follows: (366,356)<br />

for CL∪REG; (485, 422) for CL∪REG∪RCW; (256,263) for ISAC(new); and (274,269)<br />

for MIX. Thus, the computational cost for each iteration of HydraLR,1 and HydraDF,1<br />

was similar.<br />

4.2 Algorithm selection with decision forests<br />

To assess the impact of our improved algorithm selection procedure, we evaluated it<br />

in the context of SATZILLA-style portfolios of different CPLEX configurations, dubbed<br />

MIPzilla. As component solvers, we always used the CPLEX default plus CPLEX<br />

configurations optimized for the various subsets of our four benchmarks. Specifically,<br />

for ISAC(new) we used the six configurations found by GGA in [17]. For CL∪REG,<br />

CL∪REG∪RCW, and MIX we used one configuration optimized for each of the benchmark<br />

instance sets that were combined to create the distribution (e.g., CL and REG for<br />

CL∪REG). We took all such optimized configurations from [13], and manually optimized<br />

the remaining configurations using PARAMILS.<br />

In Table 1, we presented performance results for MIPzilla on our four MIP<br />

benchmark sets, contrasting the original linear regression (LR) models with our new<br />

cost-sensitive decision forests (DF). Overall, MIPzilla was never worse with DF than<br />

with LR, and sometimes substantially better. For relatively simple data sets, such as<br />

CL∪REG and CL∪REG∪RCW, the difference between the models was quite small. For<br />

9<br />

24


DataSet Model Train (cross valid.) Test SF: LR/DF (Test)<br />

Time PAR (Solved) Time PAR (Solved) Time PAR<br />

LR 172 332 (99.5%) 177 458 (99.1%) 1.08× 1.13×<br />

RAND DF 147 308 (99.5%) 164 405 (99.3%)<br />

LR 518 2224 (94.7%) 549 2858 (92.9%) 1.16× 1.26×<br />

HAND DF 363 1327 (97.0%) 475 2268 (94.4%)<br />

LR 459 2195 (94.6%) 545 3085 (92.1%) 1.12× 1.34×<br />

INDU DF 382 1635 (96.1%) 487 2300 (94.4%)<br />

Table 2. SATZILLA performance (average runtime and PAR in seconds, and percentage solved),<br />

varying predictive models. Column SF gives the speedup factor achieved by cost-sensitive decision<br />

forests (DF) over linear regression (LR) on the test set.<br />

more heterogeneous data sets, MIPzilla performed much better with DF than with LR:<br />

e.g., 18% and 17% better in terms of final portfolio runtime in the case of ISAC(new)<br />

and MIX. Overall, our new cost-sensitive classification-based algorithm selection was<br />

clearly preferable to the previous mechanism based on linear regression. In further<br />

experiments, we also evaluated alternate approaches based on random regression forests<br />

(trained separately for each algorithm as in the linear regression approach), decision<br />

forests without costs, and support vector machines (SVMs) both with and without costs.<br />

We found that the cost-sensitive variants always outperformed the cost-free ones. In these<br />

more extensive experiments, we observed that cost-sensitive DF always performed very<br />

well and linear regression performed inconsistently, with especially poor performance<br />

on heterogenous data sets.<br />

Our improvements to the algorithm selection procedure, although motivated by<br />

the application to MIP, were in fact problem independent. We therefore conducted<br />

an additional experiment to evaluate the effectiveness of SATZILLA based on our new<br />

cost-sensitive decision forests, compared to the original version using linear regression<br />

models. We used the same data used for building SATzilla2009 [29]. The number<br />

of training/test instances were 12<strong>11</strong>/806 (RAND category with 17 candidate solvers),<br />

672/447 (HAND category with 13 candidate solvers) and 570/379 (INDU category with<br />

10 candidate solvers). Table 2 shows that by using our new cost-sensitive decision forest,<br />

we improved SATZILLA’s performance 29% (in average over three categories) in terms<br />

of PAR over the previous (competition-winning) version of SATZILLA; for the important<br />

industrial category, we observed PAR improvements of 34%. Because there exists<br />

no highly parameterized SAT solver with strong performance across problem classes<br />

(analogous to CPLEX for MIP), we did not investigate HYDRA for SAT. 2 However, we<br />

noted that this paper’s findings suggest that there is merit in constructing such highly<br />

parameterized solvers for SAT and other NP-hard problems.<br />

2 The closest to a SAT equivalent of what CPLEX is for MIP would be MiniSAT [5], but it<br />

does not expose many parameters and does not perform well for random instances. The highly<br />

parameterized SATenstein solver [18] cannot be expected to perform well across the board<br />

for SAT; in particular, local search is not the best method for highly structured instances.<br />

10<br />

25


DataSet Solver Train (cross valid.) Test<br />

Time PAR (Solved) Time PAR (Solved)<br />

Default 424 1687 (96.7%) 424 1493 (96.7%)<br />

CL ParamILS 145 339 (99.4%) 134 296 (99.5%)<br />

∪REG HydraDF,1 64 97 (99.9%) 63 63 (100%)<br />

HydraDF,4 42 42 (100%) 48 48 (100%)<br />

MIPzilla 40 40 (100%) 39 39 (100%)<br />

Oracle 33 33 (100%) 33 33 (100%)<br />

(MIPzilla)<br />

CL Default 405 1532 (96.5%) 406 1424 (96.9%)<br />

∪REG ParamILS 148 148 (100%) 151 151 (100%)<br />

∪RCW HydraDF,1 89 89 (100%) 95 95 (100%)<br />

HydraDF,4 106 106 (100%) <strong>11</strong>2 <strong>11</strong>2 (100%)<br />

MIPzilla 99 99 (100%) 99 99 (100%)<br />

Oracle 89 89 (100%) 89 89 (100%)<br />

(MIPzilla)<br />

Default 3.98 3.98 (100%) 3.77 3.77 (100%)<br />

ISAC ParamILS 2.06 2.06 (100%) 2.13 2.13 (100%)<br />

(new) HydraLR,1 1.67 1.67 (100%) 1.52 1.52 (100%)<br />

HydraDF,1 1.2 1.2 (100%) 1.42 1.42 (100%)<br />

HydraDF,4 1.05 1.05 (100%) 1.17 1.17 (100%)<br />

MIPzilla 2.19 2.19 (100%) 2.00 2.00 (100%)<br />

Oracle 1.83 1.83 (100%) 1.81 1.81 (100%)<br />

(MIPzilla)<br />

Default 182 992 (97.5%) 156 387 (99.3%)<br />

ParamILS 139 717 (98.2%) 126 357 (99.3%)<br />

MIX HydraLR,1 74 74 (100%) 90 205 (99.6%)<br />

HydraDF,1 60 60 (100%) 65 181 (99.6%)<br />

HydraDF,4 53 53 (100%) 62 177 (99.6%)<br />

MIPzilla 48 48 (100%) 48 164 (99.6%)<br />

Oracle 34 34 (100%) 39 155 (99.6%)<br />

(MIPzilla)<br />

Table 3. Performance (average runtime and PAR in seconds, and percentage solved) of<br />

HydraDF,4, HydraDF,1 and HydraLR,1 after 5 iterations.<br />

4.3 Evaluating HYDRA-MIP<br />

Next, we evaluated our full HydraDF,4 approach for MIP; on all four MIP benchmarks,<br />

we compared it to HydraDF,1, to the best configuration found by PARAMILS, and to<br />

the CPLEX default. For ISAC(new) and MIX we also assessed HydraLR,1. We did<br />

not do so for CL∪REG and CL∪REG∪RCW because, based on the results in Table 1, we<br />

expected the DF and LR models to perform almost identically. Table 3 presents these<br />

results. First, comparing HydraDF,4 to PARAMILS alone and to the CPLEX default, we<br />

observed that HydraDF,4 achieved dramatically better performance, yielding between<br />

2.52-fold and 8.83-fold speedups over the CPLEX default and between 1.35-fold and<br />

2.79-fold speedups over the configuration optimized with PARAMILS in terms of average<br />

runtime. Note that (due probably to the heterogeneity of the data sets) the built-in CPLEX<br />

self-tuning tool was unable to find any configurations better than the default for any of<br />

<strong>11</strong><br />

26


PAR Score<br />

PAR Score<br />

300<br />

250<br />

200<br />

150<br />

100<br />

50<br />

Hydra DF,4<br />

Hydra DF,1<br />

MIPzilla DF<br />

Oracle(MIPzilla)<br />

0<br />

1 2 3 4 5<br />

Number of Hydra Iterations<br />

3<br />

2.5<br />

2<br />

1.5<br />

(a) CL∪REG<br />

Hydra DF,4<br />

Hydra DF,1<br />

Hydra LR,1<br />

1<br />

1 2 3 4 5<br />

Number of Hydra Iterations<br />

(c) ISAC(new)<br />

MIPzilla DF<br />

Oracle(MIPzilla)<br />

PAR Score<br />

PAR Score<br />

160<br />

140<br />

120<br />

100<br />

Hydra DF,4<br />

Hydra DF,1<br />

MIPzilla DF<br />

Oracle(MIPzilla)<br />

80<br />

1 2 3 4 5<br />

Number of Hydra Iterations<br />

400<br />

350<br />

300<br />

250<br />

200<br />

(b) CL∪REG∪RCW<br />

Hydra DF,4<br />

Hydra DF,1<br />

Hydra LR,1<br />

MIPzilla DF<br />

Oracle(MIPzilla)<br />

150<br />

1 2 3 4 5<br />

Number of Hydra Iterations<br />

(d) MIX<br />

Fig. 1. Performance per iteration for HydraDF,4, HydraDF,1 and HydraLR,1, evaluated on test<br />

data.<br />

our four data sets. Compared to HydraLR,1, HydraDF,4 yielded a 1.3-fold speedup<br />

for ISAC(new) and a 1.5-fold speedup for MIX. HydraDF,4 also typically performed<br />

better than our intermediate procedure HydraDF,1, with speedup factors up to 1.21<br />

(ISAC(new)). However, somewhat surprisingly, it actually performed worse for one<br />

distribution, CL∪REG∪RCW. We analyzed this case further and found that in HydraDF,4,<br />

after iteration three PARAMILS did not find any configurations that would further improve<br />

the portfolio, even with a perfect algorithm selector. This poor PARAMILS performance<br />

could be explained by the fact that HYDRA’s dynamic performance metric only rewarded<br />

configurations that made progress on solving some instances better; almost certainly<br />

starting in a poor region of configuration space, PARAMILS did not find configurations<br />

that made progress on any instances over the already strong portfolio, and thus lacked<br />

guidance towards better regions of configuration space. We believed that this problem<br />

could be addressed by means of better configuration procedures in the future.<br />

12<br />

27


Figure 1 shows the test performance the different HYDRA versions achieved as a<br />

function of their number of iterations, as well as the performance of the MIPzilla<br />

portfolios we built manually. When building these MIPzilla portfolios for CL∪REG,<br />

CL∪REG∪RCW, and MIX, we exploited ground truth knowledge about the constituent<br />

subsets of instances, using a configuration optimized specifically for each of these subsets.<br />

As a result, these portfolios yielded very strong performance. Although our various<br />

HYDRA versions did not have access to this ground truth knowledge, they still roughly<br />

matched MIPzilla’s performance (indeed, HydraDF,1 outperformed MIPzilla on<br />

CL∪REG). For ISAC(new), our baseline MIPzilla portfolio used CPLEX configurations<br />

obtained by ISAC [17]; all HYDRA versions clearly outperformed MIPzilla<br />

in this case, which suggests that its constituent configurations are suboptimal. For<br />

ISAC(new), we observed that for (only) the first three iterations, HydraLR,1 outperformed<br />

HydraDF,1. We believed that this occurred because in later iterations the portfolio<br />

had stronger solvers, making the predictive models more important. We also observed<br />

that HydraDF,4 consistently converged more quickly than HydraDF,1 and HydraLR,1.<br />

While HydraDF,4 stagnated after three iterations for data set CL∪REG∪RCW (see our<br />

discussion above), it achieved the best performance at every given point in time for the<br />

three other data sets. For ISAC(new), HydraDF,1 did not converge after 5 iterations,<br />

while HydraDF,4 converged after 4 iterations and achieved better performance. For<br />

the other three data sets, HydraDF,4 converged after two iterations. The performance<br />

of HydraDF,4 after the first iteration (i.e., with 4 candidate solvers available to the<br />

portfolio) was already very close to the performance of the best portfolios for MIX and<br />

CL∪REG.<br />

4.4 Comparing to ISAC<br />

We spent a tremendous amount of effort attempting to compare HydraDF,4 with<br />

ISAC [17], since ISAC is also a method for automatic portfolio construction and was<br />

previously applied to a distribution of MIP instances. ISAC’s authors supplied us with<br />

their their training instances and the CPLEX configurations their method identified, but<br />

are generally unable to make their code available to other researchers and, as mentioned<br />

previously, were unable to recover their test data. We therefore compared HydraDF,4’s<br />

and ISAC’s relative speedups over the CPLEX default (thereby controlling for different<br />

machine architectures) on their training data. We note that HydraDF,4 was given only<br />

2/3 as much training data as ISAC (due to the need to recover a test set from [17]’s<br />

original training set); the methods were evaluated using only the original ISAC training<br />

set; the data set is very small, and hence high-variance; and all instances were quite easy<br />

even for the CPLEX default. In the end, HydraDF,4 achieved a 3.6-fold speedup over<br />

the CPLEX default, as compared to the 2.1-fold speedup reported in [17].<br />

As shown in Figure 1, all versions of HYDRA performed much better than a MIPzilla<br />

portfolio built from the configurations obtained from ISAC’s authors for the ISAC(new)<br />

13<br />

28


dataset. In fact, even a perfect oracle of these configurations only achieved an average<br />

runtime of 1.82 seconds, which is a factor of 1.67 slower than HydraDF,4.<br />

5 Conclusion<br />

In this paper, we showed how to extend HYDRA to achieve strong performance for heterogeneous<br />

MIP distributions, outperforming CPLEX’s default, PARAMILS alone, ISAC and<br />

the original HYDRA approach. This was done using a cost-sensitive classification model<br />

for algorithm selection (which also lead to performance improvements in SATZILLA),<br />

along with improvements to HYDRA’s convergence speed. In future work, we plan to<br />

investigate more robust selection criteria for adding multiple solvers in each iteration of<br />

HydraDF,k that consider both performance improvement and performance correlation.<br />

Thus, we may be able to avoid the stagnation we observed on CL∪REG∪RCW. We expect<br />

that HydraDF,k can be further strengthened by using improved algorithm configurators,<br />

such as model-based procedures. Overall, the availability of effective procedures for<br />

constructing portfolio-based algorithm selectors, such as our new HYDRA, should encourage<br />

the development of highly parametrized algorithms for other prominent NP-hard<br />

problems in AI, such as planning and CSP.<br />

References<br />

1. K. Ahmadizadeh, C. Dilkina, B.and Gomes, and A. Sabharwal. An empirical study of<br />

optimization for maximizing diffusion in networks. In CP, 2010.<br />

2. C. Ansotegui, M. Sellmann, and K. Tierney. A gender-based genetic algorithm for the<br />

automatic configuration of solvers. In CP, pages 142–157, 2009.<br />

3. M. Birattari, Z. Yuan, P. Balaprakash, and T. Stüzle. Empirical Methods for the Analysis of<br />

Optimization Algorithms, chapter F-race and iterated F-race: an overview. 2010.<br />

4. L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.<br />

5. N. Eén and N. Sörensson. An extensible SAT-solver. In <strong>Proceedings</strong> of the 6th Intl. Conf. on<br />

Theory and Applications of Satisfiability Testing, LNCS, volume 2919, pages 502–518, 2004.<br />

6. C. Gebruers, B. Hnich, D. Bridge, and E. Freuder. Using CBR to select solution strategies in<br />

constraint programming. In ICCBR, pages 222–236, 2005.<br />

7. A. Gilpin and T. Sandholm. Information-theoretic approaches to branching in search. Discrete<br />

Optimization, 2010. doi:10.1016/j.disopt.2010.07.001.<br />

8. C. P. Gomes, W. van Hoeve, and A. Sabharwal. Connections in networks: A hybrid approach.<br />

In CPAIOR, 2008.<br />

9. A. Guerri and M. Milano. Learning techniques for automatic algorithm portfolio selection. In<br />

ECAI, pages 475–479, 2004.<br />

10. E. Horvitz, Y. Ruan, C. P. Gomes, H. Kautz, B. Selman, and D. M. Chickering. A Bayesian<br />

approach to tackling hard computational problems. In UAI, pages 235–244, 2001.<br />

<strong>11</strong>. F. Hutter. Automated Configuration of Algorithms for Solving Hard Computational Problems.<br />

PhD thesis, University Of British Columbia, Computer Science, 2009.<br />

14<br />

29


12. F. Hutter, D. Babić, H. H. Hoos, and A. J. Hu. Boosting Verification by Automatic Tuning of<br />

Decision Procedures. In FMCAD, pages 27–34, Washington, DC, USA, 2007. IEEE Computer<br />

Society.<br />

13. F. Hutter, H. H. Hoos, and K. Leyton-Brown. Automated configuration of mixed integer<br />

programming solvers. In CPAIOR, 2010.<br />

14. F. Hutter, H. H. Hoos, K. Leyton-Brown, and T. Stützle. ParamILS: an automatic algorithm<br />

configuration framework. Journal of Artificial Intelligence Research, 36:267–306, 2009.<br />

15. F. Hutter, H. H. Hoos, and T. Stützle. Automatic algorithm configuration based on local<br />

search. In AAAI, pages <strong>11</strong>52–<strong>11</strong>57, 2007.<br />

16. IBM. IBM ILOG CPLEX Optimizer – Data Sheet. Available online: ftp://public.dhe.<br />

ibm.com/common/ssi/ecm/en/wsd14044usen/WSD14044USEN.PDF, 20<strong>11</strong>.<br />

17. S. Kadioglu, Y. Malitsky, M. Sellmann, and K. Tierney. ISAC - instance specific algorithm<br />

configuration. In ECAI, 2010.<br />

18. A. KhudaBukhsh, L. Xu, H. H. Hoos, and K. Leyton-Brown. SATenstein: Automatically<br />

building local search SAT solvers from components. pages 517–524, 2009.<br />

19. D. Lehmann, R. Müller, and T. Sandholm. The winner determination problem. In Combinatorial<br />

Auctions, chapter 12, pages 297–318. 2006.<br />

20. K. Leyton-Brown, E. Nudelman, G. Andrew, J. McFadden, and Y. Shoham. A portfolio<br />

approach to algorithm selection. In IJCAI, pages 1542–1543, 2003.<br />

21. K. Leyton-Brown, E. Nudelman, and Y. Shoham. Empirical hardness models: Methodology<br />

and a case study on combinatorial auctions. Journal of the ACM, 56(4):1–52, 2009.<br />

22. K. Leyton-Brown, M. Pearson, and Y. Shoham. Towards a universal test suite for combinatorial<br />

auction algorithms. In ACM-EC, pages 66–76, 2000.<br />

23. E. O’Mahony, E. Hebrard, A. Holland, C. Nugent, and B. O’Sullivan. Using case-based<br />

reasoning in an algorithm portfolio for constraint solving. In Irish Conference on Artificial<br />

Intelligence and Cognitive Science, 2008.<br />

24. J. R. Rice. The algorithm selection problem. Advances in Computers, 15:65–<strong>11</strong>8, 1976.<br />

25. H. Samulowitz and R. Memisevic. Learning to solve QBF. In AAAI, pages 255–260, 2007.<br />

26. D. Stern, R. Herbrich, T. Graepel, H. Samulowitz, L. Pulina, and A. Tacchella. Collaborative<br />

expert portfolio management. In AAAI, pages 210–216, 2010.<br />

27. K. M. Ting. An instance-weighting method to induce cost-sensitive trees. IEEE Trans. Knowl.<br />

Data Eng., 14(3):659–665, 2002.<br />

28. L. Xu, H. H. Hoos, and K. Leyton-Brown. Hydra: Automatically configuring algorithms for<br />

portfolio-based selection. In AAAI, pages 210–216, 2010.<br />

29. L. Xu, F. Hutter, H. Hoos, and K. Leyton-Brown. SATzilla2009: an Automatic Algorithm<br />

Portfolio for SAT. Solver description, SAT competition 2009, 2009.<br />

30. L. Xu, F. Hutter, H. H. Hoos, and K. Leyton-Brown. SATzilla: portfolio-based algorithm<br />

selection for SAT. Journal of Artificial Intelligence Research, 32:565–606, June 2008.<br />

15<br />

30


1 Introduction<br />

As it is well stated, natural hazards may dramatically threaten people’s lives,<br />

because of the different kind of drawbacks they can generate, such as disturbing<br />

people’s daily activities, economic losses, and even breaking the peace of a whole<br />

country. For this reason, any effort oriented to minimize the impacts of natural<br />

catastrophes, such as tornados, floods, forest fires, and other hazards, is welcome.<br />

Due to the difficulty on predicting the occurrence of these phenomena, most of<br />

the research efforts are focused on predicting their evolution through the time,<br />

relying in some physical or mathematical models.<br />

Nevertheless, environmental hazards represent very difficult systems to simulate.<br />

Theoretical and model-related issues aside, many simulators lack precision<br />

on their results because of the inherent uncertainty of the data needed to define<br />

the state of the system. This uncertainty is due to the difficult to gather precise<br />

values at the right places where the catastrophe is taking place, or because the<br />

hazard itself distorts the measurements. So, in many cases the unique alternative<br />

consists of working with interpolated, outdated, or even absolutely unknown values.<br />

Obviously, this fact results in a lack of accuracy and quality on the provided<br />

predictions.<br />

To overcome the just mentioned input uncertainty problem, we have developed<br />

a two-stage prediction strategy, which, first of all, carries out a parameter<br />

adjustment process by comparing the results provided by the simulator and the<br />

real observed disaster evolution. Then, the underlying simulator is executed taking<br />

into account the adjusted parameters obtained in the previous phase, in<br />

order to predict the evolution of the particular hazard for a later time instant. A<br />

successful application of this method mainly depends on the effectiveness of the<br />

adjustment technique that has been carried out. In this sense, our research group<br />

has developed several solutions for input parameters optimization, all of them<br />

characterized by an intensive data management: use of statistical approach based<br />

on exhaustive exploration of previous fires databases [5], application of evolutionary<br />

computation [8], calibration based on domain-specific knowledge [4], and<br />

even solutions coming from the merge of some of the above mentioned [7]. Since<br />

all these approaches perform the calibration stage in a data driven fashion, they<br />

all match the Dynamic Data Driven Application Systems (DDDAS) paradigm<br />

[13, 14].<br />

It has been demonstrated that the above mentioned adjustment techniques<br />

contribute to improve the quality of the predictions. However, in some cases the<br />

application of them becomes a problem of exploration of huge search spaces. This<br />

fact turns out a great disadvantage, taking into account that, in these kind of<br />

urgent situations, a successful prediction is not only determined by the accuracy<br />

of the results: it is also necessary to seriously consider the time restrictions. While<br />

a natural catastrophe is taking place, is necessary to make urgent decisions to<br />

effectively fight against it. Many times there exist several constraints that make<br />

arise the question of how to deal with the combinatorial explosion of the search<br />

space.<br />

2<br />

32


In order to be useful, any evolution prediction of an ongoing hazard must<br />

be delivered as fast as possible for not being outdated. Consequently, we come<br />

up with the binomial urgency-accuracy. For this purpose, we introduce a new<br />

methodology to characterize each element of the proposed DDDAS prediction<br />

process, with the aim of enhancing the efficiency of our proposed strategy. In<br />

particular, we have carried out this research work using forest fires as study case,<br />

and designing experimental testbeds based on the static analysis of the results<br />

obtained from thousands of simulations of different well-known simulators.<br />

This work is part of a more ambitious project, which consists of determining<br />

in advance how a certain combination of natural hazard simulator, computational<br />

resources, adjustment strategy, and frequency of data acquisition will perform,<br />

in terms of execution time and prediction quality.<br />

This paper is organized as follows. In the next section, an overview of how<br />

the two-stages DDDAS for forest fire spread prediction is given. In Section 3,<br />

we expose how this framework could be generalized to any natural hazard, and<br />

the methodology to evaluate the prediction time assessment is described. In<br />

Section 4, the experimental study is reported and, finally, the main conclusions<br />

are included in Section 5.<br />

2 DDDAS for Forest Fire Spread Prediction<br />

In the field of physical systems modelling, specifically forest fire behavior modeling,<br />

there exist several fire propagation simulators [9–<strong>11</strong>], based in some physical<br />

or mathematical models [1], whose main objective is to try to predict the fire<br />

evolution. These simulators need certain input data, which define the characteristics<br />

of the environment where the fire is taking place, in order to evaluate its<br />

future propagation. This data usually consists of the current fire front, terrain<br />

topography, vegetation type, and meteorological data such as humidity and wind<br />

direction and wind speed. Some of this data could be retrieved in advance and<br />

with noticeably accuracy, as, for example, the topography of the area and the<br />

predominant vegetation types. However, there is some data that turns out very<br />

difficult to obtain with reliability. For instance, getting an accurate fire perimeter<br />

is very complicated because of the difficulties involved in getting, at real time,<br />

images or data about this matter. Other kind of data sensitive to imprecisions<br />

is that of meteorological data, which is often distorted by the fire itself. However,<br />

this circumstance is not only related to forest fires, but it also happens<br />

in any system with a dynamic state evolution throughout time (e.g. floods [19],<br />

thunderstorms [20, 21], etc.). These restrictions concerning uncertainty in the<br />

input parameters, added to the fact that these inputs are set up only at the very<br />

beginning of the simulation process, become an important drawback because as<br />

the simulation time goes on, variables previously initialized could change dramatically,<br />

misleading simulation results. In order to overcome these restrictions,<br />

we need a system capable of dynamically obtaining real time input data in those<br />

case that is possible and, otherwise, properly estimating the values of the input<br />

parameters needed by the underlying simulator.<br />

3<br />

33


(a) Classic prediction (b) Two-stage prediction<br />

Fig. 1. Prediction Methods<br />

The classic way of predicting forest fire behaviour, summarised in Figure<br />

1(a), takes the initial state of the fire front (RF = real fire) as input as well as<br />

the input parameters given for some time tx. The simulator then returns the<br />

prediction (SF = simulated fire) for the state of fire front at a later time tx+1.<br />

Comparing the simulation result SF from time tx+1 with the advanced real<br />

fire RF at the same instant, the forecasted fire front tends to differ to a greater<br />

or lesser extent from the real fire line. One reason for this behaviour is that the<br />

classic calculation of the simulated fire is based upon one single set of input<br />

parameters afflicted with the before explained insufficiencies. To overcome this<br />

drawback, a simulator independent data-driven prediction scheme was proposed<br />

to optimize dynamic model input parameters [3]. Introducing a previous calibration<br />

step as shown in Figure 1(b), the set of input parameters is optimized<br />

before every prediction step. The solution proposed comes from reversing the<br />

problem: how to find a parameter configuration such that, given this configuration<br />

as input, the fire simulator would produce predictions that match the actual<br />

fire behavior. Having detected the simulator input that better describes current<br />

environmental conditions, the same set of parameters, could also be used to<br />

describe best the immediate future, assuming that meteorological conditions remain<br />

constant during the next prediction interval. Then, the prediction becomes<br />

the result of a series of automatically adjusted input configurations.<br />

This strategy works under the hypothesis that the environmental conditions<br />

are stable throughout the adjustment and calibration steps. However, this assumption<br />

is not always true, specially when dealing with very dynamic parameters,<br />

such as wind speed and wind direction. For this reason, new techniques<br />

had to be introduced to overcome this problem, so that the system was able<br />

to dynamically acquire data if there have been detected sudden changes in the<br />

initial conditions [7].<br />

Previous works proposed several calibration techniques, which made the<br />

problem of fire spread prediction to fit the DDDAS paradigm, rather than the<br />

classic prediction scheme such as [6–8]. Since the two stages DDDAS for forest fire<br />

spread prediction described in Figure 1(b) constitutes a simulator-independent<br />

prediction method, the same technique could be extrapolated to any kind of<br />

natural disasters by only exchanging the underlying simulator. Figure 2 shows<br />

4<br />

34


a general scheme for a two-stage DDDAS for natural hazard management. In<br />

the following section, we shall describe a methodology to perform the prediction<br />

time assessment under this prediction framework.<br />

Fig. 2. General two-stages DDDAS for natural hazard prediction evolution<br />

3 Prediction Time Assessment<br />

As stated in Section 1, when dealing with emergency simulation, it is extremely<br />

necessary to maximize the result of the urgency-accuracy binomial. This goal is<br />

oriented to provide the personnel in charge of making decisions about how to<br />

face an ongoing emergency, with intelligent tools able to evaluate, in advance,<br />

how a certain combination of simulator, computational resources, adjustment<br />

strategy, and frequency of data acquisition will perform, in terms of execution<br />

time and prediction quality. In order to bound the problem, we work under<br />

certain assumptions:<br />

– We focus on those emergencies where the corresponding simulators present<br />

high input-data sensitivity.<br />

– We assume scenarios where the computational resources are dedicated. Currently,<br />

we are working on adapting tools that allow urgent execution of tasks<br />

in distributed-computing environments, e.g. SPRUCE [15].<br />

– We rely on the two-stage DDDAS prediction strategy.<br />

Taking into account these premises and bearing in mind the scheme shown<br />

in Figure 2, we can define three levels of prediction time assessment: Simulator<br />

level assessment (SLA), Adjustment level assessment (ALA) and Prediction level<br />

assessment (PLA).<br />

3.1 Simulator level assessment (SLA)<br />

Prediction time assessment at this level must be done independently on the<br />

underlying simulator (natural hazard) and the particular setting of their input<br />

parameters. The main objective at this level is to define a simulator-independent<br />

5<br />

35


methodology to determine a clustering classification of the simulator execution<br />

time, where each cluster has associated an upper bound for the execution time<br />

depending on the values of the input parameters. This process is carried out in<br />

an offline way and will be widely explained later on in this paper. Since this<br />

characterization process depends on the executable platform, different simulator<br />

characterizations will be performed for each available computational resource.<br />

3.2 Adjustment level assessment (ALA)<br />

This level corresponds to estimate the prediction time increase due to the calibration<br />

strategy used in the Adjustment stage. As we have previously mentioned,<br />

there exist several calibration strategies that have been demonstrate to be useful<br />

for improving the prediction quality of a hazard evolution. Each one of this<br />

optimization schemes must be modeled independently of each other because the<br />

way of performing is quite different. As it could be observed in Figure 2, there<br />

is a tight relation between the results obtained at SLA with this level because<br />

SLA is inside ALA, therefore, ALA is directly proportional to SLA.<br />

3.3 Prediction level assessment (PLA)<br />

At this level one can rely either on dynamic data injection to the system or not.<br />

A pure DDDAS will take into account data injection at real time and this is the<br />

way that the DDDAS for forest fire spread prediction has been designed in its<br />

advanced form. However, in a preliminary version, the dynamic data injection<br />

was not considered and it was based in the working hypothesis that the environmental<br />

conditions keep constant from the calibration stage to the prediction<br />

stage. For this reason, the PLA methodology has been designed in a two step<br />

fashion, first of all we will determine a standard methodology for the prediction<br />

stage without real time data injection and, afterwards, the PLA’s characterization<br />

will be performed, taking into account data gathering frequency and data<br />

source. The aim consists of reaching the capability to determine the probability<br />

distribution that indicates which percentage of prediction improvement has historically<br />

been obtained in the cases where the data was acquired with a certain<br />

frequency, and from a certain data sources. This characterization level, as in<br />

SLA, relies on a massive statistical study. Thus, we can assess in advance the<br />

probability of improvement the dynamic data injection process may produce in<br />

the prediction, without the need to modify the underlying simulator.<br />

It is important to notice that in the characterization of the simulator, we<br />

focus on the execution time as a ”classification criteria”, whereas the quality of<br />

prediction is the factor taken into account when characterizing the adjustment<br />

stage (ALA). This is because the quality of the initial prediction given by the<br />

simulator has no influence over the final prediction. Nevertheless, the execution<br />

time of each calibration technique is directly proportional to the execution time<br />

of the simulator. Hence, in order to estimate both accuracy of prediction and<br />

time needed to perform it, the study of these aspects is carried out in this way.<br />

6<br />

36


In the next section, an empirical study concerning the method followed for<br />

the Simulator Level Assessment is detailed and the obtained results are analyzed.<br />

4 Experimental Study<br />

In this section, we present the experimental studies carried out to validate the<br />

first two steps of the previously proposed methodology.<br />

Subsequently, we expose the way we deal with SLA. As stated above, the fact<br />

of having well characterized each simulator we deal with, in terms of execution<br />

time, becomes crucial for the validation of the whole methodology. Afterwards,<br />

we present the application of the described strategy to the ALA, where, as<br />

adjustment technique, we chose the Genetic Algorithm.<br />

4.1 Prediction Time Classification<br />

The matter concerning the simulation time estimation may be tackled by means<br />

of carrying out large sets of executions of the underlying simulator, and then<br />

analyzing its behavior from the obtained results. However, this fact may not be<br />

trivial in certain cases. While it is easy to detect that the application presents<br />

a high sensitivity to certain input parameters, even in an intuitive way, some<br />

of them produce a behavior of the simulator that turns out hard to predict.<br />

Figures 3 and 4 show examples of each case, respectively. In the former, one can<br />

observe that the dimension of the map to be simulated has a direct influence<br />

on the execution time (as it was bound to happen), whereas, in the latter, it<br />

can be noticed that the relation between execution time and wind direction is<br />

not so clear (this anomaly is reported in [12]), and even it becomes odder when<br />

combining variations in wind direction with variations with vegetation type.<br />

Fig. 3. Execution time as a function of number of cells.<br />

Currently, this characterization is fulfilled by means of carrying out large sets<br />

of executions (on the order of tens of thousands) counting on different initial<br />

scenarios (different input data sets), and then, applying knowledge-extraction<br />

techniques from the info they provide. We record the execution times from the<br />

7<br />

37


Fig. 4. Variations in execution time according to variations in wind direction and<br />

vegetation type.<br />

experiment, and then we establish a classification of the input parameters according<br />

the elapsed times they produced. At this moment, we are capable to<br />

apply machine learning techniques to determine classification criteria and, therefore,<br />

given a new set of input parameters, to be able to estimate how much the<br />

execution will last.<br />

This learning process is carried out offline, i.e. the classification rules are<br />

established prior to the hazard occurrence. Therefore, at the moment of the<br />

urgency management, we only have to apply the classification technique, which<br />

involves a negligible cost of computational time.<br />

This fact highlights the need to base on complex criteria in order to successfully<br />

classify the input data sets according to the execution time they will cause.<br />

Consequently, we rely on Artificial Intelligence field to reach such an objective.<br />

Specifically, this experimental study shows the results obtained from the use of<br />

decision trees as classification technique.<br />

Test bed description. FireLib is a C function library for predicting the spread<br />

rate and intensity of free-burning wildfires, developed in 1996. It is based on<br />

the Rothermel fire model [1] to determine the direction and magnitude of the<br />

maximum rate of spread. The simulated scenario is the subsequently described:<br />

– Domain: For the characterization of fireLib, an artificial 1001x1001 cells map<br />

was used (cells width and height: 100 feet). In both cases, the indicated<br />

topography remained constant for all the executions.<br />

– Simulation duration: FireLib simulations end once the fire reaches one edge<br />

of the map.<br />

– Ignition point: The ignition point in the case of fireLib was the central cell<br />

of the map.<br />

Table 1 shows the assigned probability distributions for each type of input<br />

parameter. As regards wind speed and direction, the chosen distributions and<br />

their associated parameters were the ones used in [18]. The vegetation models<br />

correspond to the 13 standard Northern Forest Fire Laboratory (NFFL) fuel<br />

models [2].<br />

8<br />

38


Input Distribution µ,σ Min,Max<br />

Vegetation Uniform<br />

model<br />

— 1,13<br />

Wind<br />

Speed<br />

Normal 12.83,6.25 —<br />

Wind<br />

rectionDi-<br />

Normal 56.6,13.04 —<br />

Dead fuel Uniform — 0,1<br />

moisture<br />

Live fuel Uniform — 0,4<br />

moisture<br />

Table 1. Input parameters distributions description.<br />

Once established the distribution of each input parameter, a set of 38750<br />

different combinations of input data sets was generated, and the simulations of<br />

each scenario were performed.<br />

As regards the computational platform, all the experiments carried out in<br />

this work were done on a cluster of 32 IBM x3550 nodes, each of which counting<br />

on 2 x Dual-Core Intel Xeon CPU 5160, 3.00GHz, 4MB L2 cache memory (2x2)<br />

and 12 GB Fully Buffered DIMM 667 MHz, running Linux version 2.6.16.<br />

Fig. 5. Execution times using fireLib.<br />

Empirical evaluation. As one can see in Figure 5, the variance on the simulation<br />

time is very noticeable. The great majority of the executions are located<br />

under the 2500 seconds threshold, but there were several executions that lasted<br />

more than 30000 seconds, and even more than 50000 seconds.<br />

From the point of view of emergency prediction, it is crucial to have the question<br />

of execution time under control, so we may deal with cases that drastically<br />

9<br />

39


slow down the prediction process. An elapsed time prediction for a simulator<br />

execution with an error on the order of thousands of seconds would be prohibitive,<br />

so, from cases like this one, there arises the need to be able to predict<br />

how the simulator is going to behave and, therefore, the need to use an efficient<br />

classification technique.<br />

In order to respond to this need, the experimental study carried out in this<br />

work consisted of using decision trees as the classification method, to be able to<br />

estimate, in advance, the execution time of fireLib, given a new unknown set of<br />

input parameters.<br />

The decision trees used in this research were the generated by the C4.5<br />

algorithm [17], specifically, the J48 open source Java implementation of the C4.5<br />

algorithm in the Weka [16] data mining tool. The data obtained from the 38750<br />

executions was used as a training set, and 1000 new instances were generated<br />

(according to the distributions shown in Table 1) to be used as a test set.<br />

The number of classes, and the execution time intervals they represent, were<br />

determined taking into account where our work is framed, i.e. the intervals chosen<br />

for each class are those that in a real emergency situation would matter (it has<br />

no sense, for example, to classify by intervals of 10 seconds when predicting<br />

forest fire spread). The defined classes are the following, where ET stands for<br />

execution time:<br />

– Class A: ET ≤ 900 seconds.<br />

– Class B: 900 seconds < ET ≤ 1800 seconds.<br />

– Class C: 1800 seconds < ET ≤ 3600 seconds.<br />

– Class D: 3600 seconds < ET ≤ 7200 seconds.<br />

– Class E: 7200 seconds < ET.<br />

The results of the application of decision trees to the test set are summarized<br />

in Table 2. Here, one of the main aspects to highlight is the prominence of the<br />

main diagonal, which means that perfect matches are predominant over the<br />

whole set of predictions. Furthermore, one can notice that the values decrease<br />

as one moves away from the main diagonal. Indeed, the worst possible cases<br />

(predict A when the real class is E, and vice-versa), never happened.<br />

Predicted Class<br />

A B C D E<br />

A 669 14 4 2 0<br />

B 17 72 9 4 0<br />

C 2 12 72 12 4<br />

D 5 6 14 24 5<br />

E 0 3 2 12 36<br />

Table 2. Correspondence between real and predicted classes.<br />

Real Class<br />

10<br />

40


Fig. 6. Classification accuracy.<br />

Figure 6 shows the absolute values of the number of predictions that totally<br />

hit the real class, as well as the absolute values where the prediction had an<br />

accuracy determined by the distance between classes. A Distance X accuracy<br />

means that there are X-1 classes between the predicted class and the real class.<br />

The most noticeable aspect when analyzing this graphic is that if we consider<br />

Distance 1 as a good prediction accuracy, then the results obtained present a<br />

96.8% of satisfactory classifications.<br />

4.2 Adjustment Time Restriction<br />

Once we have the capability to classify the time incurred in a simulation with a<br />

particular set of parameters, in the following experimental study we expose how<br />

we can take a great advantage from this technique, applying it in the two stage<br />

prediction strategy described in Section 2.<br />

The aim of this experiment is to demonstrate the benefits of the ability to<br />

discard, in advance, those initial settings for the simulation which execution<br />

times would cause the adjustment technique to last more than the initial pre-set<br />

deadline for the prediction. Furthermore, we demonstrate that the application of<br />

the above described classification technique does not have impact in the quality<br />

of the prediction results.<br />

Test bed description. In this case, we have used FARSITE [9], as a fire spread<br />

simulator. We carried out an analogous process as the one described in 4.1, with<br />

a training database of 20934 simulations, executed in the same computational<br />

platform. The distribution of each input parameter also corresponds to the one<br />

specified in Table 1. This experiment uses the GIS data from the benchmark<br />

provided by FARSITE (the Ashley project), and in every case, a simulation of<br />

30 hours is performed.<br />

The adjustment technique chosen for this study was the Genetic Algorithm.<br />

In this study, we analyse the results obtained from the calibration step, considering<br />

the adjustment time interval [0 hours - 5 hours]. It has been carried out<br />

ten experiments, starting from ten different initial random populations of fifty<br />

individuals, and evolving them through five generations.<br />

<strong>11</strong><br />

41


Population Calibration Error<br />

#generations<br />

with Class C<br />

members<br />

Average<br />

estimated<br />

execution time<br />

0 0.31238 0 6000 s<br />

1 0.120206 0 6000 s<br />

2 0.203242 2 9000 s<br />

3 0.127323 4 12000 s<br />

4 0.13543 2 9000 s<br />

5 0.022934 0 6000 s<br />

6 0.071767 1 7500 s<br />

7 0.178331 2 9000 s<br />

8 0.1724 0 6000 s<br />

9 0.209174 0 6000 s<br />

Table 3. Results obtained in the calibration interval [0 hours - 5 hours] for each<br />

population.<br />

It is worth emphasizing that the computational resource used in this work<br />

provides enough computing elements to be able to execute every individual of a<br />

given generation in a different node, i.e. all individuals of each generation start<br />

their corresponding simulation at the same time, being processed in parallel.<br />

This fact implies that the time incurred in processing each generation depends<br />

on the individual which produces the slowest simulation.<br />

In this experiment, we also establish a timeout of one hour, so simulations<br />

that reached this threshold were discarded from the study.<br />

Analysis of the results. Table 3 summarizes the obtained results for this<br />

experiment. The values in the second column correspond to the error of the<br />

best individual after five generations for each particular population. Since the<br />

underlying fire simulator produces a raster file indicating the time of arrival of<br />

the fire for each cell of the simulated map, the quality error is calculated by<br />

means of the following formula:<br />

E =<br />

(Cells ∪−InitCells) − (Cells ∩−InitCells)<br />

RealCells − InitCells<br />

This equation calculates the differences in the number of cells burned, both<br />

missing or in excess, between the simulated and the real fire. Cells∪ is the<br />

union of the number of cells burned in the real fire and the cells burned in the<br />

simulation, Cells∩ is the intersection between the number of cells burned in the<br />

real fire and in the simulation, RealCells are the cells burned in the real fire<br />

and InitCells are the cells burned at the starting time.<br />

As it was expected, different initial populations lead to different quality of results.<br />

Nevertheless, since our techniques are supposed to be applied in an urgent<br />

12<br />

42


situation, it is worth examining the time spent on each evolution process. For<br />

this purpose, we perform a post-mortem classification of the individuals involved<br />

in that process. This classification was performed following the methodology described<br />

in the previous section, defining the following classes, where ET stands<br />

for execution time:<br />

– Class A: ET ≤ 600 seconds.<br />

– Class B: 600 seconds < ET ≤ 1800 seconds.<br />

– Class C: 1800 seconds < ET ≤ 3600 seconds.<br />

As stated above, the time spent in each generation depends on the individual<br />

which produces the slowest simulation in that particular generation. Therefore, in<br />

order to evaluate the elapsed time for each evolution process, we focus on analyse,<br />

for each population, how many generations have individuals that notably delays<br />

its evolution, i.e. for each population, how many generations have individuals<br />

classified as C. Table 3 summarizes this information in the third column.<br />

From the complete set of input parameter combinations tested for the Genetic<br />

Algorithm, 23 individuals belonged to Class C. Applying the classification<br />

schema described in the previous section, 19 of them were correctly classified,<br />

i.e. the process provided a 82.36% hit ratio.<br />

It is worth mentioning that in those cases where a generation included one<br />

or more Class C members, at least one of them was correctly classified and,<br />

consequently, the wrong-classified individuals did not affect the average estimated<br />

execution time. This value, summarized in the fourth column of Table 3,<br />

is the summation of the estimated average time for each generation, which is the<br />

average value of the interval corresponding to the slowest class present in the<br />

generation.<br />

Another interesting results that should be pointed out from this experiment,<br />

consist of the lack of relation between the time incurred during the calibration<br />

step and the quality of results obtained. This fact becomes clear when examining<br />

the most and least favorable cases (populations 5 and 3, respectively). As one<br />

can see, the error obtained in population 5 is approximately six times lesser<br />

than the one obtained in popultaion 3. Besides, the average execution time of<br />

the former was one half of the one produced by the latter, which in absolute<br />

terms means a difference of 200 minutes.<br />

The main conclusion of this experiment is that if we apply the classification<br />

strategy previous to the submission of the individuals to the computing platform,<br />

we are able to detect, in advance, combinations of input parameters that will<br />

make the adjustment process to increase its duration prohibitely. Therefore,<br />

we can remove them from the process, and this elimination will not affect the<br />

accuracy of the results.<br />

Furthermore, for this experimental study, each generation was processed in<br />

a parallel way, so one can realize the huge gain that can be obtained from<br />

the application of the proposed methodology for prediction time and quality<br />

enhancement assessment.<br />

13<br />

43


5 Conclusions<br />

Natural hazard management is undoubtedly a relevant application area in which<br />

the DDDAS paradigm can play a very important role. As it has been proved in<br />

previous works, the application of this paradigm becomes crucial in order to<br />

improve the quality of the predictions given by the simulators. Particularly, the<br />

combination with the above exposed two-stage prediction method, contributes<br />

to relieve the input uncertainty problem and, therefore, enhancing the quality<br />

of prediction.<br />

This work constitutes an essential part of a very ambitious project, which<br />

consists of determining in advance how a certain combination of natural hazard<br />

simulator, computational resources, adjustment strategy, and frequency of data<br />

acquisition will perform, in terms of execution time and prediction quality.<br />

Since we are dealing in the area of natural hazards management, it is absolutely<br />

necessary to take into account the time incurred for the prediction<br />

method. For this purpose, we have designed a methodology to assess in the<br />

urgency-accuracy binomial in each particular case.<br />

As it is well known, the execution time of a particular simulator depends on<br />

the specific setting of the input parameters. However, as it has been exposed,<br />

it becomes hard to predict how certain variations on certain input parameters<br />

would affect the execution time. In this work, we approach such a challenge by<br />

means of Artificial Intelligence techniques. Particularly, in this work we present<br />

how we deal with simulators characterization by means of the use of decision<br />

trees as classification technique.<br />

The experimental studies have been done using two different forest fire spread<br />

simulator. The proposed classification scheme has been carried out considering<br />

FireLib and FARSITE simulators and a huge set of input parameters combinations,<br />

in order to validate the classification strategy with different setup conditions.<br />

The obtained results demonstrate that the use of decision trees as classification<br />

strategy is suitable for this research, obtaining up to 96.8% of satisfactory<br />

classification prediction. Furthermore, it has been demonstrated that it is<br />

possible to notably speedup the calibration process by the application of this<br />

classification strategy, without any loss of the results’ quality.<br />

These results represent a great advance and allow us to tackle the subsequent<br />

steps of the proposed methodology with a guaranteed background.<br />

References<br />

1. R. C. Rothermel. How to Predict the Spread and Intensity of Forest and Range<br />

Fires, USDAFS,OgdenTU,Gen.Tech.Rep.INT-143,pp. 1–5.1983.<br />

2. F. A. Albini. Estimating wildfire behavior and effects. Gen.Tech.Rep.INT-GTR-<br />

30. Ogden, UT: U.S. Department of Agriculture, Forest Service, Intermountain<br />

Forest and Range Experiment Station. 1976.<br />

14<br />

44


3. B. Abdalhaq, A methodology to enhance the Predction of Forest Fire Propagation,<br />

PhD Thesis dissertation. Universitat Autònoma de Barcelona (Spain). June 2004.<br />

4. K. Wendt, A. Cortés and T. Margalef, Knowledge-guided Genetic Algorithm for<br />

input parameter optimisation in environmental modelling, Procedia Computer Science<br />

2010, Volume 1(1), International Conference on Computational Science (ICCS<br />

2010), pp. 1361–1369.<br />

5. G. Bianchini, A. Cortés, T. Margalef and E. Luque, Improved Prediction Methods<br />

for Wildfires Using High Performance Computing A Comparison, LNCS,Volume<br />

3991, pp. 539–546, 2006.<br />

6. G. Bianchini, M. Denham, A. Cortés, T. Margalef and E. Luque, Wildland Fire<br />

Growth Prediction Method Based on Multiple Overlapping Solution, Journal of<br />

Computational Science, Volume 1, Issue 4, pp. 229–237. Ed. Elsevier Science. 2010.<br />

7. R. Rodríguez, A. Cortés and T. Margalef, Injecting Dynamic Real-Time Data into<br />

a DDDAS for Forest Fire Behavior Prediction, Lecture Notes in Computer Science,<br />

Volume 5545(2), pp. 489–499, 2009.<br />

8. M. Denham, A. Cortés A. and T. Margalef, Computational Steering Strategy to<br />

Calibrate Input Variables in a Dynamic Data Driven Genetic Algorithm for Forest<br />

Fire Spread Prediction, Lecture Notes in Computer Science, Volume 5545(2),<br />

pp. 479–488, 2009.<br />

9. M. A. Finney, FARSITE: Fire Area Simulator-model development and evaluation,<br />

Res. Pap. RMRS-RP-4, Ogden, UT: U.S. Department of Agriculture, Forest Service,<br />

Rocky Mountain Research Station, 1998.<br />

10. A. Lopes, M. Cruz and D. Viegas FireStation - An integrated software system<br />

for the numerical simulation of fire spread on complex toography. Environmental<br />

Modelling and Software 17(3), pp. 269–285. 2002.<br />

<strong>11</strong>. FIRE.ORG - Public Domain Software for the Wildland fire Community.<br />

http://www.fire.org.<br />

12. fireLib User Manual and Technical Reference (online).<br />

http://www.fire.org/downloads/fireLib/1.0.4/doc.html.<br />

13. F. Darema, Dynamic Data Driven Applications Systems: A New Paradigm for Application<br />

Simulations and Measurements, ICCS 2004, LNCS 3038, Springer Berlin<br />

/ Heidelberg, pp. 662–669. 2004.<br />

14. Dynamic Data Driven Application Systems homepage. http://www.dddas.org.<br />

15. P. Beckman, S. Nadella, N. Trebon and I. Beschastnikh, SPRUCE: A System for<br />

Supporting Urgent High-Performance Computing, Grid-Based Problem Solving Environments,<br />

Volume 239/2007, pp. 295–3<strong>11</strong>. 2007.<br />

16. G. Holmes, A. Donkin and I. H. Witten. Weka: A machine learning workbench,<br />

<strong>Proceedings</strong> of the Second Australia and New Zealand Conference on Intelligent<br />

Information Systems, Brisbane, Australia. pp. 357–361. 1994.<br />

17. J. R. Quinlan. Improved use of continuous attributes in c4.5, Journal of Artificial<br />

Intelligence Research, Volume 4, pp. 77–90. 1996.<br />

18. R. E. Clark, A. S. Hope, S. Tarantola, D. Gatelli, P. E. Dennison and M. A. Moritz,<br />

Sensitivity Analysis of a Fire Spread Model in a Chaparral Landscape, Fire Ecology,<br />

Volume 4(1), pp. 1–13. 2004.<br />

19. H. Madsen and F. Jakobsen Cyclone induced storm surge and flood forecasting in<br />

the northern Bay of Bengal, Coastal Engineering, Volume 51, Issue 4, pp. 277–296.<br />

2004.<br />

20. S. D. Aberson, Five-day tropical cyclone track forecasts in the North Atlantic basin,<br />

Weather and Forecasting, Volume 13, pp. 1005–1015. 1998.<br />

21. H. C. Weber, Hurricane Track Prediction Using a Statistical Ensemble of Numerical<br />

Models, MonthlyWeatherReview,Volume131,pp.749-770.2003.<br />

15<br />

45


A Hybrid Algorithm combining Path Scanning<br />

and Biased Random Sampling for the Arc<br />

Routing Problem<br />

Sergio González 1 , Angel A. Juan 1 , Daniel Riera 1 , and José Cáceres 1<br />

Estudis d’informàtica, Multimèdia i Telecomunicació<br />

Open University of Catalonia - IN3<br />

Barcelona, Spain<br />

{sgonzalezmarti,ajuanp,drierat,jcaceresc}@uoc.edu<br />

Abstract. The Arc Routing Problem is a kind of NP-hard routing problems<br />

where the demand is located in some of the arcs connecting nodes<br />

and should be completely served fulfilling certain constraints. This paper<br />

presents a hybrid algorithm which combines a classical heuristic with biased<br />

random sampling, to solve the Capacitated Arc Routing Problem<br />

(CARP). This new algorithm is compared with the classical Path scanning<br />

heuristic, reaching results which outperform it. As discussed in the<br />

paper, the methodology presented is flexible, can be easily parallelised<br />

and it does not require any complex fine-tuning process. Some preliminary<br />

tests show the potential of the proposed approach as well as its<br />

limitations.<br />

Keywords: Arc Routing Problem, Combinatorial Optimisation, Hybrid Algorithms,<br />

Metaheuristics, Simulation.<br />

1 Introduction<br />

The Arc Routing Problem (ARP) is the counterpart to the Vehicle Routing Problem<br />

(VRP). In the latter, the demand is placed in nodes (i.e. clients) whereas in<br />

the ARP, it is located in arcs. Existing literature on ARP is not so extensive as<br />

for the VRP, although there are approaches to the VRP which can be adapted<br />

to the ARP, obtaining fairly good results. The objective of this research is to<br />

adapt the general idea proposed in [15], which were proposed for the Capacitated<br />

Vehicle Problem (CVRP) and apply them to the Capacitated Arc Routing<br />

Problem (CARP).<br />

The structure of this paper is as follows. First, the CARP problem is stated,<br />

establishing some assumptions and the basic notation. Section 3 visits the existing<br />

literature to revise the state of the art. Later, in section 4, the proposed<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

46


algorithm is introduced, presenting the classic Path Scanning algorithm proposed<br />

by Golden et al. in [9] first. Section 5 displays and discusses some results,<br />

and finally, section 6 states some conclusions and future work.<br />

2 The Capacitated Arc Routing Problem<br />

The Arc Routing Problem is a set of NP-hard problems where the objective is<br />

to determine a routing plan on a graph to serve a given set of nodes and arcs<br />

(also known as tasks), in contrast with the Vehicle Routing Problem, where the<br />

customer’s demands occur in nodes and the arcs are only used to model paths<br />

interconnecting nodes. The CARP is a particular case of the ARP in which every<br />

vehicle serving a route has a maximum capacity.<br />

The CARP problem is subjected to the following constraints:<br />

(a) All routes begin and end at the depot,<br />

(b) every vehicle has a maximum load capacity, which is considered to be the<br />

same for all vehicles,<br />

(c) every arc with positive demand must be satisfied,<br />

(d) every arc is served by a single vehicle,<br />

(e) every arc can be traversed as many times as required (though it can only be<br />

served by a single vehicle),<br />

(f) and the total routing cost is minimised.<br />

2.1 Basic Notation and Assumptions<br />

The CARP is defined over a non-complete graph G = (V,E), where V =<br />

{0, 1, 2,...,n} represents the set of n nodes, and E ⊆ A, A = {(i, j)|i, j ∈<br />

V ; i = j} represents the set of m arcs connecting pairs of nodes, in which the<br />

demand is located. A fleet of identical vehicles of capacity W , based on depot<br />

node 0 must serve the set of customers denoted by E. Every edge e =(i, j) has<br />

a traversal cost or length cij, a positive or zero demand dij and, as undirected<br />

arc, can be traversed in both directions. Thus, a subset R of required edges E<br />

(or tasks) must be served by the fleet, which are those with positive demands,<br />

dij > 0.<br />

In this scenario, the classical CARP goal is to determine a minimum length<br />

set of vehicle trips covering all the tasks with the following constraints:<br />

(a) Every trip leaves the depot, visits a subset of tasks whose total demand does<br />

not exceed W and returns to the depot,<br />

(b) and every task must be served by one and only one vehicle, though edges<br />

can be traversed multiple times.<br />

47<br />

2


3 Related work on the Capacitated Arc Routing Problem<br />

The ARP has been widely studied, though existing publications on this field<br />

are considerably lower in number than those for the VRP. Some authors have<br />

published reviews of the advances performed on the ARP field. Particularly good<br />

results are [25], [14] and [5].<br />

The study of the ARP was introduced on 1735, when Leonhard Euler presented<br />

his solution for the Königsberg bridge problem [12]. This problem, also<br />

known as the Euler Tour Problem, proposes, given a connected graph G =(V,E),<br />

to find a closed tour visiting every edge in E exactly once, or to determine that<br />

no such tour exists.<br />

The next ARP historically proposed was the Chinese Postman Problem<br />

(CPP), suggested on [21]. This is stated as given a connected graph G =(V,E,C)<br />

where C is a distance matrix. The aim is to find a tour which crosses every edge<br />

at least once and does this in the shortest possible way. When G is completely<br />

directed or completely symmetric, it can be resolved in polynomial time [4].<br />

Following the CPP, in [22], Orloff suggested the Rural Postman Problem<br />

(RPP), which is formally stated as follows. Given an undirected graph G =<br />

(V,E,C) whereC is the cost matrix for the edges, find a minimum cost tour<br />

which passes through every edge in a subset R of E at least once. It can be<br />

proved that the RPP is NP-hard [16] and its hardness comes from determining<br />

how the tour should connect the various components of R. After that, the Min-<br />

Max k-Chinese Postman Problem (MM k-CPP) can be found in [7]. In this case,<br />

given a connected undirected graph G =(V,E,C) whereC is a cost matrix<br />

with a special depot node, the aim is to find k tours, starting and ending in<br />

the depot node, such that every edge is covered by at least one tour and the<br />

length of the longest tour is minimised. It should be noted that for this problem<br />

the objective is to minimise the makespan, whereas most other problems with<br />

multiple postmen (or vehicles) seek to minimise the total distance travelled.<br />

The CARP was introduced in [8], and was originally stated as follows. Given<br />

a connected undirected graph G =(V,E,C,Q) whereC is a cost matrix and Q<br />

is a demand matrix, and given a number of identical vehicles with capacity W ,<br />

find the necessary tours such that:<br />

(a) Every arc with positive demand is served by exactly one vehicle,<br />

(b) the sum of demands on those arcs served by every vehicle does not exceed<br />

W ,<br />

(c) and the total cost of the tours is minimised.<br />

This problem is considered as the classical CARP. It can be proved that<br />

the Capacitated Vehicle Problem (CVRP) can be transformed into a CARP [8],<br />

and the CARP can be transformed in CVRP [1], which make the two classes of<br />

problems equivalent, so algorithms used to solve one class can easily be adapted<br />

to solve the other, as we intend to do with our algorithm. For the transformation<br />

of CARP into CVRP, the resulting CVRP requires either fixing the variables or<br />

using edges with infinite cost. Furthermore, the resulting CVRP is a complete<br />

48<br />

3


graph of larger size than that of the original CARP. There also exists a problem<br />

in half way between CARP and CVRP, which is the Stringed CVRP [20], in<br />

which customers are to be served as in the CVRP but some of these customers<br />

are located along the streets.<br />

During the eighties, problem specific heuristics were the most widely used<br />

methods for solving the CARP. They include the Construct-Strike algorithm,<br />

the Path-Scanning and the Augment-Merge algorithms [9]. The performance<br />

of those classical methods is generally 10% to 40% above the optimal solution.<br />

Pearn [23] proposed modified versions for those heuristics by adding several types<br />

of randomness, outperforming original heuristics. There exist several benchmarks<br />

to test the performance of the algorithms against the classical CARP, which can<br />

be downloaded from [2], and which will be used in this paper to test the proposed<br />

algorithm.<br />

More recently, other problem specific heuristics have been proposed. Some of<br />

them are the Double Outer Scan heuristic [24], which combines the Augment-<br />

Merge and the Path-Scanning methods, and the Node Duplication heuristic [24],<br />

which uses similar ideas to those proposed in the Node Duplication Lower Bound<br />

[<strong>11</strong>].<br />

Recently, most advances in development of heuristics for the classical CARP<br />

regard metaheuristics. Tabu Search algorithms have been constructed for solving<br />

the CARP. The first, called CARPET, was proposed in [13]. In it, unfeasible<br />

solutions are allowed but are also penalised. This algorithm outperformed the<br />

existing ones and is still one of the best performing algorithms for CARP. Also a<br />

combination of Tabu Search and Scatter Search to construct Tabu Scatter Search<br />

was proposed by Greistorfer [10]. Lacomme presented both a Genetic Algorithm<br />

[17] and a Memetic Algorithm [18]. In both algorithms crossover is performed<br />

on a giant tour, and fitness of a chromosome is based on the partitioning of<br />

the tour into vehicle tours. Currently these algorithms are among the very best<br />

performing solutions for the CARP.<br />

Another even younger generation of metaheuristics is that of the Ant Colony<br />

Systems. Lacomme [18] proposes an algorithm where two types of ants are used,<br />

one that makes the solution converge towards a minimum cost solution and<br />

another which ensures diversification to avoid getting trapped in a local minimum.<br />

A Guided Local Search algorithm is proposed in [3], suggesting that the<br />

distance of each edge is penalised according to some function which is adjusted<br />

throughout the algorithm. Computational experiments shows that this approach<br />

is promising.<br />

4 Proposed algorithm<br />

In this section the proposed algorithm is detailed. To do that, first of all the Path<br />

Scanning algorithm is reviewed to present later our approach on the Randomised<br />

Path Scanning.<br />

49<br />

4


4.1 Path Scanning algorithm<br />

The Path Scanning algorithm [9] is a simple and efficient algorithm which aims<br />

to get competitive results for CARP in low computational times. Its main idea is<br />

to construct five complete solutions, every one of them following an optimisation<br />

criterion. The final solution of the algorithm is the best —in terms of cost— of<br />

the five obtained.<br />

The way every route is constructed is not clearly defined in the original<br />

Golden paper, so it allows different interpretations when trying to implement it.<br />

The approach followed in this paper is to extend the current route by selecting<br />

only adjacent arcs with unserved demand, selecting that which best accomplishes<br />

the given criterion. Those five criterion consider that the vehicle is at node i and<br />

the route through the selected arc e to the node j:<br />

(1) Minimise the cost per unit demand (min{cij/dij})<br />

(2) Maximise the cost per unit demand (max{cij/dij})<br />

(3) Minimise the distance from node j back to depot.<br />

(4) Maximise the distance from node j back to depot.<br />

(5) If vehicle is less than half-full, minimise distance from node j back to depot,<br />

otherwise maximise this distance.<br />

In the case of non adjacent arcs with existing unserved demand, the closest<br />

arc —in terms of the shortest path distance— is selected. If there exists more<br />

than one arc at the same minimum distance, then the best arc accomplishing<br />

the current optimisation criterion is selected.<br />

Finally, once the vehicle capacity is exhausted, the current route is closed<br />

by returning the vehicle to the depot through the shortest path. The original<br />

algorithm does not state how the shortest path is computed. In our approach an<br />

implementation of Dijkstra’s algorithm is used.<br />

4.2 Randomised Path Scanning<br />

Recent advances in development of high-quality pseudo-random number generators<br />

have opened perspectives regarding the use of Monte Carlo Simulation<br />

(MCS) in combinatorial problems. As stated previously in this paper, the idea<br />

behind our algorithm is based on that from Juan et al. [15] for the CVRP. In<br />

that paper, a classical heuristic for the CVRP, as the Clarke and Wright Savings<br />

(CWS) heuristic, was chosen and combined with the MCS methodology. Thus,<br />

some random behaviour was introduced to the CWS heuristic in order to start<br />

an efficient search process inside the space of feasible solutions, which allows to<br />

improve original CWS results.<br />

Notice that this general approach has similarities with the Greedy Randomised<br />

Adaptive Search Procedure (GRASP) [6]. But our approach does not<br />

contain an expensive local search phase and includes a more detailed randomised<br />

construction step.<br />

In the studied case, the Path Scanning heuristic for CARP has been chosen<br />

and is combined with MCS to add randomness allowing it to reach better results.<br />

50<br />

5


With that, the Randomised Path Scanning (RPS) is obtained. In the proposed<br />

algorithm, two random processes are introduced into the original algorithm:<br />

(1) When constructing every solution, the optimisation criterion to select the<br />

next arc is not known beforehand. A criterion is randomly selected, with<br />

uniform probability distribution ([23] states that it gets better results than<br />

other probability distributions).<br />

(2) When selecting the next arc, the arc which better accomplishes the selected<br />

criterion is not chosen by default. All the candidate arcs are sorted following<br />

the selected criterion and a weight is given to every one of them, following<br />

a geometrical distribution. Thus, the next arc is selected randomly with a<br />

geometric probability distribution.<br />

With this randomisation, many valid solutions can be generated. An efficient<br />

search process inside the feasible solutions is started where each of these feasible<br />

solutions consists on a set of round-trip routes from the depot that, altogether,<br />

satisfy the demand of the arcs.<br />

Pseudo-code The algorithm is implemented as described next. First, the problem<br />

instance is loaded from the data files. Next, the arcs are extracted from the<br />

problem instance and stored in a static structure. After that, all the shortest<br />

paths between all pairs of nodes are computed using a Dijkstra implementation<br />

for the shortest path algorithm. A loop constructing complete solutions is started<br />

and finally the best solution among all the generated solutions, is selected.<br />

procedure RandPS;<br />

begin<br />

arp = getInstanceInputs ();<br />

arcs = getArcs(arp );<br />

paths = constructShortestPathsMatrix ( arcs );<br />

while stopping criterion not satisfied<br />

sol = buildRandomizedSolution(arcs , paths);<br />

if sol . cost < bestSolution . cost<br />

bestSolution = sol ;<br />

end if<br />

end while<br />

return bestSolution ;<br />

end<br />

5 Results<br />

The implementation of the RPS algorithm has been done as a Java application,<br />

using some state-of-the-art pseudo-random number generator. In particular,<br />

some classes from the SSJ library [19] were implemented, among them the<br />

51<br />

6


LFSR<strong>11</strong>3 with a period of approximately 2 <strong>11</strong>3 . To test the new algorithm, instances<br />

from [2] have been used, which are based on those of [9] and will allow<br />

the result comparison with the original Path Scanning algorithm.<br />

Table 1 shows results obtained with the gdb instances. The Path scanning<br />

solutions were obtained from Golden’s original article [9]. RPS results and times<br />

are obtained from the Java implementation described in previous section, generating<br />

10000 solutions on the loop which selects the best one.<br />

Problem Nodes Arcs LB BKS PS PS Time RandPS RandPS Time Gap (%)<br />

gdb1 12 22 316 316 316 0,005 316 0,61 0,00<br />

gdb2 12 26 339 339 367 0,006 339 0,71 7,63<br />

gdb3 12 22 275 275 289 0,003 275 0,63 4,84<br />

gdb4 <strong>11</strong> 19 287 287 320 0,002 287 0,51 10,31<br />

gdb5 13 26 377 377 417 0,002 383 0,76 8,15<br />

gdb6 12 22 298 298 316 0,001 298 0,56 5,70<br />

gdb7 12 22 325 325 357 0,003 325 0,57 8,96<br />

gdb8 27 46 344 348 416 0,015 358 2,45 13,94<br />

gdb9 27 51 303 303 355 0,017 324 2,74 8,73<br />

gdb10 12 25 275 275 302 0,003 275 0,62 8,94<br />

gdb<strong>11</strong> 22 45 395 395 424 0,003 395 1,90 6,84<br />

gdb12 13 23 458 458 560 0,001 490 0,58 12,50<br />

gdb13 10 28 536 536 592 0,002 536 0,68 9,46<br />

gdb14 7 21 100 100 102 0,001 100 0,46 1,96<br />

gdb15 7 21 58 58 58 0,001 58 0,42 0,00<br />

gdb16 8 28 127 127 131 0,002 127 0,68 3,05<br />

gdb17 8 28 91 91 93 0,002 91 0,66 2,15<br />

gdb18 9 36 164 164 168 0,003 164 0,92 2,38<br />

gdb19 <strong>11</strong> <strong>11</strong> 55 55 57 0,001 55 0,23 3,51<br />

gdb20 <strong>11</strong> 22 121 121 125 0,002 121 0,53 3,20<br />

gdb21 <strong>11</strong> 33 156 156 168 0,002 156 0,89 7,14<br />

gdb22 <strong>11</strong> 44 200 200 207 0,003 200 1,33 3,38<br />

gdb23 <strong>11</strong> 55 233 233 241 0,005 235 1,80 2,49<br />

Table 1. Results obtained with gdb instances. LB=Lower Bound; BKS=Best Known<br />

Solution obtained from [18]. Times expressed in seconds. Bold indicates that it achieves<br />

the BKS and underline that it outperforms PS solution.<br />

6 Conclusions and future work<br />

From the obtained results it can be seen that the classical Path Scanning is<br />

outperformed by the new RPS. Furthermore, competitive results are obtained<br />

in small-medium sized instances, so the new algorithm accomplishes the main<br />

objective of this research, which is to prove if the ideas from [15] for the CVRP<br />

are valid for the CARP.<br />

52<br />

7


Future work in this research will be to add splitting and cache techniques<br />

to the algorithm, trying to improve results and optimize the algorithm. Due to<br />

the independence of all the generated solutions, the algorithm could easily be<br />

parallelised in order to improve its performance when attempting to solve larger<br />

instances of CARP problems.<br />

An additional future objective of the research is to apply the algorithm to different<br />

variants inside the ARP, specially the Arc Routing Problem with Stochastic<br />

Demand (ARPSD) since having the proposed algorithm randomisation, we<br />

think that the RPS algorithm will be well suited for this problem with random<br />

behaviour on the arcs’ demand.<br />

Acknowledgements<br />

This work has been partially supported by the Spanish Ministry of Science and<br />

Innovation (TRA2010-21644-C03), and has been developed in the context of the<br />

CYTED-IN3-HAROSA network (http://dpcs.uoc.edu).<br />

References<br />

[1] A.A. Assad, B.L. Golden and W.L. Pearn. Transforming arc routing into node<br />

routing problems. Computers and Operations Research, 14(4):285-288, 1987.<br />

[2] J.M. Bleneguer. http://www.uv.es/belengue/carp.html<br />

[3] P. Beullens, D. Cattrysse, L. Muyldermans and D. Van Oudheusden. A guided<br />

local search heuristic for the capacitated arc routing problem. European Journal<br />

of Operational Research, 147:629-643, 2003.<br />

[4] N. Christofides. The optimum traversal of a graph. OMEGA, The International<br />

Journal of Management Science, 1(6):719732, 1973.<br />

[5] A. Corberán and C. Prins. Recent results on arc routing problems: an annotated<br />

bibliography. Networks, 56(1):50-69, 2010.<br />

[6] T.A. Feo and M.G. Resende. Greedy randomized adaptive search procedures. Journal<br />

of Global Optimization, 6:109-133, 1995.<br />

[7] G.N. Frederickson, M.S. Hecht and C.E. Kim. Approximation algorithms for some<br />

routing problems. SIAM Journal of Computing, 7(2):178-193, 1978.<br />

[8] B.L. Golden and R.T. Wong. Capacitated arc routing problems. Networks, <strong>11</strong>:305-<br />

315, 1981.<br />

[9] B.L. Golden, J.S. DeArmon and E.K. Baker. Computational experiments with<br />

algorithms for a class of routing problems. Computers and Operations Research<br />

10:47-59, 1983.<br />

[10] P. Greistorfer. A Tabu Scatter Search Metaheuristic for the Arc Routing Problem.<br />

Computers & Industrial Engineering, 44:249-266, 2003.<br />

[<strong>11</strong>] R. Hirabayashi, N. Nishida and Y. Saruwatari. Node duplication lower bounds for<br />

the capacitated arc routing problems. Journal of the Operations Research Society<br />

of Japan, 35(2):<strong>11</strong>9-133, 1992.<br />

[12] H. Sachs, M. Stiebitz and R.J. Wilson. An historical note: Euler-s Königberg<br />

letters. Journal of Graph Theory, 12(1):133-139, 1988.<br />

[13] A. Hertz, G. Laporte and M. Mittaz. A tabu search heuristic for the capacitated<br />

arc routing problem. Operations Research, 48(1):129-135, 2000.<br />

53<br />

8


[14] A. Hertz. Recent Trends in Arc Routing. in Graph theory, Combinatorics and<br />

algorithms: Operations research/computer science interfaces series, M.C. Golumbic<br />

and I.B.A Hartman. 2005.<br />

[15] A.A. Juan, J. Faulin, R. Ruiz, B. Barrios and S. Caballé. The SR-GCWS hybrid<br />

algorithm for solving the capacitated vehicle routing problem. Applied Soft<br />

Computing, 10:215-224, 2010.<br />

[16] A.H.G. Rinnooy Kan and J.K. Lenstra. On general routing problems. Networks,<br />

6:273-280, 1976.<br />

[17] P. Lacomme, C. Prins and W. Ramdana-Chérif. Competitive genetic algorithms<br />

for the capacitated arc routing problem and its extensions. Lecture Notes in Computer<br />

Science, 2037:473-483, 2001.<br />

[18] P. Lacomme, C. Prins and W. Ramdane-Chérif. Competitive memetic algorithms<br />

for arc routing problems. Annals of Operations Research, 131:159-185, 2004.<br />

[19] P. L’Ecuyer. SSJ: A framework for stochastic simulation in Java. <strong>Proceedings</strong> of<br />

the Winter Simulation Conference, 2002, 234-242.<br />

[20] A. Løkketangen and J. Oppen. Arc routing in a node routing environment. Computers<br />

and Operations Research, 33(4):1033-1055, 2006.<br />

[21] K. Mei-Ko. Graphic programming using odd or even points. Chinese Mathematics,<br />

1:237-277, 1962.<br />

[22] C.S. Orloff. A fundamental problem in vehicle routing. Networks, 4:35-64, 1974.<br />

[23] W.L. Pearn. Approximate solutions for the capacitated arc routing problem. Computers<br />

& Operations Research, 16(6):589-600, 1989.<br />

[24] S. Wøhlk. Contributions to arc routing. PhD thesis, University of Southern Denmark,<br />

2005.<br />

[25] S. Wøhlk. A decade of the capacitated arc routing problem. The Vehicle Routing<br />

Problem: Latest Advances and New Challenges. Springer 2010.<br />

54<br />

9


Algorithms for Interval Data Minmax Regret<br />

Paths<br />

Carolinne Torres 1 , César Astudillo 2 , Matthew Bardeen 3 , and Alfredo<br />

Candia-Véjar 4<br />

1 Facultad de Ingeniería<br />

Universidad de Talca, Chile<br />

carolinne@alumnos.utalca.cl<br />

2 castudillo@utalca.cl<br />

3 mbardeen@utalca.cl<br />

4 acandia@utalca.cl<br />

Abstract. The present paper advocates to the exact and heuristic resolution<br />

for the interval data minmax regret shortest path problem. It<br />

is assumed that, in an input graph, only the lower and upper bounds<br />

are known for the edge lengths, defining a combinatorial optimization<br />

problem under uncertainty. The goal is to find a path s-t, whichminimizes<br />

the maximum regret. The literature includes algorithms that solve<br />

the classic version efficiently. However, the variant that we study in this<br />

manuscript is known to be NP-Hard. We propose a Simulated Annealing<br />

(SA) algorithm to tackle the aforementioned problem, and we compare<br />

its performance with three other schemes, namely, an exact algorithm<br />

that utilizes a Mixed Integer Programming (MIP) formulation, and two<br />

state-of-the-art heuristics. Our experimental results consider numerous<br />

instances possessing different topologies and sizes.<br />

We study a variant of the well known Shortest Path problem (SP) problem<br />

for which efficient algorithms have been designed since 1959 [4]. Given a digraph<br />

G =(V,A) (V is the set of nodes and E is the set of arcs) with non-negative arc<br />

costs associated to each arc and two nodes s and t in V , SP consists of finding a<br />

s-t path of minimum total cost. Dijkstra designed a polynomial time algorithm<br />

and from this, a number of other approaches have been proposed. Ahuja et al.<br />

present the different algorithmic alternatives to solve the problem [1].<br />

Our interest is focused on shortest path problems where there exists uncertainty<br />

only in the objective function parameters. In the SP problem, for each arc<br />

we have a closed interval defining the possibilities for the arc length. A scenario<br />

is a vector where each number represents an element of an arc length interval.<br />

The uncertainty model used here is the minmax regret approach (MMR), sometimes<br />

named robust deviation; in this model the problem is to find a feasible<br />

solution being ε-optimal for any possible scenario with ε as small as possible.<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

55


One of the properties of the minmax regret model is that it is not as pessimistic<br />

as the (absolute) minmax model. This model in combinatorial optimization has<br />

been largely studied lately, see the review of Aissi et al. [2] and the book by<br />

Kasperski [6].<br />

It is known that minmax regret combinatorial optimization problems with<br />

interval data (MMRCO) are usually NP-hard even in the case when the classic<br />

problem is easy to solve; this is the case of the minimum spanning tree problem,<br />

shortest path problem, assignment problems and others, see [2] for details. In a<br />

few cases, an NP-Hard MMRCO problem can be resolved by a polynomial time<br />

algorithm for a class of problem instances. This is the case of the MMR spanning<br />

arborescences problem in digraphs, when the input graph is acyclic, obtaining<br />

a polynomial time algorithm, see [3]. Several efforts have been made for obtaining<br />

exact solutions using the broad repertory of exact methods, principally<br />

formulating a MMR problem like a MIP and then using a commercial code or applying<br />

branch and bound, branch and cut or Benders decomposition approaches<br />

in a dedicated scheme. Several studies have shown that Benders decomposition<br />

outperforms both branch and bound and branch and cut for obtaining exact solutions<br />

when applied to the following problems: MMR Spanning Trees [8], MMR<br />

Paths [<strong>11</strong>], MMR Assignment [13] and MMR Traveling Salesman [9].<br />

Exact algorithms for Minmax Regret Paths have been proposed by [5, 6, 10,<br />

<strong>11</strong>]. All these papers show that exact solutions for MMR SP can be obtained by<br />

different methods and take into account several types of graphs and degree of<br />

uncertainty. However, the size of the graphs tested is limited to 2000 nodes.<br />

We present the results of two basic and fast heuristics defined by fixing<br />

specific scenarios, namely the mid point scenario and upper limit scenario.<br />

Several applications of the shortest paths problems consider networks with<br />

many thousands of nodes or more and then the task of designing heuristics for<br />

large MMRP is important, see [14] for a recent application. In this context,<br />

our main contributions in this paper are the analysis of the performance of the<br />

CPLEX solver for a MIP formulation of MMRP, the analysis of the performance<br />

of known heuristics for the problem and finally the analysis of the performance<br />

of a proposed simulated annealing approach for the problem. For experiments we<br />

consider two classes of networks, random and a class of networks used in telecommunications<br />

and both containing different problem sizes. Instances containing<br />

from 100 to 20000 nodes with different degrees of uncertainty were considered.<br />

In Section 2 we present some notation and the problem definition, in Section<br />

3 a mathematical programming formulation for MMRP is presented. The simulated<br />

annealing approach proposed and two known heuristics for MMRP are also<br />

presented. Definitions of the tested problem instances and analysis of the experiments<br />

conducted are discussed in Section 4. Finally in Section 5 conclusions and<br />

future work are presented.<br />

56<br />

2


Algorithm 1 Heuristic HM<br />

Input: Network G, and interval costs function c<br />

Output: A feasible solution Y for MMR-SP<br />

1. For all e ∈ A do<br />

2. c sM<br />

e =(c + e + c − e )/2<br />

3. end For<br />

4. Y ← OPT(s M )<br />

5. Return Y,Z(Y )<br />

Algorithm 2 Heuristic HU<br />

1. For all e ∈ A do<br />

2. c sU<br />

e = c + e<br />

3. end For<br />

4. Y ← OPT(s U )<br />

5. Return Y,Z(Y )<br />

2.3 Simulated Annealing for MMRP<br />

Simulated Annealing (SA) is a well known probabilistic metaheuristic proposed<br />

by Kirkpatrick et al. for solving hard combinatorial optimization in an approximate<br />

way [7].<br />

Usually, SA seeks to avoid being trapped in local optimum as would normally<br />

occur in algorithms using local search methods. A key characteristic of SA is the<br />

possible acceptation of worse solutions than the current during the exploration<br />

of the local neighborhood. Accordingly with the physical analogy of SA with<br />

metallurgy, several parameters must be tuned in order to find good solutions.<br />

Typical parameters are associated to concepts like neighborhood, cooling schedule,<br />

size of internal loop and termination criterion. These parameters are usually<br />

adjusted through experimentation and testing.<br />

We now describe the main definitions of the concepts and parameters generally<br />

used in SA algorithms.<br />

Search Space A subgraph S of the original graph G is defined such that this<br />

subgraph contains a s-t path. In S a classical s-t shortest path subproblem<br />

is solved, where the arc costs are chosen taking the upper limit arc costs.<br />

Then, the optimum solution of these problem is evaluated for acceptation.<br />

Initial Solution The initial solution is obtained applying the heuristic HU to<br />

the original network S 1 .<br />

Cooling Programming A geometric descent of the temperature was used according<br />

to parameter β.<br />

Internal Loop This loop is defined by a parameter L and depend on the size<br />

of the instances tested.<br />

Neighborhood Search Moves Let S i be the subgraph of G considered at the<br />

iteration i and let x i be the solution given by the search space at the iteration<br />

i. Then we generate a new subgraph S i+1 of G from S i changing the status of<br />

some components of the vector characterizing S i . The number of components<br />

59<br />

5


For Karasan graphs, Table 1 shows the structure of instances, the value (exact<br />

or approximated) given by CPLEX, the value given by the heuristics and their<br />

gap from the optimum value, and the value and gap given by simulated annealing<br />

approach.<br />

We note CPLEX was able to solve, optimally, instances with up to 1000<br />

nodes. For instances with 5000, 10000 and 20000 nodes, memory problems only<br />

permit feasible solutions to be found. The gap value, given in relative percentage,<br />

is also presented. It can be seen that the gap increases with the complexity of<br />

the instances. With respect to the heuristics HM and HU, it is clear that HU<br />

has a better performance than HM and the gap of the values of HU is always<br />

small. SA consistently improves the value found by the heuristics.<br />

For Random graphs Table 3 shows the structure of each instance, the optimum<br />

value (exact or approximated) given by CPLEX, the value given by the<br />

heuristics and their gap from the optimum value, and the value given by the<br />

Simulated Annealing approach along with its gap. In this case CPLEX was able<br />

to optimally solve all of the instances. Again HU has better performance than<br />

HM and almost always finds an optimal solution. In two particular cases HM<br />

and HU only find good feasible solutions. In these cases, SA was not be able to<br />

improve on the solution provided by HU.<br />

Table 2 shows the times taken by CPLEX, HM, HU and SA for Karasan<br />

graphs. CPLEX had memory problems when solving problems with 10000 and<br />

20000 nodes. The execution time for CPLEX in these instances is also very high<br />

(in some cases, over 10 hours). Both heuristic methods take very little time to<br />

execute, and Simulated Annealing takes slightly over 2000 seconds in the worst<br />

case.<br />

Table 4 shows the times taken by CPLEX, HM, HU and SA for random<br />

graphs. We note execution times for optimal solutions given by CPLEX increases<br />

as a function of the number of nodes but the highest time is lower than two<br />

minutes. Execution times for HM and HU are negligible and for SA, the highest<br />

time is about two minutes.<br />

4 Conclusions and Future Work<br />

A simulated annealing algorithm was proposed for solving the interval data minmax<br />

regret path problem. The performance of this algorithm was compared with<br />

the performance of two known simple but effective heuristics for MMRCO problems;<br />

the optimal solution (in most cases) was provided by CPLEX from a linear<br />

integer programming formulation known for MMRP. Two classes of instances<br />

were considered for experimentation; random graphs and Karasan graphs. For<br />

random graphs, the optimal solutions were always obtained by CPLEX in reasonable<br />

times. The heuristic using the upper bound scenario outperforms that<br />

using the midpoint scenario and almost always finds the optimal solutions. In<br />

these cases, Simulated Annealing was obviously not able to improve on the initial<br />

solution given by the heuristics.<br />

61<br />

7


Table 1. The results of the analyses for Karasan graphs. For graph instances where<br />

HU did not find the optimum value, SA was always able to improve the results. In cases<br />

with many nodes, optimum values were sometimes not found by CPLEX (G >0). In<br />

these cases, b is used to indicate an unknown gap for the heuristic and SA methods.<br />

Instance |A| CPLEX/G HM/G HU/G SA/G<br />

K-100-200-0.9-7 651 79/0 89/12.7 80/1.3 80/1.3<br />

K-100-200-0.9-15 1275 29/0 33/13.8 30/3.4 30/3.4<br />

K-100-200-0.9-50 2500 34/0 34/0 34/0 34/0<br />

K-500-200-0.9-5 2475 1715/0 1840/7.3 1767/3.0 1742.8/1.6<br />

K-500-200-0.9-12 5856 220/0 243/10.5 227/3.2 223/1.4<br />

K-500-200-0.9-17 82<strong>11</strong> 95/0 95/0 95/0 95/0<br />

K-1000-200-0.9-7 6951 1438/0 1534/6.7 1445/0.5 1442/0.3<br />

K-1000-200-0.9-21 20559 182/0 205/12.6 186/2.2 183.6/0.9<br />

K-1000-200-0.9-60 56400 46/0 48/4.3 47/2.2 47/2.2<br />

K-5000-200-0.9-4 19984 18860/4.56 19684/b 19635/b 19186/b<br />

K-5000-200-0.9-10 49990 3289/0 3406/3.6 3299/0.3 3297/0.3<br />

K-5000-200-0.9-18 89676 1201/0 1268/5.6 1219/1.5 1210.4/0.8<br />

K-10000-200-0.9-4 39984 36537/6.97 38346/b 37685/b 36954/b<br />

K-10000-200-0.9-8 79936 10780/4.02 <strong>11</strong>361/b 10940/b 10863/b<br />

K-10000-200-0.9-15 149775 3020/1.90 3220/b 3039/b 3034.2/b<br />

K-20000-200-0.9-3 59991 <strong>11</strong>8101/8.62 123392/b 124170/b 122227/b<br />

K-20000-200-0.9-5 99975 507<strong>11</strong>/6.23 53169/b 51735/b 51319.3/b<br />

K-20000-200-0.9-8 159936 20968/4.25 22107/b 21231/b 2<strong>11</strong>42.7/b<br />

Table 2. Execution times (in seconds) for Karasan graphs.<br />

Instance |A| CPLEX HM HU SA<br />

K-100-200-0.9-7 651 0.05 0.00 0.00 0.10<br />

K-100-200-0.9-15 1275 0.05 0.00 0.00 0.12<br />

K-100-200-0.9-50 2500 0.05 0.01 0.01 0.14<br />

K-500-200-0.9-5 2475 9.41 0.01 0.01 1.89<br />

K-500-200-0.9-12 5856 0.85 0.02 0.02 2.56<br />

K-500-200-0.9-17 82<strong>11</strong> 0.86 0.03 0.03 3.01<br />

K-1000-200-0.9-7 6951 3736.00 0.03 0.03 9.79<br />

K-1000-200-0.9-21 20559 5.21 0.06 0.06 14.80<br />

K-1000-200-0.9-60 56400 4.74 0.16 0.15 28.01<br />

K-5000-200-0.9-4 19984 5319.99 0.09 0.09 216.29<br />

K-5000-200-0.9-10 49990 48183.76 0.19 0.18 145.76<br />

K-5000-200-0.9-18 89676 6567.90 0.31 0.30 142.41<br />

K-10000-200-0.9-4 39984 6304.30 0.19 0.19 2028.52<br />

K-10000-200-0.9-8 79936 12288.09 0.32 0.31 765.58<br />

K-10000-200-0.9-15 149775 43200.00 0.52 0.51 534.51<br />

K-20000-200-0.9-3 59991 <strong>11</strong>523.00 0.25 0.18 324.20<br />

K-20000-200-0.9-5 99975 10608.20 0.17 0.17 326.60<br />

K-20000-200-0.9-8 159936 12328.30 0.63 0.54 328.80<br />

62<br />

8


Table 3. The results of the analyses for Random graphs. For graph instances where<br />

HU did not find the optimum value, SA was always able to improve the results.<br />

Instance |A| CPLEX/G HM/G HU/G SA/G<br />

R-100-200-0.9-0.060 593 164/0 176/7.32 164/0 164/0<br />

R-100-200-0.9-0.<strong>11</strong>0 1088 189/0 190/0.52 190/0.52 190/0.52<br />

R-100-200-0.9-0.250 2475 151/0 151/0 151/0 151/0<br />

R-500-200-0.9-0.010 2494 476/0 477/0.21 477/0.21 477/0.21<br />

R-500-200-0.9-0.020 4989 320/0 322/0.63 320/0 320/0<br />

R-500-200-0.9-0.035 8732 252/0 295/17.06 252/0 252/0<br />

R-1000-200-0.9-0.005 4994 328/0 384/17.07 328/0 328/0<br />

R-1000-200-0.9-0.010 9989 215/0 215/0 215/0 215/0<br />

R-1000-200-0.9-0.060 59939 <strong>11</strong>3/0 139/23.43 <strong>11</strong>3/0 <strong>11</strong>3/0<br />

R-5000-200-0.9-0.001 24995 317/0 317/0 317/0 317/0<br />

R-5000-200-0.9-0.002 49990 320/0 395/23.43 320/0 320/0<br />

R-5000-200-0.9-0.004 99980 262/0 270/3.05 262/0 262/0<br />

R-10000-200-0.9-0.0004 39995 483/0 488/1.04 483/0 483/0<br />

R-10000-200-0.9-0.0008 79991 303/0 303/0 303/0 303/0<br />

R-10000-200-0.9-0.0015 149985 341/0 362/6.16 341/0 341/0<br />

R-20000-200-0.9-0.00013 51997 745/0 745/0 745/0 745/0<br />

R-20000-200-0.9-0.00023 91995 498/0 498/0 498/0 498/0<br />

R-20000-200-0.9-0.00043 171991 289/0 289/0 289/0 289/0<br />

Table 4. Execution times (in seconds) for Random graphs.<br />

Instance |A| CPLEX HM HU SA<br />

R-100-200-0.9-0.060 593 0.04 0.00 0.00 0.03<br />

R-100-200-0.9-0.<strong>11</strong>0 1088 0.08 0.00 0.00 0.04<br />

R-100-200-0.9-0.250 2475 0.16 0.00 0.00 0.07<br />

R-500-200-0.9-0.010 2494 0.29 0.00 0.00 0.65<br />

R-500-200-0.9-0.020 4989 0.61 0.01 0.01 1.06<br />

R-500-200-0.9-0.035 8732 0.71 0.01 0.01 1.53<br />

R-1000-200-0.9-0.005 4994 0.61 0.01 0.01 2.59<br />

R-1000-200-0.9-0.010 9989 0.93 0.01 0.01 3.86<br />

R-1000-200-0.9-0.060 59939 6.03 0.07 0.07 18.34<br />

R-5000-200-0.9-0.001 24995 7.92 0.05 0.05 14.61<br />

R-5000-200-0.9-0.002 49990 16.31 0.08 0.08 2974.00<br />

R-5000-200-0.9-0.004 99980 82.35 0.36 0.37 49.87<br />

R-10000-200-0.9-0.0004 39995 22.12 0.10 0.10 44.21<br />

R-10000-200-0.9-0.0008 79991 21.97 0.14 0.14 5.27<br />

R-10000-200-0.9-0.0015 149985 58.42 0.26 0.27 <strong>11</strong>8.21<br />

R-20000-200-0.9-0.00013 51997 30.58 0.13 0.13 47.40<br />

R-20000-200-0.9-0.00023 91995 68.40 0.23 0.24 77.01<br />

R-20000-200-0.9-0.00043 171991 83.78 0.34 0.33 <strong>11</strong>9.26<br />

63<br />

9


For Karasan graphs, the optimal solution given by CPLEX was obtained<br />

only for instances with less than 5000 nodes. All of the instances with 10000<br />

or more nodes had estimated gaps between 1.9% and 8.62%. CPLEX was not<br />

able to find the optimum in these cases because of memory overflow errors. The<br />

Heuristic HU almost always outperformed HM and found solutions with small<br />

gaps in the cases where the optimum solution was known. Simulated Annealing<br />

consistently improved (or tied) the solutions found by the HU heuristic. Although<br />

the solutions found by SA were slightly worse than those found by CPLEX, the<br />

time spent finding these solutions was significantly less.<br />

It seems clear that for large networks (over 10000 nodes) of MMRP, it is<br />

convenient to use the heuristic HU to find reasonable solutions, and if time<br />

permits, to use the Simulated Annealing approach described here to improve<br />

those solutions. CPLEX still could offer aid in analyzing the heuristic results by<br />

providing bounds for the optimal solution.<br />

For future work with SA for MMRP it would be important to consider more<br />

instances while testing and also to consider more variety on some parameters<br />

when defining the test instances e.g., the parameter c. The application of the<br />

neighborhood scheme used here could be considered for the application of SA to<br />

other MMRCO problems like the minmax regret spanning tree problem.<br />

References<br />

1. R. K. Ahyja, T. L. Magnanti, and J. B. Orlin. Network flows : theory, algorithms,<br />

and applications. Prentice Hall, Upper Saddle River, NJ, 1993.<br />

2. H. Aissi, C. Bazgan, and D. Vanderpooten. Minmax and minmax regret versions of<br />

combinatorial optimization problems: A survey. European Journal of Operational<br />

Research, 197(2):427–438,Sept.2009.<br />

3. E. Conde and a. Candia. Minimax regret spanning arborescences under uncertain<br />

costs. European Journal of Operational Research, 182(2):561–577,Oct.2007.<br />

4. E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische<br />

Mathematik, 1(1):269–271, Dec. 1959.<br />

5. O. Karasan, M. Pinar, and H. Yaman. The robust shortest path problem with<br />

interval data, 2001.<br />

6. A. Kasperski. Discrete Optimization with Interval Data, volume228ofStudies<br />

in Fuzziness and Soft Computing. Springer Berlin Heidelberg, Berlin, Heidelberg,<br />

2008.<br />

7. S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing.<br />

Science, 220(4598):671–680,1983.<br />

8. R. Montemanni. A Benders decomposition approach for the robust spanning<br />

tree problem with interval data. European Journal of Operational Research,<br />

174(3):1479–1490, Nov. 2006.<br />

9. R. Montemanni, J. Barta, M. Mastrolilli, and L. M. Gambardella. The Robust<br />

Traveling Salesman Problem with Interval Data. Transportation Science,<br />

41(3):366–381, Aug. 2007.<br />

10. R. Montemanni and L. Gambardella. An exact algorithm for the robust shortest<br />

path problem with interval data. Computers & Operations Research, 31(10):1667–<br />

1680, Sept. 2004.<br />

64<br />

10


<strong>11</strong>. R. Montemanni and L. M. Gambardella. The robust shortest path problem with<br />

interval data via Benders decomposition. 4or, 3(4):315–328, Dec. 2005.<br />

12. Y. Nikulin. Simulated annealing algorithm for the robust spanning tree problem.<br />

Journal of Heuristics, 14(4):391–402,Oct.2007.<br />

13. J. Pereira and I. Averbakh. Exact and heuristic algorithms for the interval data<br />

robust assignment problem. Computers & Operations Research, 38(8):<strong>11</strong>53–<strong>11</strong>63,<br />

Aug. 20<strong>11</strong>.<br />

14. P. Sanders and D. Schultes. Engineering fast route planning algorithms. In<br />

C. Demetrescu, editor, <strong>Proceedings</strong> of the 6th international conference on Experimental<br />

algorithms,volume4525ofLecture Notes in Computer Science,pages23–36,<br />

Berlin, Heidelberg, 2007. Springer-Verlag.<br />

65<br />

<strong>11</strong>


Community of Scientist Optimization:<br />

Foraging and Competing for Research Resources<br />

Alfredo Milani 1,2 , Valentino Santucci 1<br />

1 Department of Mathematics and Computer Science<br />

University of Perugia, Italy<br />

2 Department of Computer Science<br />

Hong Kong Baptist University, Hong Kong<br />

milani@unipg.it, valentino.santucci@dmi.unipg.it<br />

Abstract. A novel optimization paradigm, called Community of Scientists<br />

Optimization (CoSO), is presented in this paper. The approach is<br />

based on the metaphor of the behaviour of a community of scientists<br />

pursuing for research results and foraging the funds needed to organize<br />

and develop research activities. The expressivity of the metaphor allows<br />

to devise a wide variety of strategies which can be applied to general<br />

optimization domains. Experiments on benchmark problems in numerical<br />

optimization shows the effectiveness of the approach. From a theoretical<br />

standpoint the CoSO’s framework subsumes other evolutionary<br />

optimization hybrid approaches and it also represents a great potential<br />

for application in non numerical and agent based domains.<br />

1 Introduction<br />

Computational solutions to hard optimization problems greatly benefit from the<br />

use of evolutionary techniques [1, 10, <strong>11</strong>]. Indeed, they are very useful to tackle<br />

hard optimization problems such as the ones arising in continuous numerical [21]<br />

and combinatorial optimization domains [6, 7]. Evolutionary techniques made use<br />

of different metaphors, often inspired by biological [5, 14, 20] or physical phenomena<br />

[3, 15], in order to design heuristics and strategies which can be employed<br />

during the exploration of the solutions search space. Evolutionary algorithms,<br />

in general, can be characterized as approximate methods which are based on<br />

some notion of time, i.e. generations or iterations, and some notions of solutions<br />

items, i.e. individuals, particles, ants, etc., which evolve over the time giving<br />

further successive approximations of the optimal solution. Different population<br />

dynamics [18] characterize the different approaches, in general individuals can<br />

breed, die, move, change their characteristics or behavior, thus improving their<br />

local search performances, and eventually collectively approaching the optimal<br />

solution.<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

66


In recent years, biological behaviors [5, 12] have been the constant inspirational<br />

metaphor for evolutionary algorithms. The underlying hypothesis is that<br />

the emerging behavior of a number of simple distributed agents (such as bees,<br />

ants, schooling fish, birds, etc.) can exhibit an high level of organization and<br />

high level valuable properties, such as converging to some optimal solution. The<br />

starting point of this paper can be posed as a somewhat provocative question:<br />

if simple organisms offer collective emergent optimization behavior why do not<br />

not take inspiration from the behavior of highly evolved organisms, i.e. humans?<br />

The idea of Community of Scientist Optimization (CoSO) originates from<br />

the investigation of the mechanism of the collective emerging behavior of a very<br />

interesting biological organization: the human scientific community. CoSO is<br />

based on the distributed optimization process which has produced and produces<br />

the highest results in the advancement of knowledge, i.e. the method used by the<br />

scientific community. In other words CoSO exploits the mechanisms that humans<br />

employ in order to organize, select and finance the scientific research, but, despite<br />

of the suggestive starting inspiration, our final purpose is to investigate the<br />

effectiveness and the actual applicability of those mechanisms to computational<br />

optimization processes and domains.<br />

The rest of the paper is organized as follows. Next section analyzes the main<br />

features of the modern scientific research process seen as a collective emergent<br />

behavior. Section 3 introduces the CoSO approach and the evolutionary foraging<br />

optimization model in the framework of numerical optimization. Section 4 discusses<br />

experiments of a CoSO implementation with some benchmark problems<br />

and its comparison with PSO performances [12]. Finally, in Section 6, a discussion<br />

of the theoretical aspects of CoSO, its relationships with other foraging and<br />

evolutionary approaches and a description of possible future lines of research<br />

concludes the paper.<br />

2 Research Process as Emerging Behaviour<br />

2.1 Science from Patronage to Self Organization<br />

Scientific research, since the remote times of great Greek mathematicians, scientists<br />

and philosophers, until Middle Age and Renaissance has been mostly<br />

relying on the goodwill of a ”mecenate” or patron. The mecenate was usually a<br />

prince, a king or a very rich person, who sponsored the activity of a ”recognized<br />

scientist”, often by paying him/her a regular salary, and admitting him/her to<br />

its court in exchange of a different variety of services from art performances to<br />

scientific talks (this was the case, for example, of Leonardo Da Vinci with the<br />

court of Ludwig the Moor in Milan, and of the Greek scientist Archimede, who<br />

was applying his scientific findings to weapon design for the Greek navy against<br />

the Romans). The ability of the mecenate in selecting the right scientists and,<br />

conversely, the ability of the scientist to cope with the personal idiosyncrasies of<br />

the patron, other than the court social environment were crucial in the development<br />

of science of these early times. In other cases the scientists were people<br />

who lived in a favorable situation which allowed them a lot of thinking time: rich<br />

2<br />

67


people themselves, monks, employees in obscure offices (remember that Einstein<br />

himself was an employee in a Swiss patent office at the time of his relativity<br />

papers), etc.<br />

It was with the first scientific societies of 1600 and eventually with the advent<br />

of the industrial revolution that the process of scientific production started to<br />

boost, with the huge production of the XIX century and the still never ceasing<br />

amount of high valuable findings and results continuously produced.<br />

It must be noted that the most relevant change which took place in scientific<br />

production process was the moving from a top-down driven process, based on<br />

the arbitrary graceful judgment of the patron, into a distributed self organized<br />

system which self regulates its own expansion and evaluation criteria. Relevant<br />

elements in the modern scientific research are the role of scientific journals and<br />

committees in selecting papers to publish in the journals, and the more objective<br />

criteria adopted by government agencies to assign research funds. Funds<br />

assignment criteria are often based on publications and previous results of the<br />

proponents, then it is the scientific community itself, which indirectly assesses<br />

the projects to finance. On the other hand, funds also represent a foraging mechanism<br />

which establishes priorities in the use of the resources consumed by the<br />

individuals.<br />

2.2 Features and Emerging Behavior of the Research Process<br />

There are some features of the modern process of scientific production which<br />

are worth to be briefly discussed and which will be later integrated in the CoSO<br />

model:<br />

– Funds: scientists need funds to do their research, i.e. to hire new researchers<br />

or to buy tools, labs, books, etc.<br />

– Journals: journals are collections of results which acts as communication<br />

channels among scientists. Scientists read journals: to take inspiration from<br />

previous researches, to avoid to rediscover already known results and to<br />

improve previous results.<br />

– Selection/Publication: the scientific production is self selected. Reputable<br />

scientists select the papers that other scientists want to publish, i.e. to draw<br />

to public attention. The selection is held by mean of (hopefully) objective<br />

criteria, such as considering, if the proposed results improve the published<br />

ones.<br />

– Results: scientific results, i.e. findings which are worth to be published or<br />

often which are not improving the previous knowledge. Sometimes the information<br />

contained in scientific publication is negative, i.e. a work informs<br />

the community that a certain hypothesis is false or a certain research line<br />

is not promising, although often misconceived, negative information is also<br />

very useful for the progress of knowledge.<br />

– Research projects: research investigators propose research projects, i.e. programs<br />

containing description of the research area and plans of research detailing<br />

which resources are needed and how to employ them. A typical research<br />

3<br />

68


management strategy dilemma is deciding either to hire new researchers or<br />

to devote existing researchers to the project.<br />

– Fund assignments criteria and policies: the funds are assigned upon projects<br />

and are based on the scientific results of the proponents, i.e. the scientific<br />

groups with the best results are more likely to obtain funds. Moreover governmental<br />

agencies also guarantee that additional (and possibly conflicting)<br />

criteria are met, such as priorizing strategic topics and ensuring topics diversity<br />

(for instance the European Union funds a limited number of outsiders,<br />

challenging a certain amount of high risk projects per year; as a required<br />

feature the projects must concern new and diverse areas).<br />

The scientific research process is probably the most notable example of collective<br />

intelligence where the valuable emergent behavior is represented by the<br />

advancement of knowledge. In the scientific community, each researcher interacts<br />

with the others by reading journals and produces new results either by<br />

exploring new directions of research or by deepening existing lines of research.<br />

In both cases if the results improve the previous ones, the new ideas can be<br />

published and spread in the community thus representing an inspiration for further<br />

researches. Successful researches will more likely lead to obtain funds to<br />

continue the research, while non successful ones mean no funds and the end<br />

of the research activity. Funds are the food of modern scientists, competition<br />

for research funds introduces a foraging mechanism which indirectly acts as a<br />

selection mechanism. Once funds have been obtained, the successful proponent<br />

researcher has to decide a strategy of research fund management, since funds<br />

can be used in quite different ways: he can hire new researchers or can just keep<br />

doing research by himself for a longer time. Moreover decisions must be taken<br />

about research directions: where and how the new and old researchers should<br />

explore. Fund policies are also important as a global regulating mechanism: governmental<br />

agencies can establish that certain areas are strategic, or that certain<br />

areas of research cannot go below a minimal amount of resources, or that too<br />

many projects insist on the same area. Policies which aim at topics diversity<br />

can be seen as general heuristics which guarantee an equilibrated advancement<br />

of knowledge, and redistribute the risk of failure when research projects are too<br />

much dense in an area.<br />

In the next section it will be introduced CoSO: an evolutionary foraging<br />

approach based on the described features of the collective research process, where<br />

the emergent behaviour will be suitable for numerical optimization problems.<br />

3 Community of Scientist Optimization: an Evolutionary<br />

Foraging Optimization Model<br />

CoSO is an evolutionary foraging optimization algorithm whose key features are<br />

inspired from the metaphor of the scientific research process taking place in a<br />

community of scientists.<br />

4<br />

69


Let a multidimensional numerical optimization problem represented by an<br />

objective function f : Θ → R to be minimized (maximized) in the space of feasible<br />

solution Θ ⊆ R d (where d is a positive integer indicating the dimensionality<br />

of the problem).<br />

CoSO is composed by a dynamic set of researchers R = {r1,...,rn} that<br />

share one or more journals Jj and compete for publishing their best results, i.e.<br />

the best points visited in Θ with respect to the cost function f. Researchers<br />

use funds to organize research by also hiring new researchers to help them. At<br />

each iteration a researcher consumes one unit of funds, thus the researchers can<br />

possibly die by fund exhaustion. The activities of searching, publishing, fund<br />

distribution, and fund investment are synchronized by discrete time instants,<br />

also called iterations. As the iterations progress the journals will reflect the advancement<br />

of knowledge on function f eventually converging toward the optimal<br />

minimum (maximum) value.<br />

3.1 Journals<br />

CoSO journals {Jj} are a set of data structures which register the significant<br />

progress of exploration done by the researchers over the time. Each journal Jj<br />

is statically characterized by:<br />

– a journal length kj, i.e. the maximum number of results which can be published<br />

in a journal issue,<br />

– a set of readers/authors Rj ⊆ R,<br />

– a sequence of journal issues {Jj,t}, one issue for each discrete time instant<br />

t.<br />

A journal issue Jj,t is a list of at most kj papers, i.e. pairs (f(xi,u),xi,u)<br />

(where i ranges on the researchers set Rj and u ∈ [0,t]∩N) ordered with respect<br />

to f(xi,u), which contains (at most) the best kj results obtained until iteration<br />

t by some journal readers. Researchers will refer to the latest journal issues they<br />

read in order to decide their direction of research. Researchers will also publish<br />

in the journals they know. In this sense journals act as communication channels<br />

among researchers.<br />

Finally, note that papers submitted at time t to a journal Jj are published<br />

in the journal issue Jj,t+1 if and only if they result within the best kj submitted<br />

of all the times, i.e. they should improve the best previously published results<br />

and should be at most kj.<br />

3.2 Researchers<br />

Researchers represent the active search elements of CoSO. At each time instant<br />

t, a researcher ri is associated with a certain set of properties:<br />

– xi,t, aresearch position in the multidimensional space Θ,<br />

– vi,t, a movement vector indicating the direction of research with respect to<br />

the previous position at time step t − 1,<br />

5<br />

70


Table 2. Experiments Results<br />

Function<br />

Qm<br />

CoSO<br />

C PcQm PSO<br />

C Pc<br />

Sphere 8269 8269 1.00 8024 8024 1.00<br />

Rosenbrock 1339203 1339203 1.00 — — 0.00<br />

Ackley 340734 299846 0.88 46000 31280 0.68<br />

Rastrigin 67660 67660 1.00 — — 0.00<br />

Griewank 18460 17722 0.96 136852 120430 0.88<br />

optimization problems, while it shows a similar behavior for simple unimodal<br />

problems.<br />

Fig. 1. Convergence graph on Sphere Function<br />

5 Theoretical aspects of CoSO and Related Works<br />

CoSO shares many elements with PSO [12, 17], and more in general it subsumes<br />

this latter one 5 . On the other hand many important differences exists, first of all<br />

5 A PSO with all particles connected can be modeled by CoSO with a single journal of<br />

dimension 1 and a fund distribution strategy reassigning at each iteration one unit<br />

of fund to each researcher. In this way no researchers are created or deleted.<br />

<strong>11</strong><br />

76


Fig. 2. Convergence graph on Griewank Function<br />

the introduction of inheritance, foraging and selection mechanisms completely<br />

absent in the pure PSO approaches.<br />

An easy parallel can be done between the notion of researchers and PSO<br />

particles, since both are characterized by changing their position in the search<br />

space. On the other hand the use of journals in CoSO combined with foraging<br />

allows a more articulated dynamic.<br />

It is interesting to recognize some basic mechanisms of foraging: survival and<br />

indirect communications (see for instance indirect communication thru pheromone<br />

in ACO [5]). Journals acts as communication channels that researchers use to<br />

indirectly exchange information about where areas with good results are, i.e.<br />

where the food for surviving is. Funds are computational resources which are<br />

guaranteed to the best computing entities, i.e. the best researchers. Connecting<br />

the foraging to the performance, by the funds distribution strategy and allowing<br />

communication thru channels/journals makes possible to obtain a collective<br />

emergent behavior consisting in optimizing the performance, i.e. a collective<br />

converging behavior.<br />

CoSO also uses elements from classical Genetic Algorithms (GAs) [8, 16, 20].<br />

The foraging mechanism induced by the notion of research funds introduces a<br />

selection mechanism which resembles the genetic survival of the fittest strategy<br />

[16]. In other words, CoSO implements a kind of ”publish or perish” rule which<br />

can be restated in a ”get good results then get funds or perish” rule. The inheritance<br />

methods used in CoSO can also be certainly related to GAs, whereas<br />

the foraging selection mechanism allows to promote the best researcher features.<br />

12<br />

77


A remarkable difference is that self-reproduction is used in CoSO, i.e. hired researchers<br />

tend to reflect most of the features of their creator, or to evolve them<br />

by mutation (see the perturbation of funds management strategy si). With this<br />

respect CoSO can be related also by bacteria foraging hybrid algorithms as [2,<br />

13, 14].<br />

Other hybrid approaches can be related to CoSO, for instance [19] which<br />

introduces diversity in a particle swarm optimizer, or the recents [18, 22] where<br />

population dynamics and genetic operator are used in the PSO framework.<br />

Although the many connections with proposed hybrid approaches, CoSO has<br />

the remarkable merit of discovering some similar mechanisms within the unique<br />

coherent metaphor of scientific production process.<br />

Another interesting aspect of CoSO is that, despite of its application in the<br />

domain of numerical optimization, it can be easily extended to other areas and<br />

used as a framework for managing distributed agents in problems suitable to<br />

be solved by collective emergent behavior. Consider, for instance, applications<br />

domains where agents (i.e. retrieval agents crawling for ”interesting” documents,<br />

planning agents, web services, etc.) produce non-numerical solution instances or<br />

services, which can be compared and shared thru journals. The agents can have,<br />

in general, different computation capabilities/abilities which will be prized with<br />

different computational resources by the foraging mechanism.<br />

6 Conclusion<br />

CoSO is an innovative evolutionary approach to computational optimization<br />

based on the mechanisms used by the scientific community to manage the process<br />

of scientific production. Among the main relevant features of CoSO are the use of<br />

a foraging mechanism, i.e. competition for research funds, which indirectly acts<br />

as a selection mechanism, a self regulating ”outsider” strategy which ensures<br />

maintain diversity in the research topics, and evolving research management<br />

strategies for hiring new researchers which dynamically adapt.<br />

Despite of the many points of contacts with recent hybrid PSO and foraging<br />

proposals [2, 14, 18], CoSO metaphor offers a single framework where foraging,<br />

competition, communication, and search dynamics lead to a collective emergent<br />

behavior which results in an efficient optimization process.<br />

Experimental results in numerical optimization problems are encouraging<br />

since CoSO outperforms classical PSO [4, 17] in difficult benchmark problems.<br />

Future lines of research will regards the exploration of different criteria for<br />

funds assignments (i.e. taking into account of the historical performance of researchers),<br />

different evolution mechanism for journal relevance distribution and<br />

for funds management strategy (which currently do not evolve in the single researcher<br />

but in its outbreeds). An interesting line of research will be also the<br />

experimentation of CoSO as framework for organizing the collective behavior of<br />

distributed set of agents in the area of non-numerical problems.<br />

13<br />

78


References<br />

1. Bäck T. (1996) Evolutionary Algorithms in Theory and Practice, Oxford, NY<br />

2. Biswas A., Dasgupta S., Das S., Abraham A. (2007) Synergy of PSO and Bacterial<br />

Foraging Optimization – A Comparative Study on Numerical Benchmarks, In:<br />

Innovations in Hybrid Intelligent Systems, ASC 44, pp. 255–263, Springer<br />

3. Cerny V. (1985) A Thermodynamical Approach to the Travelling Salesman Problem:<br />

an Efficient Simulation Algorithm. In: Journal of Optimization Theory and<br />

Applications, 45:41–51<br />

4. Clerc M., Kennedy J. (2002) The Particle Swarm-Explosion, Stability, and Convergence<br />

in a Multidimensional Complex Space. In: IEEE Transactions on Evolutionary<br />

Computation 6(1):58–73<br />

5. Colorni A., Dorigo M., Maniezzo V. (1991) Distributed Optimization by Ant<br />

Colonies. In: <strong>Proceedings</strong> of First European Conference on Artificial Life, Elsevier<br />

Publishing, pp. 134–142<br />

6. Dorigo M., Gambardella L. M. (1997) Ant Colony System : A Cooperative Learning<br />

Approach to the Traveling Salesman Problem. In: IEEE Transactions on Evolutionary<br />

Computation, 1(1):53–66<br />

7. Dorigo M., Maniezzo V., Colorni A. (1996) Ant System: Optimization by a Colony<br />

of Cooperating Agents. In: IEEE Transactions on Systems, Man, and Cybernetics<br />

– Part B, 26(1):29–41<br />

8. Eiben A. E., Raué P. E., Ruttkay Z. S. (1994) Genetic Algorithms with Multi-<br />

Parent Recombination. In: <strong>Proceedings</strong> of Third Conference on Parallel Problem<br />

Solving from Nature, pp. 78–87<br />

9. Feoktistov V. (2006) Differential Evolution. In search of solutions. Springer<br />

10. Hingston P. F., Barone L. C., Michalewicz Z. (2008) Design by Evolution: Advances<br />

in Evolutionary Design. Springer<br />

<strong>11</strong>. Holland J. H. (1975) Adaptation in Natural and Artificial Systems. University of<br />

Michigan Press, Ann Arbor<br />

12. Kennedy J., Eberarth R. (1995) Particle Swarm Optimization. In: <strong>Proceedings</strong> of<br />

IEEE Conference on Neural Networks, IEEE Press, pp. 1942–1948<br />

13. Kim D. H., Abraham A., Cho J. H. (2007) A Hybrid Genetic Algorithm and<br />

Bacterial Foraging Approach for Global Optimization. In: Information Sciences,<br />

177(18):3918–3937<br />

14. Kim D. H., Cho C. H. (2005) Bacterial Foraging Based Neural Network Fuzzy<br />

Learning. In: <strong>Proceedings</strong> of Indian International Conference on Artificial Ingelligence,<br />

pp. 2030–2036<br />

15. Kirkpatrick S., Gelatt C. D., Vecchi M. P. (1983) Optimization by Simulated<br />

Annealing. In: Science New Series 220(4598):671–680<br />

16. Michalewicz Z. (1999) Genetic Algorithms + Data Structures = Evolution Programs.<br />

Springer-Verlag<br />

17. Poli R., Kennedy J., Blackwell T. (2007) Particle Swarm Optimization. An<br />

overview. In: Swarm Intelligence 1(1):33–57<br />

18. Qi K., Lei W., Qi-Di W. (2008) A Novel Ecological Particle Swarm Optimization<br />

Algorithm and its Population Dynamics Analysis. In: Applied Mathematics and<br />

Computation 205(1):61–72<br />

19. Riget J., Vesterstrøm J. S. (2002) A Diversity-Guided Particle Swarm Optimizer<br />

– the ARPSO. In: EVALife Technical Report no. 2002-02<br />

20. Schmitt L. M. (2001) Theory of Genetic Algorithms. In: Theoretical Computer<br />

Science, 259:1–61<br />

14<br />

79


21. Storn R., Price K. (1997) Differential evolution – A Simple and Efficient Heuristic<br />

for Global Optimization over Continuous Spaces. In: Journal of Global Optimization,<br />

<strong>11</strong>(4):341–359<br />

22. Yanjiang M., Zhihua C., Jianchao Z. (2009) Dynamic Population-Based Particle<br />

Swarm Optimization Combined with Crossover Operator. In: <strong>Proceedings</strong> of Ninth<br />

International Conference on Hybrid Intelligent Systems, vol. 1, pp. 399–404<br />

15<br />

80


An Empirical Study of Learning and Forgetting<br />

Constraints<br />

Ian P. Gent, Ian Miguel and Neil C.A. Moore<br />

{ipg,ianm,ncam}@cs.st-andrews.ac.uk<br />

School of Computer Science, University of St Andrews, St Andrews, Scotland, UK.<br />

Abstract. Conflict-driven constraint learning provides big gains on many<br />

CSP and SAT problems. However, time and space costs to propagate<br />

the learned constraints can grow very quickly, so constraints are often<br />

discarded (forgotten) to reduce overhead. We conduct a major empirical<br />

investigation into the overheads introduced by unbounded constraint<br />

learning in CSP. To the best of our knowledge, this is the first published<br />

study in either CSP or SAT. We obtain two significant results.<br />

The first is that a small percentage of learnt constraints do most propagation.<br />

While this is conventional wisdom, it has not previously been<br />

the subject of empirical study. Second, we show that even constraints<br />

that do no effective propagation can incur significant time overheads.<br />

Finally, by implementing forgetting, we confirm that it can significantly<br />

improve the performance of modern learning CSP solvers, contradicting<br />

some previous research.<br />

1 Introduction<br />

In this paper, we conduct an empirical investigation into the overheads introduced<br />

by unbounded constraint learning in CSP. To the best of our knowledge,<br />

this is the first published study in either CSP or SAT. We obtain two primary<br />

results. The first is that a small percentage of learnt constraints do most propagation.<br />

Although this is conventional wisdom, no published study exists. Second,<br />

we show that even constraints that do no effective propagation can incur significant<br />

time overheads. This clarifies conventional wisdom which suggests that<br />

watched literal propagators can have lower overheads when not in use. Finally, we<br />

show that forgetting can improve performance of modern learning CSP solvers<br />

by exhibiting a working implementation, contradicting some previous published<br />

research.<br />

2 Background: Learning and Forgetting in SAT and CSP<br />

Nogood learning is an important CSP search technique. In brief, when the solver<br />

reaches a dead-end, a new constraint is added to rule out future branches that<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

81


are effective in learning solvers is consistent with the belief: if few constraints<br />

dominate collectively most can be thrown away without harming search. However<br />

constraint forgetting in some form is a positive necessity to avoid running<br />

out of memory, so it would still benefit the solver even if individual constraints<br />

were comparably effective. Irrespective, the effect must be quantified, and understanding<br />

the effect quantitatively might help to design effective forgetting<br />

strategies.<br />

Procedure Measuring effectiveness of an individual constraint is more difficult<br />

in a learning solver than in a standard backtracking solver, because the learning<br />

procedure combines constraints together. Hence a constraint may do little propagation<br />

itself, but constraints derived from it during the learning process may<br />

do a lot. Hence the influence of a constraint may be wide. This is a subtle issue<br />

and we have not attempted to measure it. Rather we will be measuring only the<br />

direct effects of individual constraints, and not their “influence”.<br />

Therefore, in this section, the number of unit propagations is used as a measure<br />

of the effectiveness of a learnt constraint. This choice is not immediate, so<br />

we will now discuss why it was chosen. The problem is that propagations are<br />

not necessarily beneficial if they remove values but do not contribute to domain<br />

wipeouts or other failures. To get around this issue, as part of its clause forgetting<br />

system (see §4.1) minisat [6] measures the number of times a constraint<br />

has been identified as part of the reason for a failure. Hence, we did consider<br />

using the number of propagations that lead to failure as a measure of constraint<br />

effectiveness, rather than raw number of propagations. However, over our 2050<br />

instances and 566,059 learned constraints, the correlation coefficient between<br />

propagation count and count of involvement in conflicts is 0.96. In other words<br />

each propagation is roughly equally likely to be involved in a conflict. Hence<br />

the following results should apply almost equally to propagations resulting in<br />

failure. The advantage of using the total number of propagations is that it is<br />

more easily defined and less coupled with learning.<br />

For efficiency reasons, solvers do not collect this data by default. In order<br />

to carry out these experiments our solver was amended to print out a short<br />

message whenever a constraint propagated, giving the unique constraint number<br />

and the node at which the propagation occurred. These data were then analysed<br />

externally with the aid of a statistical package. Although this slows the solver<br />

down, the experiment is fair because counts are not affected.<br />

Note that the later a constraint is posted, the less time is has to propagate.<br />

Hence the number of raw propagations carried out by each constraint are not<br />

directly comparable. To get around this, only constraints learned during the first<br />

50% of nodes approximately are included, and for each constraint the number<br />

of propagations are counted only over the following 50% of nodes, so that every<br />

count is over the same number of nodes. For example, if the problem is solved<br />

in 9999 nodes, constraints learned between nodes 1 and 5000 are included, and<br />

the constraint learned at node 278 is counted from nodes 278 to 5277.<br />

4<br />

84


Cum. sum of UPs<br />

0 5000 10000 15000 20000 25000<br />

0<br />

20<br />

40<br />

60<br />

Percentile of constraint<br />

Fig. 1. What proportion of constraints are responsible for what propagation? – single<br />

instance<br />

Results and analysis For instance latinSquare-dg-8 all.xml.minion we<br />

exhibit a graph that we will later show is representative of other instances.<br />

The upper curve 2 in Figure 1 shows what proportion of the best constraints<br />

are responsible for what proportion of all unit propagations (UPs). By “best”<br />

we mean doing the most propagations. Each point is an individual constraint<br />

and the constraints are sorted by increasing propagation count moving from the<br />

left to the right of the x-axis. The x-axis is the percentile of the constraint’s<br />

propagation. The y-axis is the number of propagations accounted for by that<br />

constraint and those with a lower percentile. For example, the circled point on<br />

the x-axis is the median (50th percentile) constraint by propagation count: it<br />

is the 5223th constraint, out of 10446. The total propagation count for all 5223<br />

constraints is exactly 5223 [sic] out of a total of 26220 for all constraints, i.e. 20%<br />

of the total. Hence the bottom 50% of constraints account for just 20% of all<br />

propagation. The slope is shallow until the 80th percentile constraint (marked<br />

by a small square), after which it steepens dramatically. Hence the top 20% of<br />

constraints do a lot more work than the rest. This agrees with the hypothesis<br />

that a minority of constraints do most propagation.<br />

In §2 we noted that each constraint is guaranteed to propagate at least once.<br />

This first propagation has the effect of a right branch, so does not contribute<br />

effectively since the solver would have done this anyway. Hence we now report<br />

results with these ineffective propagations deleted. In the black (lower) curve<br />

in Figure 1 the same graph is shown with 1 subtracted from the propagation<br />

count of each constraint. Here the curve is zero until the 80% percentile, meaning<br />

that the worst 80% of constraints contribute no additional propagation after the<br />

right branch, i.e. just one propagation each: just 20% of constraints do all useful<br />

propagation and 10% do almost all.<br />

The previous results focus on a specific instance, so we will now expand<br />

analysis to all 949 instances from the test set that cannot be solved within 1000<br />

nodes of search. This is done to ensure that a trend has a chance to establish: to<br />

2 the points are close enough together to appear as a single curve, rather than distinct<br />

points<br />

5<br />

85<br />

!<br />

80<br />

100


P Min. 1st Qu. Median Mean 3rd Qu. Max.<br />

1% 0.01 0.01 0.01 0.04 0.03 2.04<br />

5% 0.01 0.02 0.04 0.09 0.09 2.04<br />

10% 0.01 0.05 0.08 0.19 0.18 3.64<br />

15% 0.01 0.09 0.13 0.31 0.31 3.91<br />

20% 0.01 0.12 0.19 0.46 0.47 5.46<br />

25% 0.01 0.17 0.27 0.64 0.68 6.80<br />

30% 0.01 0.23 0.35 0.86 0.92 8.24<br />

35% 0.01 0.30 0.46 1.<strong>11</strong> 1.22 9.69<br />

40% 0.01 0.37 0.58 1.40 1.58 <strong>11</strong>.13<br />

45% 0.01 0.47 0.72 1.73 1.99 12.57<br />

50% 0.01 0.57 0.86 2.<strong>11</strong> 2.51 14.02<br />

55% 0.02 0.67 1.00 2.56 3.22 16.33<br />

60% 0.02 0.78 1.18 3.07 3.93 18.76<br />

65% 0.02 0.89 1.34 3.65 4.86 21.27<br />

70% 0.02 0.99 1.51 4.34 6.09 24.39<br />

75% 0.02 1.09 1.70 5.15 7.56 27.51<br />

80% 0.02 1.19 1.89 6.15 9.50 30.83<br />

85% 0.02 1.32 2.08 7.40 <strong>11</strong>.75 37.07<br />

90% 0.02 1.44 2.27 9.<strong>11</strong> 15.37 43.32<br />

95% 0.02 1.55 2.48 <strong>11</strong>.68 21.88 50.00<br />

100% 0.02 1.65 2.71 16.03 37.06 69.89<br />

Table 1. What proportion of constraints are responsible for what propagation? – all<br />

instances<br />

analyse only a few constraints might be less meaningful. In Table 1 for each chosen<br />

percentage P , we give what percentage of the best constraints are needed to<br />

account for P % of overall non-branching propagation 3 . These results show that<br />

usually a small proportion of the best constraints perform a disproportionate<br />

amount of propagation. For example 10% of all propagation is performed by a<br />

median of 0.08% and maximum of 3.64% of constraints, and 100% by a median<br />

of 2.71% and a maximum of 69.89%. Hence the behaviour described above for a<br />

single benchmark is robust over many instances: the best few constraints overwhelmingly<br />

perform most non-branching propagation. If anything, the above<br />

sample instance understates the effect, since it required about 20% instead of<br />

the median of 2.71% of constraints to do all propagations.<br />

Conclusion We have shown empirically that the best constraints are responsible<br />

for much of the propagation and thus search space reduction.<br />

3 It may seem anomalous that some entries exceed P %, since the best P % constraints<br />

must do at least P % of propagations. This apparent anomaly is because there may<br />

be no integer number of constraints doing P % of propagation, so it is necessary to<br />

overcount.<br />

6<br />

86


3.3 Clauses have high time as well as space costs<br />

Unit propagation by watched literals [19] is designed to reduce the amount of<br />

time spent propagating infrequently propagating constraints, by the possibility<br />

of watches migrating to inactive literals that do not trigger and cost nothing<br />

to propagate. Before describing the experiment, we will first briefly outline how<br />

watched literal propagation works.<br />

Unit propagation (UP) is a way of propagating clauses. Watched literals are<br />

an efficient implementation of UP, first described in [19]. The idea is to watch<br />

a pair of variables, that are not set to false. Provided that such variables exist,<br />

a clause must be satisfiable, and unit propagation needn’t happen yet. Suppose<br />

that one of these variables is set to false: if another non-false variable can be<br />

found then the propagation watches it instead, otherwise the single non-false<br />

variable has to be unit propagated to true immediately to avoid the constraint<br />

being unsatisfied. The empirical evidence suggests that since the propagator only<br />

cares about assignments to two variables it is efficient compared to other unit<br />

propagators that watch all assignments (e.g. ones that count false assignments).<br />

If the watched variables are set to 1 early in search then the clause will essentially<br />

be zero cost until the solver backtracks beyond that point, because it will never<br />

be triggered on those variables.<br />

Hence, perhaps weakly propagating constraints do not cost much time, if<br />

space is available to store them, since there is a possibility of infrequently propagating<br />

constraints doing little work? Hence the next question is: do constraints<br />

which do not propagate a lot cost significant time as well as space?<br />

Procedure The minimum amount of time to process a single domain event<br />

with a watched literal propagator can be on the order of a handful of machine<br />

instructions, taking nanoseconds to run, during which time the system clock may<br />

not tick. Hence, to obtain nano-scale timings, the solver keeps a running total of<br />

the number of processor clock ticks as recorded by the RDTSC register specific<br />

to Intel processors [13]. Each of these occupies 1/(2.66 × 10 9 ) seconds, since we<br />

used a 2.66 GHz Xeon E5430. The overhead of collecting data is very low, taking<br />

only one assembly instruction to get the number, and a few more cycles to add<br />

it to the running total.<br />

At the end of search, all the cycle counts are printed out and analysed externally<br />

with the aid of a statistical package.<br />

Results and analysis How does time spent correlate with unit propagations<br />

performed? Figure 2 is a scatterplot for the single instance used in §3.2. Each<br />

point represents a single constraint. The x-axis gives the number of unit propagations<br />

(including the right-branching initial one), and the y-axis the total number<br />

of processor cycles used to propagate it during the entire search. First, and unsurprisingly,<br />

as an individual constraint propagates more, it often requires more<br />

time to do so. What may be surprising is that the worst case for constraints<br />

is roughly constant, and independent of the number of propagations. That is,<br />

7<br />

87


Time (cycles)<br />

1e+08<br />

1e+07<br />

1e+06<br />

1e+05<br />

1<br />

10<br />

Number of UPs<br />

Fig. 2. How much time does propagation take?<br />

constraints which do no effective propagation can take a similar amount of time<br />

to propagate as constraints which propagate almost 1000 times. For this sample<br />

instance, 74% of propagation time is occupied with constraints that never propagate<br />

again after the first time. This suggests that learnt constraints can lead to<br />

significant time overhead without doing any useful propagation.<br />

Table 2 extends the study to the 1,923 instances out of the full set of 2,028<br />

where at least one constraint is learned. Each row is a chosen percentage R%<br />

of the total non-branching propagations, and the columns are summary statistics<br />

for what % of the overall propagation time the best constraints take to<br />

achieve R% of all propagation. A constraint is “better” than another if it does<br />

more propagations per second of time spent propagating. For example, the third<br />

row says that the median over all instances is that 10% of all non-branching<br />

propagation can be done in just 0.62% of the time taken by the best available<br />

constraints. Using the most efficient constraints, all non-branching propagation<br />

can be achieved in a mean of less than a quarter of the time of using all constraints.<br />

All other time spent is completely wasted since it leads to no effective<br />

propagation.<br />

Conclusion The results on all instances confirm the result from the single<br />

instance, and shows that learnt constraints which do no propagation contribute<br />

significantly to the time overhead of the solver.<br />

The design of watched literal propagators make it possible that constraints<br />

that do not propagate will cost the solver very little in time. This is because the<br />

watches could potentially migrate to ”silent literals” that do not trigger often.<br />

Hence, we feel it significant that we have shown that this is often not the case,<br />

and useless clauses can be very costly on an individual basis.<br />

4 Clause forgetting<br />

The above results suggest that, if picked carefully, the solver can often remove<br />

constraints to save a lot of time at only a small cost in search size. As described<br />

in §2, this is a well known and well used technique in both CSP and SAT.<br />

8<br />

88<br />

100<br />

1000


R Min. 1st Qu. Median Mean 3rd Qu. Max.<br />

1% 0.00 0.02 0.17 6.12 3.32 100.00<br />

5% 0.00 0.05 0.33 6.17 3.32 100.00<br />

10% 0.00 0.<strong>11</strong> 0.62 6.30 3.52 100.00<br />

15% 0.00 0.18 0.95 6.50 3.82 100.00<br />

20% 0.00 0.26 1.38 6.79 4.38 100.00<br />

25% 0.00 0.35 1.88 7.12 5.<strong>11</strong> 100.00<br />

30% 0.00 0.45 2.31 7.52 5.82 100.00<br />

35% 0.00 0.54 2.85 8.07 6.82 100.00<br />

40% 0.00 0.63 3.38 8.46 7.75 100.00<br />

45% 0.00 0.71 4.03 9.01 9.10 100.00<br />

50% 0.00 0.79 4.54 9.46 9.97 100.00<br />

55% 0.00 0.91 5.38 10.50 <strong>11</strong>.67 100.00<br />

60% 0.00 1.04 6.08 <strong>11</strong>.16 13.32 100.00<br />

65% 0.00 1.20 6.87 <strong>11</strong>.97 15.10 100.00<br />

70% 0.00 1.38 7.99 13.06 17.73 100.00<br />

75% 0.00 1.58 9.06 14.00 19.62 100.00<br />

80% 0.00 1.78 10.07 15.27 22.59 100.00<br />

85% 0.00 2.03 <strong>11</strong>.35 16.78 25.91 100.00<br />

90% 0.00 2.29 12.56 18.55 30.03 100.00<br />

95% 0.00 2.59 14.31 20.76 34.05 100.00<br />

100% 0.00 2.89 15.23 24.01 41.02 100.00<br />

Table 2. How much time does propagation take?–all instances<br />

Indeed Katsirelos and Bacchus have implemented relevance bounded learning<br />

for a g-learning solver in [16]. They report poor results showing that relevance<br />

bounding with k = 3 leads to more timeouts and slower solution time. However<br />

a very small number of similar problems are tried so results are inconclusive.<br />

In this section, we try a range of well-known existing strategies for forgetting<br />

learned constraints.<br />

4.1 Context<br />

For size-bounded and relevance-bounded learning [5, 8] the solver must respectively<br />

not learn the constraint if it has more than k literals in it or remove the<br />

constraint once k literals become unset for the first time. Both have been applied<br />

successfully to the CSP in the past, but using a s-learning solver. Since<br />

they were last tried, algorithms for propagating disjunctions have progressed<br />

significantly with the introduction of watched literal propagation [19], meaning<br />

that learned constraints are faster to propagate. Hence the techniques may no<br />

longer be useful and, if they are useful, the optimal choice of parameters will<br />

probably have changed as long clauses become less burdensome. Also, the learning<br />

algorithms applied have fundamentally changed with the advent of g-nogood<br />

learning. Katsirelos has shown [15] that the properties of clauses change as a<br />

result of g-learning, for example the average clause length can reduce. This also<br />

9<br />

89


motivates the re-evaluation of existing forgetting strategies. Finally, theoretical<br />

results [14, 3] from SAT show that there is an exponential separation between<br />

solvers using size-bounded learning and learning unrestricted on length, meaning<br />

that the former may need exponentially more search than the latter on particular<br />

problems. This means that size-bounded learning is theoretically discredited,<br />

but it remains to see how it performs in practice.<br />

Recently there have been a collection of new forgetting heuristics in SAT<br />

solvers, which are based on activity. Using activity-based heuristics the clauses<br />

that are least used for conflict analysis are removed when the solver needs to free<br />

space to learn new clauses. As well as guessing which clauses are least beneficial,<br />

new strategies also decide how many to keep. This is a difficult trade off, because<br />

keeping more increases propagation time, but throwing them away reduces<br />

inference power. The best choice is problem dependent. We will experiment on<br />

what we will call the minisat strategy after the solver it originated in [6].<br />

The strategy has 3 main components:<br />

activity each clause has an activity score, which is incremented by 1 each time<br />

it is used as an explanation in the firstUIP procedure<br />

decay periodically, activities are reduced, so that clauses that have been active<br />

recently are prioritised<br />

forgetting just before the scores are decayed each time, half of all constraints<br />

are removed with a couple of exceptions:<br />

– those that have unit propagated in the current branch of search are kept,<br />

– those with scores below a fixed threshold are removed first even if the<br />

target of removing half has already been reached, and<br />

– binary and unary clauses are always kept.<br />

In order to implement this algorithm the frequency of decay & forgetting and<br />

the divisor for decay must be supplied. The threshold below which all clauses<br />

are removed is simply 1 over the size of the clause database.<br />

4.2 Experimental evaluation<br />

We will describe an experiment to test the effectiveness of the forgetting strategies<br />

from the literature described above.<br />

Implementing constraint forgetting As mentioned in §3.2 each learned constraint<br />

propagates at least once and this is necessary for the completeness of<br />

g-learning. Hence when implementing bounded learning, our solver propagates<br />

it once anyway even if the constraint is going to be discarded immediately.<br />

In our implementation, currently unit clauses, a.k.a. locked clauses 4 , can be<br />

slated for deletion meaning that they are not propagated any more, but the<br />

memory cannot be freed until it is no longer unit. In our solver, restarts are not<br />

4 nomenclature due to [6]<br />

10<br />

90


used. It is possible to prove that deleting clauses is safe (i.e. the solver is still<br />

complete), provided that they are not locked.<br />

For k relevance bounding, recall that the solver must remove the constraint<br />

when k literals become unset for the first time. Our implementation works as<br />

follows: when the constraint is created the literals are sorted by descending<br />

depth at which they became false 5 and the k’th depth is selected. When the<br />

solver backtracks beyond depth k, exactly k literals will have become unset<br />

and the constraint can be deleted. There is little runtime overhead using this<br />

implementation.<br />

The implementation of size-bounded learning and the minisat strategy follow<br />

straightforwardly from the definitions given above.<br />

Experimental methodology Each of the 2028 instances was executed four<br />

times with a 10 minute timeout, over 3 Linux machines each with 2 Intel Xeon<br />

cores at 2.4 GHz and 2GB of memory each, running kernel version 2.6.18 SMP.<br />

Parameters to each run were identical, and the minimum time for each is used<br />

in the analysis, in order to approximate the run time in perfect conditions (i.e.<br />

with no system noise) as closely as possible. Each instance was run on its own<br />

core, each with 1GB of memory. Minion was compiled statically (-static) using<br />

g++ version 4.4.3 with flag -O3.<br />

Beauty contest We tried each strategy with a wide range of parameters and in<br />

Table 3 report a selection of the best parameters for each. The best parameters<br />

were found by testing a wide interval of possible parameters, and finding a local<br />

optimum. Close to the local optimum more parameters were tried to locate the<br />

best single value where possible (e.g. for discrete parameters). minion with no<br />

learning at all is also included in the comparison under name “stock.undefined”.<br />

In the table, the strategies are abbreviated to name.parameter, except minisat<br />

which is abbreviated to minisat.interval.decayfactor.<br />

The “Beauty Contest” columns give both the number of instances solved and<br />

the total amount of time spent. Hence an instance that times out does not count<br />

towards instances solved and costs 600 seconds. The best strategy is that which<br />

solved the most instances, taking into account overall time to break ties. In<br />

the table the best strategies are listed first. Finally first and third quartiles and<br />

median nodes per second are given. These statistics show the increase or decrease<br />

in search speed. A solver with forgetting should have a higher search speed<br />

because it has fewer constraints to propagate. The ‘Search measures’ columns<br />

give measures of what effect each strategy has compared to unbounded learning.<br />

This is a measure of how effective search is compared to unbounded learning, as<br />

opposed to how fast. The columns are as follows:<br />

Instances means the number of instances the variant and unbounded both<br />

complete. The number of instances being compared in the following two<br />

statistics.<br />

5 this information is available from the learning subsystem<br />

<strong>11</strong><br />

91


speedup with rel6 vs stock<br />

10000.0<br />

1000.0<br />

100.0<br />

10.0<br />

1.0<br />

0.1<br />

0.1 1.0 10.0 100.0 1000.0<br />

stock solve time<br />

(a) No learning<br />

speedup with rel6 versus unbounded<br />

100.0<br />

10.0<br />

1.0<br />

0.1<br />

0.05 0.50 5.00 50.00 500.00<br />

unbounded solve time<br />

(b) Unbounded learning<br />

Fig. 3. Graph comparing the best strategy (relevance-bounded k =6)againstother<br />

strategies<br />

Nodes inc. means what factor additional nodes the strategy needs on those<br />

instances. The smaller the number 6 , the less propagation is lost as a result<br />

of forgetting.<br />

Speedup means speedup factor, e.g. speedup factor of 2 means that the strategy<br />

takes half the time to solve the all instances together. Note that because only<br />

instances completed by both are included, there are no timeouts in the total.<br />

The aim is to maximise nodes per second, while keeping the node increase<br />

as little as possible.<br />

Analysis of results In these results, most of the strategies for forgetting clauses<br />

improve over unbounded learning (none.undefined in Table 3) in terms of both<br />

instances solved and overall time. There is an overall increase in the number<br />

of instances solved: provided that the increased node rate compensates for the<br />

increase in the number of nodes searched, there will be a net win. There is an<br />

apparent paradox because for some strategies that beat unbounded learning,<br />

e.g. size.2, the number of nodes increases more than the node rate in the “search<br />

measures” section. However this is not a problem, because “beauty contest” is<br />

based on all instances, whereas “search measures” is based only on instances that<br />

didn’t timeout. Hence the paradox is because for these strategies, the instances<br />

that timed out were the most improved in terms of nodes and node rate. This<br />

makes sense when the instances that run the longest with unbounded learning<br />

are the most encumbered by useless clauses.<br />

These results are interesting because contrary to [16], relevance- and sizebounded<br />

learning work well for certain choices of k. However, the results in this<br />

paper were based on a larger set of benchmarks and a larger range of parameters<br />

were tried. Also, different implementation decisions in our solver will result<br />

6 constraint forgetting could occasionally lead to less search, as in backjumping [21],<br />

so a number under 1 is possible in principle<br />

12<br />

92


Strategy Beauty contest Search measures<br />

Instances Time 1st Q NPS Median NPS 3st Q NPS Instances Nodes inc. Speedup<br />

stock.undefined 1667 248598.9 403.9 1353.0 10390.0 1312 129.6 6.7<br />

relevance.6 1641 278203.7 205.3 502.4 1257.0 1336 2.4 4.2<br />

relevance.5 1639 277357.3 217.6 541.6 1433.0 1336 2.8 4.7<br />

relevance.4 1639 280652.1 222.5 533.4 1549.0 1333 3.6 4.3<br />

relevance.7 1637 278973.3 201.7 482.9 <strong>11</strong>84.0 1336 1.9 4.4<br />

size.10 1637 280804.7 196.7 534.4 1225.0 1336 4.1 5.1<br />

relevance.10 1636 279244.4 178.1 454.1 1021.0 1335 1.6 5.2<br />

relevance.3 1635 280366.6 242.1 566.2 1728.0 1336 5.5 3.4<br />

size.8 1635 281008.0 214.6 566.2 1383.0 1335 5.2 4.5<br />

size.5 1634 283213.5 235.9 595.7 1574.0 1335 7.5 3.9<br />

relevance.14 1631 281037.3 141.7 409.5 874.6 1334 1.3 5.6<br />

size.12 1631 282370.3 187.6 504.2 <strong>11</strong>43.0 1335 2.1 5.5<br />

size.13 1631 2829<strong>11</strong>.4 180.1 485.7 1081.0 1335 1.8 5.5<br />

size.14 1631 283324.7 180.1 469.2 1044.0 1335 1.6 5.7<br />

relevance.15 1629 282680.8 136.6 404.9 865.1 1335 1.3 5.9<br />

size.9 1629 283146.9 205.9 541.2 1298.0 1334 4.5 5.0<br />

size.<strong>11</strong> 1629 283882.0 193.7 516.0 <strong>11</strong>70.0 1333 3.0 5.3<br />

relevance.16 1629 284854.4 134.5 406.7 860.9 1335 1.3 5.6<br />

size.15 1628 287587.7 176.5 463.9 1007.0 1333 1.7 4.7<br />

relevance.13 1627 281439.7 155.0 427.0 928.2 1335 1.4 5.3<br />

relevance.2 1625 287833.7 250.6 580.3 2006.0 1329 61.3 3.2<br />

relevance.12 1623 284866.5 159.0 420.5 928.9 1334 1.4 5.3<br />

size.2 1621 289421.7 257.4 604.3 2088.0 1327 21.6 3.7<br />

relevance.17 1620 288246.0 126.1 402.2 830.4 1335 1.3 5.1<br />

size.20 1619 295401.9 155.1 413.9 907.9 1335 1.3 4.9<br />

relevance.20 1618 293226.9 <strong>11</strong>9.2 361.1 783.1 1334 1.2 5.3<br />

size.1 1616 294566.6 262.4 6<strong>11</strong>.1 2192.0 1323 61.6 3.1<br />

mostrecent.1 1600 302325.7 227.2 544.0 2102.0 1319 65.8 3.1<br />

mostrecent.2 1600 305267.5 206.9 500.7 2008.0 1323 37.0 2.8<br />

mostrecent.10 1569 326<strong>11</strong>4.8 155.6 381.5 1683.0 1323 34.8 2.6<br />

relevance.30 1555 333292.2 98.4 255.6 686.2 1335 1.2 4.1<br />

size.30 1554 330743.5 124.0 359.9 786.2 1335 1.2 4.2<br />

minisat.1.1 1517 349391.3 <strong>11</strong>2.9 278.1 <strong>11</strong>64.0 1326 8.0 2.1<br />

relevance.40 1501 360096.1 70.5 166.2 635.5 1335 1.1 3.3<br />

size.40 1498 354322.2 108.1 260.1 720.8 1334 1.1 3.9<br />

mostrecent.100 1475 386555.2 77.2 217.8 1002.0 1326 6.1 2.2<br />

minisat.201.501 1440 410767.3 60.8 173.3 810.8 1321 2.0 2.0<br />

minisat.201.1001 1439 4<strong>11</strong>044.4 60.9 170.6 800.4 1321 2.0 2.0<br />

minisat.201.1 1438 410130.1 60.9 174.2 805.6 1321 2.0 2.1<br />

minisat.401.501 1419 431958.5 46.4 152.4 698.8 1319 1.8 1.9<br />

minisat.401.1001 1417 438939.3 45.6 146.5 676.0 1320 1.8 1.7<br />

minisat.401.1 1413 444863.3 43.8 143.5 660.1 1319 1.8 1.6<br />

relevance.100 1404 406542.4 31.4 99.2 564.3 1330 1.0 2.0<br />

size.100 1397 406529.6 40.5 <strong>11</strong>0.5 581.3 1330 1.1 1.9<br />

minisat.601.1001 1373 500036.1 36.8 127.9 586.7 1319 1.6 1.4<br />

minisat.601.501 1371 502484.1 36.1 121.2 583.9 1318 1.5 1.4<br />

mostrecent.1000 1371 559058.3 31.6 106.3 566.1 1330 1.3 1.6<br />

minisat.601.1 1367 510004.5 35.8 126.0 581.4 1316 1.4 1.5<br />

minisat.1.1001 1344 440553.2 22.7 100.7 585.6 1322 3.0 0.9<br />

none.undefined 1343 440552.2 22.2 76.4 510.0 1343 1.0 1.0<br />

minisat.1.501 1343 442209.0 22.6 97.6 574.2 1321 3.0 0.9<br />

Table 3. Comparison of various strategies for forgetting constraints<br />

in a different time-space trade off. In fact, the best strategy solves 298 more<br />

instances than unbounded learning in about 45 hours less runtime. However it<br />

still trails stock minion by 26 instances and about 8 hours of runtime. In spite<br />

of this, Figure 3(a) gives evidence that learning is still valuable and promising<br />

in specific cases. Each point is an instance, with the x-axis the runtime taken<br />

13<br />

93


y stock Minion and the y-axis is stock runtime over rel.6 runtime; points above<br />

the line are speedups and points below are slowdowns. Whilst many instances<br />

are slowed down, speedups of up to 5 orders of magnitude are available on<br />

some types of problem. Apart from the best strategy, various parameters for<br />

relevance-bounded learning perform similarly to k = 6, as well as some sizebounded<br />

learning parameters. It seems clear that they are significantly better<br />

than unbounded learning, but not much different to each other.<br />

The minisat strategy is not effective for any choice of parameters that we<br />

tried. However there is reason to believe that a better implementation might<br />

improve matters. Notice that the increase in nodes for the better strategies<br />

(200.X) is relatively small. Using a profiler, we have discovered that the reason<br />

for slowness is the amount of time taken to maintain and process the scores, and<br />

to process the constraints periodically. Hence perhaps a better implementation<br />

would turn out to perform competitively overall.<br />

Now we will analyse the best forgetting strategy more carefully. Figure 3(b)<br />

depicts the speedup on each instance for relevance-bounded k = 6 compared to<br />

unbounded. It shows that most individual instances are speeded up, sometimes<br />

by two orders of magnitude, although a few are slowed down by up to an order<br />

of magnitude.<br />

In conclusion, whether to use learning remains a modelling decision, where<br />

big wins are sometimes available but sometimes it is better turned off.<br />

5 Conclusions<br />

In this paper, we have carried out the first detailed empirical study of the effectiveness<br />

and costs of individual constraints in a CDCL solver. We found that,<br />

typically, a very small minority of constraints contribute most of the propagation<br />

added by learning. While this is conventional wisdom, it has not previously<br />

been the subject of empirical study. It is important to verify and make precise<br />

folklore results, for until evidence exists and is published it is unverifiable and<br />

acts as a barrier for entry to new researchers, who may not yet be aware of folk<br />

knowledge.<br />

Furthermore, these best constraints cost only a small fraction of the runtime<br />

cost. Conversely, constraints that do no effective propagation can incur significant<br />

time overheads. This contradicts conventional wisdom which suggests that<br />

watched literal propagators have lower overheads when not in use. This result<br />

shows why it is important to experiment on “known” results, because they are<br />

not always entirely correct.<br />

Together, these results explain why forgetting can work so well. It is obvious<br />

that forgetting is a positive necessity due to memory constraints, but this research<br />

shows that forgetting is not only necessary but also fortuituously effective<br />

because of the disparity in propagation between constraints.<br />

Finally, we performed an empirical survey of several simple techniques for<br />

forgetting constraints in g-learning, and found that they are extremely effec-<br />

14<br />

94


tive in making the learning solver more robust and efficient, contrary to some<br />

previously published evidence.<br />

References<br />

1. R. J. Bayardo and R. C. Schrag. Using CSP look-back techniques to solve realworld<br />

SAT instances. pages 203–208. AAAI Press, 1997.<br />

2. N. Beldiceanu, M. Carlsson, and J.-X. Rampon. Global constraint catalog. Technical<br />

Report 08, Swedish Institute of Computer Science, 2005.<br />

3. E. Ben-Sasson and J. Johannsen. Lower bounds for width-restricted clause learning<br />

on small width formulas. In SAT, volume6175ofLNCS, pages16–29,2010.<br />

4. F. Boussemart, F. Hemery, C. Lecoutre, and L. Sais. Boosting systematic search<br />

by weighting constraints. In ECAI 04, pages482–486,August2004.<br />

5. R. Dechter. Enhancement schemes for constraint processing: backjumping, learning,<br />

and cutset decomposition. Artif. Intell., 41(3):273–312,1990.<br />

6. N. Eén and N. Sörensson. An extensible SAT-solver. In E. Giunchiglia and A. Tacchella,<br />

editors, SAT, volume2919ofLNCS, pages 502–518. Springer, 2003.<br />

7. T. Feydy and P. J. Stuckey. Lazy clause generation reengineered. In I. P. Gent,<br />

editor, CP, volume5732ofLNCS, pages 352–366. Springer, 2009.<br />

8. D. Frost and R. Dechter. Dead-end driven learning. In AAAI-94, volume1,pages<br />

294–300. AAAI Press, 1994.<br />

9. I. Gent, I. Miguel, and N. Moore. Lazy explanations for constraint propagators.<br />

In PADL 2010, number 5937 in LNCS, January 2010.<br />

10. I. P. Gent, C. Jefferson, and I. Miguel. Minion: A fast scalable constraint solver.<br />

In ECAI, pages98–102,2006.<br />

<strong>11</strong>. M. L. Ginsberg. Dynamic backtracking. JAIR, 1:25–46,1993.<br />

12. E. Goldberg and Y. Novikov. Berkmin: A fast and robust SAT-solver. Discrete<br />

Applied Mathematics, 155(12):1549–1561,2007.<br />

13. Intel. IA-32 Intel Architecture Software Developer’s Manual Volume 1: Basic Architecture.<br />

Intel,Inc,2000.<br />

14. J. Johannsen. An exponential lower bound for width-restricted clause learning. In<br />

O. Kullmann, editor, SAT, volume5584ofLNCS, pages128–140,2010.<br />

15. G. Katsirelos. Nogood Processing in CSPs. PhD thesis, University of Toronto, Jan<br />

2009. http://hdl.handle.net/1807/16737.<br />

16. G. Katsirelos and F. Bacchus. Unrestricted nogood recording in CSP search. In<br />

CP, pages873–877,2003.<br />

17. G. Katsirelos and F. Bacchus. Generalized nogoods in CSPs. In M. M. Veloso and<br />

S. Kambhampati, editors, AAAI, pages390–396,2005.<br />

18. J. P. Marques-Silva and K. A. Sakallah. GRASP: A new search algorithm for<br />

satisfiability. In International Conference on Computer-Aided Design, pages220–<br />

227, November 1996.<br />

19. M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik. Chaff: Engineering<br />

an Efficient SAT Solver. In DAC 01, 2001.<br />

20. O. Ohrimenko, P. J. Stuckey, and M. Codish. Propagation via lazy clause generation.<br />

Constraints, 14(3):357–391,2009.<br />

21. P. Prosser. Domain filtering can degrade intelligent backtracking search. In 13th<br />

International Joint Conference on Artificial Intelligence. Morgan Kaufmann, 1993.<br />

22. G. Richaud, H. Cambazard, and N. Jussien. Automata for nogood recording in<br />

constraint satisfaction problems. In In CP06 Workshop on the Integration of SAT<br />

and CP techniques, 2006.<br />

15<br />

95


Job Shop Scheduling with Routing Flexibility and<br />

Sequence Dependent Setup-Times<br />

Angelo Oddi 1 , Riccardo Rasconi 1 , Amedeo Cesta 1 , and Stephen F. Smith 2<br />

1 Institute of Cognitive Science and Technology, CNR, Rome, Italy<br />

angelo.oddi,riccardo.rasconi,amedeo.cesta@istc.cnr.it<br />

2 Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA sfs@cs.cmu.edu<br />

Abstract. This paper presents a meta-heuristic algorithm for solving a job shop<br />

scheduling problem involving both sequence dependent setup-times and the possibility<br />

of selecting alternative routes among the available machines. The proposed<br />

strategy is a variant of the Iterative Flattening Search (IFS) schema. This<br />

work provides three separate results: (1) a constraint-based solving procedure<br />

that extends an existing approach for classical Job Shop Scheduling; (2) a new<br />

variable and value ordering heuristic based on temporal flexibility that take into<br />

account both sequence dependent setup-times and flexibility in machine selection;<br />

(3) an original relaxation strategy based on the idea of randomly breaking<br />

the execution orders of the activities on the machines with a activity selection<br />

criteria based on their proximity to the solution’s critical path. The efficacy of the<br />

overall heuristic optimization algorithm is demonstrated on a new benchmark set<br />

which is an extension of a well-known and difficult benchmark for the Flexible<br />

Job Shop Scheduling Problem.<br />

1 Introduction<br />

This paper describes an iterative improvement approach to solve job-shop scheduling<br />

problems involving both sequence dependent setup-times and the possibility of selecting<br />

alternative routes among the available machines. Over the last years there has been<br />

an increasing interest in solving scheduling problems involving both setup-times and<br />

flexible shop environments [3, 2]. This fact stems mainly from the observation that in<br />

various real-word industry or service environments there are tremendous savings when<br />

setup times are explicitly considered in scheduling decisions. In addition, the possibility<br />

of selecting alternative routes among the available machines is motivated by interest in<br />

developing Flexible Manufacturing Systems (FMS) [25] able to use multiple machines<br />

to perform the same operation on a job’s part, as well as to absorb large-scale changes,<br />

in volume, capacity, or capability.<br />

The proposed problem, called in the rest of the paper Flexible Job Shop Scheduling<br />

Problem with Sequence Dependent Setup Times (SDST-FJSSP) is a generalization<br />

of the classical Job Shop Scheduling Problem (JSSP) where a given activity may be<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms for Solving<br />

Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

96


processed on any one of a designated set of available machines and there are no setuptimes.<br />

This problem is more difficult than the classical JSSP (which is itself NP-hard),<br />

since it is not just a sequencing problem; in addition to deciding how to sequence activities<br />

that require the same machine (involving sequence-dependent setup-times), it<br />

is also necessary to choose a routing policy, i.e., deciding which machine will process<br />

each activity. The objective remains that of minimizing makespan.<br />

Despite this problem is often met in real manufacturing systems, not many papers<br />

take into account sequence dependent setup-times in flexible job-shop environments.<br />

On the other hand, a richer literature is available when setup-times and flexible jobshop<br />

environments are considered separately. In particular, on the side of setup-times<br />

a first reference work is [7], which relies on an earlier proposal presented in [6]. More<br />

recent works are [28] and [13], which propose effective heuristic procedures based on<br />

genetic algorithms and local search. In these works, the introduced local search procedures<br />

extend an approach originally proposed by [19] for the classical job-shop scheduling<br />

problem to the setup times case. A last noteworthy work is [5], which extends the<br />

well-known shifting bottleneck procedure [1] to the setup-time case. Both [5] and [28]<br />

have produced reference results on a previously studied benchmark set of JSSP with<br />

sequence dependent setup-times problems initially proposed by [7]. About the Flexible<br />

Job Shop Scheduling FJSSP an effective synthesis of the existing solving approaches is<br />

proposed in [14]. The core set of procedures which generate the best results include the<br />

genetic algorithm (GA) proposed in [10], the tabu search (TS) approach of [16] and the<br />

discrepancy-based method, called climbing depth-bound discrepancy search (CDDS),<br />

defined in [14]. Among the papers dealing with both sequence dependent setup times<br />

and flexible shop environments there is the work [23], which considers a shop type<br />

composed of pools of identical machines as well as two types of setup times: one modeling<br />

the transportation times between different machines (sequence dependent) and the<br />

other one modeling the required reconfiguration times (not sequence dependent) on the<br />

machines. The other work that deals with sequence dependent setup times and routing<br />

flexibility is [24], which considers a flow-shop environment with multi-purpose machines<br />

such that each stage of a job can be processed by a set of unrelated machines<br />

(the processing times of the jobs depend on the machine they are assigned to). [26] considers<br />

a problem similar to the previous one, where the jobs are composed by a single<br />

step, but setup-times are both sequence and machine dependent. Finally, [27] considers<br />

a job-shop problem with parallel identical machines, release times and due dates but<br />

sequence independent setup-times.<br />

This paper focuses on a family of solving techniques referred to as Iterative Flattening<br />

Search (IFS). IFS was first introduced in [8] as a scalable procedure for solving<br />

multi-capacity scheduling problems. IFS is an iterative improvement heuristic designed<br />

to minimize schedule makespan. Given an initial solution, IFS iteratively applies twosteps:<br />

(1) a subset of solving decisions are randomly retracted from a current solution<br />

(relaxation-step); (2) a new solution is then incrementally recomputed (flattening-step).<br />

Extensions to the original IFS procedure were made in two subsequent works [17, 12]<br />

and more recently [20] have performed a systematic study aimed at evaluating the effectiveness<br />

of single component strategies within the same uniform software framework.<br />

The IFS variant that we propose relies at its core on a constraint-based solver. This<br />

2<br />

97


procedure is an extension of the SP-PCP procedure proposed in [21]. SP-PCP generates<br />

consistent orderings of activities requiring the same resource by imposing precedence<br />

constraints on a temporally feasible solution, using variable and value ordering heuristics<br />

that discriminate on the basis of temporal flexibility to guide the search. We extend<br />

both the procedure and these heuristics to take into account both sequence dependent<br />

setup-times and flexibility in machine selection. To provide a basis for embedding this<br />

core solver within an IFS optimization framework, we also specify an original relaxation<br />

strategy based on the idea of randomly breaking the execution orders of the activities on<br />

the machines with a activity selection criteria based on their proximity to the solution’s<br />

critical path.<br />

The paper is organized as follows. Section 2 defines the SDST-FJSSP problem<br />

and Section 3 introduces a CSP representation. Section 4 describes the core constraintbased<br />

search procedure while Section 5 introduces details of the IFS meta-heuristics.<br />

An experimental section (Section 6) describes the performance of our algorithm on a<br />

set of benchmark problems, and explains the most interesting results. Some conclusions<br />

end the paper.<br />

2 The Scheduling Problem<br />

The SDST-FJSSP entails synchronizing the use of a set of machines (or resources)<br />

R = {r1,...,rm} to perform a set of n activities A = {a1,...,an} over time. The set<br />

of activities is partitioned into a set of nj jobs J = {J1,...,Jnj}. The processing of a<br />

job Jk requires the execution of a strict sequence of nk activities ai ∈ Jk and cannot be<br />

modified. All jobs are released at time 0. Each activity ai requires the exclusive use of a<br />

single resource ri for its entire duration chosen among a set of available resources Ri ⊆<br />

R. No preemption is allowed. Each machine is available at time 0 and can process more<br />

than one operation of a given job Jk (recirculation is allowed). The processing time pir<br />

of each activity ai depends on the selected machine r ∈ Ri, such that ei − si = pir,<br />

where the variables si and ei represent the start and end time of ai. Moreover, for each<br />

resource r, the value st r ij<br />

represents the setup time between two generic activities ai<br />

and aj (aj is scheduled immediately after ai) requiring the same resource r, such that<br />

ei + st r ij ≤ sj. As is traditionally assumed in the literature, the setup times st r ij satisfy<br />

the so-called triangular inequality (see [7, 4]). The triangle inequality states that, for any<br />

three activities ai, aj, ak requiring the same resource, the inequality st r ij ≤ str ik + str kj<br />

holds. A solution S = {(s1, r1), (s2, r2),...,(sn, rn)} is a set of pairs (si, ri), where<br />

si is the assigned start time of ai, ri is the selected resource for ai and all the above<br />

constraints are satisfied. Let Ck be the completion time for the job Jk, the makespan is<br />

the value Cmax = max1≤k≤nj{Ck}. An optimal solution S ∗ is a solution S with the<br />

minimum value of Cmax. The SDST-FJSSP is NP-hard since it is an extension of the<br />

JSSP problem [<strong>11</strong>].<br />

3 A CSP Representation<br />

There are different ways to model the problem as a Constraint Satisfaction Problem<br />

(CSP) [18]; here we use an approach similar to [21]. In particular, we focus on assigning<br />

3<br />

98


IFS(S,MaxFail, γ)<br />

begin<br />

1. Sbest ← S<br />

2. counter ← 0<br />

3. while (counter ≤ MaxFail) do<br />

4. RELAX(S, γ)<br />

5. S ←PCP(S, Cmax(Sbest))<br />

6. if Cmax(S)


path set). As known, an activity ai belongs to the critical path (i.e., meets the critical<br />

path condition) when, given ai’s end time ei and its feasibility interval [lbi,ubi], the<br />

condition lbi = ubi holds. For each activity ai, the smaller the difference ubi − lbi<br />

computed on ei, the closer is ai to the critical path condition. At each IFS iteration,<br />

the critical path set is built so as to contain any activity ai with a probability directly<br />

proportional to the γ parameter and inversely proportional to the ubi − lbi value. For<br />

obvious reasons, the critical path-biased relaxation entails a smaller disruption on the<br />

solution S, as it operates on a smaller set of activities; the activities that are farther<br />

from the critical path condition will have a minimum probability to be selected. As<br />

explained in the following section, this difference has important consequences on the<br />

experimental behavior.<br />

6 Experimental Analysis<br />

The empirical evaluation has been carried out on a SDST-FJSSP benchmark set synthesized<br />

on purpose out of the first 20 instances of the edata subset of the FJSSP HUdata<br />

testbed from [15], and will therefore be referred to as SDST-HUdata. Each one<br />

of the SDST-HUdata instances has been created by adding to the original HUdata instance<br />

one Setup-Time matrix str (nJ × nJ) for each present machine r, where nJ is<br />

the number of present jobs. Without loss of generality, the same randomly generated<br />

Setup-Time matrix was added for each machine of all the benchmark instances. Each<br />

value st r i,j<br />

in the Setup-Time matrix models the setup time necessary to reconfigure the<br />

r-th machine to switch from job i to job j. Note that machine reconfiguration times are<br />

sequence dependent: setting up a machine to process a product of type j after processing<br />

a product of type i can generally take a different amount of time than setting up the<br />

same machine for the opposite transition. The elements str i,j of the Setup-Time matrix<br />

satisfy the triangle inequality [7, 4], that is, for each three activities ai, aj, ak requiring<br />

the same machine, the inequality str ij ≤ str ik + str kj holds. The 20 instances taken from<br />

HUdata (namely, the instances la01-la20) are divided in four groups of five (nJ × nA)<br />

instances each, where nJ is the number of jobs and nA is the number of activities per<br />

job for each instance. More precisely, group la01-la05 is (10 × 5), group la06-la10 is<br />

(15×5), group la<strong>11</strong>-la15 is (20×5), and group la16-la20 is (10×10). In all instances,<br />

the processing times on machines assignable to the same activity are identical, as in the<br />

original HUdata set. The algorithm used for these experiments has been implemented<br />

in Java and run on a AMD Phenom II X4 Quad 3.5 Ghz under Linux Ubuntu 10.4.1.<br />

Results. Table 1 and table 2 show the obtained results running our algorithm on the<br />

SDST-HUdata set using the Random or Slack-based procedure in the IFS relaxation step,<br />

respectively. Both tables are composed of 10 columns and 23 rows (one row per problem<br />

instance plus three data wrap-up rows). The best column lists the shortest makespans<br />

obtained in the experiments for each instance; underlined values represent the best values<br />

obtained from both tables (global bests). The columns labeled γ =0.2 to γ =0.9<br />

(see Section 4) contain the results obtained running the IFS procedure with a different<br />

value for the relaxing factor γ. For each problem instance (i.e., for each row) the values<br />

in bold indicate the best makespan found among all the tested γ values (γ runs).<br />

<strong>11</strong><br />

106


Table 1. Results with random selection procedure<br />

inst. best γ<br />

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9<br />

la01 726 772 731 728 726 729 726 729 740<br />

la02 749 785 785 749 749 749 749 749 768<br />

la03 652 677 658 658 658 652 652 658 675<br />

la04 673 673 673 673 689 689 680 680 690<br />

la05 603 613 613 603 605 605 606 607 632<br />

la06 950 965 950 954 954 971 997 995 1020<br />

la07 916 946 916 925 919 947 950 987 1000<br />

la08 954 973 961 964 954 963 958 1000 1001<br />

la09 1002 1039 1002 1039 1020 1042 1020 1045 1068<br />

la10 977 1017 977 1022 977 1027 1008 1042 1048<br />

la<strong>11</strong> 1265 1265 1312 1285 1282 1345 1332 1372 1368<br />

la12 1088 1088 <strong>11</strong>14 <strong>11</strong>30 <strong>11</strong>67 <strong>11</strong>65 <strong>11</strong>99 1209 <strong>11</strong>98<br />

la13 1255 1255 1255 1255 1300 1280 1300 1316 1315<br />

la14 1292 1292 1315 1344 1346 1362 1351 1345 1372<br />

la15 1298 1298 1302 1338 1355 1352 1367 1388 1429<br />

la16 1012 1028 1012 1012 1012 1012 1012 1012 1023<br />

la17 864 881 885 885 864 888 864 864 902<br />

la18 985 1021 1007 1029 999 985 985 985 985<br />

la19 956 1006 992 975 956 956 978 959 981<br />

la20 997 1008 1010 997 997 997 997 997 999<br />

B (N) 12 6(1) 7(5) 6(4) 8(5) 6(5) 7(5) 5(3) 1(1)<br />

Av.C. 20149 17579 14767 <strong>11</strong>215 10950 9530 7782 7588<br />

Av.MRE 19.34 18.29 18.66 18.37 19.42 19.43 20.60 22.44<br />

For each γ run, the last three rows of both tables show respectively (up-bottom):<br />

(1) the number B of best solutions found locally (i.e., within the current table) and,<br />

underlined within round brackets, the number N of best solutions found globally (i.e.,<br />

between both tables); (2) the average number of utilized solving cycles (Av.C.), and<br />

(3) the average mean relative error (Av.MRE) 6 with respect to the lower bounds of<br />

the original HUdata set (i.e., without setup times), reported in [16]. For all runs, a<br />

maximum CPU time limit was set to 800 seconds.<br />

One significant result that the tables show is the difference in the average of utilized<br />

solving cycles (Av.C. row) between the random and the slack-based relaxation<br />

procedure. In fact, it can be observed that on average the slack-based approach uses<br />

more solving cycles in the same allotted time than its random counterpart (i.e., the<br />

slack-based relaxation heuristic is faster in the solving process). This is explained by<br />

observing that the slack-based relaxation heuristic entails a less severe disruption of the<br />

current solution at each solving cycle compared to the random heuristic, as the former<br />

generally relaxes a lower number of activities (given the same γ value). The lower the<br />

disruption level of the current solution in the relaxation step, the easier it is to re-gain<br />

solution feasibility in the flattening step. In addition of this efficiency issue, the slackbased<br />

relaxation approach also provides the extra effectiveness deriving from operating<br />

in the vicinity of the critical path of the solution, as demonstrated in [8].<br />

The good performance exhibited by the slack-based heuristic can be also observed<br />

by inspecting the B(N) rows in both tables. Clearly, the slack-based approach finds a<br />

6 The individual MRE of each solution is computed as follows: MRE = 100 × (Cmax −<br />

LB)/LB, where Cmax is the solution makespan and LB is the instance’s lower bound<br />

12<br />

107


Table 2. Results with slack-based selection procedure<br />

inst. best γ<br />

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9<br />

la01 726 739 736 726 726 726 726 726 726<br />

la02 749 785 749 749 749 749 749 749 749<br />

la03 652 658 658 658 658 658 652 658 658<br />

la04 673 686 686 686 673 686 680 673 680<br />

la05 603 613 603 613 605 603 604 603 605<br />

la06 960 963 963 971 960 963 962 970 970<br />

la07 925 941 966 941 925 931 946 972 1000<br />

la08 948 983 963 948 964 993 967 994 973<br />

la09 1002 1020 1020 1002 1002 1040 1069 1052 1042<br />

la10 985 993 991 1007 1022 1022 1017 985 1024<br />

la<strong>11</strong> 1256 1256 1257 1295 1295 1308 1318 1324 1332<br />

la12 1082 1082 1097 1098 <strong>11</strong>59 <strong>11</strong>52 <strong>11</strong>88 <strong>11</strong>63 1207<br />

la13 1215 1222 1240 1240 1223 1215 13<strong>11</strong> 1301 13<strong>11</strong><br />

la14 1285 1308 1285 1285 13<strong>11</strong> 1295 1335 1372 1345<br />

la15 1291 1333 1291 1330 1302 13<strong>11</strong> 1383 1389 1412<br />

la16 1007 1012 1012 1012 1007 1012 1012 1012 1012<br />

la17 858 889 868 893 895 888 858 859 872<br />

la18 985 1019 1025 1021 1007 985 985 985 985<br />

la19 956 1006 976 987 984 956 980 956 959<br />

la20 997 997 1033 997 997 997 1003 997 997<br />

B (N) 17 3(3) 4(4) 5(5) 8(6) 7(7) 5(5) 8(7) 4(4)<br />

Av.C. 21273 18068 15503 13007 10643 10653 8639 8575<br />

Av.MRE 18.67 18.09 18.26 18.19 18.14 19.58 19.44 20.16<br />

higher number of best solutions (17 against 12), which is confirmed by comparing the<br />

number of locally found bests (B) with the global ones (N), for each γ value, and for<br />

both heuristics.<br />

Another interesting aspect can be found analyzing the γ values range where the<br />

best performances are obtained (Av.MRE row). Inspecting the Av.MRE values, the<br />

following can in fact be stated: (1) the slack-based heuristic finds solutions of higher<br />

quality w.r.t. the random heuristic over the complete γ variability range; (2) in the random<br />

case, the best results are obtained in the [0.3, 0.5] γ range, while in the slack-based<br />

case the best γ range is wider ([0.3, 0.6]).<br />

7 Conclusions<br />

In this paper we have proposed the use of Iterative Flattening Search (IFS) as a means of<br />

effectively solving the SDST-FJSSP. The proposed algorithm uses as its core solving<br />

procedure an extended version of the SP-PCP procedure proposed by [21] and a new<br />

relaxation strategy targeted to the case of SDST-FJSSP. The effectiveness of the procedure<br />

was demonstrated on 20 modified instances of the edata subset of the FJSSP<br />

HUdata testbed from [15], a well known and difficult Flexible Job Shop Scheduling<br />

benchmark set. In particular, we show as the new slack-based relaxation strategy exhibits<br />

better performance than the random selection one. Further improvement of the<br />

current algorithm may be possible by incorporating additional heuristic information<br />

and search mechanisms. One of the next steps will be the collection of the benchmarks<br />

proposed in the cited works [23, 24, 26, 27], although no one of the problems proposed<br />

13<br />

108


in these papers coincides with the SDST-FJSSP, basically they can be seen as slight<br />

variations of this problem, hence the proposed IFS procedure can be adapted to solve an<br />

interesting and large class of flexible manufacturing scheduling problems. This will be<br />

the focus of our future work together the realization of a web repository to collect all<br />

the interesting benchmark sets.<br />

Acknowledgments<br />

CNR authors are partially supported by EU under the ULISSE project (Contract FP7.218815),<br />

and MIUR under the PRIN project 20089M932N (funds 2008).<br />

References<br />

1. J. Adams, E. Balas, and D. Zawack. The shifting bottleneck procedure for job shop scheduling.<br />

Management Science, 34(3):391–401, 1988.<br />

2. A. Allahverdi, C. Ng, T. Cheng, and M. Kovalyov. A survey of scheduling problems with<br />

setup times or costs. European Journal of Operational Research, 187(3):985–1032, 2008.<br />

3. A. Allahverdi and H. Soroush. The significance of reducing setup times/setup costs. European<br />

Journal of Operational Research, 187(3):978–984, 2008.<br />

4. C. Artigues and D. Feillet. A branch and bound method for the job-shop problem with<br />

sequence-dependent setup times. Annals OR, 159(1):135–159, 2008.<br />

5. E. Balas, N. Simonetti, and A. Vazacopoulos. Job shop scheduling with setup times, deadlines<br />

and precedence constraints. Journal of Scheduling, <strong>11</strong>(4):253–262, 2008.<br />

6. P. Brucker, B. Jurisch, and B. Sievers. A branch and bound algorithm for the job-shop<br />

scheduling problem. Discrete Applied Mathematics, 49(1-3):107–127, 1994.<br />

7. P. Brucker and O. Thiele. A branch & bound method for the general-shop problem with<br />

sequence dependent setup-times. OR Spectrum, 18(3):145–161, 1996.<br />

8. A. Cesta, A. Oddi, and S. F. Smith. Iterative Flattening: A Scalable Method for Solving<br />

Multi-Capacity Scheduling Problems. In AAAI/IAAI. 17 th National Conference on Artificial<br />

Intelligence, pages 742–747, 2000.<br />

9. R. Dechter, I. Meiri, and J. Pearl. Temporal constraint networks. Artificial Intelligence,<br />

49:61–95, 1991.<br />

10. J. Gao, L. Sun, and M. Gen. A hybrid genetic and variable neighborhood descent algorithm<br />

for flexible job shop scheduling problems. Computers & Operations Research, 35:2892–<br />

2907, 2008.<br />

<strong>11</strong>. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of<br />

NP-Completeness. W. H. Freeman & Co., New York, NY, USA, 1979.<br />

12. D. Godard, P. Laborie, and W. Nuitjen. Randomized Large Neighborhood Search for Cumulative<br />

Scheduling. In <strong>Proceedings</strong> of ICAPS-05, pages 81–89, 2005.<br />

13. M. A. González, C. R. Vela, and R. Varela. A Tabu Search Algorithm to Minimize Lateness<br />

in Scheduling Problems with Setup Times. In <strong>Proceedings</strong> of the CAEPIA-TTIA 2009 13th<br />

Conference of the Spanish Association on Artificial Intelligence, 2009.<br />

14. A. B. Hmida, M. Haouari, M.-J. Huguet, and P. Lopez. Discrepancy search for the flexible<br />

job shop scheduling problem. Computers & Operations Research, 37:2192–2201, 2010.<br />

15. J. Hurink, B. Jurisch, and M. Thole. Tabu search for the job-shop scheduling problem with<br />

multi-purpose machines. OR Spectrum, 15(4):205–215, February 1994.<br />

16. M. Mastrolilli and L. M. Gambardella. Effective neighbourhood functions for the flexible<br />

job shop problem. Journal of Scheduling, 3:3–20, 2000.<br />

14<br />

109


17. L. Michel and P. Van Hentenryck. Iterative Relaxations for Iterative Flattening in Cumulative<br />

Scheduling. In <strong>Proceedings</strong> of ICAPS-04, pages 200–208, 2004.<br />

18. U. Montanari. Networks of Constraints: Fundamental Properties and Applications to Picture<br />

Processing. Information Sciences, 7:95–132, 1974.<br />

19. E. Nowicki and C. Smutnicki. An advanced tabu search algorithm for the job shop problem.<br />

Journal of Scheduling, 8(2):145–159, 2005.<br />

20. A. Oddi, A. Cesta, N. Policella, and S. F. Smith. Iterative flattening search for resource<br />

constrained scheduling. J. Intelligent Manufacturing, 21(1):17–30, 2010.<br />

21. A. Oddi and S. Smith. Stochastic Procedures for Generating Feasible Schedules. In <strong>Proceedings</strong><br />

14th National Conference on AI (AAAI-97), pages 308–314, 1997.<br />

22. N. Policella, A. Cesta, A. Oddi, and S. Smith. From Precedence Constraint Posting to Partial<br />

Order Schedules. AI Communications, 20(3):163–180, 2007.<br />

23. A. Rossi and G. Dini. Flexible job-shop scheduling with routing flexibility and separable<br />

setup times using ant colony optimisation method. Robotics and Computer-Integrated Manufacturing,<br />

23(5):503–516, 2007.<br />

24. R. Ruiz and C. Maroto. A genetic algorithm for hybrid flowshops with sequence dependent<br />

setup times and machine eligibility. European Journal of Operational Research, 169(3):781<br />

– 800, 2006.<br />

25. A. K. Sethi and S. P. Sethi. Flexibility in manufacturing: A survey. International Journal of<br />

Flexible Manufacturing Systems, 2:289–328, 1990. 10.1007/BF00186471.<br />

26. E. Vallada and R. Ruiz. A genetic algorithm for the unrelated parallel machine scheduling<br />

problem with sequence dependent setup times. European Journal of Operational Research,<br />

2<strong>11</strong>(3):612 – 622, 20<strong>11</strong>.<br />

27. V. Valls, M. A. Perez, and M. S. Quintanilla. A tabu search approach to machine scheduling.<br />

European Journal of Operational Research, 106(2-3):277 – 300, 1998.<br />

28. C. R. Vela, R. Varela, and M. A. González. Local search and genetic algorithm for the job<br />

shop scheduling problem with sequence dependent setup times. Journal of Heuristics, 2009.<br />

15<br />

<strong>11</strong>0


Automatic Generation of Efficient Domain-Optimized<br />

Planners from Generic Parametrized Planners<br />

Mauro Vallati 1 , Chris Fawcett 2 , Alfonso E. Gerevini 1 ,<br />

Holger H. Hoos 2 , and Alessandro Saetti 1<br />

1 Dipartimento di Ingegneria dell’Informazione<br />

Università di Brescia, Italy<br />

{mauro.vallati,gerevini,saetti}@ing.unibs.it<br />

2 Computer Science Department<br />

University of British Columbia, Canada<br />

{fawcettc,hoos}@cs.ubc.ca<br />

Abstract. When designing state-of-the-art, domain-independent planning systems,<br />

many decisions have to be made with respect to the domain analysis or<br />

compilation performed during preprocessing, the heuristic functions used during<br />

search, and other features of the search algorithm. These design decisions can<br />

have a large impact on the performance of the resulting planner. By providing<br />

many alternatives for these choices and exposing them as parameters, planning<br />

systems can in principle be configured to work well on different domains. However,<br />

usually planners are used in default configurations that have been chosen because<br />

of their good average performance over a set of benchmark domains, with<br />

limited experimentation of the potentially huge range of possible configurations.<br />

In this work, we propose a general framework for automatically configuring a parameterized<br />

planner, showing that substantial performance gains can be achieved.<br />

We apply the framework to the well-known LPG planner, which has 62 parameters<br />

and over 6.5 × 10 17 possible configurations. We demonstrate that by using<br />

this highly parameterized planning system in combination with the off-the-shelf,<br />

state-of-the-art automatic algorithm configuration procedure ParamILS, the planner<br />

can be specialized obtaining significantly improved performance.<br />

Introduction<br />

When designing state-of-the-art, domain-independent planning systems, many decisions<br />

have to be made with respect to the domain analysis or compilation performed<br />

during preprocessing, the heuristic functions used during search, and several other features<br />

of the search algorithm. These design decisions can have a large impact on the<br />

performance of the resulting planner. By providing many alternatives for these choices<br />

and exposing them as parameters, highly flexible domain-independent planning systems<br />

are obtained, which then, in principle, can be configured to work well on different<br />

domains, by using parameter settings specifically chosen for solving planning problems<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms for Solving<br />

Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

<strong>11</strong>1


from each given domain. However, usually such planners are used with default configurations<br />

that have been chosen because of their good average performance over a set<br />

of benchmark domains, based on limited exploration within a potentially vast space of<br />

possible configurations. The hope is that these default configurations will also perform<br />

well on domains and problems beyond those for which they were tested at design time.<br />

In this work, we advocate a different approach, based on the idea of automatically<br />

configuring a generic, parameterized planner using a set of training planning problems<br />

in order to obtain planners that perform especially well in the domains of these training<br />

problems. Automated configuration of heuristic algorithms has been an area of intense<br />

research focus in recent years, producing tools that have improved algorithm performance<br />

substantially in many problem domains. To our knowledge, however, these techniques<br />

have not yet been applied to the problem of planning.<br />

While our approach could in principle utilize any sufficiently powerful automatic<br />

configuration procedure, we have chosen the FocusedILS variant of the off-the-shelf,<br />

state-of-the-art automatic algorithm configuration procedure ParamILS [8]. At the core<br />

of the ParamILS framework lies Iterated Local Search (ILS), a well-known and versatile<br />

stochastic local search method that iteratively performs phases of a simple local search,<br />

such as iterative improvement, interspersed with so-called perturbation phases that are<br />

used to escape from local optima. The FocusedILS variant of ParamILS uses this ILS<br />

procedure to search for high-performance configurations of a given algorithm by evaluating<br />

promising configurations, using an increasing number of runs in order to avoid<br />

wasting CPU-time on poorly-performing configurations. ParamILS also avoids wasting<br />

CPU-time on low-performance configurations by adaptively limiting the amount of<br />

runtime allocated to each algorithm run using knowledge of the best-performing configuration<br />

found so far.<br />

ParamILS has previously been applied to configure state-of-the-art solvers for SAT<br />

[7] and mixed integer programming (MIP) [9]. This resulted in a version of the SAT<br />

solver Spear that won the first prize in one category of the 2007 Satisfiability Modulo<br />

Theories Competition [7]; it further contributed to the SATzilla solvers that won prizes<br />

in 5 categories of the 2009 SAT Competition and led to large improvements in the<br />

performance of CPLEX on several types of MIP problems [9]. Differently from SAT<br />

and MIP, in planning, explicit domain specifications are available through a planning<br />

language, which creates more opportunities for planners to take problem structure into<br />

account in parameterized components (e.g., specific search heuristics). This can lead to<br />

more complex systems, with greater opportunities for automatic parameter configuration,<br />

but also greater challenges (bigger, richer design spaces can be expected to give<br />

rise to trickier configuration problems).<br />

One such planning system is LPG (e.g., [3, 4]). Based on a stochastic local search<br />

procedure, LPG is a well-known efficient and versatile planner with many components<br />

that can be configured very flexibly via 62 exposed configurable parameters, which<br />

jointly give rise to over 6.5 × 10 17 possible configurations. The default settings of these<br />

parameters have been chosen to allow the system to work well on a broad range of<br />

domains. In this work, we used ParamILS to automatically configure LPG on various<br />

propositional domains; LPG’s configuration space is one of the largest considered so<br />

far in applications of ParamILS.<br />

2<br />

<strong>11</strong>2


We tested our approach using ParamILS and LPG on <strong>11</strong> domains of planning problems<br />

used in previous international planning competitions (IPC-3–6). Our results demonstrate<br />

that by using automatically determined, domain-optimized configurations (LPG.sd),<br />

substantial performance gains can be achieved compared to the default configuration<br />

(LPG.d). Using the same automatic configuration approach to optimize the performance<br />

of LPG on a merged set of benchmark instances from different domains also results in<br />

improvements over the default, but these are less pronounced than those obtained by<br />

automated configuration for single domains.<br />

We also investigated to which extent the domain-optimized planners obtained by<br />

configuring the general-purpose LPG planner perform well compared to other state-ofthe-art<br />

domain-independent planners. Our results indicate that, for the class of domains<br />

considered in our analysis, LPG.sd is significantly faster than LAMA [10], the topperforming<br />

propositional planner of the last planning competition (IPC-6). 3<br />

Moreover, in order to understand how well our approach works compared to stateof-the-of-art<br />

systems in automated planning with learning, we have experimentally<br />

compared LPG.sd with the planners of the learning track of IPC-6, showing that in<br />

terms of speed and usefulness of the learned knowledge our system outperforms the<br />

respective IPC-6 winners PbP.s [5] and ObtuseWedge [<strong>11</strong>].<br />

While in this work, we focus on the application of the proposed framework to the<br />

LPG planner, we believe that similarly good results can be obtained for highly parameterized<br />

versions of other existing planning systems. In general, our results suggest that<br />

in the future development of efficient planning systems, it is worth including many<br />

different variants and a wide range of settings for the various components, instead of<br />

committing at design time to particular choices and settings, and to use automated procedures<br />

for finding configurations of the resulting highly parameterized planning systems<br />

that perform well on the problems arising in a specific application domain under<br />

consideration.<br />

In the rest of this paper, we first provide some background and further information<br />

on LPG and its parameters. Next, we describe in detail our experimental analysis and<br />

results, followed by concluding remarks and a discussion of some avenues for future<br />

work.<br />

The Generic Parameterized Planner LPG<br />

In this section, we provide a very brief description of LPG and its parameters. LPG<br />

is a versatile system that can be used for plan generation, plan repair and incremental<br />

planning in PDDL2.2 domains [6]. The planner is based on a stochastic local search procedure<br />

that explores a space of partial plans represented through linear action graphs,<br />

which are variants of the very well-known planning graph [1].<br />

Starting from the initial action graph containing only two special actions representing<br />

the problem initial state and goals, respectively, LPG iteratively modifies the<br />

3 The version of LAMA used in the competition has only four Boolean parameters exposed,<br />

which its authors recommend to leave unchanged; it is therefore not suitable for studying automatic<br />

parameter configuration. A newer, much more flexibly configurable version of LAMA<br />

has become available very recently, as part of the Fast Downward system, which we are studying<br />

in ongoing work.<br />

3<br />

<strong>11</strong>3


Domain Configuration P1 P2 P3 P4 P5 P6 P7 Total<br />

Blocksworld 1 1 2 1 5 1 2 13<br />

Depots 2 2 1 1 2 2 2 12<br />

Gold-miner 2 3 0 1 4 2 1 13<br />

Matching-BW 1 2 2 1 3 0 2 <strong>11</strong><br />

N-Puzzle 4 5 3 2 14 5 2 35<br />

Rovers 0 1 0 0 0 2 1 4<br />

Satellite 2 7 3 1 <strong>11</strong> 5 3 32<br />

Sokoban 0 1 1 1 1 1 2 7<br />

Zenotravel 3 5 2 3 <strong>11</strong> 5 3 32<br />

Merged set 0 1 0 1 5 2 2 <strong>11</strong><br />

Number of parameters 6 15 8 6 17 7 3 62<br />

Table 1. Number of parameters of LPG that are changed by ParamILS in the configurations<br />

computed for nine domains independently considered (2nd–10th lines) and jointly considered<br />

(“merged set” line). Each P1–P7 column corresponds to a different parameter category (or planner<br />

component).<br />

The last line of Table 1 shows the number of LPG’s parameters that fall into each of<br />

these seven categories (planner components).<br />

Experimental Analysis<br />

In this section, we present the results of a large experimental study examining the effectiveness<br />

of the automated approach outlined in the introduction. While our analysis<br />

is focused on planning speed, we also report preliminary results on plan quality.<br />

Benchmark domains and instances<br />

In our first set of experiments, we considered problem instances from eight known<br />

benchmark domains used in the last four international planning competitions (IPC-3–<br />

6), Depots, Gold-miner, Matching-BW, N-Puzzle, Rovers, Satellite, Sokoban,<br />

and Zenotravel, plus the well-known domain Blocksworld. These domains were selected<br />

because they are not trivially solvable and random instance generators are available<br />

for them, such that large training and testing sets of instances can be obtained.<br />

For each domain, we used the respective random instance generator to derive three<br />

disjoint sets of instances: a training set with 2000 relatively small instances (benchmark<br />

T), a testing set with 400 middle-size instances (benchmark MS), and a testing set<br />

with 50 large instances (benchmark LS). The size of the instances in training set T was<br />

decided such that the instances may be solved by the default configuration of LPG in<br />

20 to 40 CPU seconds on average. For testing sets MS and LS, the size of the instances<br />

was defined such the instances may on average be solved by the default configuration of<br />

LPG in 50 seconds to 2 minutes and in 3 minutes to 7 minutes, respectively. This does<br />

not mean that all our problem instances can be solved by LPG, since we have just decided<br />

the size of the instances according to the performance of the default configuration,<br />

and then we have used random generators for deriving the actual instances.<br />

5<br />

<strong>11</strong>5


For the experiments comparing automatically determined configurations of LPG<br />

against the planners that entered the learning track of IPC-6, we employed the same<br />

instance sets as those used in the competition.<br />

Automated configuration using ParamILS<br />

For all configuration experiments we used the FocusedILS variant of ParamILS version<br />

2.3.5 with default parameter settings. Using the default configuration of LPG as the<br />

starting point for the automated configuration process, we concurrently performed 10<br />

independent runs of FocusedILS per domain, using random orderings of the training<br />

set instances. 4 Each run of FocusedILS had a total CPU-time cutoff of 48 hours, and a<br />

cutoff time of 60 CPU seconds was used for each run of LPG performed during the configuration<br />

process. The objective function used by ParamILS for evaluating the quality<br />

of configurations was mean runtime, with timeouts and crashes assigned a penalized<br />

runtime of ten times the per-run cutoff. Out of the 10 configurations produced by these<br />

runs, we selected the configuration with the best training set performance (as measured<br />

by FocusedILS) as the final configuration of LPG for the respective domain.<br />

Additionally, we used FocusedILS for optimizing the configuration of LPG across<br />

all of the selected domains together. As with our approach for individual domains, we<br />

performed 10 independent runs of FocusedILS starting from the default configuration;<br />

again, the single configuration with the best performance on the merged training set as<br />

measured by FocusedILS was selected as the final result of the configuration process.<br />

The final configurations thus obtained were then evaluated on the two testing sets<br />

of instances (benchmarks MS and LS) for each domain. We used a timeout of 600 CPU<br />

seconds for benchmark MS, and 900 CPU seconds for benchmark LS.<br />

For convenience, we define the following abbreviations corresponding to configurations<br />

of LPG:<br />

– Default (LPG.d): The default configuration of LPG.<br />

– Random (LPG.r): Configurations selected independently at random from all possible<br />

configurations of LPG.<br />

– Specific (LPG.sd): The specific configuration of LPG found by ParamILS for each<br />

domain.<br />

– Merged (LPG.md): The configuration of LPG obtained by running ParamILS on<br />

the merged training set.<br />

Table 1 shows, for each parameter category of LPG, the number of parameters that<br />

are changed from their defaults by ParamILS in the derived domain-optimized configurations<br />

and in the configuration obtained for the merged training set.<br />

Empirical result 1 Domain-optimized configurations of LPG differ substantially from<br />

the default configuration.<br />

Moreover, we noticed that usually the changed configurations are considerably different<br />

from each other.<br />

4 Multiple independent runs of FocusedILS were used, because this approach can help ameliorate<br />

stagnation of the configuration process occasionally encountered otherwise.<br />

6<br />

<strong>11</strong>6


Domain LPG.d LPG.r<br />

Score % solved Score % solved<br />

Blocksworld 99.00 99 0.00 16<br />

Depots 86.00 86 0.00 18<br />

Gold-miner 91.00 91 0.00 19<br />

Matching-BW 14.00 14 0.15 9<br />

N-Puzzle 59.10 89 34.75 86<br />

Rovers 85.81 100 31.21 53<br />

Satellite 96.02 100 18.99 37<br />

Sokoban 73.20 74 2.06 28<br />

Zenotravel 98.70 100 2.47 24<br />

Total 702.8 83.7 89.6 32.2<br />

Table 2. Speed scores and percentage of problems solved by LPG.d and LPG.r for 100 problems<br />

in each of 9 domains of benchmark MS.<br />

Results on specific domains<br />

The performance of each configuration was evaluated using the performance score functions<br />

adopted in IPC-6 [2]. The speed score of a configuration C is defined as the sum<br />

of the speed scores assigned to C over all test problems. The speed score assigned to C<br />

for a planning problem p is 0 if p is unsolved and T ∗ p /T (C)p otherwise, where T ∗ p is the<br />

lowest measured CPU time to solve problem p among those of the compared solvers,<br />

and T (C)p denotes the CPU time required by C to solve problem p. Higher values for<br />

the speed score indicate better performance.<br />

Table 2 shows the results of the comparison between LPG.d and LPG.r, which we<br />

conducted to assess the performance of the default configuration on our benchmarks.<br />

Empirical result 2 LPG.d is considerably faster and solves many more problems than<br />

LPG.r.<br />

Specifically, LPG.r solves very few problems in 6 of the 9 domains we considered, while<br />

LPG.d solves most of the considered problems in all but one domain. This observation<br />

also suggests that the default configuration is a much better starting point for deriving<br />

configurations using ParamILS than a random configuration. In order to confirm this<br />

intuition, we performed an additional set of experiments using the random configuration<br />

as starting point. As expected, the resulting configurations of LPG perform much worse<br />

than LPG.sd, and even sometimes perform worse than LPG.d.<br />

Figure 2 provides results in the form of a scatterplot, showing the performance of<br />

LPG.sd and LPG.d on the individual benchmark instances. We consider all instances<br />

solved by at least one of these planners. Each cross symbol indicates the CPU time<br />

used by LPG.d and LPG.sd to solve a particular problem instance of benchmarks MS and<br />

LS. When a cross appears under (above) the main diagonal, LPG.sd is faster (slower)<br />

than LPG.d; the distance of the cross from the main diagonal indicates the performance<br />

gap (the greater the distance, the greater the gap). The results in Figure 2 indicate that<br />

LPG.sd performs almost always better than LPG.d, often by 1–2 orders of magnitude.<br />

7<br />

<strong>11</strong>7


CPU seconds of LPG.sd<br />

U<br />

100<br />

10<br />

1<br />

0.1<br />

0.1 1 10<br />

CPU seconds of LPG.d<br />

100 U<br />

CPU seconds of LPG.sd<br />

U<br />

100<br />

10<br />

1<br />

0.1<br />

0.1 1 10<br />

CPU seconds of LPG.d<br />

100 U<br />

Fig. 2. CPU time (log. scale) of LPG.sd with respect to LPG.d for problems of benchmarks MS<br />

(upper plot) and LS (bottom plot). U corresponds to runs that timed out with the given runtime<br />

cutoff.<br />

Table 3 shows the performance of LPG.d, LPG.md, and LPG.sd for each domain<br />

of benchmarks MS and LS in terms of speed score, percentage of solved problems and<br />

average CPU time (computed over the problems solved by all the considered configurations).<br />

These results indicate that LPG.sd solves many more problems, is on average<br />

much faster than LPG.d and LPG.md, and that for some benchmark sets LPG.sd always<br />

performs better than or equal to the other configurations, as the IPC score of LPG.sd is<br />

sometimes the maximum score (i.e., 400 points for benchmark MS, and 50 for benchmark<br />

LS). 5<br />

Empirical result 3 LPG.sd performs much better than both LPG.d and LPG.md.<br />

Interestingly, the results in Figure 2 and Table 3 also indicate that, for larger test<br />

problems, the performance gap between LPG.sd and LPG.d tends to increase: For ex-<br />

5 Additional results (not detailed here for lack of space) using 2000 test problems for each of the<br />

nine considered domains of the same size as those used for the training indicate a performance<br />

behavior very similar to the one observed for the MS and LS instances considered in Table 3.<br />

8<br />

<strong>11</strong>8


Domain MS problems<br />

Speed score (% solved) Average CPU time<br />

LPG.d LPG.md LPG.sd LPG.d LPG.md LPG.sd<br />

Blocksworld 21.3 (98.8) 74.8 (100) 400 (100) 105.3 28.17 4.29<br />

Depots 124 (90.3) 164 (99) 345 (98.5) 78.1 42.4 5.7<br />

Gold-miner 18.5 (90.5) 232 (100) 374 (100) 94.4 7.4 1.6<br />

Matching-BW 9.74 (15.8) 72.5 (55.3) 375 (97.8) 93.8 42.3 5.6<br />

N-Puzzle 20.1 (85) 27.0 (86.3) 347 (86.8) 321.0 247 31.20<br />

Rovers 131 (100) 162 (100) 400 (100) 72.2 52.9 21.2<br />

Satellite 104 (100) <strong>11</strong>1 (100) 400 (100) 64.0 59.2 1.3<br />

Sokoban 26.7 (75.8) 191 (94.8) 335 (96.5) 24.6 6.15 1.19<br />

Zenotravel 49.1 (100) 97.2 (99.8) 397 (100) 103.7 57.6 <strong>11</strong>.1<br />

All above 280.3 (83.3) 304.3 (91.5) – <strong>11</strong>5.4 38.8 –<br />

Domain LS problems<br />

Speed score (% solved) Average CPU time<br />

LPG.d LPG.md LPG.sd LPG.d LPG.md LPG.sd<br />

Blocksworld 5.12 (100) <strong>11</strong>.1 (100) 50 (100) 320.9 144.8 30.8<br />

Depots 3.91 (100) 17.4 (100) 44.1 (98) 326.6 181.1 25.7<br />

Gold-miner 1.54 (100) 32.6 (100) 35.9 (100) 327 21.0 21.2<br />

Matching-BW 1.51 (86) 15.2 (94) 47.4 (100) 225 72.3 1.90<br />

N-Puzzle 0.66 (100) 1.41 (100) 50 (100) 344 158 4.44<br />

Rovers 9.61 (100) 48.5 (100) 45.6 (100) 248 48.3 52.7<br />

Satellite 9.43 (100) 28.8 (100) 50 (100) 263 85.4 48.9<br />

Sokoban 4.55 (62) 24.0 (82) 38.7 (94) 70.8 7.00 4.23<br />

Zenotravel 0.52 (100) 4.26 (100) 50 (100) 294 42.9 2.90<br />

All above 12.6 (96) 49.7 (100) – 309.7 81.3 –<br />

Table 3. Speed score, percentage of solved problems, and average CPU time of LPG.d, LPG.md<br />

and LPG.sd for 400 MS and 50 LS instances in each of 9 domains, independently considered, and<br />

in all domains (last line).<br />

ample, on the middle-size instances of Matching-BW, LPG.sd is on average about one<br />

order of magnitude faster than LPG.d, while on the largest instances it has an average<br />

performance advantage of more than two orders of magnitude.<br />

Empirical result 4 LPG.sd is faster than LPG.d also for instances considerably larger<br />

than those used for deriving the planner configurations.<br />

This observation indicates that the approach used for deriving configurations scales well<br />

with increasing problem instance size.<br />

As can be seen from the last line of Table 3, LPG.md performs usually better than<br />

LPG.d on the individual domain test sets. Moreover, it performs better than LPG.d<br />

on the sets obtained by merging the test sets for all individual domains, which indicates<br />

that by using a merged training set, we successfully produced a configuration with good<br />

performance on average across all selected domains.<br />

9<br />

<strong>11</strong>9


Domain LPG.sd vs. LAMA LPG.sd vs. PbP.s<br />

∆-speed ∆-solved ∆-speed ∆-solved<br />

Blocksworld +377.4 +52 +361.7 ±0<br />

Depots +393.9 +381 +2<strong>11</strong>.1 +54<br />

Gold-miner +400 +400 +395.6 +319<br />

Matching-BW +227.8 +<strong>11</strong>8 +40.7 +330<br />

N-Puzzle +255.7 +4 +279.8 −20<br />

Rovers +392.9 +14 +313.4 +9<br />

Satellite +388.1 +157 +253.6 +9<br />

Sokoban +340.1 +278 −41.6 +5<br />

Zenotravel +368.3 ±0 −282.1 +8<br />

Total +3144 +1404 +1532 +714<br />

Table 4. Performance gap between LPG.sd and LAMA (2nd-3rd columns) and LPG.sd and<br />

PbP.s (4-5th columns) for 400 MS problems in each of 9 domains in terms of speed score and<br />

number of solved problems.<br />

Empirical result 5 LPG.md performs better than LPG.d.<br />

Next, we compared our LPG configurations with state-of-the-art planning systems,<br />

namely, the winner of the IPC-6 classical track LAMA (configured to stop when the<br />

first solution is computed), and the winner of the IPC-6 learning track, PbP. The performance<br />

gap between LPG.sd and these planners for MS problems are shown in Table 4,<br />

where we report the speed score and the number of solved problems (positive numbers<br />

mean that LPG.sd performs better). These experimental results indicate clearly that<br />

our configurations of LPG are significantly faster and solve many more problems than<br />

LAMA.<br />

Empirical result 6 LPG.sd performs significantly better than LAMA on well-known<br />

non-trivial domains.<br />

Moreover, LPG.sd outperforms PbP.s in most of the selected domains: only for<br />

Sokoban and Zenotravel PbP.s obtains a better speed score (but performs slightly<br />

worse in terms of solved problems), and only for N-Puzzle it solves more problems (but<br />

it is generally slower). Interestingly, for these domains the multiplanner of PbP.s runs a<br />

single planner with an associated set of macro-actions; these macro-actions clearly help<br />

to significantly speed up the search phase of this planner.<br />

Empirical result 7 For the considered well-known benchmark domains, LPG.sd performs<br />

significantly better than PbP.s.<br />

Results on learning track of IPC-6<br />

To evaluate the effectiveness of our approach against recent learning-based planners,<br />

we compared our LPG.sd configurations with planners that entered the learning track<br />

10<br />

120


Planner # unsolved Speed score ∆-score<br />

LPG.sd 38 93.23 +59.7<br />

ObtuseWedge 63 63.83 +33.58<br />

PbP.s 7 69.16 −3.54<br />

RFA1 85 <strong>11</strong>.44 –<br />

Wizard+FF 102 29.5 +10.66<br />

Wizard+SGPlan 88 38.24 +7.73<br />

Table 5. Performance of the top 5 planners that took part in the learning track of IPC-6 plus<br />

LPG.sd, in terms of the number of unsolved problems, speed score and score gap with and without<br />

using the learned knowledge for the problems of the learning track of IPC-6.<br />

of IPC-6, based on the same performance criteria as used in the competition. Table 5<br />

shows performance in terms of the number of unsolved problems, speed score, and performance<br />

gap with and without using the learned knowledge (positive numbers mean<br />

that the planner performs better using the knowledge); the results in this table indicate<br />

that LPG.sd performs better than every solver that participated in the IPC-6 learning<br />

track, including the version of PbP.s which won the IPC-6 learning track. Although<br />

LPG.sd solves fewer problems than PbP (obtaining zero score for each unsolved problem),<br />

it achieves the best score as it is the fastest planner on 3 domains (Gold-miner,<br />

N-Puzzle and Sokoban), and it performs close to PbP.s on one additional domain<br />

(Matching-BW). Furthermore, the results in Table 5 indicate that the performance gap<br />

between LPG.sd and LPG.d is significant, and is greater than the gap achieved by<br />

ObtuseWedge, the planner recognised as best learner of the IPC-6 competition.<br />

Empirical result 8 According to the evaluation criteria of IPC-6, LPG.sd performs<br />

better than the winners of the learning track for speed and best-learning.<br />

Further preliminary results on plan quality<br />

Although the experimental analysis in this paper focuses on planning speed, we give<br />

some preliminary results indicating that automatic algorithm configuration is also promising<br />

for optimizing plan quality. Additional experiments to confirm this observation are<br />

in progress. Figure 3 shows results on two benchmark domains (100 problems each<br />

from the MS set) in terms of relative solution quality of LPG.sd and LPG.d over CPU<br />

time spent by the planner, where, in this context, LPG.sd refers to LPG configured for<br />

optimizing plan quality. Training was conducted based on LPG runs with cut-off of 2<br />

CPU minutes, with the objective to minimise the best plan cost (number of actions)<br />

within that time limit (LPG is an incremental planner computing a sequence of plans<br />

with increasing quality). The quality score of a configuration is defined analogously to<br />

the runtime score previously described, but using plan cost instead of CPU time.<br />

Overall, these results indicate that, at least for the domains considered here, LPG.sd<br />

always finds considerably better plans than LPG.d, unless small CPU-time limits are<br />

used, in which case they perform similarly.<br />

<strong>11</strong><br />

121


100<br />

80<br />

60<br />

40<br />

20<br />

Quality score<br />

LPG.d (Depots)<br />

LPG.sd (Depots)<br />

LPG.d (Gold-miner)<br />

LPG.sd (Gold-miner)<br />

0<br />

1 10 100 900<br />

Fig. 3. Quality score of LPG.d and LPG using domain-optimized configurations for computing<br />

high-quality plans w.r.t. an increasing CPU-time limit (x-axis: ranging from 1 to 900 seconds) for<br />

domains Depots and Gold-miner.<br />

Conclusions and Future Work<br />

We have investigated the application of computer-assisted algorithm design to automated<br />

planning and proposed a framework for automatically configuring a generic planner<br />

with several parameterized components to obtain specialized planners that work efficiently<br />

on given domains. In a large-scale empirical analysis, we have demonstrated<br />

that our approach, when applied to the state-of-the-art, highly parameterized LPG planning<br />

system, effectively generates substantially improved domain-optimized planners.<br />

Our work and results also suggest a potential method for testing new heuristics and<br />

algorithm components, based on measuring the performance improvements obtained by<br />

adding them to an existing highly-parameterized planner followed by automatic configuration<br />

for specific domains. The results may not only reveal to which extent new<br />

design elements are useful, but also under which circumstances they are most effective<br />

– something that would be very difficult to determine manually.<br />

We see several avenues for future work. Concerning the automatic configuration<br />

of LPG, we are conducting an experimental analysis about the usefulness of the proposed<br />

framework for identifying configurations improving the planner performance in<br />

terms of plan quality, of which in this paper we have given preliminary results. Moreover,<br />

we plan to apply the framework to metric-temporal planning domains. Finally,<br />

we believe that our approach can yield good results for other planners that have been<br />

rendered highly configurable by exposing many parameters. In particular, preliminary<br />

results from ongoing work indicate that substantial performance gains can be obtained<br />

when applying our approach to a very recent, highly parameterized version of the IPC-4<br />

winner Fast Downward.<br />

References<br />

1. Blum, A., and Furst, M., L. 1997. Fast planning through planning graph analysis. Artificial<br />

Intelligence 90:pp. 281–300.<br />

12<br />

122


2. Fern, A.; Khardon, R.; and Tadepalli, P. 2008. Learning track of the 6th international planning<br />

competition. In http://eecs.oregonstate.edu/ipc-learn/.<br />

3. Gerevini, A.; Saetti, A.; and Serina, I. 2003. Planning through stochastic local search and<br />

temporal action graphs. Journal of Artificial Intelligence Research 20:239–290.<br />

4. Gerevini, A.; Saetti, A.; and Serina, I. 2008. An approach to efficient planning with numerical<br />

fluents and multi-criteria plan quality. Artificial Intelligence 172(8-9):899–944.<br />

5. Gerevini, A.; Saetti, A.; and Vallati, M. 2009. An automatically configurable portfolio-based<br />

planner with macro-actions: PbP. In Proc. of ICAPS-09.<br />

6. Hoffmann, J., and Edelkamp, S. 2005. The deterministic part of IPC-4: An overview. Journal<br />

of Artificial Intelligence Research 24:519–579.<br />

7. Hutter, F.; Babić, D.; Hoos, H. H.; and Hu, A. J. 2007. Boosting verification by automatic<br />

tuning of decision procedures. In Formal Methods in Computer-Aided Design, 27–34. IEEE<br />

CS Press.<br />

8. Hutter, F.; Hoos, H. H.; Leyton-Brown, K.; and Stützle, T. 2009. ParamILS: An automatic<br />

algorithm configuration framework. Journal of Artificial Intelligence Research 36:267–306.<br />

9. Hutter, F.; Hoos, H. H.; and Leyton-Brown, K. 2010. Automated configuration of mixed<br />

integer programming solvers. In Proc. of CPAIOR-10.<br />

10. Richter, S. Helmert, M., and Westphal, M. 2007. Landmarks revisited. In Proc. of AAAI-07.<br />

<strong>11</strong>. Yoon, S.; Fern, A.; and Givan, R. 2008. Learning control knowledge for forward search<br />

planning. Journal of Machine Learning Research (JMLR) 9:683–718.<br />

13<br />

123


Taking Advantage of Domain Knowledge in<br />

Optimal Hierarchical Deepening Search Planning<br />

Pascal Schmidt 1,2 , Florent Teichteil-Königsbuch 1 , and Patrick Fabiani 1<br />

1 Onera - The French Aerospace Lab<br />

F-31055, Toulouse, France<br />

surname.lastname@onera.fr<br />

2 Université de Toulouse<br />

F-31000, Toulouse, France<br />

Abstract. In this paper, we propose a new algorithm, named HDS<br />

for Hierarchical Deepening Search, to solve large structured classical<br />

planning problems using the divide and conquer motto. A large majority<br />

of planning problems can be easily and recursively decomposed<br />

in many easier subproblems, what is efficiently exploited for instance by<br />

domain-independent approaches such as landmark techniques or domainknowledge<br />

formalisms like Hierarchical Task Networks (HTN). We propose<br />

to exploit domain knowledge in the form of HTNs to guide the<br />

generation of multiple levels of subgoals during the search. Compared<br />

with traditional HTN approaches, we rely on task effects and task-level<br />

heuristics to recursively optimize the plan level-by-level, instead of depthfirst<br />

non-optimal planning in the network. Higher level plan solutions are<br />

decomposed into subproblems and refined into finer level plans, which are<br />

in turn decomposed and refined. Backtracks between levels occur when<br />

costs of refined plans exceed the expected costs of higher-level plans,<br />

thus ensuring to produce optimal plans at each level of the hierarchy.<br />

We demonstrate the relevance of our approach on several well-known<br />

domains compared with state-of-the-art domain-knowledge planners.<br />

1 INTRODUCTION<br />

Automated planning is a field of Artificial Intelligence which aims at automatically<br />

computing a sequence of actions that lead to some goals from a given initial<br />

state. Many subareas have been explored, some assuming that effects of actions<br />

are deterministic [6]. Even in this case, solving realistic problems is challenging<br />

because finding a solution path may require to explore an exponential number<br />

of states with regard to the number of state variables. To cope with this combinatorial<br />

explosion, efficient algorithms have recourse to heuristics, which guide<br />

the search towards optimistic or approximate solutions. Remarkably, hierarchical<br />

methods iteratively decompose the planning problem into smaller and much<br />

simpler ones.<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

124


In a vast majority of problems, the planner must deal with constraints, such<br />

as multiple predefined phases or protocols. Such constraints generally help solving<br />

the planning problem, because they prune lots of search paths where these<br />

constraints do not hold. They can be given by an expert of the problem to solve<br />

— which is often the case in many realistic applications such as military missions<br />

— or beforehand automatically deduced from the model. In this paper,<br />

we assume that these constraints are known and given to the planner. We thus<br />

propose a new method to model and solve a deterministic planning problem,<br />

based on a hierarchical and heuristic approach and taking advantage of these<br />

constraints.<br />

1.1 Intuition on a simple example<br />

Fig. 1. Path planning graph with high level choice<br />

We illustrate our idea on a simple navigation problem, but our approach<br />

pre-eminently targets complex structured problems formalized in a kind of hierarchical<br />

STRIPS semantics [6]. In the graph of Figure 1, the robot must go from<br />

A to L. A human operator who see this graph can immediately say that there<br />

is an important choice to do: go around the wall by the north through G or by<br />

the south through H, as shown in Figure 1. Therefore, it seems to us interesting<br />

to solve this problem at coarse grain, using this information to decide where we<br />

should pass before exploring at fine grain this solution, avoiding to explore the<br />

non-chosen branch. Refinement of the chosen path into elementary steps may<br />

question the previous choice, revealing an unseen difficulty. For instance, there<br />

may be a hole in E discovered when exploring the path via G in details in the<br />

planning process, forcing the agent to reappraise the choice of this path and<br />

changing its higher level decision to the path via H. We then replan at coarse<br />

grain using this new information, until the solution converges.<br />

Intuitively, this approach consists in making jumps in the state graph, then<br />

refining these jumps by recursively doing shorter ones until we apply only elementary<br />

steps.<br />

1.2 Related work<br />

The idea of adding domain-dependent control knowledge to help finding a plan<br />

is wide spread. We can cite TLPlan [1] in which the authors use temporal logic<br />

2<br />

125


(LTL) to give properties defining “good” plans (i.e. cheap plans that lead to the<br />

goal) over a sequence of actions or states (not only the current state). This allows<br />

for very precise guidance of the search either by checking if the current partial<br />

plan is correct, or if it may lead to a complete plan that satisfies the formulas.<br />

Other approaches use what is called procedural knowledge: an operator, who<br />

writes a planning problem, knows by experience some techniques, some groups<br />

of actions (and recursively) that achieve a subgoal and knows how to break down<br />

each goal and subgoal into finer subgoals. Several works are done in this field. In<br />

Hierarchical Task Networks (HTNs) [4], the global mission is recursively broken<br />

down into a combination of subtasks, until the planner applies only elementary<br />

actions. The High-level Actions (HLA) framework [10] differs from HTNs on the<br />

fact that no recipe is given for the whole mission: the planner has to built the<br />

first high level plan then refine it the same way as for HTNs. Planning algorithms<br />

are also associated with the BDI formalism [3]. Our main difference with these<br />

formalisms and associated planning techniques is that we plan one hierarchical<br />

level at a time and keep a coherence in the abstraction level of the different tasks<br />

in each hierarchical plan. Thus, we allow the planner to foresee shortcuts and<br />

difficulties at each level of the hierarchy, avoiding to plan an elementary step<br />

without knowing the long-term effect of this step at coarse grain.<br />

Other works aim at automatically learning some kind of procedural knowledge.<br />

For instance, Landmarks Planning techniques as used in Lama [12], where<br />

the planner deduces a set of subgoals from the problem, Macro-FF [2], where<br />

the planner tries to make groups of actions that have interesting effects, or<br />

HTN-MAKER [8] where the algorithm tries to generalize tasks by analyzing admissible<br />

plans. While these works are interesting, they assume that knowledge<br />

is learned rather given by human experts, what definitely targets applications<br />

with different design and operational constraints.<br />

We now present how we extended the HTN formalism to implement our<br />

contribution, and the algorithm we developed to solve problems expressed in<br />

this formalism. In a last part, we compare the performances of our planner with<br />

SHOP2, dynDFS and TLPlan on several planning benchmarks.<br />

2 FORMALISM<br />

PDDL planning The goal of “classical” planning is to compute a strategy<br />

called plan to reach a goal with the exact knowledge of the applicable actions<br />

and their effects in a completely known world. A problem of classical planning<br />

is a P = (s0,g,A)wheres0 is the initial state of the world, g the goal to<br />

reach, defined as a set of states, and A a set of actions. The initial state of the<br />

world and all the other states of the world are represented by a set of literals<br />

L describing the world. The goal is defined by a set of literals either true<br />

or false. If all literals (and their value) are given in the goal description, the<br />

goal state is unique, otherwise it defines a set of states. Each action is a tuple<br />

a = (name(a),precond(a),effects(a)), where name(a) is the name of the action,<br />

precond(a) are the preconditions required on the current state to apply a, and<br />

3<br />

126


effects(a) are the modifications done on the current state by the application of<br />

a. A plan π is a sequence of actions. π is a solution of the problem if by the<br />

application of all its actions from s0 it leads to g.<br />

In order to describe planning problems, the PDDL language (presented on [5])<br />

and its various extensions are widely used. It is based on the Strips formalism,<br />

and breaks the problem into two parts, the domain that contains the set of<br />

actions A, and the problem that contains the initial state of the world s0, the<br />

goal definition g and the formula to define the cost.<br />

!"#$%"&'(&')<br />

*+!,%"&'(&')- !"#$%"&'(&') #"./<br />

*+!,%"&'(&')<br />

0"%"&'(&')- *+!,%"&'(&')<br />

#"./<br />

Fig. 2. Example of HTN<br />

Expressing hierarchy with HTNs A Hierarchical Task Network (HTN) [4]<br />

is an extension to classical planning that consists in modeling tasks, that is,<br />

abstract actions with different methods to break them down. A HTN problem<br />

is a tuple (s0, g, A, T ), where s0 is the initial state, g is the goal, A the set of<br />

elementary actions (as above), and T the set of tasks. A task t ∈ T is a set of<br />

preconditions and a set of methods: t = (precond(t),M(t)) where precond(t) isa<br />

literal formula that represents the set of states where the task can be performed,<br />

and M(t) is the set of methods m(t). Each method m(t) defines a possible<br />

decomposition of the task into subtasks or elementary actions. There are two<br />

ways of breaking down a task, parallel and sequential. A parallel decomposition<br />

gives the subtasks the possibility to be executed in parallel whereas a sequential<br />

decomposition forces the planner to put the subtasks one after the other in the<br />

given ordering. In most applications, the set of tasks is given by a human expert<br />

of the domain, and have a significant influence on the performance of the planner.<br />

A graphical representation of a HTN is shown on Figure 2. Tasks and elementary<br />

actions are represented in boxes, a horizontal line shows the different<br />

choices of methods for that task, and slanted bars show the decompositions of<br />

methods. Sequential decompositions are represented by arrows. We can see here<br />

a model to solve a path planning problem, where moveTo ?a ?l represents the<br />

highest level task (the mission) consisting for an agent ?a to reach the location<br />

4<br />

127


?l, jumpTo a high level move and goto an elementary step. The void task is the<br />

termination case, necessary to stop the recursion.<br />

Meta-effects to link tasks In the standard HTN formalism as defined by [4],<br />

each task represents a group of methods to achieve a sub-goal, but the planner<br />

does not have knowledge of the accomplished subtask. Therefore, it is impossible<br />

to tie up a task after another one without exploring it in details to know its<br />

effects. In other terms, the standard formalism does not allow for helpful coarse<br />

grain exploration of the problem. In order to use HTN tasks directly as macrooperators<br />

at any level, the first extension we need to add to the HTN formalism<br />

must give the planner the ability to know the effect of a task.<br />

In order to do so, we introduce meta-effects for HTNs. These meta-effects<br />

are attached to tasks like effects are attached to elementary actions. This allows<br />

the planner to get a knowledge of the main effects of a task and to assemble<br />

high-level tasks to make a high level plan. With that knowledge, it will be able<br />

to check the pre-conditions of the next task and compute a high-level heuristic.<br />

Our task t is now a set of preconditions, a set of methods and a set of effects:<br />

t = (precond(t),M(t), effects(t)).<br />

Here is the BNF of the meta-effects in PDDL language:<br />

::= ":metaEffect" <br />

::= <br />

|"(not" ")"<br />

|"(forall" "(" ")" ")"<br />

|"(when" ")"<br />

|"(and" + ")"<br />

::= <br />

|"(assign" ")"<br />

|"(increase" ")"<br />

|"(decrease" ")"<br />

where is a function, an expression, a<br />

boolean expression, a boolean condition and a list of<br />

typed variables.<br />

!"#$%"&'(&')<br />

*+!,%"&'(&')- !"#$%"&'(&') #"./<br />

*+!,%"&'(&')<br />

(1&'(&')<br />

(1&'(&')<br />

0"%"&'(&')- *+!,%"&'(&')<br />

#"./<br />

Fig. 3. Meta-effects in HTN<br />

5<br />

128


An example is shown on Figure 3. We give meta-effects to high-level tasks<br />

moveTo and jumpTo that give the result of the task, i.e. the position of the<br />

robot at the end of the task. These meta-effects are written in a rounded box<br />

on the graphic representations of HTN. Meta-effects can be more or less precise<br />

depending on which points are considered as relevant by the expert. Here, an<br />

estimated cost of a move or a jump, computed with euclidean distance from the<br />

starting point of the task to its destination, can be associated to the meta-effect<br />

if the underlying planner can deal with it. In PDDL, this example is written as:<br />

:metaEffect<br />

(and<br />

(increase (cost) (dist (at ?a) ?l))<br />

(assign (at ?a) ?l)<br />

)<br />

The level of precision of the meta-effects have an important influence on<br />

the planning process: if they are exhaustive with respect to the effects of the<br />

underlying actions, then the effects of the task are totally predictable, and the<br />

choice of a given task will not need to be reconsidered. Contrary to the works<br />

by [10], who also define meta-effects, our effects are generally not complete, i.e.<br />

some numerical estimations of the final state are not well evaluated and some<br />

predicate changes are not present. This simplification allows us to use metaeffects<br />

the same way as normal effects in any forward planning algorithm.<br />

Inspired by admissible heuristics in classical planning, we define optimistic<br />

meta-effects such that the long term cost of a meta-effect is lower than the real<br />

one.<br />

Macro-tasks to avoid recursion Another weakness of standard HTNs concerns<br />

the modeling of methods that must be decomposed into an unknown number<br />

of subtasks (determined at planning time). For instance, consider our navigation<br />

graph of Figure 2. To break down the jumpTo task, we need to recursively<br />

write that jumpTo is a sequence of one goTo to a given point next to the starting<br />

point, followed by a jumpTo from there to the goal.<br />

This may cause several problems. For modelers and readers who do not have<br />

expert programming skills, it is not very intuitive to break this task down using<br />

recursion. One must deal with termination cases, or ask oneself if he would rather<br />

use right or left recursion. Most importantly, task recursion is incoherent with<br />

our idea of doing jumps in the state graph. At high level of hierarchy, the planner<br />

tries to plan a moveTo by refining it into a jumpTo and a moveTo. Atthenext<br />

level, the plan will be a goTo then a jumpTo, then another jumpTo and a moveTo.<br />

That is, the computed plan does not have any consistence in terms of hierarchy.<br />

Thus, we introduce macro-tasks as another extension in the spirit of regular<br />

expressions. The aim is to break down a task into an unknown number of<br />

subtasks that are all at the same level in the hierarchy. A method m is now defined<br />

as a precondition and a macro-task : m = (precond(m), macroTask(m)).<br />

A macro-task is recursively defined by several alternatives that express how to<br />

group subtasks together, the terminal case being a single subtask:<br />

– ordered: subtasks are executed in sequence;<br />

6<br />

129


– multseq: a subtask is executed an unknown number of times until a final<br />

condition is met;<br />

– optional: a subtask is executed only if needed;<br />

– pickOne: a subtask is executed, where the value of a variable satisfying a<br />

given constraint is set. If no variable satisfies the constraint, the current<br />

planning branch is considered as a deadend and the planner backtracks.<br />

Macro-tasks are lazily refined by the parser, so that the planner’s algorithm<br />

described in the next section needs to only assume that macro-tasks are sequences<br />

of subtasks.<br />

The definition of the grammar is the following:<br />

::= <br />

| ordered *<br />

| multseq until <br />

| optional <br />

| pickOne <br />

where is a list of variables, a subtask with its parameters<br />

and a boolean test.<br />

!"#$%"&'(&')<br />

0<br />

*+!,%"&'(&')-<br />

0<br />

."%"&'(&')-*<br />

(/&'(&')<br />

(/&'(&')-<br />

Fig. 4. Macro-task example<br />

An example of this extension is presented on Figure 4. Compared with Figure<br />

3, the model is simpler and more understandable, and above all, all different<br />

occurrences of a same task are all at the same level of the hierarchy.<br />

The PDDL decomposition of jumpTo is written as:<br />

:subtasks<br />

(:multseq<br />

(:pickOne (?l1 - loc) (isElem (at ?a) ?l1)<br />

(goTo ?a ?l1)<br />

)<br />

until (= (at ?a) ?l)<br />

)<br />

where loc is a type representing a location and isElem is a boolean function<br />

that tests if the path between its two arguments is elementary and possible or<br />

not. The keyword multseq represents a set of subtasks with an unknown arity,<br />

and pickOne defines the point of choice of a given variable.<br />

7<br />

130


3 ALGORITHM DESCRIPTION<br />

In this section, we present an algorithm that is able to solve any problem expressed<br />

in the previous formalism. The main idea of this algorithm, named HDS<br />

for Hierarchical Deepening Search, consists in computing first a plan with a low<br />

level of precision, then using this plan as a guide to compute a more precise plan,<br />

until we obtain a detailed plan that contains only elementary actions. At each<br />

step, the algorithm backtracks to a previous higher level plan if the cost of the<br />

current plan is higher than the expected quality of the lower-level plans.<br />

3.1 Using a lower precision plan as a guide<br />

Let n be a given level of the hierarchy. We assume first that complete plans<br />

have been constructed for all upper levels including n. The by-level planner uses<br />

the macro-tasks at level n + 1 and the plan Pn at level n to compute a higher<br />

precision plan Pn+1 that solves level n + 1. We can use any forward planning<br />

algorithm, for instance A ∗ , slightly modified to handle constraints from Pn in<br />

its exploration.<br />

The first idea is that the by-level planner uses all tasks (actions and macrotasks)<br />

as elementary actions, using the meta-effects of macro-tasks as normal<br />

effects. The second idea consists in keeping track, for each state, of its position<br />

in the HTN by means of an extended state σ := (σ.s, σ.p), composed of the<br />

state σ.s and the position σ.p in the HTN. Using this extended state, we can<br />

significantly restrict the branching factor: in each state, we pick-up actions that<br />

can be applied according to the position σ.p in the HTN among the ones whose<br />

preconditions are satisfied in state σ.s. This is quite similar to works by [9] and<br />

[10], except that we run this algorithm at each level of the hierarchy, not only<br />

the finest one.<br />

We initialize the forward planner with a root node containing the initial state<br />

of the problem. In each state that explored by the forward planning algorithm, we<br />

look at the position in the upper plan and the possible solutions proposed by the<br />

method decomposition of the upper task. Among all of these solutions, we keep<br />

only the applicable ones according to the current state and the preconditions.<br />

The possible sons are defined by the upper plan and the methods of the metaactions.<br />

We keep track of the current task of the upper plan and the current<br />

position in the methods decomposition of the task. According to the position in<br />

the higher plan, we have different branching possibilities:<br />

– if just entered a primitive action: apply it in the new plan and go to the next<br />

task,<br />

– if just entered a meta-action: sons are the different acceptable methods according<br />

to their preconditions,<br />

– if ordered/parallel: apply in sequence/parallel the different sub-tasks, without<br />

choices,<br />

– if optional: develop two sons, one with the optional subtask inserted, one<br />

without,<br />

8<br />

131


– if multseq/multpar: develop in sequence/parallel the subtask until the end<br />

condition is true<br />

– if pickone: develop all sons with all combination of variables accepted by the<br />

condition,<br />

– if face a subtask: check the preconditions, if true apply the effects, otherwise<br />

declare the branch as dead-end.<br />

Using an algorithm inspired from A* for this forward planner, we have the<br />

algorithm 1. As in A*, we maintain a planning tree for each level of hierarchy.<br />

This tree contains at each node:<br />

– a state (node.σ)<br />

– the lowest cost to reach it (node.cost),<br />

– an estimation of the cost to reach the goal (computed by an heuristic)<br />

(node.estim)<br />

– and the sons of this node (node.sons), that is, the reachable states according<br />

to the different applicable and acceptable actions.<br />

The algorithm rides recursively through this tree, choosing at each node<br />

its most promising son, that is, the one with the lowest sum of its cost and its<br />

estimated cost (line 22). Once a tip node is reached, that is, a node without sons,<br />

the algorithm applies all the applicable and acceptable actions from this state<br />

(line 8), and affects the resulting states as sons for the current node (line 14).<br />

The algorithm stops when the goal set has been reached or when it is established<br />

that the problem has no solution, that is, when no more node can be<br />

developed. It then extracts the plan from the A* tree (∅ if the problem has no<br />

solution) (line 5).<br />

3.2 Links between levels<br />

We present now (see Algorithm 2) how we construct the complete hierarchical<br />

plan (defined at all levels of the hierarchy) by refining or backtracking between<br />

plans iteratively constructed by the by-level planner.<br />

We initialize the planner with the init task of the problem (line 2), that is<br />

used by the by-level algorithm as a guide to compute the first plan. Then we<br />

keep an instance of the by-level planner for each level. The by-level planner is<br />

launched using the plan extracted from the upper level (line 5).<br />

Then, once the currently lowest level (lets call it n) by-level planner ends its<br />

work, we call a plan updater (line 10) on the higher level plans. This updater<br />

reports the actual best estimated cost to the final node of the upper by-level<br />

plan (at level k, k < n). By propagating this cost to the whole best branch of<br />

the planning structure, the updater will be able to determine if the better plan<br />

is still the same or not(line 10). This updater is called on each plan, from level<br />

n − 1 to level 0 (lines 8 to 13). At each step, the current evaluation of the best<br />

possible plan is reevaluated and used for the directly upper plan. The planning<br />

sequence starts again at the coarser level where the best plan has changed.<br />

We continue propagating the new cost estimation towards the coarsest plan.<br />

For each level, we note if the plan has been questioned or not. We then restart<br />

the computation at the coarsest level which has been questioned.<br />

9<br />

132


Algorithm 1: Astar by-level planner:<br />

1 begin runAStar(root,Pn)<br />

2 goalReached ← false;<br />

3 while root.cost< ∞∧¬goalReached do<br />

4 goalReached ← aStarRecPlanner(node,Pn);<br />

5 return extractPlan(root);<br />

6 begin aStarRecPlanner(node,Pn)<br />

7 if node.sons = ∅ then<br />

8 Ast ← next(node.σ.p) ∩ acceptable(node.σ.s);<br />

9 forall the a ∈ Ast do<br />

10 node’.σ.s ← apply(a.effects, node.σ.s);<br />

<strong>11</strong> node’.σ.p ← track(node.σ.p, a, Pn+1);<br />

12 node’.cost ← node.cost + cost(node.σ.s, a, node’.σ.s);<br />

13 node’.estim ← heurist(node’.σ);<br />

14 node.sons ← node.sons ∪ {node’};<br />

15 if node.sons = ∅ then node.estim = ∞;<br />

16 goalReached ← false;<br />

17 else<br />

18 if satisfies(node.σ.s,goal) then<br />

19 goalReached ← true;<br />

20 else<br />

21 node’ ← argminn’∈node.sons(n’.cost+n’.estim);<br />

22 goalReached ← aStarRecPlanner(node’);<br />

23 c ← minn’∈node.sons(n’.cost+n’.estim);<br />

24 node.estim ← c-node.cost;<br />

25 return goalReached;<br />

This algorithm terminates if it finds a plan containing only elementary tasks<br />

and which is not invalidated by the upper by-level planners, that is when the final<br />

solution is found, or when the estimated cost for the highest level by-level planner<br />

reaches infinity, that is, when it is estimated that no plan can be computed to<br />

reach the goal with the given decomposition.<br />

3.3 HDS Properties<br />

HDS properties first rely on the HTN and its meta-effects. If the solution (resp.<br />

optimal solution) is not reachable through the HTN, HDS will not be able to find<br />

any solution (resp. the optimal solution) of the problem. Assuming the HTN is<br />

well written, i.e. the optimal solution is reachable, the planner may still consider<br />

an intermediate solution that does not allow the planner to reach the optimal<br />

solution if the meta-effects are not optimistic (i.e. their long-term cost are higher<br />

than the real cost) ; the backtrack process will not be able to detect it.<br />

Second, the properties of our algorithm depends on the properties of the bylevel<br />

planner used:<br />

10<br />

133


To implement our by-level planner, we chose the Dijkstra algorithm, modified<br />

to use macro-tasks and information from upper plan. Even if there exists far more<br />

efficient algorithms in the literature, we chose to implement a very simple and<br />

quite naive by-level planner in order to highlight the relevance of our global<br />

Hierarchical Deepening Search approach (efficiency does not come from the bylevel<br />

planner but from our general framework). Along the same line, we do not<br />

use generic heuristics, such as Hmax or Hadd [7]. Without these heuristics, we<br />

have much less constraints on the formalism, and our planner accepts object<br />

functions, i.e. functions that return an object instead of just a number. We can<br />

also use non linear functions or effects.<br />

4.2 Comparisons with other planners<br />

Our planner HDS is optimal given an HTN decomposition on the problem. We<br />

compared HDS with TLPlan [1], a non optimal domain-dependent planner based<br />

on LTL temporal logic; with dynDFS [<strong>11</strong>], a domain-dependent optimal temporal<br />

planner based on the Timelines formalism; and with SHOP2 configured to<br />

find the optimal solution, which is a successful HTN planner (without metaeffects).<br />

The first three tracks presented in the next are from the IPC3 planning<br />

competition. All planners were allowed 2Gb of RAM and 10 minutes to plan<br />

each problem on a 3GHz Intel processor.<br />

Satellite. The Satellite STRIPS domain comes from IPC3 where a fleet of satellites<br />

have to take pictures of various events with various instruments. In the<br />

STRIPS version, each action has a time cost of one. The aim is to minimize<br />

the total time of the mission. Parallelism between satellites is authorized. In<br />

this domain, we compared HDS with dynDFS and with TLPlan. SHOP2 is not<br />

presented here as we did not have HTNs for SHOP2 on this track.<br />

Figure 5 presents planning times and costs for the different planners. Since<br />

parallelism of tasks is not yet available on HDS, our HTN decomposition is<br />

quite weak, including only tasks to initialize a sensor (turn towards an acceptable<br />

ground station then switch on the sensor and calibrate it) and to take a<br />

picture (turn to event then take picture), and cannot really take advantage of<br />

the hierarchical framework.<br />

HDS performances are similar to dynDFS, which is specialized in parallelism,<br />

but can solve less problems. In particular, HDS finds optimal plans for the problem<br />

that it could solve. As a reference, we report also the costs found by the<br />

domain-independent planner Lama [12] which, as TLPlan, cannot take advantages<br />

of parallelism in term of costs. Lama was set in the optimizing mode, and<br />

even if it does not always get the optimal cost (in a non parallelized plan), it<br />

can often find a much better solution than the one found by TLPlan, showing<br />

that TLPlan solutions are far from the optimal ones (TLPlan did not take advantage<br />

of parallelism and get high costs as soon as the problem has more than<br />

one satellite).<br />

12<br />

135


Fig. 5. Satellite<br />

Fig. 6. HDS vs SHOP2<br />

Freecell. This domain comes also from IPC3 and is inspired from by the famous<br />

Microsoft Windows game. We compared ourselves with SHOP2, giving SHOP2<br />

exactly the same HTNs as the ones given to HDS, except meta-effects and macrotasks<br />

that cannot be handled by SHOP2. We configured these HTNs to find the<br />

optimal solution at the finest level of the hierarchy. We forced the planners to<br />

send to the home location all unneeded cards, as automatically done in the<br />

Windows game. We provided another method to move a block of cards of the<br />

13<br />

136


same column if enough free cells are available. Figure 6 presents planning times<br />

for both planners. Costs are not plotted since HDS is optimal and SHOP2 is<br />

configured here to be optimal. SHOP2 is able to solve only the first problem,<br />

whereas HDS can solve the seven first ones. Additionally to meta-effects and<br />

macro-tasks, this difference can be due to several other factors: Lisp is far less<br />

efficient than OCaml and HDS can deal with more abstract functions and denser<br />

problem descriptions than SHOP2, leading to more efficient computation.<br />

Zeno Traveler. In this domain also extracted from IPC3, the planner has to<br />

make people reach their destination by plane. The planes have two speed modes:<br />

slow and zoom. Slow consumes far less fuel than zoom, but it is far slower. In<br />

the numeric version, as each move only takes one time step, zoom is never the<br />

best solution. Like Satellite, the lack of parallelism between tasks in our model<br />

restricts the capacities of HDS. We only gave as knowledge the information that<br />

the plane can only go to destinations where someone is waiting or where someone<br />

needs to go, and that someone already at his destination is not allowed to board<br />

the plane.<br />

We compared HDS with SHOP2 in its optimal mode, using its IPC3 HTN<br />

decomposition. We can see on Figure 6 the advantage of HDS: in the first two<br />

problems, with just one plane, HDS computation time is around 20 milliseconds,<br />

whereas SHOP2 computation time is around 200 milliseconds. For the other<br />

problems, the planner must choose among multiple planes, and even if parallelism<br />

is not really taken into account, SHOP2 cannot solve any of these problems,<br />

whereas HDS can solve the third problem in 14 seconds.<br />

Explore and Guide. This domain particularly puts in evidence the advantages<br />

of our approach. The goal is, for a helicopter, to drive back intruders to the<br />

border, having explored their known exit path in order to ensure that no trap<br />

is present. In this problem, non concurrent high-level tasks are easy to identify<br />

and their effects and costs are well approximated. Once the highest-level plan<br />

is computed, it is very helpful for the computation of the exploration strategy,<br />

that is split into sub-zones by an expert.<br />

We gave to SHOP2 exactly the same HTNs as to HDS, except meta-effects<br />

and macro-tasks. The main algorithmic difference is that HDS first explores<br />

at low precision and then refines this plan (with backtracks), whereas SHOP2<br />

directly explores at the finest precision level. Both planners return the same<br />

optimal solution for each problem. The results are presented in Figure 6, where<br />

we can see that HDS is still between one and two orders of magnitude quicker<br />

than SHOP2.<br />

5 CONCLUSION AND FUTURE WORKS<br />

In this paper, we proposed to use both macro-operator techniques and procedural<br />

control knowledge within the same informed planning framework. We<br />

introduced the meta-effects and macro-tasks extensions to the HTN formalism,<br />

14<br />

137


allowing us to jump forward in the state graph. We also proposed an algorithm<br />

named HDS that explores, level by level, such a structure, thus detecting traps<br />

and optimizing an abstract plan before refining it into a precise executable plan,<br />

backtracking to another high-level solution if necessary. We furthermore proved<br />

that HDS, thanks to the proposed extensions to the HTN formalism, is very<br />

efficient and optimal given the decomposition on structured problems. This contribution<br />

provides an assistance to write large planning problems using domain<br />

expertise, and to reduce the complexity of the underlying planning algorithm.<br />

The required domain expertise can be also automatically extracted from the<br />

model, and used in our approach.<br />

In a close future, we plan to implement real parallelism, not only in our model<br />

but also in our planner. We expect gains by introducing more human expertise<br />

in the domains and better performances on some problems. Since our algorithmic<br />

approach is quite generic, especially concerning the by-level planner, we<br />

plan to extend our contribution to other planning schemes, such as probabilistic<br />

planning, using a forward MDP by-level planner.<br />

References<br />

1. F. Bacchus and F. Kabanza. Using temporal logics to express search control knowledge<br />

for planning. Artificial Intelligence, 2000.<br />

2. A. Botea, M. Enzenberger, M. Müller, and J. Schaeffer. Macro-ff: Improving ai<br />

planning with automatically learned macro-operators. Journal of Artificial Intelligence<br />

Research, 24:581–621,2005.<br />

3. L. de Silva, S. Sardina, and L. Padgham. First principles planning in bdi systems.<br />

In Autonomous Agents and Multiagent Systems (AAMAS-09), 2009.<br />

4. K. Erol, J. Hendler, and D. S. Nau. HTN planning: complexity and expressivity. In<br />

AAAI’94: <strong>Proceedings</strong> of the twelfth national conference on Artificial intelligence<br />

(vol. 2), pages<strong>11</strong>23–<strong>11</strong>28,1994.<br />

5. M. Fox and D. Long. Pddl2.1: An extension to pddl for expressing temporal<br />

planning domains. Journal of Artificial Intelligence Research, 20:2003,2003.<br />

6. M. Ghallab, D. Nau, and P. Traverso. Automated Planning. Morgan Kaufmann,<br />

San Francisco, CA, USA, 2004.<br />

7. P. Haslum and H. Geffner. Admissible heuristics for optimal planning. pages<br />

140–149. AAAI Press, 2000.<br />

8. C. Hogg, H. Muñoz-Avila, and U. Kuter. Htn-maker: Learning htns with minimal<br />

additional knowledge engineering required. In Association for the Advancement of<br />

Artificial Intelligence (AAAI-08), 2008.<br />

9. U. Kuter and D. S. Nau. Using domain-configurable search control for probabilistic<br />

planning. In AAAI, Pittsburgh, Pennsylvania, USA, July 2005.<br />

10. B. Marthi, S. Russel, and J. Wolfe. Angelic hierarchical planning: Optimal and online<br />

algorithms. In International Conference on Automated Planning and Scheduling<br />

(ICAPS-08), 2008.<br />

<strong>11</strong>. C. Pralet and G. Verfaillie. Using constraint networks on timelines to model and<br />

solve planning and scheduling problems. In Proc. ICAPS, 2008.<br />

12. S. Richter, M. Helmert, and M. Westphal. Landmarks revisited. In 23rd AAAI<br />

Conference on Artificial Intelligence (AAAI-08), 2008.<br />

15<br />

138


Solving Disjunctive Temporal Problems with<br />

Preferences using Boolean Optimization solvers<br />

Marco Maratea 1 ,MaurizioPianfetti 1 ,andLucaPulina 2<br />

1 DIST, University of Genova, Viale F. Causa 15, Genova, Italy.<br />

marco@dist.unige.it,maurizio.pianfetti@studenti.ingegneria.unige.it<br />

2 DEIS, University of Sassari, Piazza Università <strong>11</strong>, Sassari, Italy.<br />

lpulina@uniss.it<br />

Abstract. The Disjunctive Temporal Problem (DTP), which involves Boolean<br />

combination of difference constraints of the form x − y ≤ c, is an expressive<br />

framework for constraints modeling and processing. When a DTP is unfeasible<br />

we may want to select a feasible subset of its DTP constraints (i.e., disjunctions<br />

of difference constraints), possibly subject to some degree of satisfaction: The<br />

Max-DTP extends DTP by associating a preference, in the form of weight, to<br />

each DTP constraint for its satisfaction, and the goal is to find an assignment to<br />

its variables that maximizes the sum of weights of satisfied DTP constraints. In<br />

this paper we first present an approach based on Boolean optimization solvers to<br />

solve Max-DTPs. Then, we implement our ideas in TSAT++, an efficient DTP<br />

solver, and evaluate its performance on randomly generated Max-DTPs, using<br />

both different Boolean optimization solvers and two optimization techniques.<br />

1 Introduction<br />

The Disjunctive Temporal Problem (DTP), introduced in [8], is defined as the finite<br />

conjunction of DTP constraints, each DTP constraint being a finite disjunction of difference<br />

constraints of the form x−y ≤ c, where x and y are arithmetic variables ranging<br />

over a domain of interpretation (the set of real numbers R or the set of integers Z), and<br />

c is a numeric constant. The goal is to find an assignment to the variables of the problem<br />

that satisfies all DTP constraints. The DTP is recognized to be a good compromise<br />

between expressivity and efficiency, given that the arithmetic consistency of a set of<br />

difference constraints can be checked in polynomial time, and has found applications<br />

in many areas such as planning, scheduling, hardware and software verification, see,<br />

e.g., [19, 5]. Along the years several systems that can solve DTPs have been developed,<br />

e.g., SK [21], TSAT [2], CSPI [18], EPILITIS [23], TSAT++ [4], and MATHSAT [5].<br />

Moreover, the competition of solvers for Satisfiability Modulo Theories (SMT-COMP) 3<br />

has two logics that include DTPs (called QF RDL and QF IDL, respectively).<br />

When a DTP is unfeasible, i.e., unsatisfiable, we may want to select a feasible subset<br />

of its DTP constraints, which can be possibly subject to some degree of satisfaction:<br />

The maximum satisfiability problem on a DTP (i.e., Max-DTP) extends DTP by associating<br />

a preference, in the form of cost, or weight, to each DTP constraint for taking<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms for Solving<br />

Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

3 http://www.smtcomp.org/.<br />

139


into account what the reward for the DTP constraint’s satisfaction. The goal is to find an<br />

assignment to the variables of the problem that maximizes thesumoftherewardofsatisfied<br />

DTP constraints. The introduction of preferences in DTPs has been first presented<br />

in [20], where complex preferences can be assigned to each difference constraint.<br />

In this paper we present an approach which extends the lazy SAT-based approach<br />

implemented in solvers for DTPs. The idea is to (i) abstract a Max-DTP P into a Conjunctive<br />

Normal Form (CNF) formula φ and an optimization function f; (ii) find a<br />

solution for ϕ under f with a Boolean optimization solver; and (iii) verify if the solution<br />

returned is consistent. Step (ii) can be implemented with a variety of approaches<br />

and solvers, ranging from Max-SAT 4 and Pseudo-Boolean (PB) 5 ,toAnswerSetProgramming<br />

(ASP) [12, 13]. Then, we implement our ideas by modifying the DTP solver<br />

TSAT++, a well-known and efficient solver for solving DTPs, and call the resulting<br />

system TSAT#. We finally evaluate its performance on randomly generatedDTPs,<br />

using a well-known generation method from [21] extended with randomlygenerated<br />

weights. We focus our analysis on TSAT#, as representative of thesolversimplementing<br />

the lazy SAT-based approach to DTPs, and consider Max-SAT andPBsolversas<br />

back-engines, as well as two optimization techniques that proved effective for solving<br />

DTPs. Our preliminary results show that the Max-SAT solver AKMAXSAT performs<br />

well on these benchmarks, and that the employed optimization techniqueshelptoreducing<br />

the search time.<br />

2 Formal Background<br />

Disjunctive Temporal Problems. Temporal constraints have been introduced in [8], as<br />

an extension of the Simple Temporal Problem (STP), which consists of conjunction<br />

of different constraints. Let V be a set of symbols, called variables. Adifference constraint,<br />

or simply constraint is an expression of the form x − y ≤ c, wherex, y ∈ V,<br />

and c is a numeric constant. A DTP formula, orsimplyformula, isacombinationof<br />

constraints via the unary connective “¬” fornegationandthen-ary connectives “∧”<br />

and “∨” (n ≥ 0) forconjunctionanddisjunction,respectively.Aconstraint literal, or<br />

simply literal, iseitheraconstraintoritsnegation.Ifa is a constraint, then a abbreviates<br />

¬a and ¬a stands for a.LetthesetD (domain of interpretation)beeitherthesetof<br />

the real numbers R, orthesetofintegersZ. Anassignment is a total function mapping<br />

variables to D. Letσ be an assignment and φ be a formula. Then σ |= φ (σ satisfies a<br />

formula φ)isdefinedasfollows.<br />

σ |= x − y ≤ c if and only if σ(x) − σ(y) ≤ c,<br />

σ |= ¬φ if and only if it is not the case that σ |= φ,<br />

σ |= (∧ n i=1 φi) if and only if for each i ∈ [1,n], σ |= φi, and<br />

σ |= (∨ n i=1 φi) if and only if for some i ∈ [1,n], σ |= φi.<br />

If σ |= φ then σ will also be called a model of φ. Wealsosaythataformulaφ is<br />

satisfiable if and only if there exists a model for it.<br />

4 http://www.maxsat.udl.cat/.<br />

5 See, e.g., http://www.cril.univ-artois.fr/PB10/.<br />

140<br />

2


ADTPistheproblemofdecidingwhetheraformulaissatisfiable or not in the<br />

given domain of interpretation D. Noticethatthesatisfiabilityofaformuladependson<br />

D, e.g.,theformulax − y>0 ∧ x − y


It is a well known fact that BF can be used in step 3 to check the satisfiability of a<br />

finite set Q of constraints of the form x − y ≤ c. Thisisdonebyfirstbuildingaconstraint<br />

graph for Q, see,e.g.,[6].Thesoundnessandcompletenessofthealgorithm is<br />

guaranteed by the soundness and completeness of the underlying solving procedure for<br />

solving DTPs, i.e., the solving procedure for φ (showed in [4]), and from the soundness<br />

and completeness of the Boolean optimization procedures employed. For solving step<br />

2 awiderangeofformulations,solvingproceduresandtechniques can be employed,<br />

e.g., Weighted Max-SAT, Pseudo-Boolean, and ASP.<br />

In the next paragraphs we present two optimization techniques that can help to improve<br />

the performance of the basic algorithm presented in this section. However, there<br />

is an optimization to the basic procedure that it is set by default in TSAT#: If the consistency<br />

check at step 3 does not succeed, we add a “reason” to the abstraction formula<br />

ϕ,i.e.,aclausethatpreventsthesolveremployedtore-compute an assignment µ having<br />

the literals corresponding to constraint literals that caused the arithmetic inconsistency<br />

assigned in the same way. Given the BF, computing such reason can be done efficiently,<br />

by considering the difference constraints involved in (one of the) negative cycles. Of<br />

course, methods for limiting the number of added reasons is needed, in order to let<br />

the procedure to still working in polynomial space, e.g., given a positive integer b, by<br />

adding only the reasons that contain a number of literals lessorequalthanb.<br />

Optimizations. We herewith highlight two optimization techniques, one theory dependent<br />

and one theory independent, that proved to be effective for solving DTPs, and that<br />

can be fruitfully used with black-box engines. Their general ideaistoreducetheenumeration<br />

of unfruitful assignments at a reasonable price. The first one, denoted with<br />

IS2,isapreprocessingstep:Foreachunorderedpair〈ci,cj〉 of distinct difference constraints<br />

appearing in the formula φ and involving the same variables, all possible pairs<br />

of literals built out of them are checked for consistency. Assuming ci and cj are inconsistent,<br />

the constraint ci ∨ cj is added to the input formula before calling TSAT#. The<br />

second technique, called “model reduction”, is based on the observation that an assignment<br />

µ generated by TSAT# can be redundant, that is, there might exist an assignment<br />

µ ′ ⊂ µ that propositionally entails the input formula. When this is the case, we can<br />

check the consistency of µ ′ instead µ. Detailsforbothtechniquescanbefoundin[4].<br />

4 Implementation and Experimental Analysis<br />

We have implemented TSAT# as an extension of the TSAT++ solver[3],byintegrating<br />

some Max-SAT and PB solvers as back-engines for reasoning aboutBooleanoptimization<br />

problems. Specifically, the employed solvers are: MINIMAXSAT ver. 1.0 [14],<br />

MINISAT+ [9]ver.1.14,andAKMAXSAT [15], the version submitted to the last Max-<br />

SAT 2010 Competition. These are well-known solvers for Boolean optimization and<br />

among the best Partial Weighted Max-SAT 6 and Pseudo-Boolean (focusing on the OPT-<br />

SMALL-INT 7 category) solvers, in various Max-SAT and PB Evaluations andCompe-<br />

6 The “partial” version of the problem, where both hard and soft clauses are present, is needed<br />

because the original clauses of the abstracted problem are soft, while the added ones are hard.<br />

7 We remind that this is a category of PB Evaluations and Competitions where (i) no constraint<br />

has a sum of coefficients greater than 2 20 (20 bits), and (ii) the objective function is linear.<br />

142<br />

4


titions. We remind that MINISAT+ isaPBsolver,AKMAXSAT is a Max-SAT solver,<br />

while MINIMAXSAT accepts problems in both formalisms: Given it has been mainly<br />

evaluated on Max-SAT formulations, we rely on such format in our analysis. Given a<br />

CNF formula ϕ, andafunctionw 8 mapping each clause to a positive integer number<br />

that represents its weight, the main implementation part has been devoted to introduce<br />

weights and formulate the optimization problems in Max-SAT and PB formats. For<br />

Max-SAT problems, there is an immediate formulation by directly assigning weights<br />

to clauses, while PB problems need “clause selectors” to be added to each soft clause,<br />

and the optimization function to be defined over the clause selectors. In the following,<br />

given a Boolean optimization solver X,<br />

1. TSAT#(X) is TSAT# in plain configuration employing X for step 2;<br />

2. TSAT#+p(X) is TSAT# with model reduction enabled employing X for step 2;<br />

3. TSAT#+is(X) is TSAT# with IS2 preprocessing enabled employing X for step 2;<br />

4. TSAT#+is+p(X) is TSAT# with both model reduction and IS2 preprocessing enabled<br />

employing X for step 2.<br />

About the benchmarks, we randomly generated Max-DTPs, using awell-known<br />

generation method from [21] extended with random weights. Inparticular,inourmodel<br />

Max-DTPs are randomly generated by fixing the number k of disjuncts per clause, the<br />

number n of arithmetic variables, a positive integer L such that all the constants are<br />

taken in [−L, L], andapositiveintegerw such that all the weights are taken in [1,w].<br />

Then, (i) the number of clauses m is increased to create bigger problems, (ii) for each<br />

tuple of values of the parameters, 10 instances are generatedandthenfedtothesolvers,<br />

and (iii) the median of the CPU times is plotted against the m/n ratio. We fix k =2,<br />

L =100, w =100, n =5, 10, andtheratiom/n varying from 6 to 10. The lower<br />

bound of m/n has been fixed as the lower positive integer for which there is amajority<br />

of unsatisfiable underlying DTPs. Further note that the DTP is alreadya“difficult”<br />

problem, and the analysis in literature on DTPs have been performed on problems with<br />

few tens of variables for the setting used in this paper 9 :addingpreferencesfurther<br />

increase the difficulty. The timeout for each problem has been set to 1800s on a Linux<br />

box equipped with a Pentium IV 3.2GHz processor and 1GB of RAM.<br />

Fig. 1 shows the results for TSAT# employing MINIMAXSAT (top), MINISAT+<br />

(middle) and AKMAXSAT (bottom), respectively, on randomly generated Max-DTPs<br />

with n = 5. For each plot, the left one considers real-valued variables, while the<br />

right one considers integer-valued variables. First, we can notethattheoptimization<br />

techniques described help to significantly improve the efficiency: Enhancing the plain<br />

TSAT# version with one of the technique helps reducing the overall CPU time of<br />

around a factor of 3, whileenablingbothtechniquesinconjunctionimprovestheperformance<br />

of around one order of magnitude. Comparing the performance of the various<br />

Boolean optimization solvers employed, AKMAXSAT is clearly the best underlying<br />

8 w has been originally defined on DTPs. After the abstraction, there is a one-to-one correspondence<br />

between each DTP constraint and the related clause in ϕ. Thus, with a slight abuse of<br />

notation, we consider the optimization function to be defined on the clauses of ϕ.<br />

9 In [16], DTPs with many variables n are used, but the analysis is focused on problems with<br />

k>2 having ratios m/n such that a vast majority of instances are satisfiable.<br />

143<br />

5


meadin CPU time<br />

median CPU time<br />

median CPU time<br />

<strong>11</strong>0<br />

100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

TSAT#(MiniMaxSAT)<br />

TSAT#+p(MiniMaxSAT)<br />

TSAT#+is(MiniMaxSAT)<br />

TSAT#+is+p(MiniMaxSAT)<br />

5 real-valued variables<br />

0<br />

30 35 40<br />

m/n<br />

45 50<br />

300<br />

250<br />

200<br />

150<br />

100<br />

10<br />

50<br />

0<br />

30 35 40<br />

m/n<br />

45 50<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

3<br />

2<br />

1<br />

TSAT#(MiniSAT+)<br />

TSAT#+p(MiniSAT+)<br />

TSAT#+is(MiniSAT+)<br />

TSAT#+is+p(MiniSAT+)<br />

TSAT#(akmaxsat)<br />

TSAT#+p(akmaxsat)<br />

TSAT#+is(akmaxsat)<br />

TSAT#+is+p(akmaxsat)<br />

5 real-valued variables<br />

5 real-valued variables<br />

0<br />

30 35 40<br />

m/n<br />

45 50<br />

median CPU time<br />

median CPU time<br />

median CPU time<br />

100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

TSAT#(MiniMaxSAT)<br />

TSAT#+p(MiniMaxSAT)<br />

TSAT#+is(MiniMaxSAT)<br />

TSAT#+is+p(MiniMaxSAT)<br />

5 integer-valued variables<br />

0<br />

30 35 40<br />

m/n<br />

45 50<br />

300<br />

250<br />

200<br />

150<br />

100<br />

10<br />

50<br />

TSAT#(MiniSAT+)<br />

TSAT#+p(MiniSAT+)<br />

TSAT#+is(MiniSAT+)<br />

TSAT#+is+p(MiniSAT+)<br />

5 integer-valued variables<br />

0<br />

30 35 40<br />

m/n<br />

45 50<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

3<br />

2<br />

1<br />

TSAT#(akmaxsat)<br />

TSAT#+p(akmaxsat)<br />

TSAT#+is(akmaxsat)<br />

TSAT#+is+p(akmaxsat)<br />

5 integer-valued variables<br />

0<br />

30 35 40<br />

m/n<br />

45 50<br />

Fig. 1. Results of TSAT# employing MINIMAXSAT (top), MINISAT+ (middle), and AKMAXSAT<br />

(bottom) on random Max-DTPs with 5 real-valued (left) and integer-valued (right) variables.<br />

solver on these benchmarks among the ones analyzed: On the biggest instances, it gains<br />

around one order of magnitude over MINIMAXSAT, andmorethanafactorof20 over<br />

MINISAT+. All considerations made hold with both real- and integer-valued variables.<br />

The intuition for the superior performance of AKMAXSAT is that, given the results<br />

of the 2010 Max-SAT Competition, AKMAXSAT seems to be very effective on randomly<br />

generated and synthetic benchmarks, Given our approach, the starting abstraction formula<br />

has the following structure: It is a fixed-length formula where each variable occurs<br />

once (with high probability) in the formula. But this was not fully expected: During the<br />

144<br />

6


median CPU time<br />

1800<br />

1600<br />

1400<br />

1200<br />

1000<br />

800<br />

600<br />

400<br />

200<br />

TSAT#(akmaxsat)<br />

TSAT#+p(akmaxsat)<br />

TSAT#+is(akmaxsat)<br />

TSAT#+is+p(akmaxsat)<br />

10 real-valued variables<br />

0<br />

60 70 80<br />

m/n<br />

90 100<br />

median CPU time<br />

1800<br />

1600<br />

1400<br />

1200<br />

1000<br />

800<br />

600<br />

400<br />

200<br />

TSAT#(akmaxsat)<br />

TSAT#+p(akmaxsat)<br />

TSAT#+is(akmaxsat)<br />

TSAT#+is+p(akmaxsat)<br />

10 integer-valued variables<br />

0<br />

60 70 80<br />

m/n<br />

90 100<br />

Fig. 2. Results of TSAT# employing AKMAXSAT on random Max-DTPs with 10 real-valued<br />

(Left) and integer-valued (Right) variables.<br />

search, adding constraints corresponding to reasons, the number of occurrences of literals<br />

increase, giving to the formula a less “synthetic” structure.<br />

We increase the number of variables n to 10, andfocustheanalysisonTSAT#<br />

employing AKMAXSAT,i.e.,ourbestBooleanoptimizationsolveronthesebenchmarks.<br />

From Fig. 2 we can note that the impact of the optimization techniques is now different:<br />

Model reduction improves dramatically the performance of TSAT#(AKMAXSAT), while<br />

the impact of the preprocessing is limited with this setting.<br />

5 Conclusionsand FutureWork<br />

In this paper we have presented an approach to solving weighted maximum satisfiability<br />

on DTPs. The approach extends the one implemented in TSAT++, by employing<br />

Boolean optimization solvers as reasoning engines. The performance of the resulting<br />

system, TSAT#, employing some Max-SAT and PB solvers are analyzed on randomly<br />

generated benchmarks, together with the impact that both theory dependent and independent<br />

optimization techniques have on its performance. The AKMAXSAT Max-SAT<br />

solver is the best option among the solvers analyzed, and both optimizationtechniques<br />

help to improve the overall performance. Current research includes (i) the integration<br />

of other solvers in TSAT#, e.g., the ASP solver CLASP [<strong>11</strong>], that have proved to be<br />

very competitive at the 2009 PB Competition, (ii) the extension of our algorithm to<br />

deal with other forms of preferences, e.g., where weights canbeassociatedtoeachdifferent<br />

constraint, for which it is still possible to rely on PB andASPformalismsand<br />

systems, and (iii) acomparativeanalysiswithrivaltoolsforwhichsuchmorecomplex<br />

preferences can be easily specified, e.g. MAXILITIS and HYSAT [10].<br />

Acknowledgments. We would like to thank Adrian Kügel for providing, and getting<br />

support to, AKMAXSAT, andtoMichaelD.MoffittandBartPeintnerfordiscussions<br />

about MAXILITIS and WEIGHTWATCHER.<br />

145<br />

7


References<br />

1. J. Argelich, C. M. Li, F. Manyà, and J. Planes. The first and second Max-SAT evaluations.<br />

Journal of Satisfiability, Boolean Modelling and Computation, 4(2-4):251–278, 2008.<br />

2. A. Armando, C. Castellini, and E. Giunchiglia. SAT-based procedures for temporal reasoning.<br />

Proc. of ECP 1999, volume 1809 of LNCS, pages 97–108. Springer, 1999.<br />

3. A. Armando, C. Castellini, E. Giunchiglia, M. Idini, and M. Maratea. TSAT++: An open<br />

platform for satisfiability modulo theories. ENTCS, 125(3):25–36, 2005.<br />

4. A. Armando, C. Castellini, E. Giunchiglia, and M. Maratea. The SAT-based approach to<br />

Separation Logic. Journal of Automated Reasoning, 35(1-3):237–263, 2005.<br />

5. M. Bozzano, R. Bruttomesso, A. Cimatti, T. A. Junttila, P. van Rossum, S. Schulz, and R. Sebastiani.<br />

MathSAT: Tight integration of SAT and mathematical decision procedures. Journal<br />

of Automated Reasoning, 35(1-3):265–293, 2005.<br />

6. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. MIT<br />

Press, 2001.<br />

7. L. de Moura. http://yices.csl.sri.com/.<br />

8. R. Dechter, I. Meiri, and J. Pearl. Temporal constraint networks. Artificial Intelligence,<br />

49(1-3):61–95, Jan. 1991.<br />

9. N. Eén and N. Sörensson. Translating pseudo-Boolean constraints into SAT. Journal on<br />

Satisfiability, Boolean Modeling and Computation, 2:1–26, 2006.<br />

10. M. Franzle, C. Herde, T. Teige, S. Ratschan, and T. Schubert. Efficient solving of large<br />

non-linear arithmetic constraint systems with complex boolean structure. Journal on Satisfiability,<br />

Boolean Modeling and Computation, 1:209–236, 2007.<br />

<strong>11</strong>. M. Gebser, B. Kaufmann, A. Neumann, and T. Schaub. Conflict-driven answer set solving.<br />

In Proc. of IJCAI 2007, pages 386–392. Morgan Kaufmann Publishers, 2007.<br />

12. M. Gelfond and V. Lifschitz. The stable model semantics for logic programming. In Proc.<br />

of ICLP/SLP 1988, pages1070–1080,1988.<br />

13. M. Gelfond and V. Lifschitz. Classical negation in logic programs and disjunctive databases.<br />

New Generation Computing, 9:365–385, 1991.<br />

14. F. Heras, J. Larrosa, and A. Oliveras. MiniMaxSat: A new weighted Max-SAT solver. Journal<br />

of Artificial Intelligence Research (JAIR), 31:1–32, 2008.<br />

15. A. Kuger. Improved exact solver for the weighted Max-SAT problem. In Proc. of the 2nd<br />

Pragmatics of SAT (PoS-10) workshop,2010.<br />

16. B. Nelson and T. K. S. Kumar. CircuitTSAT: A solver for large instances of the disjunctive<br />

temporal problem. In Proc. of ICAPS 2008,pages232–239.AAAIPress,2008.<br />

17. R. Nieuwenhuis and A. Oliveras. On sat modulo theories and optimization problems. In<br />

Proc. of SAT 2006, volume 4121 of LNCS, pages 156–169. Springer, 2006.<br />

18. A. Oddi and A. Cesta. Incremental forward checking for the disjunctive temporal problem.<br />

In Proc. of ECAI-2000, pages 108–<strong>11</strong>2, Berlin, 2000.<br />

19. A. Oddi, R. Rasconi, and A. Cesta. Project scheduling as a disjunctive temporal problem.<br />

In Proc. of ECAI 2010, volume 215 of Frontiers in Artificial Intelligence and Applications,<br />

pages 967–968. IOS Press, 2010.<br />

20. B. Peintner and M. E. Pollack. Low-cost addition of preferences to DTPs and TCSPs. In<br />

Proc. of AAAI 2004,pages723–728.AAAIPress/TheMITPress,2004.<br />

21. K. Stergiou and M. Koubarakis. Backtracking algorithms for disjunctions of temporal constraints.<br />

Artificial Intelligence, 120(1):81–<strong>11</strong>7, 2000.<br />

22. O. Strichman, S. A. Seshia, and R. E. Bryant. Deciding Separation formulas with sat. In<br />

Proc. of CAV 2002, volume 2404 of LNCS, pages 209–222. Springer, 2002.<br />

23. I. Tsamardinos and M. Pollack. Efficient solution techniques for disjunctive temporal reasoning<br />

problems. Artificial Intelligence, 151:43–89, 2003.<br />

146<br />

8


Visualizing Learning Dynamics in Large-Scale<br />

Networks<br />

Manal Rayess 1 and Sherief Abdallah 1,2<br />

1 Faculty of Informatics<br />

The British University in Dubai<br />

P.O.Box 502216, Dubai,<br />

United Arab Emirates<br />

manal.rayess@gmail.com<br />

2 (Fellow)School of Informatics,<br />

University of Edinburgh<br />

Edinburgh, EH8 9LE, UK<br />

sherief.abdallah@buid.ac.ae<br />

Abstract. Learning in multiagent systems requires that agents change<br />

their behavior in an attempt to maximize their payoffs. This can result in<br />

the system having complex dynamics. Being able to visualize these complex<br />

dynamics is an important step toward understanding learning in<br />

multiagent systems. Previous work in this area either focused on smallscale<br />

theoretical analysis that is difficult to extend to large-scale networks,<br />

or used global performance metrics (such as the average payoff)<br />

as a rough approximation to the dynamics.<br />

In this paper we propose a new visualization methodology that combines<br />

network analysis with dimensionality reduction to visualize learning dynamics<br />

in large-scale networks of agents. First, the dynamics over the<br />

network are summarized using network measures, then we use dimensionality<br />

reduction to reduce the dimensions even further. We conduct a<br />

comparative study to investigate different network analysis measures and<br />

different dimensionality reduction techniques over different settings. The<br />

results confirm that using network analysis is beneficial for visualizing<br />

dynamics.<br />

1 Introduction<br />

Multiagent systems (MAS) are systems composed of multiple interacting intelligent<br />

agents, where an intelligent agent is a computational element that is capable<br />

of interacting with its environment (as well as with other agents). Learning in<br />

multiagent systems requires that agents change their behavior in an attempt to<br />

maximize their payoffs. This can result in the system having complex dynamics.<br />

Being able to visualize these complex dynamics is an important step toward<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

147


understanding learning in multiagent systems. Previous work in this area either<br />

focused on small-scale theoretical analysis that is difficult to extend to largescale<br />

networks, or used global performance metrics (such as the average payoff)<br />

as a rough approximation to the dynamics.<br />

Traditional techniques have relied heavily on using global performance metrics<br />

the system is trying to optimize (such as the social welfare) or other summarizing<br />

statistics of the local performance parameters (such as the average number<br />

of wins) in visualizing the performance of multiagent systems. However,<br />

these techniques can overlook important information pertinent to the performance<br />

on the micro level such as the malfunction of some agent (e.g. due to<br />

a disruption in its learning functionality). Moreover, experiments have shown<br />

that depending on the global performance alone can overlook hidden instability<br />

[1]. Another limitation in existing visualization techniques is that such<br />

techniques are better suited for plotting the learning dynamics for few players<br />

(typically two) in games with low-dimensional payoff matrix (typically 2×2<br />

or 3×3). A common way to compare the performance of different learning algorithms<br />

in higher dimensional games, is to list the results in a table which<br />

is a non-visual technique. Examples of these existing techniques are surveyed<br />

in [10]. Figure 1 illustrates some of the traditional visualization techniques.<br />

The long-term goal of our research is to find a technique for visualizing the<br />

learning dynamics of adaptive agents in large-scale networks in a way that is<br />

capable of summarizing the interaction of agents over the network by as few<br />

parameters as possible, while being able to remain sensitive to learning dynamics<br />

of individual agents. The work we present here provides the first step toward<br />

the above goal. We propose a visualization methodology that consists of two<br />

steps. In the first step we use network analysis measures to summarize agent<br />

interactions over the network to fewer number of parameters that capture the<br />

network structure. In the second step we use dimensionality reduction to reduce<br />

the number of parameters to one or two dimensions that can be visualized. We<br />

summarize the contributions of this paper in the following points.<br />

1. A new visualization methodology that combines network analysis with dimensionality<br />

reduction.<br />

2. A comparative study of different combinations of dimensionality reduction<br />

techniques and social network measures.<br />

3. Extensive evaluation of our methodology to answer the following questions:<br />

RQ1. Can our visualization technique differentiate between different learning<br />

algorithms in a networked system?<br />

RQ2. Can our visualization technique capture disruptions in the learning of<br />

agents, and<br />

RQ3. Can our visualization technique capture disruptions in the network<br />

structure?<br />

148<br />

2


Fig. 1. Techniques used for visualizing the performance of learning algorithms in MAS.<br />

1) The policy trajectory plot is used to plot the performance of some learning algorithm<br />

for two players in a 2x2 game. Grayscales are used to indicate the direction of convergence.<br />

2) The simplex plot extends the trajectory plot to to a 3x3 game matrix. 3) In<br />

the directional field plot a velocity field is computed to represent the derivative of the<br />

strategies of both players with respect to the time. Velocities are displayed by arrows<br />

pointing in the direction of the derivative and the length of the arrow indicates the absolute<br />

value of the derivative. 4) The cumulative reward plot indicates whether agents<br />

converge to the maximum social welfare profile in coordination (and anti-coordination)<br />

games of n players.<br />

2 Methodology<br />

We propose a two-steps visualization methodology: summarizing interactions<br />

over the network using network analysis measures then use dimensionality reduction<br />

to limit number of parameters to one or two. Dimensionality reduction<br />

is the process of reducing the number of features or parameters of a data set<br />

consisting of a large number of interrelated variables, called latent variables,<br />

while retaining as much as possible of the variation. The most common linear<br />

dimensionality reduction method is the Principal Component Analysis (PCA).<br />

PCA [8] works by transforming the original data set into a new set of uncorrelated<br />

variables (PCs) which are ordered so that the first few retain most of the<br />

variation in all of the original variables (PC1, PC2,...,PCn).<br />

Network measures are functions that summarize a graph into numeric values<br />

to simplify the analysis of the network. A key network metric is the centrality<br />

of a node which is concerned with measuring the extent to which a node is<br />

”central” in the network. A node is said to be more central than others if: it has<br />

more ties, it can reach all others more quickly, or it controls the flow between<br />

the others. These three properties form the three measures of node centrality:<br />

degree, closeness, and betweenness [3].<br />

149<br />

3


These measures were generalized by later works for weighted networks [2,<br />

5, 7]. In [7] the number of ties, i.e. the degree, is also taken into consideration<br />

(besides the weights of the links) in computing weighted centrality. The relative<br />

importance between the number of links and their weights can be tuned by a<br />

variable parameter α. Thus, the weighted degree combines both the degree and<br />

the strength of the node.<br />

The remaining of this paper shows the experiments we conducted for testing<br />

this visualization technique and discusses the results before concluding by<br />

summarizing the achievements of this work.<br />

3 Experiments and Results<br />

In order to apply the techniques we are making use of for our visualization<br />

method, we made use of several available tools. The tools we have used are:<br />

MATLAB dimensionality reduction toolbox for applying different dimensionality<br />

reduction techniques [4], tnet [6] for computing the weighted network measures,<br />

and NetLogo [12], multiagent programmable modeling environment, for building<br />

a platform for the sake of testing the proposed visualization technique on<br />

networks of adaptive agents.<br />

Before testing our visualization technique on networks of intelligent agents,<br />

we carried out several pilot experiments to compare and identify suitable combinations<br />

of a dimensionality reduction technique and social network measure(s). 3<br />

We concluded from the pilot experiments the following: a) PCA gives good<br />

results in least computational time, b) using PCA with one or two target dimensions<br />

on a combination of social network measures that summarized the interactions<br />

in some network reveal patterns that could not be revealed using the raw<br />

network dynamics, i.e. without using network measures, and c)using weighted<br />

degree centrality (in-degree and/or out-degree) is sufficient to producing good<br />

visualization results.<br />

The remainder of this section shows how we used this visualization technique<br />

on networks of adaptive agents.<br />

3.1 Experimental Settings<br />

We test the proposed visualization technique on large networks with nodes representing<br />

agents that involve in playing some game and learn their actions. We<br />

pick the ”battle of the sexes” game and we implement two learning algorithms,<br />

namely the Q-learning [<strong>11</strong>] and the Infinitesimal Gradient Ascent (IGA) [9].<br />

The Battle of the Sexes is a sample of a coordination game. The game can<br />

be defined by the following payoff matrix:<br />

I F<br />

I 4,7 0,0<br />

F 3,3 7,4<br />

3 The details of the pilot experiments are described in a techincal report<br />

150<br />

4


For the sake of testing the visualization technique on networks of adaptive<br />

agents, we developed a platform using NetLogo. In this platform, agents are<br />

situated in a network, with pre-specified number of nodes and average node<br />

degree. The network is weighted in a manner similar to weighted social networks<br />

where weights typically represent the amount of communication between the<br />

corresponding nodes. In every time step each player has to learn two things:<br />

what action to play (one action among all players) and which neighbor(s) to<br />

play with. Learning about the action uses the learning algorithm as set by the<br />

user (Q or IGA), whereas learning about the neighbor to play with always uses<br />

Q-learning. The output of each run is the edge-list corresponding to the evolving<br />

weighted network.<br />

3.2 Results for Question 1: Can it distinguish different learning<br />

algorithms?<br />

We conducted different runs using both learning algorithm (Q-learning and IGA)<br />

and then we tried to use the dimensionality-reduced network metric technique<br />

of visualization on these runs as in Figure 2.<br />

We can clearly observe the distinction made by this visualization technique<br />

on the two learning algorithms.<br />

Fig. 2. Result of PCA (with target dimensions is 1) on weighted closeness for weights<br />

corresponding to values of the iterator (Q vector in Q-learning and policy in IGA) in<br />

10 different runs on a network of 5 nodes and average node degree of 2. The game<br />

played is Battle of the Sexes.Parameters for the learning algorithms are as follows: Q<br />

(ε initially is 1 and iteratively decays in multiples of 0.98, α =0.1,γ =0.9).IGA(η<br />

initially is 0.03 and then iteratively decays in multiples of 0.99).<br />

3.3 Results for Question 2: Can it capture disruptions in learning?<br />

When learning is disrupted at some stage for some agent, it will simply stop<br />

learning. This is simulated by having it playing a random action in each forth-<br />

151<br />

5


coming stage. Thus, it will stop maintaining the values of the learning parameters.<br />

We examined different runs for adaptive agents situated in random networks<br />

of different sizes (number of nodes × average node degree),ranging from 5 × 3to<br />

100×4, where the learning of random node(s) is disrupted at different times after<br />

convergence. We then computed the degree centrality for the resulting edgelist<br />

on which we used PCA to reduce the dimensionality. We plot the results of<br />

dimensionality reduction in 2D and mark the point of disruption in each run as<br />

shown in Figure 3.<br />

Fig. 3. Results of PCA (2 dimensions) on weighted degree (in-degree and out-degree<br />

with α =0.5 corresponding to weighted networks of different node and average link<br />

degrees. The learning of 20% random nodes is disrupted at different times after convergence.<br />

Learning algorithm is Q-learning.<br />

3.4 Results for Question 3: Can it capture disruptions in network<br />

structure?<br />

When a link is disrupted, i.e. disconnected, at some stage it will not take part<br />

in playing games with neighbors and, hence, will receive a zero payoff. As in<br />

the previous section, we plot the results of the weighted degree centrality of<br />

different runs where in each run one or more random links are disconnected after<br />

convergence. Figure 4 illustrates these plots in 2D where each plot correspond<br />

to a different run and the point of disruption is circled in red.<br />

3.5 Results for Question 4: Can it be used to identify the type and<br />

source of disruption?<br />

If we can distinguish between different types of disruptions and identify the<br />

source of this disruption, then we can say that our visualization technique can<br />

be used as a means of explanatory data analysis for multiagent systems.<br />

152<br />

6


Fig. 4. Results of PCA (2 dimensions) on weighted degree (in-degree and out-degree<br />

with α =0.5) corresponding to different weighted networks. Random links are disrupted<br />

at different times after convergence. Learning algorithm is IGA.<br />

As an experiment we plotted different lists of weighted node degree centrality<br />

metric that we computed for previous runs over time and we could observe that<br />

these network metrics of disrupted nodes show different trends than those of<br />

undisrupted nodes (see for example Figure 5).<br />

However, in the experiments we conducted it was not clear how to distinguish<br />

between different types of disruption.<br />

Fig. 5. Plotting weighted degree centrality of six nodes, three of which have their<br />

learning disrupted (the ones in red) at time 20.<br />

4 Conclusion and Future Work<br />

In this research we aimed at finding a technique for visualizing the learning dynamics<br />

in large-scale networks of multiagent systems in a way that is capable<br />

of summarizing the global performance of the whole system (on its macro level)<br />

153<br />

7


y as few parameters as possible, while being able to remain sensitive to the<br />

learning of individual agents (the micro level). For this we proposed the use of<br />

a combination of dimensionality reduction and weighted network metrics as a<br />

means of visualizing the performance in networks of adaptive agents. The results<br />

of the experiments have confirmed several claims we made in this study.<br />

Many things can be done, as a future work, to consolidate our findings and<br />

build on top of it. For example, we can extend the testing to other games (such as<br />

the Prisoner’s Dilemma and Matching Pennies) and other learning algorithms.<br />

Beside trying different tunings and parameters, one important extension to this<br />

work is by testing whether this technique can be used for explaining the performance<br />

of the system, for example by unambiguously identifying the type and<br />

source of the disruption in case it occurred in the system.<br />

References<br />

1. S. Abdallah. Using graph analysis to study networks of adaptive agent. In <strong>Proceedings</strong><br />

of the 9th International Conference on Autonomous Agents and Multiagent<br />

Systems: volume 1 - Volume 1, AAMAS’10,pages517–524,Richland,SC,2010.<br />

International Foundation for Autonomous Agents and Multiagent Systems.<br />

2. A. Barrat, M. Barthélemy, R. Pastor-Satorras, and A. Vespignani. The architecture<br />

of complex weighted networks. <strong>Proceedings</strong> of the National Academy of Sciences<br />

of the United States of America, 101(<strong>11</strong>):3747–3752, March 2004.<br />

3. L. C. Freeman. Centrality in social networks conceptual clarification. Social Networks,<br />

1(3):215–239,1978-1979.<br />

4. MathWorks. Matlab toolbox for dimensionality reduction. http://homepage.<br />

tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html,<br />

November 2010.<br />

5. M. E. J. Newman. Analysis of weighted networks. PHYS.REV.E, 70:056131,2004.<br />

6. T. Opsahl. Structure and Evolution of Weighted Networks. PhD thesis, University<br />

of London, London, UK, 2009. pp. 104-122.<br />

7. T. Opsahl, F. Agneessens, and J. Skvoretz. Node centrality in weighted networks:<br />

Generalizing degree and shortest paths. Social Networks, 32(3):245–251,July2010.<br />

8. K. Pearson. On lines and planes of closest fit to systems of points in space. Philosophical<br />

Magazine, 2(6):559–572,1901.<br />

9. S. P. Singh, M. J. Kearns, and Y. Mansour. Nash convergence of gradient dynamics<br />

in general-sum games. In <strong>Proceedings</strong> of the 16th Conference on Uncertainty in<br />

Artificial Intelligence, UAI ’00, pages 541–548, San Francisco, CA, USA, 2000.<br />

Morgan Kaufmann Publishers Inc.<br />

10. H. van den Herik, D. Hennes, M. Kaisers, K. Tuyls, and K. Verbeeck. Multi-agent<br />

Learning Dynamics: A Survey. In Cooperative Information Agents XI, Lecture<br />

Notes in Computer Science, pages 36–56. Springer Berlin / Heidelberg, September<br />

2007.<br />

<strong>11</strong>. C. Watkins. Learning from Delayed Rewards. PhD thesis, University of Cambridge,England,<br />

1989.<br />

12. U. Wilensky. Netlogo. http://ccl.northwestern.edu/netlogo/. Center for Connected<br />

Learning and Computer-Based Modeling, Northwestern University, Evanston, IL.,<br />

1999.<br />

154<br />

8


ACO algorithms for solving a new Fleet<br />

Assignment Problem<br />

Javier Diego Martín 1 , Ignacio Rubio Sanz 1 , Miguel Ortega-Mier 1 , and Álvaro<br />

García-Sánchez 1<br />

Unidad Docente de Organización de la Producción<br />

Escuela Técnica Superior de Ingenieros Industriales<br />

Universidad Politécnica de Madrid, Spain<br />

jdiego@gesfor.es, ignacio.rubio.sanz@gmail.com , miguel.ortega.mier@upm.es,<br />

alvaro.garcia@upm.es<br />

Abstract. The Fleet Assignment Problem (FAP), which transportation<br />

companies have to deal with, consists in deciding the fleet size and assigning<br />

a type of vehicle to a set of scheduled trips in order to minimize<br />

the total operational costs. In this paper, we propose a new model for<br />

the FAP, referred to as the new flexible model for the fleet assignment<br />

problem. We have developed Ant Colony Optimization (ACO) method<br />

and have analyzed its performance and compared it with Branch&Bound<br />

in a wide range of instances.<br />

1 Introduction<br />

In passenger transportation, how vehicles are assigned to trips is very important.<br />

An effective and efficient management of the vehicle fleet has a significant impact<br />

on the company costs and thus it heavily influences the profit achieved. The<br />

Fleet Assignment Problem (FAP) addresses this issue. The goal is to find the<br />

optimal fleet dimension in order to meet the transport demand. Although the<br />

FAP initially was developed to solve problems within the aeronautical industry,<br />

we propose a new model that may be applied in different transportation contexts.<br />

Most of the problems presented in the literature are Mixed Integer Problems<br />

(MIP) models with several compromises, as can be found in [4], [1] or [5].<br />

We have developed a new FAP model from those models. For large instances,<br />

solving times for the FAP model using Branch&Bound are too long. Therefore<br />

we developed an ACO method. The FAP can be easily formulated through a<br />

graph and ACO is a well-known constructive metaheuristic suitable for in that<br />

context. We have checked the solution quality obtained and we have analysed<br />

the execution time using ACO in some particular cases.<br />

The rest of the paper is organized as follows. In section 2 the problem is<br />

stated in detail. In section 3, we describe the ACO algorithms for adressing the<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

155


problem. The results of the experimentation both using a general solver and the<br />

ACO framework are given in section 4, along with the main conclusions.<br />

2 Problem description<br />

A trip is a journey from an origin to a destination with a unique code. For<br />

example, a trip could be a flight from JFK to LAX, code AA1203.<br />

Each trip can be scheduled in one or several time departure windows. It<br />

means that the trip AA1203 might depart on Monday between 9am and <strong>11</strong>am<br />

or Monday between 1pm and 3pm. Here, we have considered a uniqueness constraint,<br />

i.e, one and only one of the eligible time windows must be assigned.<br />

Additionally, a duty is a tuple defined by a trip and a time window. In the previous<br />

example, we two duties (AA1203, 9-<strong>11</strong>am) and (AA1203, 1-3pm) exist and<br />

one and only one must be active.<br />

There may exist time-precedence relationships among trips. For example, it<br />

might be necessary to impose that a trip can not depart later that a certain<br />

amount of time from another trip’s departure. For this requierment to be met, a<br />

constraint set ’earlier than’ was defined. Finally, two duties can be scheduled so<br />

that the time elapsed between their respective departure times must equal some<br />

particular value.<br />

Given a set of trips and a set of windows and the eligible duties, the problem<br />

consists in: 1) defining the departure time for every trip (which implies using a<br />

particular window) and 2) assigning a type of vehicle to each of them, so that<br />

the total cost is miminized.<br />

A solution for the problem would consist of a set of rotations with a predefined<br />

time horizon, where a rotation is the sequence of duties that a particular vehicle<br />

will perform. Moreover, rotations must be cyclic, meaning that the number of<br />

vehicles of every type at beginning of the time horizon will be the same that<br />

at the end of it. For example, if at the beginning time horizon at JFK there<br />

are three MD80 and two A320. The rotation is cyclic if at the end of the time<br />

horizon at JFK there are three MD80 and two A320. Otherwise, the rotation<br />

will not be cyclic regardless of the rotations.<br />

3 ACO for solving the FAP<br />

A graph associated to the problem must be defined (nodes and edges). Every<br />

node represents a duty assigned to a type of vehicle. There will be as many<br />

nodes as possibilities of assignment, so every node will have associated only a<br />

time window and only a vehicle type. Every edge linking two nodes represents<br />

two possible subsequent duties and a particular type of vehicle. For an edge to<br />

exist linking nodes i and i + 1 the following conditions are to be met:<br />

– The vehicle that performs the duty to which refers node i + 1 is the same as<br />

that for node i.<br />

2<br />

156


whose goal is to adjust and to propagate the time domain of the duties. The<br />

constraint propagation engine will recalculate all the time windows (minimum<br />

and maximum departure times for each duty) and if one of these values changes,<br />

we make sure the cadidate node can be assigned without affecting pre-existing<br />

duties of the rotation.<br />

4 Experimental results<br />

Although we have developed a new flexible model for the FAP, in our experimental<br />

results we only have studied airline instances, where trip durations are<br />

not variable. In addition, we have established the minimum time aircrafts must<br />

remain in airports, the costs of using an aircraft and the costs of the duties. This<br />

has been set according to the customer experience. We present the parameters<br />

used for the experimentation in table 1.<br />

Table 1. Vehicle parameters. In homogeneous instances, the vehicle used is MD80.<br />

Name Minimal Scale(min) Fixed Cost ($) Cost/min<br />

MD80 45 21,000 35<br />

A320 60 32,000 53.3<br />

B747 90 66,000 <strong>11</strong>0<br />

The computational study has been developed with an IntelCoreDuo, 1.83<br />

GHz, 2Gb RAM computer. The MIP problems were modeled using AIMMS<br />

and solved with CPLEX v12.2. We have created a new program that generates<br />

instances for the problem because we have not found benchmark instances that<br />

simulate the reality modeled by the new FAP model developed. We have studied<br />

instances with 50, 250, 500, 1000 and 2000 trips. The ACO parameters have<br />

been defined using [3], and [2]. A summary of them is shown in table 2. The<br />

termination condition should be the number of iterations. This number was<br />

fixed according to the instance size and the execution time of the algorithm. So,<br />

we have considered that 500 iterations for small instances (up to 500 trips) and<br />

750 iterations for large ones (over 500 trips) is appropriate. We have run each<br />

algorithm 10 times.Finally, the ACO parameter α is always 1 regardless of the<br />

algorithm in order to avoid a stagnation state.<br />

The first anlysis consits in ranking the algorithms as to what extent they<br />

deviate from the value of the optimal solution. For every algorithm, we have<br />

calculated the average error for all instances and all ACO parameters. The results<br />

are shown in table 3.<br />

So, the best algorithms for solving FAP regardless of the nature of the problem<br />

are: ASRank, MMAS and EAS.<br />

The instance tables display the following information: N.Trips is the number<br />

of trips to be assigned, tLP is the execution time (in seconds) for to obtain either<br />

the optimal solution of the instance or a feasible solution, β is an ACO parameter,<br />

4<br />

158


Table 2. ACO parameters: Algorithm is the ACO algorithm implemented, β is a<br />

parameter, ρ is the evaporation rate, τ0 is the initial pheromone value, na is the number<br />

of ants, SelectNode is the rule used for assigning nodes and q0 is a constant value.<br />

Algorithm β ρ τ0 na SelectNode (q0)<br />

EAS 1, 3, 5 0.5 0.5 5, 10, 20 AS<br />

ASRank 1, 3, 5 0.5 0.5 5, 10, 20 AS<br />

MMAS 1, 3, 5 0.8 τmax=10 20 5, 10, 20 AS<br />

ACS 1, 2, 5 0.1 0.5 5, 10 ACS (0.9, 0.8)<br />

HyperCube (HC) 1, 2, 5 0.5 0.5 5, 10 ACS (0.9)<br />

HCMMAS 1, 2, 5 0.05 0.5 5, 10 ACS (0.9)<br />

Table 3. ACO Algorithms performance solving FAP<br />

Algorithm Solution average (%)<br />

EAS 10.74<br />

ASRank 9.53<br />

MMAS 10.05<br />

ACS 14.76<br />

HC 17.99<br />

HCMMAS <strong>11</strong>.63<br />

Algorithm is the ACO algorithm used, x is the average of all the best algorithm<br />

solutions after 10 runs, (%) is the solution error obtained by ACO compared to<br />

the solution obtained in the MIP model and σx is its deviation. s bs and s ws are<br />

the best and the worst solutions found among the best solutions, it is the average<br />

value for the iteration where the best solution was found in each execution and<br />

t is the average of the execution time of the algorithm in seconds.<br />

The instances with homogeneous fleets and only one time window for each<br />

trip do not depend on the algorithm used, because ACO outperformed Branch<br />

and Bound as we usually obtain the optimal solution with short execution times<br />

with ACO. Moreover, we have noticed that the performance of ACO algorithms<br />

improves when β > 1 regardless of the number of ants. For the other cases<br />

studied, although the experimentation has been very extensive, we will only<br />

show the most remarkable experimental results in tables 4, 5 and 6.<br />

From the experimental results obtained, we can conclude:<br />

– Low-medium size instances:<br />

1. For heterogeneous fleet problems with trips with single time windows,<br />

ACO is more time consuming than Branch and Bound. Moreover, ACO<br />

does not attain the optimal solution, offering in average a 6% deviation<br />

from the optimal.<br />

2. For homogeneous fleet problems with multiple time windows, ACO is 80<br />

times faster than Branch and Bound, with an approximate error of 10%<br />

of the linear programming solutions. This means that ACO is the first<br />

recommended method for solving the problem.<br />

5<br />

159


Table 4. Heterogeneous fleet and only one time window for each trip. When the number<br />

of trips is 2000, the number of ants is 5. The symbol ∗ means that Branch and Bound<br />

for this trip size and this type of instance returns a feasible solution, not the optimal.<br />

The symbol - indicates that Branch and Bound could not solve the instance.<br />

N.Trips tLP β Algorithm x(%) σx s bs<br />

s ws<br />

it t<br />

EAS 634565 (0.00) 0 634565 634565 6.5 5.3<br />

3 ASRank 635503 (0.15) 2812.5 634565 643940 4.8 5.3<br />

50 1.42 MMAS 634565 (0.00) 0 634565 634565 46.6 5.2<br />

EAS 634565 (0.00) 0 634565 634565 7.5 5.4<br />

5 ASRank 636440 (0.30) 3750 634565 643940 7.3 5.2<br />

MMAS 635503 (0.15) 2812.5 634565 643940 26.2 5.3<br />

EAS 3165630 (4.23) 23<strong>11</strong>8.5 3135360 32<strong>11</strong>500 341.5 45.7<br />

3 ASRank 3<strong>11</strong>7160 (2.63) 24714.2 3087500 3164300 210.8 44.9<br />

250 3.35 MMAS 3147020 (3.62) 47031.2 3092440 3271800 368 44.4<br />

EAS 3194600 (5.18) 21904.8 3166820 3238760 295.9 44.3<br />

5 ASRank 3<strong>11</strong>6290 (2.60) 20152 3099630 3171840 364 42.8<br />

MMAS 3<strong>11</strong>3990 (2.53) 16366 3078910 3142250 376.5 42.3<br />

EAS 6396950 (9.37) 4<strong>11</strong>16.3 6336570 6461500 377.8 224.7<br />

3 ASRank 6218530 (6.32) 59074.7 6127820 6306470 441.1 217.8<br />

500 10.6 MMAS 6260250 (7.04) 65356.8 6187330 6375870 534.6 224.1<br />

EAS 64<strong>11</strong>280 (9.61) 47359.2 6330340 6503270 572.3 223.3<br />

5 ASRank 6246<strong>11</strong>0 (6.79) 41420.8 6187990 63<strong>11</strong>440 448.8 215<br />

MMAS 6201050 (6.02) 65617.7 6088480 6305690 436.8 213.5<br />

EAS 13<strong>11</strong>7700 (8.71) 54935.5 13043400 13182200 552.5 662.6<br />

3 ASRank 12760100 (5.75) 89145.5 12666700 12939500 486.9 655.4<br />

1000 38 MMAS 12953300 (7.35) 30180.2 12888800 12976900 467.7 723.8<br />

EAS 13103800 (8.60) 55938.8 13018500 13207400 351.3 665.3<br />

5 ASRank 12782100 (5.93) 103109 12643000 1297400 613.7 728.2<br />

MMAS 12703000 (5.27) 32569.3 12600500 12810800 558.7 716.8<br />

EAS 25913560 106155.8 25752500 26029400 456.6 1382.4<br />

3 ASRank 25729980 40853.8 25673200 25770000 534.8 1278<br />

2000 ∗<br />

- MMAS 25579380 64266.7 25472100 25660100 440.4 1056.8<br />

EAS 2593<strong>11</strong>60 156627.37 25695300 26<strong>11</strong>6100 495 1337.4<br />

5 ASRank 25767350 <strong>11</strong>2500.7 25594900 25890800 471.75 1224.5<br />

MMAS 24967760 56325.3 24896200 25032200 560.8 1488.2<br />

6<br />

160


Table 5. Homogeneous fleets and only multiple time window for each trip. The symbol<br />

∗ means that B&B for this problems wiht this number of trips and this type of instance<br />

returns a feasible solution, not the optimal. If the % is negative, the solution obtained<br />

by ACO is this % better than the solution obtained by the MIP model.<br />

N.Trips tLP β Algorithm x(%) σx s bs s ws<br />

it t<br />

EAS 8.2 (2.50) 0.60 8 9 18.9 5.3<br />

3 ASRank 8.1 (1.25) 0.54 8 9 47.2 5<br />

50 2.17 MMAS 8.2 (2.50) 0.40 8 9 51.3 6.2<br />

EAS 8.7 (8.7) 0.64 8 10 15.7 5.6<br />

5 ASRank 8.4 (5.00) 0.30 8 9 2.3 5.7<br />

MMAS 8.2 (2.50) 0.75 8 9 22.6 6<br />

EAS 36.9 (<strong>11</strong>.82) 1.14 35 39 154 66.7<br />

3 ASRank 35.6 (7.88) 0.92 34 37 80.6 65.4<br />

250 4504.8 MMAS 35.4 (7.27) 0.92 34 36 80.2 71<br />

EAS 37.1 (12.42) 1.04 35 39 64.4 64.9<br />

5 ASRank 36.1 (9.39) 1.04 34 37 43 63.6<br />

MMAS 36.2 (9.70) 1.47 34 39 166.3 64.8<br />

EAS 74.2 (12.42) 1.33 72 77 144.2 231.4<br />

3 ASRank 71.1 (7.73) 1.51 69 74 125.3 227.9<br />

500 ∗<br />

23631.41 MMAS 71.3 (8.03) 1.62 69 75 146.3 231.1<br />

EAS 74.8 (13.33) 1.60 71 77 146.7 232.9<br />

5 ASRank 71.8 (8.79) 0.87 70 73 203.7 225.7<br />

MMAS 71.6 (8.48) 1.50 69 74 182.6 220<br />

EAS 143.1 (-85.68) 1.45 140 145 147.9 619<br />

3 ASRank 137 (-86.29) 0.89 135 138 353.5 833.1<br />

1000 ∗<br />

4522.21 MMAS 138.7 (-86.12) 1.42 136 141 291.6 846.4<br />

EAS 144 (-85.59) 2.10 142 149 239.2 634.7<br />

5 ASRank 139.8 (-86.01) 1.89 135 142 162.5 838.5<br />

MMAS 140 (-85.99) 3.10 135 145 197.4 8<strong>11</strong>.6<br />

3. Finally, for homogeneous fleet problems and multiple time windows,<br />

ACO is 10 times faster than Branch and Bound, but the solution cost is<br />

a <strong>11</strong>% worst with ACO.<br />

– Large size instances:<br />

The technique for solving the problem should be ACO because it finds reasonably<br />

good solutions with aceptable execution times. B&B usually could<br />

not attain any feasible solution, as the computer runs out of RAM memory.<br />

Finally, we have noted that the ACO metaheuristic is very dependent on the<br />

ACO parameters, specially of the evaporation rate ρ, theβ parameter and the<br />

number of ants na, improving the last two when their values grow.<br />

Acknowledgements<br />

This work partly stems from the participation of the authors in a research project<br />

funded by the “Ministerio de Industria, Comercio and Turismo”: Proyecto Avanza<br />

reference TSI-020100-2009-534, titled OPTILOGI.<br />

7<br />

161


Table 6. Heterogeneous fleet and only multiple time window for each trip. The symbol<br />

∗ means that MIP model for this trip size and this type of instance returns a feasible<br />

solution, not the optimal. If the % is negative, the solution obtained by ACO is this %<br />

better than the solution obtained by the MIP model.<br />

N.Trips tLP β Algorithm x(%) σx s bs<br />

s ws<br />

it t<br />

EAS 830515 (12.35) 23171.5 805660 877277 245 10.8<br />

3 ASRank 816587 (10.47) 28662.9 775218 868343 238 9.4<br />

50 1.54 MMAS 798368 (8.00) 21806.7 760218 843843 199.1 9.6<br />

EAS 912806 (23.48) 22786.3 887259 948785 153.6 13.5<br />

5 ASRank 875240 (18.40) 32651.7 807218 947402 183.8 10.8<br />

MMAS 842967 (14.03) 18483.6 804652 877343 158.4 <strong>11</strong>.8<br />

EAS 3154080 (15.91) 40072 3097800 3250180 206.7 107.4<br />

3 ASRank 3075030 (13.01) 41576.6 3009560 3150540 344.5 85.2<br />

250 2<strong>11</strong>6.7 MMAS 3053370 (12.21) 44938.6 2973730 3106340 369.2 84.6<br />

EAS 3160620 (16.15) 33075.5 3<strong>11</strong>1770 32<strong>11</strong>120 291.1 107<br />

5 ASRank 3063880 (12.60) 46758.8 3015690 3140410 330.1 83.7<br />

MMAS 3015740 (10.83) 50937.8 2939180 3101220 281.2 81.1<br />

EAS 6609470 (15.56) 63928.2 6520420 6698540 318.1 534.4<br />

3 ASRank 6400480 (<strong>11</strong>.90) 83969.8 6291810 6554270 484.8 527.2<br />

500 ∗<br />

5656.8 MMAS 6407610 (12.03) 57190.3 6302740 6516430 418.1 535.1<br />

EAS 6535640 (14.27) 83706.2 6386760 6720470 416.6 533.5<br />

5 ASRank 6375490 (<strong>11</strong>.47) 45306.3 6313620 6477680 614.8 527.7<br />

MMAS 6397730 (<strong>11</strong>.85) 69973.4 6262400 6486810 561.4 531.7<br />

EAS 12616700(-60.38) 45839.8 12555600 12677200 359.6 794.4<br />

3 ASRank 12332500 (-61.28) 137824 12195300 12579700 234.2 863.8<br />

1000 ∗ 9455.7 MMAS 12412500 (-61.02) 136277 12244400 12632600 396 789.6<br />

EAS 125<strong>11</strong>200 (-60.72) 106052 12330900 12629800 345.8 858<br />

5 ASRank 12299100 (-61.38) <strong>11</strong>4649 12093300 12443900 506.2 1041.6<br />

MMAS 12216300 (-61.64) 176431 <strong>11</strong>935100 12384000 448.8 797.4<br />

References<br />

1. N. Belanger, G. Desaulniers, F. Soumis, and J. Desrosiers. Periodic airline fleet<br />

assignment with time windows, spacing constraints, and time dependent revenues.<br />

European Journal of Operational Research, 175:1754–1766,2006.<br />

2. C. Blum and M. Dorigo. The hyper-cube framework for ant colony optimization.<br />

IEEE Transactions on Systems Science and Cybernetics, 34:<strong>11</strong>61–<strong>11</strong>72,2004.<br />

3. M. Dorigo and T. Stutzle. Ant Colony Optimization. Massachusetts Institute of<br />

Technology, 2004.<br />

4. I. Ioachim, J. Desrosiers, F. Soumis, and N. Belanger. Fleet assignment and routing<br />

with schedule synchronization constraints. European Journal of Operational<br />

Research, 2:75–90,1999.<br />

5. H. D. Sherali, E. K. Bish, and X. Zhu. Airline fleet assignment concepts, models,<br />

and algortihms. European Journal of Operational Research, 172:1–30,2006.<br />

8<br />

162


A New Guillotine Placement Heuristic for the<br />

Orthogonal Cutting Problem *<br />

Slimane Abou Msabah 1 — Ahmed Riadh Baba-Ali 2<br />

1 Department of Computer Science, University of Science and Technology Houari<br />

Boumedienne, USTHB, Bab Ezzouar, Algiers, Algeria, slmalg@yahoo.com<br />

2 Department of Electronics, University of Science and Technology Houari Boumedienne,<br />

USTHB, Bab Ezzouar, Algiers, Algeria, riadhbabaali@yahoo.fr<br />

Abstract. The orthogonal cutting problem consists in finding an optimal arrangement<br />

of n items on identical dimension bins. Several placement heuristics<br />

are used to realize this task. The constraint of guillotine complicates more the<br />

problem. In our article, we are interested in the orthogonal cutting problem, taking<br />

into account the guillotine constraint. To do it, we propose a new placement<br />

heuristic inspired by the BLF routine, and which tries to place the items on levels,<br />

to verify the guillotine constraint, while exploiting intra-levels residues, in<br />

two directions vertically then horizontally. Our heuristic named BLF2G will be<br />

combined with a guided genetic algorithm, to be compared with the other heuristics<br />

and metaheuristics found in literature, on made test sets and known test<br />

sets.<br />

KEYWORDS: orthogonal cutting, guillotine constraint, combinatorial optimization,<br />

heuristics, genetic algorithm.<br />

1 Introduction<br />

The problem of cutting or placement is an optimization problem, whose objective is to<br />

determine a suitable arrangement of various items in others that are wider. The main<br />

objective is to maximize the use of the raw material, and thus to minimize the losses.<br />

The orthogonal cutting problem pulls its interest of the fact that it is applicable on<br />

several fields such as the cut of sheet steel, paper, fabrics, etc. … This is important for<br />

the industries of mass production where the optimization of the material plays an important<br />

role in the cost of manufacturing.<br />

In our work, we will propose a new guillotine placement routine, which aims the<br />

exploitation of the raw material. The development of such routine has to consider<br />

several parameters, such as the shape of the treated objects and the constraints imposed<br />

by the production system.<br />

In our case, we considered an orthogonal cutting problem, which treats a strip of<br />

fixed width and supposed infinite height, to generate items of rectangular shape. The<br />

used material can be a steel sheet and the machines of production are typically shears<br />

guillotine, which impose the cut from edge-to-edge (guillotine constraint). Items keep<br />

their original orientations to be cut on decorated or textured plates.<br />

* <strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms for<br />

Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

Full paper can be found in : http://slmalg.unblog.fr/<br />

1<br />

163


The routine, which we propose, pulls its profile of the BLF (Bottom Left Fill) routine;<br />

the layout of items is made in levels, to insure the guillotine constraint. Two<br />

other mechanisms are setup to exploit intra-levels residues, vertically and horizontally.<br />

After this introduction to the problem, we lists in section 2 of this article the placement<br />

routines found in literature. In section 3 we expose our routine, and we show its<br />

utility in exploitations of residues on test sets made to fit the bin. The results of our<br />

method are explained in section 4, which will be compared with the other heuristics,<br />

on test sets found in literature. We end our article with a conclusion, which will be<br />

presented in section 5.<br />

2 State of the art<br />

We are interested in this section to investigate the placement heuristics found in the<br />

literature from the existing methods to propose a routine which fits to our case. In<br />

order to take advantage.<br />

Baker Coffman and Rivest, present the BL (Bottom Left) heuristics to place the<br />

items in the lowest place to the left. They test several sequences of appearance of<br />

items and find that the sorting of the list of items according to the diminution of the<br />

widths gives better results [1].<br />

Jakobs uses the BL heuristics, which tries to place the item in the highest place to<br />

the right, then he slides the item successively in the lowest place then most to the left<br />

possible. He uses afterward a genetic algorithm to find the sequence, which gives best<br />

result [9].<br />

Lui and Teng perfected the BL heuristics by favouring the sliding of the item from<br />

top to bottom, the sliding of the item is made of right to the left if no movement below<br />

is possible [10].<br />

The BLF Routine tries to place items in the lowest place to the left by exploiting<br />

the internal residues. Ramesh Babu and Ramesh Babu backs up the points which form<br />

the left lower corners of residues in a list. For every item to be placed an algorithm<br />

goes through the list, respecting the lowest order at the left and place the item in the<br />

first suitable place. So the list is modified according to the dimension of the placed<br />

item. [13].<br />

Lodi et al. presented an approach named "Floor-Ceiling" (FC) who spreads the way<br />

of placing items on the levels. The approach FC places the items from left to right at<br />

the bottom of level and also place items from right to left at the top of the level [<strong>11</strong>].<br />

Lodi et al. present a new variant of the FC routine to verify the guillotine constraint.<br />

This variant realizes the cuttings from edge to edge.<br />

Ben Messaoud et al. present a new placement heuristic, based on levels (SHF) to<br />

apply the FC algorithm by injecting items placed in the ceiling below [2].<br />

Burke et al. present their Best Fit heuristic by the introduction of the notion of<br />

neighborhood. They build their format of cut as one goes along. Initially the list of<br />

items is sorted by decreasing height. They try to fill the lowest residue by the items<br />

which suits, if no item is suitable, they fill this space by an irrecoverable scrap until<br />

the next level, and so on; as a bricklayer who builds a wall. The item is moved either<br />

to the left or to the right according to the height of neighboring items. They also introduce<br />

a mechanism of orientation of the long items to reduce the losses [4].<br />

3 Our contribution<br />

The placement policy is the crucial point in the laying out process, to maximize the<br />

exploitation of residues. We can divide the placement heuristics into two categories:<br />

2<br />

164


a) The direct heuristics, which place directly the current item in the first suitable place<br />

according to the applied policy, we can quote the routines Finite First Fit, Finite Nest<br />

Fit, Bottom Left, Bottom left Fill [3] … b) The heuristics which chooses, they choose<br />

a suitable place among various according to their laying out policies, to place the current<br />

item, such as Finite Best Fit, Finite Worst Fit …<br />

The first category of heuristics realizes the laying out at short time, and it depends<br />

totally on the order of appearance of the items. These heuristics are favorable to be<br />

combined with stochastic algorithms and metaheuristics. The second category requires<br />

an additional time to realize the laying out and gives, generally better results than the<br />

first category.<br />

Our placement routine tries to place items directly to the suitable place, according<br />

to a laying out policy which verifies the guillotine constraint. It belongs to the first<br />

category.<br />

3.1 Our Placement policy<br />

The use of an adequate placement policy gives better results. The guillotine constraint<br />

makes the problem more complicated. To find a good placement policy, which fits<br />

better to this established fact, we took advantage from placement methods which exists<br />

in the literature. The routines which base on the laying out in level are the best<br />

adapted to our scenario, the cut from edge to edge required by shears-guillotines<br />

adapts itself better to the layout in levels.<br />

The laying out of an item on the strip can be made according to three possible<br />

cases:<br />

Placement in levels. The strip is structured in levels; every level is characterized by a<br />

height and an available width. If the width of the item is lower or equal to the available<br />

width the item is thus placed. The available width is updated, and if the height of<br />

the item is superior to the level, the height of the level is redefined by the height of the<br />

item. We named this phase BLG, because it is a question of applying the routine of<br />

placement Bottom Left to levels to verify the constraint of Guillotine. The application<br />

of this routine engenders BLGsub-levels (Fig.1).<br />

Level i<br />

The BLGsub-levels<br />

Level 1<br />

Fig. 1. The layout on the levels<br />

Placement in BLGsub-level:. A BLGsub-level is characterized by a width, equal to<br />

the width of the item placed below, and by the available height. Items are ordered on<br />

these residues vertically. If the width of the item is lower or equal to the width of<br />

BLGsub-level and the height of the item is lower or equal to the available height of<br />

BLGsub-level, the item is thus placed in this BLGsub-level and the available height is<br />

updated. This phase is named BLFG (Bottom Left Fill Guillotine). The application of<br />

this routine engenders BLFGsub-levels (Fig. 2).<br />

3<br />

165<br />

Height of the Level i, as high as the<br />

longest item.<br />

Available width in the level i.


Available height of<br />

BLGsub-levels<br />

Width of BLGsub-levels<br />

Fig. 2. The layout on BLGsub-levels<br />

Placement in BLFGsub-level :. A BLFGsub-level is characterized by a height and<br />

an available width. If the height of the current item is lower or equals the height of the<br />

level and the width of the current item is lower or equal to the available width of<br />

BLFGsub-level, the item is thus placed in this BLFGsub-level and the available width<br />

of BLFGsub-level is updated. This phase is named BLF2G (Bottom Left Fill 2 (second<br />

exploitation of residues) Guillotine) (Fig. 3.).<br />

Height of<br />

BLFGsub-levels<br />

Fig. 3. The layout on BLFGsub-levels<br />

After this stage, the residues are too small to be exploited, but if we have a data set of<br />

smallest items, we can continue the exploitation using BLF3G, BLF4G, etc…<br />

3.2 The placement algorithm<br />

Read the dimension of the plates<br />

Load list of item<br />

For all items<br />

For all levels<br />

If current item can be<br />

placed in the level<br />

then place item<br />

Update the level<br />

Break End if<br />

For all BLGsub-levels<br />

If current item can be<br />

placed in the BLGsub-level<br />

Then place item<br />

Update the BLGsublevel<br />

Break End if<br />

For all BLFGsub-levels<br />

If current item can be<br />

placed in the BLFGsub-level<br />

3.3 The genetic Algorithm<br />

4<br />

The BLFGsub-levels<br />

Available width of<br />

BLFGsub-levels<br />

Then place item<br />

Update BLFGsub-level<br />

break End if<br />

Pass at the following<br />

BLFGsub-level<br />

End For<br />

Pass at the following<br />

BLGsub-level<br />

End For<br />

Pass at the following level<br />

End For<br />

If item not placed<br />

then place item in a new level<br />

update the new level<br />

End if<br />

Pass at the following item<br />

End for<br />

End<br />

To show the power of our policy of placement BLF2G, we will combine it with a<br />

classic GA. We used a real codification[7] where the chromosome is defined by the<br />

order of items. The order of appearance of items in the process of laying out, accord-<br />

166


ing to our policy BLF2G, determines the quality of every individual. We implemented<br />

a genetic algorithm approach with a population size of 100 and we fixed the number<br />

of generations by 20 times the number of items. Initially we generate a random population<br />

with a random ordering item in each individual. At each generation, our BLF2G<br />

policy gives the quality of each individual. The genetic operators are so defined:<br />

The crossover operator. We used the partially matched cross-over with 1-point<br />

cross-over (PMX1). We make then a correction to make the children’s valid. We are<br />

going to correct the child 1 with the missing genes according to their order of appearance<br />

in parent 2, and replace the double genes in child 2 by the missing genes according<br />

to their order of appearance in parent 1.<br />

Parent 1 : 123|456 123321 Child 1 : 123654<br />

PMX1 Correction<br />

Parent 2 : 654|321 654456 Child 2 : 654123<br />

The Mutation operator. It is about a permutation between two sites chosen randomly:<br />

Child : 1|2365|4 143652<br />

4 Experimental results<br />

To estimate the performance of our heuristic we made a test sets which offer a maximal<br />

exploitation of the material (0 % of scraps) by applying the BLF2G policy.<br />

name # of Item Plates dimension Optimal height<br />

Msa17a, Msa17b, Msa17c 17 200 x 200 200<br />

Msa35a, Msa35b, Msa35c 35 200 x 200 200<br />

Msa75a, Msa75b, Msa75c 75 200 x 200 200<br />

Msa150a, Msa150b, Msa150c 150 200 x 200 200<br />

Table 1. Our made test sets<br />

To estimate our heuristics we combined it with a genetic algorithm. The obtained<br />

results are compared with our policy BLF2G by applying a sorting according to the<br />

Decreasing Heights to the list of items.<br />

Msa17 Msa35 Msa75 Msa150<br />

a b c a b c a b c a b c<br />

BLF2G DH 240 245 263 220 225 229 214 210 210 205 205 218<br />

BLF2G GA 200 200 200 220 215 219 215 210 218 205 205 219<br />

Table 2. Results of the routine BLF2G+DH and BLF2G+GA.<br />

We notice that the GA combined with our policy of placement BLF2G, reached the<br />

optimal for the test sets of 17 items (small size), and has given comparable results<br />

with the BLF2G+DH heuristic for test sets of medium size. But for the test sets of big<br />

size the BLF2G+DH heuristic is better.<br />

4.1 Improvement<br />

The GA failed in front of the DH heuristic for the test sets of large-size. To remedy<br />

that we suggest injecting the individual sorted out according to DH policy in the initial<br />

population of the evolutionary process, which we name GAguided. The following table<br />

shows the results.<br />

5<br />

167


Msa17 Msa35 Msa75 Msa150<br />

a b c a B c a b c a b c<br />

BLF 2G DH 240 245 263 220 225 229 214 210 210 205 205 218<br />

BLF2G GA 200 200 200 220 215 219 215 210 218 205 205 219<br />

BLF2G GAguided 200 200 200 215 210 219 207 205 210 205 205 212<br />

Table 3. Result of the BLF2G+DH routine, BLF2G+GA and BLF2G+GA guided<br />

With this improvement the GAguided keeps its superiority with regard to the DH heuristic<br />

and gives results equal and even better than the DH heuristic for the test sets of any<br />

size.<br />

4.2 Test sets found in the literature<br />

To estimate better our method we are going to use the test sets of Hopper and Turton<br />

(2001) [8], Burke et al. (2004) [4], which are the most used:<br />

Test<br />

set<br />

Hopper & Turton<br />

2001<br />

Burke et<br />

al. 2004<br />

Name<br />

Size<br />

# of items<br />

Optimal<br />

Height<br />

box Size<br />

C1:P1,P2,P3 16/17 20 20x20<br />

C2:P1,P2,P3 25 15 40x15<br />

C3:P1,P2,P3 28/29 30 60x30<br />

C4:P1,P2,P3 49 60 60x60<br />

C5:P1,P2,P3 73 90 60x90<br />

C6:P1,P2,P3 97 120 80x120<br />

C7:P1,P2,P3 196/197 240 160x240<br />

N1 10 40 40x40<br />

N2 20 50 30x50<br />

N3 30 50 30x50<br />

6<br />

Test<br />

set<br />

Burke et al. 2004<br />

Table 4. Test sets found in literature.<br />

Name<br />

Size<br />

# of items<br />

Optimal<br />

Height<br />

box Size<br />

N4 40 80 80x80<br />

N5 50 100 100x100<br />

N6 60 100 50x100<br />

N7 70 100 80x100<br />

N8 80 80 100x80<br />

N9 100 150 50x150<br />

N10 200 150 70x200<br />

N<strong>11</strong> 300 150 70x200<br />

N12 500 300 100x300<br />

N13 3152 960 640x960<br />

The following table presents the obtained results by applying our GAguided+BLF2G<br />

policy with regard to GA+BLF, SA+BLF, New Best-Fit find in Burke et al. (2004).<br />

Test<br />

sets<br />

Hopper & Turton 2001<br />

Name GA+ SA+ New<br />

BLF BLF Best-Fit<br />

C1P1 20 20 21 20<br />

C1P2 21 21 22 22<br />

C1P3 20 20 24 21<br />

C2P1 16 16 16 16<br />

C2P2 16 16 16 16<br />

C2P3 16 16 16 15<br />

C3P1 32 32 32 32<br />

C3P2 32 32 34 33<br />

C3P3 32 32 33 31<br />

C4P1 64 64 63 66<br />

C4P2 63 64 62 65<br />

C4P3 62 63 62 62<br />

C5P1 95 94 93 95<br />

C5P2 95 95 92 97<br />

C5P3 95 95 92 95<br />

C6P1 127 127 123 125<br />

C6P2 126 126 122 128<br />

GAguided+BLF2<br />

G<br />

Test<br />

sets<br />

Burke et al. 2004<br />

Name GA+<br />

BLF<br />

SA+<br />

BLF<br />

New<br />

Best-Fit<br />

GAguided+BLF2<br />

G<br />

C6P3 126 126 124 126<br />

C7P1 255 255 247 250<br />

C7P2 251 253 244 248<br />

C7P3 254 255 245 249<br />

N1 40 40 45 40<br />

N2 51 52 53 50<br />

N3 52 52 52 53<br />

N4 83 83 83 87<br />

N5 106 106 105 105<br />

N6 103 103 103 106<br />

N7 106 106 107 <strong>11</strong>6<br />

N8 85 85 84 85<br />

N9 155 155 152 153<br />

N10 154 154 152 154<br />

N<strong>11</strong> 155 155 152 153<br />

N12 313 312 306 309<br />

N13 - - 964 Out of service<br />

Table 5. comparison of the GA guided+BLF2G heuristic to GA+BLF, SA+BLF and New Best-<br />

Fit heuristics.<br />

We notice that for the test sets of small-size the method GA+BLF and SA+BLF gave<br />

results better than the heuristics New Best Fit. Our method gave comparable and<br />

sometimes better results than the other methods, such thing is explained by the care of<br />

the constraint of guillotine. For the test sets C2P3 and N2 our method reached the<br />

168


optimal, while the other methods did not manage to reach it, although they are free of<br />

the guillotine constraint. And that shows the power of our method in the exploitation<br />

of the material.<br />

For the test sets of medium size our method kept its position with regard to the<br />

other methods. But for the test sets of big size our method shows a better score with<br />

regard to the methods GA+BLF, and SA+BLF. And that confirms the good adopted<br />

laying out policy. The New Best-Fit heuristic marked a score better than our method<br />

for the test sets of large-size. And that confirms the failure of the GA in front of the<br />

test sets of large-size.<br />

The following graph compares the four methods according to the percentage of fall.<br />

% over optimal<br />

25<br />

20<br />

15<br />

10<br />

5<br />

0<br />

N1<br />

C1P3<br />

N2<br />

C2P2<br />

C3P1<br />

C3P2<br />

N4<br />

C4P2<br />

7<br />

N5<br />

N7<br />

Tests sets<br />

C5P2<br />

N8<br />

C6P2<br />

GA+BLF<br />

SA+BLF<br />

New Best-Fit<br />

GAguided+BLF2G<br />

Fig. 4. Comparison of the GAguided+BLF2G heuristic to GA+BLF, SA+BLF and New<br />

Best-Fit heuristics (% over optimal)<br />

According to the graph, we notice that the New Best-Fit heuristic has badly started for<br />

the test sets of small-size. But for the test sets of medium and big size it took over and<br />

shows better results. Both heuristics {GA, SA}+BLF, gave comparable results between<br />

them, with a light superiority of the genetic algorithm, whatever the size of<br />

problem.<br />

For the test sets of small-size, our method gave comparable results to GA+BLF and<br />

SA+BLF heuristics, and sometimes better. For C1P3 and C1P2 of small size, our<br />

method had a bad results.<br />

For the test sets of medium-size our method diverge with regard to the other heuristics,<br />

with some comparable results.<br />

For the test sets of large-size, our method gave better results with regard to<br />

{GA,SA}+BLF, but it stays below the New Best-Fit heuristics.<br />

5 Conclusion<br />

Our contribution, in the problem of rectangular cut, showed its efficiency. First of all<br />

we developed a powerful routine in exploitation of residues named BLF2G. This routine<br />

takes into account the constraint of cut from edge to edge. Secondly, we guided<br />

the genetic algorithm with the greedy heuristics DH, by the introduction of the DH<br />

individual in the initial population.<br />

The use of the GAguided combined with our routine BLF2G, has allowed to reach the<br />

optimal for Msa17 test set (a, b, and c) of small-size. But for the test sets of medium<br />

and large-size, the GA failed to investigate the space of search to find the sequence of<br />

items which offers the optimal solution.<br />

The comparisons made with the other methods which do not take the guillotine<br />

constraint into account on test sets made to insure the optimum without considering<br />

169<br />

N9<br />

C7P3<br />

N10<br />

N12


this constraint of guillotine, are very encouraging. The optimal is reached by our<br />

method repeatedly, especially for C2P3 and N2, such result is not reached by the other<br />

methods. Such thing explained the legitimacy of our heuristics of placement, in exploitation<br />

of residues.<br />

The Guided GA combined with our heuristic BLF2G allowed us to show the qualities<br />

of our method of placement for the test sets of small and mid-size. For the test sets<br />

of big size, our method BLF2G+GAguided was out of service for the test set N13 of<br />

spatial size. Almost the same thing occurred for the test sets MT01, … MT10, of<br />

Burke et al. [5].<br />

Our BLF2G routine depends totally on the GA to find the optimal order of items.<br />

In perspective, we intend in the near future to give more intelligence to our heuristics<br />

BLF2G, by applying new policies of placement, of Best-Fit type, to escape from defects<br />

met by the GA.<br />

References<br />

1. B. S. Baker, E. G. Coffman, and R. L. Rivest, “Orthogonal packings in two dimensions,”<br />

SIAM Journal of Computing, 9, 4, pp. 846-855, 1980.<br />

2. S. Ben Messaoud, C. Chu, and M. L. Espinouse, “An approach to solve cutting stock<br />

sheets,” IEEE International Conference on Systems, Man and Cybernetics, pp. 5109-5<strong>11</strong>3,<br />

2004.<br />

3. J. O. Berkey, and P. Y. Wang, “Two dimensional Finite bin packing algorithms.” Journal<br />

of the Operational Research Society, 38, pp. 423-429, 1987.<br />

4. E. K. Burke, G. Kendall and G. Whitwell, “A New Placement Heuristic for the Orthogonal<br />

Stock Cutting Problem,” Operations Research, vol. 52, no. 4, pp. 655–671, 2004.<br />

5. E. K. Burke, G. Kendall, and G. Whitwell, “A Simulated Annealing Enhancement of the<br />

Best-Fit Heuristic for the Orthogonal Stock-Cutting Problem,” INFORMS Journal on<br />

Computing, Vol. 21, No. 3, pp. 505-516, 2009.<br />

6. D. E. Goldberg, “Genetic algorithms in search, optimization, and machine learning,”<br />

Reading, MA: Addison-Wesley, 1989.<br />

7. E. Hopper, and B. Turton, “A genetic algorithm for a 2D industrial packing problem,”<br />

Computers and Industrial Engineering, vol. 37/1-2, pp. 375-378, 1999.<br />

8. E. Hopper, and B. Turton, “An empirical investigation of metaheuristic and heuristic algorithms<br />

for a 2D packing problem.” Eur. J. Oper. Res., 128, pp. 34-57, 2001.<br />

9. S. Jakobs, “On genetic algorithms for the packing of polygons,” European Journal of Operations<br />

Research, n° 88, pp. 165-181, 1996.<br />

10. D. Liu, and H. Teng, “An improved BL-algorithm for genetic algorithm of the orthogonal<br />

packing of rectangles,” Eur. J. Operational Research, <strong>11</strong>2, pp. 413-420, 1999.<br />

<strong>11</strong>. A. Lodi, S. Martello, and D. Vigo, “Heuristic and metaheuristic approches for a class of<br />

two-dimensional bin packing problems,” INFORMS journal on computing, Vol <strong>11</strong>, pp.<br />

345-357, 1999.<br />

12. Z. Michalewics, “Genetic Algorithms + Data Structures = Evolution Programs,” Third,<br />

Revised and Extended Edition, Springer, 1996.<br />

13. A. Ramesh Babu, and N. Ramesh Babu, “Effective nesting of rectangular parts inmultiple<br />

rectangular sheets using genetic and heuristic algorithms,” International Journal of Production<br />

Research, Vol. 37, n°7, pp. 1625-1643, 1999.<br />

8<br />

170


Solving Distributed FCSPs with Naming Games ⋆<br />

Stefano Bistarrelli 1,2 ,GiorgioGosti 3 and Francesco Santini 1<br />

1 Dipartimento di Matematica e Informatica, Università degli Studi di Perugia, Italy<br />

[bista,francesco.santini]@dipmat.unipg.it<br />

2 Istituto di Informatica e Telematica (IIT-CNR), Pisa, Italy<br />

stefano.bistarelli@iit.cnr.it<br />

3 Institute for Mathematical Behavioral Sciences, University Of California, Irvine, USA<br />

ggosti@uci.edu<br />

Abstract. Constraint Satisfaction Problems (CSPs) are the formalization of a<br />

large range of problems that emerge from computer science. The solving methodology<br />

described here is based on the Naming Game (NG). The NG was introduced<br />

to represent N agents that have to bootstrap an agreement on a name to give to<br />

an object (i.e. a word). In this paper we focus on solving Distributed FCSPs with<br />

an algorithm for NGs: each word on which the agents have to agree on is associated<br />

with a preference represented as a fuzzy score. The solution is the agreed<br />

word associated with the highest preference value. The two main features that<br />

distinguish this methodology from other DisFCSP solving methods are that the<br />

system can react to small instance changes and and it does not require pre-agreed<br />

agent/variable ordering.<br />

1 Introduction<br />

This paper presents a distributed method to solve Distrubuted Fuzzy Constraint Satisfaction<br />

Problems (DisFCSPs) [<strong>11</strong>,15,8,9,14]thatcomesfromageneralizationofthe<br />

Naming Game (NG)model[12,1,10,7].DisFSCPscanbeappliedtodealwithresource<br />

allocation, collaborative scheduling and distributed negotiation [8].<br />

In DisFCSP protocols, the aim is to design a distributed architecture of processors,<br />

or more generally a group of agents, who cooperate to solve a fuzzy CSP instantiation.<br />

In this framework, we see the problem as a dynamic system and we selectthestable<br />

states of the system as the solutions to our CSP. To do this we design each agent so that<br />

it will move towards a stable local state. This system may be called “self-stabilizing”<br />

whenever the global stable state is obtained through the reinforcement of the local stable<br />

states [6]. When the system finds the stable state, the DisFCSP instantiationissolved.<br />

Aprotocoldesignedinthiswayisresistanttodamageandexternal threats because it<br />

can react to changes in the problem instance. Moreover, in ourapproachallagentshave<br />

equal chance to reveal private information.<br />

⋆ Research partially supported by the MIUR PRIN 20089M932N: “Innovative and multidisciplinary<br />

approaches for constraint and preference reasoning”, by CCOS FLOSS project<br />

“Software open source per la gestione dell’epigrafia dei corpus di lingue antiche”’, and by<br />

INDAM GNCS project “Fairness, Equità e Linguaggi”.<br />

171


The NGs describe a set of problems in which a number of agents bootstrap a commonly<br />

agreed name for one or more objects. In this paper we discuss a NG generalization<br />

in which agents have individual fuzzy preferences over words. This is a natural<br />

generalization of the NG, because it models the endogenous agents’s preferences and<br />

attitudes towards certain object naming system. Moreover, we add binary fuzzy constraints<br />

that represent exogenous causes that effect the agents preferences. As shown<br />

in [3, 4], a NG can be viewed as a particular crisp CSP instance. But,ifweaddpreference<br />

levels and constraints, the NG is no longer a crisp combinatorial problem: this<br />

new game may be interpreted as an optimization problem.<br />

This paper extends the results of [3, 4] in which non-fuzzy DCSPs are solved with<br />

NGs. The paper is organized as follows: in Sec. 2 we present the backgroundonNGs.<br />

Section 3 presents the algorithm in order to solve DisFCSPs. Then, Sec. 4 presents the<br />

tests and the results for the fuzzy NG algorithm. At last, Sec. 5summarizestherelated<br />

work and Sec. 6 reports the conclusions and ideas about futurework.<br />

2 Background onNamingGames<br />

The NGs [12, 1, 10, 7] describe a set of problems in which a number of agents bootstrap<br />

acommonlyagreednameforoneormoreobjects.Thegameisplayed by a population<br />

of N agents which play pairwise interactions in order to negotiate conventions, i.e.<br />

associations between forms and meanings, and it is able to describe the emergence of a<br />

global consensus among them. For the sake of simplicity the model does not take into<br />

account the possibility of homonyms, so that all meanings areindependentandonecan<br />

work with only one of them, without loss of generality. An example of such a game is<br />

that of a population that has to reach the consensus on the name(i.e.theform)toassign<br />

to an object (i.e. the meaning) exploiting only local interactions. However, as it will be<br />

clear, the model is appropriate to address all those situations in which negotiation rules<br />

adecisionprocess(i.e.opiniondynamics,etc.)[1].<br />

Each NG is defined by an interaction protocol. There are two important aspects of<br />

the NG: the agents randomly interact and use a simple set of rules to update their state;<br />

the agents converge to a consistent state in which the object has assigned a uniquely<br />

name, by using a distributed social strategy.<br />

Generally, at each turn, two agents are randomly extracted to performtheroleof<br />

the speaker and the listener (or hearer as used in [12, 1]). The interactionbetweenthe<br />

speaker and the listener determines the agents’ update of their internal state. DCSPs and<br />

NGs share a variety of common features [3, 4].<br />

The definition of Self-stabilizing algorithm in distributed computing was first introduced<br />

by [6]. A system is self-stabilizing whenever, each system configuration associated<br />

with a solution is an absorbing state (global stable state), and any initial state<br />

of the system is in the basin of attraction of at least one solution. Inaself-stabilizing<br />

algorithm, we program the agents of our distributed system tointeractwiththeirneighbors.<br />

The agents update their state through these interactions by trying to find a stable<br />

state in their neighborhood. Since the algorithm is distributed many legal configurations<br />

of the agents’ states and their neighbors’ states start arisingsparsely.Notallof<br />

these configurations are mutually compatible, and so they form mutually inconsistent<br />

172


potential cliques. The self-stabilizing algorithm must find awaytomakethegloballegal<br />

state emerge from the competition between these potential cliques. Dijkstra [6] and<br />

Collin [5] suggest that an algorithm designed in this way can not always converge, and<br />

aspecialagentisneededtobreakthesystemsymmetry.Moreprecisely, Dijkstra [6]<br />

and Collin [5] show that we can not guarantee that a system of uniform finite state machines<br />

can always solve the ring ordering problem. However, in [4] the authors show<br />

that an naming game based algorithm with homogeneous agents can find the ring ordering<br />

problem solution with probability 1.<br />

3 Solving DisFCSPswithNaming Games<br />

As in [15], we assign to each variable xi ∈ X of the DisFCSP P = 〈X, D, C, A〉, an<br />

agent ai ∈ A. Weassumethateachagentknowsalltheconstraintsthatactover its X<br />

variables [15]. Each agent i =1, 2,...,N (where |A| = N) searchesitsownvariable<br />

domain di ∈ D for its variable assignment that optimizes P .Thedegreeofsatisfaction<br />

of a fuzzy constraint tells us to what extent it is satisfied. Otherwise stated, the<br />

goal of the game is to make the agents find an assignment of their variablesthatmaximizes<br />

the overall fuzzy score result for the problem; fuzzy preferences of constraints<br />

are combined with min function.<br />

We restrict ourselves only to unary and binary constraints. Each agent has a unary<br />

constraint ci with support defined over its variable xi ∈ X; thisunaryconstraintsrepresent<br />

the local preference of the agents for each variable assignment di ∈ D. Any<br />

binary constraint ci,j returns a preference value p ∈ [0, 1] which states the combined<br />

preference over the assignment of xi and xj together. η[ai := b] is the set of all possible<br />

assignments of the variables in X such that variable ai is assigned b. caiη[ai := b] represents<br />

the preference level of agent ai for assignment b, andcai,aj η[ai := b, aj := d]<br />

represents combined preference level of agents ai and aj for the respective assignments<br />

b and d. Inthefollowing,wewillusethe symbol to directly perform the composition<br />

of fuzzy constraints, and c {s} to denote the set of constraints that act over s. Thus,<br />

maxb∈Ds( c {s}η[s := b]) defines the best fuzzy level that an agent can take given<br />

its knowledge of the surrounding constraints and its assignment b. Respectively,top<br />

is the set of domain assignments with the maximum fuzzy value of c {s}η[s := b],<br />

top = {b ∈ Ds|b =argmaxb∈Ds( c {s}η[s := b])}. Wemaysaythatthecommunication<br />

network is determined by the network of binary constraints, since we suppose an<br />

agent ai ∈ A can communicate only with the aj ∈ A agents sharing a binary constraint,<br />

i.e. ci,j ∈ C.<br />

At the beginning, each agent marks an element b that maximizes c {s}η[s := b],<br />

this is the elements that the agents prefers to be in the final solution. At each turn, the<br />

algorithm is based on two entities: a single speaker, whichcommunicatesinbroadcast<br />

its choice on the word and the related fuzzy preference and a set of listeners.Thelisteners<br />

are all the agents that share a constraint with the speaker. At each turn t,anagentis<br />

drawn with uniform probability to be the speaker. In the following we describe in detail<br />

each step of the interaction scheme that defines the behavior between the speaker and<br />

the listeners: we consider three phases, i) broadcast, ii) feedback and iii) update.<br />

173


3.1 Interaction Protocol<br />

Broadcast The speaker s executes the broadcast protocol. The speaker checks if the<br />

marked variable assigment b is in top. Ifthemarkedvariableassignmentisnotintop<br />

it selects a new variable assignment b with uniform probability from top, andmarksit.<br />

Then it sends the couple (b, max(<br />

b∈Ds<br />

c {s}η[s := b])) to all its neighboring listeners.<br />

Feedback All the listeners receive the broadcast message (b, u) from the speaker. Each<br />

listener l computes ∀dk, c {s,l}η[s := b][l = dk] (let’ us call this value vk for any<br />

chosen dk), that is it computes the combination of the fuzzy preferences (i.e. vk) for<br />

each dk assignment, supposing that s chooses word b. Eachlistenersendsbacktos a<br />

feedback message according to the following two cases:<br />

– Failure. Ifu > max<br />

k (vk) there is a failure, andthelistenerfeedbacksafailure<br />

message containing the maximum value and the corresponding assignment for l,<br />

Fail(max<br />

k (vk),l = dk).<br />

– Success.Ifu ≤ max<br />

k (vk), thereisasuccess, thelistenerfeedbacksSucc.<br />

Update The listener’s feedback determines the update of the listener and of the speaker.<br />

When the listener feedbacks a Succ,thenthelisteneralsolowerthepreferencelevelfor<br />

all the vk with a higher preference value: ∀vk.vk >uthen it sets vk = u.Ifthespeaker<br />

receives only Succ feedback messages from all its listeners, then it does not need to<br />

update.<br />

Otherwise, that is if the speaker receives a number of Fail(vj,lj = dj) feedback<br />

messages from h listeners (with h ≥ 1 and 1 ≤ j ≤ h), then it selects the<br />

worst vw fuzzy preference such that ∀j, vw ≤ vj. Thenitsendstoalllistenersa<br />

FailUpdate(c {lw}η[lw := bw]). Thus,thespeakersetsitsassignmenttob with the<br />

worst fuzzy preference level among the failure feedback messages of the listeners, i.e.<br />

c {s}η[s := b] =vw. Inaddition,eachlistenerl sets vl = vw, i.e.c {s,l}η[s := b][l :=<br />

dl] =vw.<br />

3.2 Theorems<br />

With Lemma 1 we state that a subset of constraints C ′ ⊆ C has a higher fuzzy preference<br />

w.r.t. C. Wesaythatafuzzyconstraintproblemisα-consistent if it can be solved<br />

with a level of satisfiability of at least α (see also [2]).<br />

Lemma 1 ([2]). Consider a set of constraints C and any subset C ′ of C.Thenwehave<br />

C ≤ C ′ .<br />

The speaker selection rule defines a probability distribution function F that tells us<br />

the probability that a certain domain assignment is selected. c {s}η[s := b]) and the<br />

marked word determine F .InLemma2werelateF to the convergence of the algorithm<br />

with probability 1,relatedtothelevelofsatisfiabilityoftheproblem.<br />

174


Lemma 2. If the F function selects only the domain elements with preference level<br />

larger then α, thenthealgorithmconvergeswithprobability1,onlyifSol(P ) ≥ α.<br />

From [3, 4], if the F function chooses a random element in the word domain, then<br />

the algorithm converges to the same word, but this word could not be the optimal one,<br />

i.e. the word with the highest fuzzy preference. If we choose F in order to select only<br />

words with a preference greater than α,thenthealgorithmconvergestoasolutionwith<br />

aglobalpreferencegreaterthanα.<br />

With Prop. 1 and Prop. 2 we prepare the background for the main theorem of this<br />

section, i.e. Th. 1. Proposition 1 shows the stabilization of thealgorithmaftersome<br />

time, while Prop. 2 states that the algorithm converges with aprobabilityof1.<br />

Proposition 1. For time t → +∞, theweightassociatedtotheoptimalsolutionis<br />

equal for all the agents, and its equal to the minimum preference level of that word.<br />

Proposition 2. For any probability distribution F the algorithm converges with a probability<br />

of 1.<br />

At last, we state that the presented algorithm always converge to the best solution<br />

of the DisFCSP.<br />

Theorem 1. Since i) the algorithm always converges (see Prop. 2) and ii) by choosing<br />

afunctionF according to Lem. 2, the algorithm in 3.2 always converges to the best<br />

fuzzy solution, i.e. to the solution with the highest preference possible.<br />

4 Experimental results<br />

To evaluate the runs we define the probability of a successful interaction at time, Pt(succ),<br />

given the state of the system at that time. Pt(succ) is determined by the probability that<br />

an agent is a speaker at time t, andtheprobabilitythatagent’sinteractionisasuccess<br />

Pt(succ|s = ai), Pt(succ) = Pt(succ|s = ai)P (s = ai). ThePt(succ|s = ai)<br />

depend on the state of the agent at time t. Inparticularitdependsonthevariableassignment<br />

(or word) b selected by F ,andifc {s}η[s := b] ≤ c {l}η[l := b]. Givenan<br />

algorithm run, at each time t we can compute Pt(succ|s = ai) over the states of all<br />

agents before that the interaction is performed. Since P (s = ai) =1/N ,wecancompute<br />

Pt(succ) = Pt(succ|s = ai)/N .<br />

For our benchmark, let us define a Random Fuzzy NG instance (RFNG). To generate<br />

such an instance, we assign to each agent the same domain of names D, andforeach<br />

agent and each agent’s name we draw a preference level between [0, 1] from a unifom<br />

distribution. Moreover, RFNG can only have crisp binary equality constraints. We also<br />

define the Path RFNG Instance [4] which is a RFNG instance, in which the constraint<br />

network is a path graph.Apathgraph(orlineargraph)isaparticularlysimpleexample<br />

of a tree, which has two terminal vertices (vertices that havedegree1), while all others<br />

(if any) have degree 2.<br />

We generated 5 such random instances, with 10 agents and 10 words each. For each<br />

one of these instances, we computed using a brutal force algorithm the best preference<br />

level and the word associated to this solution. Then, we ran this algorithm 10 times on<br />

175


each instance. To decide when the algorithm finds the solution, a graph crawler checks<br />

the agents’ marked words, and their marked words preferences. If all the agents agree<br />

on the marked variable, this means they find an agreement on thename.Then,thegraph<br />

crawler checks if the shared word has a preference level equaltothebestpreference,in<br />

such case we conclude that the algorithm has found the optimalsolution.<br />

In Fig. 1 we measure the evolution in time of Pt(succ) for the path RFNG instance.<br />

When Pt(succ) = 1,allinteractionsaregoingtobesuccessful,thusweareinan<br />

absorbing state, which from Th. 1, we know it is also a solution.<br />

Pt(succ)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 50 100 150 200 250 300<br />

t<br />

Run 1<br />

Run 2<br />

Run 3<br />

Run 4<br />

Run 5<br />

Fig. 1. Evolution of the mean Pt(succ) over 5 different path RFNG instances. For each instance,<br />

we computed the mean Pt(succ) over 10 different runs. We set N =10,andthenumberof<br />

words to 10.<br />

In Fig. 2, we show the scaling of the mean number of messages MNM needed<br />

to the system to find a solution for different numbers of N variables in the path RFNG<br />

instances. For each N,theMNM was measured over 5 different path RFNG instances.<br />

We notice that the points approximately overlaps the function cN 1.8 .<br />

5 Related Work<br />

Whilst a number of approaches have been proposed to solve DCSPs [<strong>11</strong>, 15] or centralized<br />

FCSP [<strong>11</strong>] alone, only a few work is related to the combination of DCSPs and<br />

fuzzy CSPs. It is important to notice the fundamental difference with the DCSP algorithms<br />

designed by Yokoo [15]. Yokoo addresses three fundamental kinds of DCSP<br />

algorithms: Asynchronous Backtracking, Asynchronous weak-commitment Search and<br />

Distributed Breakout Algorithm [15]. Although these algorithms share the property of<br />

being asynchronous, they require a pre-agreed agent/variable ordering. The algorithm<br />

presented in this paper does not need this initial condition.<br />

DisFCSPs has been of interest to the Multi-Agent System community, especially in<br />

the context of distributed resource allocation, collaborative scheduling, and negotiation<br />

176


1e+08<br />

1e+07<br />

1e+06<br />

MNM 100000<br />

10000<br />

1000<br />

100<br />

Path RFNG<br />

cN a<br />

♦<br />

♦<br />

♦<br />

♦<br />

♦<br />

♦<br />

10 100 1000<br />

Fig. 2. Scaling of the mean number of messages MNM needed to the system to find a solution<br />

for different numbers of variables N in path RFNG instances. For each N, theMNM was<br />

measured over 5 different path RFNG instances. We notice thatthepointsapproximatelyoverlap<br />

the function cN 1.8 .<br />

(e.g. [8]). Those works focus on bilateral negotiations and when many agents take part,<br />

acentralcoordinatingagentmayberequired.Forexample,the work in [8] promotes a<br />

rotating coordinating agent which acts as a central point to evaluate different proposals<br />

sent by other agents. Hence the network model employed in those work is not totally<br />

distributed.<br />

In [13, 14] the authors define the fuzzy GENET model for solving binaryFCSPs.<br />

Fuzzy GENET is a neural network model for solving binary FCSPs. Through transforming<br />

FCSPs into [0, 1] integer programming problems, they display the equivalence between<br />

the underlying working mechanism of fuzzy GENET and thediscreteLagrangian<br />

method. Benchmarking results confirm its feasibility in tackling CSPs and flexibility in<br />

dealing with over-constrained problems.<br />

In [9] the authors propose two approaches to solve these problems: An iterative<br />

method and an adaptation of the Asynchronous Distributed constraint OPTimization<br />

algorithm (ADOPT)forsolvingDisFCSP.Theyalsopresentexperimentsontheperformance<br />

comparison of the two approaches, showing that ADOPT is moresuitablefor<br />

low density problems (density = num of links / number of agents).<br />

6 Conclusions and Future Work<br />

In this paper we have shown how NG problems [12, 1, 10, 7] can be extended with fuzzy<br />

preferences over words in order to solve a generic instance of aDisFCSP[<strong>11</strong>,15,8,9,<br />

14]. In the study, of such an algorithm we try to fully exploit the power of distributed<br />

calculation. Our algorithm is based on the random exploration of the system state space:<br />

it travels through the possible states until it finds the absorbing state, where it stabilizes.<br />

These goals are achieved through the union of new topics addressed in statistical physics<br />

(the NG), and the abstract framework posed by constraint solving.<br />

177<br />

N<br />

♦<br />

♦<br />


In other words, we show that a DisFCSP algorithm may work without a predetermined<br />

agent ordering, and can probabilistically solve instances that where not thought<br />

to be solvable by such algorithms. Moreover, in the real world, a predetermined agent<br />

ordering may be a quite restrictive assumption. Hence, it is very important to explore<br />

and understand how such distributed systems may work and whatproblemsmayexist.<br />

In future work, we intend to evaluate an asynchronous version ofthisalgorithmin<br />

depth, and to test it using other comparison metrics, such as communication cost (number<br />

of message sent), NCCCs (number of non-concurrent constraint checks). Moreover,<br />

we would like to compare our algorithm against other distributed and asynchronous algorithms,<br />

such as the fuzzy GENET, and the fuzzy ADOPT. Furthermore, we will try<br />

to generalize it to generic semiring-based CSP instances [2], and not only fuzzy CSPs.<br />

References<br />

1. A. Baronchelli, M. Felici, E. Caglioti, V. Loreto, and L. Steels. Sharp transition towards<br />

shared vocabularies in multi-agent systems. CoRR,abs/physics/0509075,2005.<br />

2. S. Bistarelli. Semirings for Soft Constraint Solving and Programming, volume2962of<br />

LNCS. Springer,2004.<br />

3. S. Bistarelli and G. Gosti. Solving CSPs with naming games. In A. Oddi, F. Fages, and<br />

F. Rossi, editors, CSCLP,volume5655ofLNCS,pages16–32.Springer,2008.<br />

4. S. Bistarelli and G. Gosti. Solving distributed CSPs probabilistically. Fundam. Inform.,<br />

105(1-2):57–78, 2010.<br />

5. Z. Collin, R. Dechter, and S. Katz. On the feasibility of distributed constraint satisfaction. In<br />

IJCAI,pages318–324,1991.<br />

6. E. W. Dijkstra. Self-stabilizing systems in spite of distributed control. Commun. ACM,<br />

17:643–644, November 1974.<br />

7. N. L. Komarova, K. A. Jameson, and L. Narens. Evolutionary models of color categorization<br />

based on discrimination. Journal of Mathematical Psychology, 51(6):359–382,2007.<br />

8. X. Luo, N. R. Jennings, N. Shadbolt, H. Leung, , and J. H. Lee. A fuzzy constraint based<br />

model for bilateral, multi-issue negotiations in semi-competitive environments. Artif. Intell.,<br />

148:53–102, August 2003.<br />

9. X. T. Nguyen and R. Kowalczyk. On solving distributed fuzzy constraintsatisfactionproblems<br />

with agents. In <strong>Proceedings</strong> of the 2007 IEEE/WIC/ACM International Conference on<br />

Intelligent Agent Technology, IAT’07,pages387–390.IEEEComputerSociety,2007.<br />

10. M. A. Nowak, J. B. Plotkin, and D. C. Krakauer. The evolutionary language game. Journal<br />

of Theoretical Biology, 200(2):147–162,September1999.<br />

<strong>11</strong>. F. Rossi, P. van Beek, and T. Walsh. Handbook of Constraint Programming (Foundations of<br />

Artificial Intelligence). ElsevierScienceInc.,NewYork,NY,USA,2006.<br />

12. L. Steels. A self-organizing spatial vocabulary. Artificial Life,2(3):319–332,1995.<br />

13. J. Wong, K. Ng, and H. Leung. A stochastic approach to solving fuzzy constraint satisfaction<br />

problems. In Eugene Freuder, editor, Principles and Practice of Constraint Programming,<br />

volume <strong>11</strong>18 of LNCS,pages568–569.Springer,1996. 10.1007/3-540-61551-2-<strong>11</strong>9.<br />

14. J. H. Y. Wong and H. Leung. Extending genet to solve fuzzy constraint satisfaction problems.<br />

In Artificial intelligence/Innovative applications of artificial intelligence, AAAI’98IAAI<br />

’98, pages 380–385, Menlo Park, CA, USA, 1998. AAAI.<br />

15. M. Yokoo and K. Hirayama. Algorithms for distributed constraint satisfaction: A review.<br />

Autonomous Agents and Multi-Agent Systems, 3:185–207,June2000.<br />

178


On Improving MUS Extraction Algorithms<br />

Joao Marques-Silva 1,2 and Inês Lynce 2<br />

1 University College Dublin<br />

jpms@ucd.ie<br />

2 INESC-ID/IST, TU Lisbon<br />

ines@sat.inesc-id.pt<br />

Abstract. Minimally Unsatisfiable Subformulas (MUS) find a wide<br />

range of practical applications, including product configuration,<br />

knowledge-based validation, and hardware and software design and verification.<br />

MUSes also find application in recent Maximum Satisfiability<br />

algorithms and in CNF formula redundancy removal. Besides direct applications<br />

in Propositional Logic, algorithms for MUS extraction have<br />

been applied to more expressive logics. This paper proposes two algorithms<br />

for MUS extraction. The first algorithm is optimal in its class,<br />

meaning that it requires the smallest number of calls to a SAT solver.<br />

The second algorithm extends earlier work, but implements a number of<br />

new techniques. The resulting algorithms achieve significant performance<br />

gains with respect to state of the art MUS extraction algorithms.<br />

This paper appears in:<br />

Karem A. Sakallah and Laurent Simon (eds.)<br />

<strong>Proceedings</strong> of the 14th International Conference on Theory and Applications of<br />

Satisfiability Testing (SAT 20<strong>11</strong>).<br />

Lecture Notes in Computer Science, volume 6695, pages 159–173.<br />

Springer, 20<strong>11</strong>.<br />

The full paper is available at:<br />

http://dx.doi.org/10.1007/978-3-642-21581-0_14<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

179


Applying UCT to Boolean Satisfiability<br />

Alessandro Previti 1 , Raghuram Ramanujan 2 , Marco Schaerf 1 , and Bart Selman 2<br />

1 Dipartimento di Informatica e Sistemistica Antonio Ruberti<br />

Sapienza, Università di Roma<br />

Roma, Italy<br />

elsandro84@gmail.com, marco.schaerf@uniroma1.it<br />

2 Department of Computer Science<br />

Cornell University<br />

Ithaca, New York<br />

{raghu, selman}@cs.cornell.edu<br />

Abstract. In this paper, we investigate the feasibility of applying UCTstyle<br />

techniques to the satisfiability of CNF formulae. We develop a new<br />

family of algorithms based on the idea of balancing exploitation (depthfirst<br />

search) and exploration (breadth-first search), combined with a<br />

simple heuristic evaluation of nodes. We compare our algorithm with<br />

a DPLL-based algorithm and WalkSAT, using the size of the tree and<br />

the number of flips as the performance measure. While our approach performs<br />

on par with DPLL on instances with little structure, it does quite<br />

well on structured instances where it can effectively reuse information<br />

gathered from one iteration on the next. We conclude with a discussion<br />

of a number of avenues for future work.<br />

This paper appears in:<br />

Karem A. Sakallah and Laurent Simon (eds.)<br />

<strong>Proceedings</strong> of the 14th International Conference on Theory and Applications of<br />

Satisfiability Testing (SAT 20<strong>11</strong>).<br />

Lecture Notes in Computer Science, volume 6695, pages 373–374.<br />

Springer, 20<strong>11</strong>.<br />

The full paper is available at:<br />

http://dx.doi.org/10.1007/978-3-642-21581-0_35<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

180


An Efficient Hierarchical Parallel Genetic<br />

Algorithm for Graph Coloring Problem<br />

Reza Abbasian and Malek Mouhoub<br />

Department of Computer Science<br />

University of Regina<br />

Regina, Canada<br />

{abbasiar, mouhoubm}@cs.uregina.ca}<br />

Abstract. Graph coloring problems (GCPs) are constraint optimization<br />

problems with various applications including scheduling, time tabling,<br />

and frequency allocation. The GCP consists in nding the minimum number<br />

of colors for coloring the graph vertices such that adjacent vertices<br />

have distinct colors. We propose a parallel approach based on Hierarchical<br />

Parallel Genetic Algorithms (HPGAs) to solve the GCP. We also<br />

propose a new extension to PGA, that is Genetic Modication (GM) operator<br />

designed for solving constraint optimization problems by taking<br />

advantage of the properties between variables and their relations. Our<br />

proposed GM for solving the GCP is based on a novel Variable Ordering<br />

Algorithm (VOA). In order to evaluate the performance of our new approach,<br />

we have conducted several experiments on GCP instances taken<br />

from the well known DIMACS website. The results show that the proposed<br />

approach has a high performance in time and quality of the solution<br />

returned in solving graph coloring instances taken from DIMACS<br />

website. The quality of the solution is measured here by comparing the<br />

returned solution with the optimal one.<br />

This paper appears in:<br />

Natalio Krasnogor (ed.)<br />

<strong>Proceedings</strong> of the 13th annual conference on Genetic and evolutionary computation<br />

(GECCO 20<strong>11</strong>), pages 521–528.<br />

ACM, 20<strong>11</strong>.<br />

The full paper is available at:<br />

http://dx.doi.org/10.<strong>11</strong>45/2001576.2001648<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

181


Checking Safety of Neural Networks with SMT<br />

Solvers: a Comparative Evaluation<br />

Luca Pulina 1 and Armando Tacchella 2<br />

1 DEIS, Università di Sassari, Italy<br />

lpulina@uniss.it<br />

2 DIST, Università di Genova, Italy<br />

Armando.Tacchella@unige.it<br />

Abstract. In this paper we evaluate state-of-the-art SMT solvers on<br />

encodings of verification problems involving Multi-Layer Perceptrons<br />

(MLPs), a widely used type of neural network. Verification is a key technology<br />

to foster adoption of MLPs in safety-related applications, where<br />

stringent requirements about performance and robustness must be ensured<br />

and demonstrated. In previous contributions, we have shown that<br />

safety problems for MLPs can be attacked by solving Boolean combinations<br />

of linear arithmetic constraints. However, the generated encodings<br />

are hard for current state-of-the-art SMT solvers, limiting our ability to<br />

verify MLPs in practice. The experimental results herewith presented<br />

are meant to provide the community with a precise picture of current<br />

achievements and standing open challenges in this intriguing application<br />

domain.<br />

This paper appears in:<br />

R. Pirrone and F. Sorbello (eds.)<br />

<strong>Proceedings</strong> of the 12th International Conference of the Italian Association for<br />

Artificial Intelligence (AI*IA 20<strong>11</strong>).<br />

Lecture Notes in Computer Science, volume 6934.<br />

Springer, 20<strong>11</strong>.<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

182


Plan Stability: Replanning versus Plan Repair<br />

Maria Fox 1 , Alfonso Gerevini 2 , Derek Long 1 , and Ivan Serina 2<br />

1 Department of Computer and Information Sciences<br />

University of Strathclyde, Glasgow, UK<br />

rstname.lastname@cis.strath.ac.uk<br />

2 Department of Electronics for Automation<br />

University of Brescia, Italy<br />

lastname@ing.unibs.it<br />

Abstract. The ultimate objective in planning is to construct plans for<br />

execution. However, when a plan is executed in a real environment it<br />

can encounter differences between the expected and actual context of<br />

execution. These differences can manifest as divergences between the<br />

expected and observed states of the world, or as a change in the goals<br />

to be achieved by the plan. In both cases, the old plan must be replaced<br />

with a new one. In replacing the plan an important consideration is plan<br />

stability. We compare two alternative strategies for achieving the stable<br />

repair of a plan: one is simply to replan from scratch and the other is<br />

to adapt the existing plan to the new context. We present arguments<br />

to support the claim that plan stability is a valuable property. We then<br />

propose an implementation, based on LPG, of a plan repair strategy<br />

that adapts a plan to its new context. We demonstrate empirically that<br />

our plan repair strategy achieves more stability than replanning and can<br />

produce repaired plans more efficiently than replanning.<br />

This paper appears in:<br />

Derek Long, Stephen F. Smith, Daniel Borrajo, Lee McCluskey (eds.)<br />

<strong>Proceedings</strong> of the 16th International Conference on Automated Planning and<br />

Scheduling (ICAPS 2006), pages 212–221.<br />

AAAI, 2006.<br />

The full paper is available at:<br />

http://www.aiconferences.org/ICAPS/2006/Papers/ICAPS06-022.pdf<br />

<strong>Proceedings</strong> of the 18 th RCRA workshop on Experimental Evaluation of Algorithms<br />

for Solving Problems with Combinatorial Explosion (RCRA 20<strong>11</strong>).<br />

In conjunction with IJCAI 20<strong>11</strong>, Barcelona, Spain, July 17-18, 20<strong>11</strong>.<br />

183

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!