Pseudo-Boolean Evaluation 2006: requirements for solvers
This document lists the requirements that a solver
must conform to. These requirements may evolve slightly over time.
You're invited to check the content of this page regularly. The
revisions of this document are detailed below.
Last modification: 2006/02/20
Revisions of this document
Major
Minor
- 2006/03/20: incomplete solvers can register in all categories.
The only requirement is that they MUST output "o " lines in category
optimisation and intercept the SIGTERM to output the best model they
found.
- 2006/02/20: submitters are advised to avoid outputing useless
comment lines
- 2006/02/15: reordered symbols in the BNF grammar and introduced
the <formula> symbol as the start symbol of the grammar.
- 2006/02/14: Inside the evaluation environment, variable names are
guaranteed to range from x1 to xN where N is the total number of
variables in the input file. Each variable present in the objective
function will occur in at least one constraint.
- 2006/02/13: Whitespace explicitely introduced in the grammar of
the input file
- 2006/02/06: Comment lines ("c " lines) may appear anywhere in the
solver output
Registering the solver to the right category
As explained in the section on the input format (see below), benchmarks
may contain arbitrarily long integers that don't fit in a usual 32-bits
integer. The page on integer overflows
explain why we consider that a strong solver should use a multiple
precision integer library. However, we do not require that you modify
your existing solver to use big integers. We just need that you
register your solver in the categories which it is able to solve. This
is an important decision because registering a solver in a given
category is a claim that it will give correct answers for each
benchmark in that category.
Concerning the integers values, three categories are defined
- Category "small integers" (SMALLINT)
The benchmarks in this category contain only small integers, which
means that they
contain no constraint with a coefficient or a sum of coefficients
greater than 2^{20} (each number has less than 20 bits).
Solvers which use 32-bits
integers are most probably safe, unless they use some fancy learning
scheme. All solvers should register in that category.
- Category "medium integers" (MEDINT)
This category contains the set of benchmarks which are neither SMALLINT
nor BIGINT. This means that there's at least one coefficient or a
constraint with a sum of coefficients between 2^{20}
and 2^{30} (at least 20 bits but less than 30 bits). Solvers
which use 32-bits integers and
no constraint learning are probably safe. Solvers which use 32-bits
integers and even simple learning schemes might give wrong answers.
- Category "big integers" (BIGINT)
The constraints contain big integers, which means
that at least one number in the file is greater than 2^{30}
(at least 30 bits) or there's at least a constraint with a sum of
coefficients greater than 2^{30} (at least 30 bits). Only
solvers with
support for big integers should register in that category.
Two other categories are defined:
- Category SAT/UNSAT
The benchmarks in this category do not contain an objective function.
The solver is expected to answer SATISFIABLE or UNSATISFIABLE. All
solvers should register in that category.
- Category "optimisation" (OPT)
Benchmarks in this category contain an objective function which should
be minimized. Solvers entering this category must be able to find the
best solution and give an OPTIMUM FOUND answer.
Incomplete solvers
Incomplete solvers are welcome in this evaluation. Despite the fact
that they will never answer UNSATISFIABLE or OPTIMUM FOUND, incomplete
solvers can be registered in both the SAT/UNSAT and OPT categories.
In the SAT/UNSAT category, an incomplete solver will stop as soon as it
finds a solution and will time out if it can't find one. The only
difference with a complete solver is that it will time out
systematically on unsatisfiable instances.
In the optimisation category, an incomplete solver will systematically
time out because it will be unable to prove that it has found the
optimum solution. Yet, it may have found the optimum value well before
the time out. In order to get relevant informations in this category,
an incomplete solver must fulfill two requirements:
- it must intercept the SIGTERM sent to the solver on timeout and
output either "s UNKNOWN" or "s SATISFIABLE" with the "v " line
corresponding to the best model it has found
- it MUST output "o " lines whenever it finds a better solution so
that, even if the solver always timeout, the timestamp of the last "o "
line indicates when the best solution was found. Keep in mind that it
is the evaluation environment which is in charge of timestamping "o "
lines.
Submitters must indicate is their solver is complete or incomplete on
the submission form.
Execution environment
The solvers will run on a cluster of computers using the Linux
operating system. They will run under the control of another program
(runsolver) which will enforce some limits on the memory and the total
CPU time
used by the program. Solvers will be run inside a sandbox
that will prevent unauthorized use of the system (network connections,
file creation outside the allowed directory,...)
The pseudo-Boolean solver must accept one first parameter which is
the name of the file containing the pseudo-Boolean instance, and a
second parameter, which is a number between 0 and 4294967295 to be used
as a seed for the random generator if the solver uses random numbers:
./mysolver instancefile.pb 12345
Two executions of a solver with the same parameters and system
resources must output the same result in approximately the same time
(so that the
experiments can be repeated).
The solver may also (optionally) use the values of the following
environment variables:
- PBTIMEOUT (the number of seconds it will be allowed to run),
- PBRAM (the amount of RAM in Mb available to the solver).
- TMPDIR (the absolute pathname of the only directory where the
solver is allowed to create temporary files)
After PBTIMEOUT seconds have elapsed, the solver will first receive
a SIGTERM
to give it a chance to output the best solution it found so far (in the
case of an optimizing solver). One second later, the program will
receive a SIGKILL
signal
from the controlling program to terminate the solver.
The solver cannot write to any file except standard output,
standard
error and files in the TMPDIR directory. A solver is not allowed to
open any network connection or launch unexpected external commands.
Solvers may use several processes or threads.
Input Format
The input file format is a stricter variant of the OPB format
(see the end of the README
file in http://www.mpi-sb.mpg.de/units/ag2/software/opbdp/opbdp1.1.1.tar.gz).
Here is an example:
* #variable= 5 #constraint= 4
*
* comments
*
*
min: 1 x2 -1 x3 ;
1 x1 +4 x2 -2 x5 >= 2;
-1 x1 +4 x2 -2 x5 >= +3;
12345678901234567890 x4 +4 x3 >= 10;
* an equality constraint
2 x2 +3 x4 +2 x1 +3 x5 = 5 ;
Most of the syntax of this file can be described by a simple BNF
grammar (see http://en.wikipedia.org/wiki/Backus-Naur_form).
<formula> is the start symbol of this grammar.
<formula>::= <sequence_of_comments>
[<objective>]
<sequence_of_comments_or_constraints>
<sequence_of_comments>::= <comment> [<sequence_of_comments>]
<comment>::= "*" <any_sequence_of_characters_other_than_EOL> <EOL>
<sequence_of_comments_or_constraints>::=<comment_or_constraint> [<sequence_of_comments_or_constraints>]
<comment_or_constraint>::=<comment>|<constraint>
<objective>::= "min:" <zeroOrMoreSpace> <linearfunction> ";"
<constraint>::= <linearfunction> <relational_operator> <zeroOrMoreSpace> <integer> <zeroOrMoreSpace> ";"
<linearfunction>::= <product> | <product><linearfunction>
<product>::= <integer> <oneOrMoreSpace> <variablename> <oneOrMoreSpace>
<integer>::= <unsigned_integer> | "+" <unsigned_integer> | "-" <unsigned_integer>
<unsigned_integer>::= <digit> | <digit><unsigned_integer>
<relational_operator>::= ">=" | "="
<variablename>::= "x" <unsigned_integer>
<oneOrMoreSpace>::= " " [<oneOrMoreSpace>]
<zeroOrMoreSpace>::= [" " <zeroOrMoreSpace>]
Some comments and details:
- A line starting with a '*' is a comment and can be ignored.
Comment lines are allowed anywhere in the file.
- As a hint to perform memory allocation, the first line of the
file will be a comment containing the word "#variable=" followed by a
space and the number of variables in the file, then a space and the
word
"#constraint=" followed by a space and the number of constraints in the
file. The space between the word and the number is mandatory to make
parsing trivial. This information is only provided as a commodity for
solvers which include a very limited parser. High quality provers are
encouraged to ignore this information as it may not be accurate outside
the evaluation environment (e.g. when a user creates a file by hand).
- Each non comment line must end with a semicolon ';'
- The first non comment line may be an objective function to
minimize. It starts with the word "min:" followed by the linear
function to minimize and terminated by a semicolon. No other objective
function can be found after this first non comment line.
- A constraint is written on a single line and is terminated by a
semicolon.
- A boolean variable (atom) is named by a lowercase 'x' followed by
a strictly positive integer number. The integer number can be
considered as a
identifier of the variable. This integer identifier is strictly less
than 2^{32}. Therefore, a solver can input a variable name by
reading a character (to skip the 'x') and then an unsigned long
int.
- Inside the evaluation environment, variable names are guaranteed
to range from "x1" to "xN" where N is the total number of variables in
the instance (as given on the first line of the file). Each variable
between x1 and xN will occur in at least one constraint or the
objective function. High quality provers are
encouraged to avoid relying on this assumption as it may not hold
outside
the evaluation environment.
- Inside the evaluation environment, each
variable present in the objective function will occur in at least one
constraint. High quality provers are
encouraged to avoid relying on this assumption as it may not hold
outside
the evaluation environment.
- Each variable name must be followed by a space
- The negation of an atom A will not appear in the file (it will be
translated to 1-A)
- The weight of a variable may contain an arbitrary number of
digits. There must
be no space between the sign of an integer and its digits.
- Lines may be very long. Programmers should avoid reading a line
as a whole.
Notice that integers may be of arbitrary size in the
file. See here for a rationale.
The rules let us write a very simple parser and avoid some ambiguities
present in the original description of the OPB format. At the same
time, the format remains easily human readable and is compatible with
solvers using the OPB format.
Output Format
The solvers must output messages on the standard output that
will be used to check the results. The output format is inspired by the
DIMACS
output specification of the SAT competition and may be used to manually
check
some
results.
Messages
With the exception of the "o " line, there is no specific order in the
solvers output lines. However, all
line,
according to its first char, must belong to one of the four following
categories:
- comments ("c " lines)
These lines start by the two characters: lower case c followed by a
space (ASCII code 32).
These lines are optional and may appear anywhere in the solver output.
They contain any information that authors want to
emphasize, such as #backtracks, #flips,... or internal cpu-time. They
are recorded by the evaluation environment for later
viewing but are otherwise ignored. At most one megabyte of solver
output will be recorded. So if a solver is very verbose,
some comments may be lost.
Submitters are advised to avoid outputing comment lines which may be
useful in an interactive environment but otherwise useless in a batch
environment. For example, outputing comment lines with the number of
constraints read so far only increases the size of the logs with no
benefit.
If a solver is really too verbose, the organizers will ask the
submitter to remove some comment lines.
- value of the objective function
("o " lines)
These lines start by the two characters: lower
case o followed by a space (ASCII code 32).
These lines are mandatory for
incomplete solvers. As far as complete solvers are concerned,
these lines are not strictly mandatory but solvers are
strongly invited to output them.
These lines should be output only for optimisation instances. They will
be ignored for SAT/UNSAT instances.
Whenever the solver finds a solution with a better value of the
objective function, it is asked to output an "o " line with the current
value of the objective function. Therefore, an "o " line must contain
the lower case o followed by a space and then by an integer which
represents the better value of the objective function. The integer
output on this line must be the value of the objective function as
found in the instance file. "o " lines should be output as soon as the
solvers finds a better solution and be ended by a standard Unix end of
line character ('\n'). Programmers are advised to flush immediately the
output stream.
Example:
The instance file contains an objective function min: 1 x1 +1 x2
-1 x3
Let f be this objective function found in the file
The solver chooses to rewrite this function as f '=x1+x2+not(x3) to get
only positive weights. It must remember that f=f '-1 (since
-x=not(x)-1). When it finds a solution x1=true, x2=true, x3=false, it
must output "o 2" (f ' has value 3 with this assignment but f has value
2). If x1=false, x2=false and x3=true is a solution, the solver may
successively output
o 2
o 1
o -1
s OPTIMUM FOUND
v -x1 -x2 x3
The evaluation environment will automatically timestamp each of these
lines so that we'll be able to know when the solver has found a better
solution and how good this solution was. The goal is to analyse the way
solvers progress toward the best solution.
- solution ("s " lines)
This line starts by the two characters: lower case s followed by a
space
(ASCII code 32).
Only one such line is allowed.
It is mandatory.
This line gives the answer of the solver. It must be one of the following
answers:
- s SATISFIABLE
this line must be output when the solver has found a solution but it
can't prove that this solution give the least value of the objective
function. This is the answer to output when the instance file contains
no objective function.
- s OPTIMUM FOUND
this line must be output when the solver has found a model
and it can prove that no other solution will give a value of the
objective function strictly less than the one obtained with this model.
Let v be the value of the objective obtained with the valuation output
by the solver. Giving this result is a commitment
that the formula extended with the constraint objective<v is
unsatisfiable.
This answer must not be used for instances which do not contain an
objective function.
- s UNSATISFIABLE
this line must be output when the solver can prove that
the formula has no solution.
- s UNKNOWN
this line must be output in any other case, i.e. when the
solver is not able to tell anything about the formula
It is of uttermost importance to respect the exact spelling of these
answers. Any mistake in the writing of these lines will cause the
answer
to be disregarded.
In contrast with last year's evaluation, solvers are not required to
provide any specific exit code (we had some troubles with exit codes
not matching the "s " lines in a few cases).
- values ("v " lines)
This line starts by the two characters: lower case v followed by a
space
(ASCII code 32).
More than one "v " line is allowed but the evaluation environment will
act as if their content was merged.
It is mandatory.
If the solver finds a solution (it outputs "s SATISFIABLE" or "s
OPTIMUM FOUND"), it must provide a model (or an
implicant) of the instance that will be used to check the correctness
of
the answer. i.e., it must provide a list of
non-contradictory
literals which, when interpreted to true, makes every constraint of the
input formula true. When
optimization is considered, this set of literals should give the
minimum value of the utility function that the solver was able to
discover. The negation of a literal is denoted by a minus sign
immediately followed by the identifier of the variable. The solution
line MUST define the value of EACH VARIABLE. The order of literals does
not matter.
Arbitrary
white space characters, including ordinary white spaces, newline and
tabulation
characters, are allowed between the literals, as long as each line
containing
the literals is a values line, i.e. it begins with the two
chars
"v ".
The values
lines should only appear with SATISFIABLE instance (including instances
for which an OPTIMUM was FOUND).
Values lines must be terminated by a Line Feed character (the
usual Unix line terminator '\n'. A "v " line which does not end with
that
terminator will be ignored because it will be considered that the
solver was interrupted before it could output a complete solution.
For instance, the following outputs are valid for the instances
given in example:
mycomputer:~$ ./mysolver myinstance-sat
c mysolver 6.55957 starting with PBTIMEOUT fixed to 1000s
c Trying to guess a solution...
s SATISFIABLE
v x1 x2 -x5 x4 -x3
c Done (mycputime is 234s).
mycomputer:~$ ./mysolver myinstance-unsat
c mysolver 6.55957 starting with PBTIMEOUT fixed to 1000s
c Trying to guess a solution...
c Contradiction found!
s UNSATISFIABLE
c Done (mycputime is 2s).
Note that we do not require a proof for unsatisfiability.
If the solver does not output a solution line, or if the
solution
line is misspelled, then UNKNOWN will be assumed.
A solver is declared to give a wrong answer in the following cases:
- It outputs UNSATISFIABLE for an instance which can be proved to
be satisfiable.
- It outputs SATISFIABLE or OPTIMUM FOUND, but provides an
assignment which doesn't satisfy each constraint. The only exception is
when the solver outputs an incomplete "v " line (which doesn't end by
'\n') in which case it is assumed that the solver was interrupted
before it could output the complete model and the answer will be
considered as UNKNOWN.
- It outputs OPTIMUM FOUND but there exists a model with a better
value of the objective function that the one obtained from the model
found.
When a solver provides even one single
wrong answer in a given category of benchmarks, the solver's results in
that category will be excluded from the final evaluation results
because they cannot be trusted. Exceptionally, the
organizers may decide to present separately the results of such a
solver but only if it obtained particularly good results and if a
detailed explanation of the problem as well as a solution is provided
by the submitters.
A solver which ends without giving any solution, or just crashes for
some reason (internal bugs...), is simply considered as giving an
UNKNOWN result. It is buggy, but not incorrect.