Pseudo-Boolean Evaluation 2006: requirements for solvers

This document lists the requirements that a solver must conform to. These requirements may evolve slightly over time. You're invited to check the content of this page regularly. The revisions of this document are detailed below.

Last modification: 2006/02/20

Revisions of this document

Major

No major revision

Minor

2006/03/20: incomplete solvers can register in all categories. The only requirement is that they MUST output "o " lines in category optimisation and intercept the SIGTERM to output the best model they found.
2006/02/20: submitters are advised to avoid outputing useless comment lines
2006/02/15: reordered symbols in the BNF grammar and introduced the <formula> symbol as the start symbol of the grammar.
2006/02/14: Inside the evaluation environment, variable names are guaranteed to range from x1 to xN where N is the total number of variables in the input file. Each variable present in the objective function will occur in at least one constraint.
2006/02/13: Whitespace explicitely introduced in the grammar of the input file
2006/02/06: Comment lines ("c " lines) may appear anywhere in the solver output

Registering the solver to the right category

As explained in the section on the input format (see below), benchmarks may contain arbitrarily long integers that don't fit in a usual 32-bits integer. The page on integer overflows explain why we consider that a strong solver should use a multiple precision integer library. However, we do not require that you modify your existing solver to use big integers. We just need that you register your solver in the categories which it is able to solve. This is an important decision because registering a solver in a given category is a claim that it will give correct answers for each benchmark in that category.

Concerning the integers values, three categories are defined

Category "small integers" (SMALLINT)
The benchmarks in this category contain only small integers, which means that they contain no constraint with a coefficient or a sum of coefficients greater than 2²⁰ (each number has less than 20 bits). Solvers which use 32-bits integers are most probably safe, unless they use some fancy learning scheme. All solvers should register in that category.
Category "medium integers" (MEDINT)
This category contains the set of benchmarks which are neither SMALLINT nor BIGINT. This means that there's at least one coefficient or a constraint with a sum of coefficients between 2²⁰ and 2³⁰ (at least 20 bits but less than 30 bits). Solvers which use 32-bits integers and no constraint learning are probably safe. Solvers which use 32-bits integers and even simple learning schemes might give wrong answers.
Category "big integers" (BIGINT)
The constraints contain big integers, which means that at least one number in the file is greater than 2³⁰ (at least 30 bits) or there's at least a constraint with a sum of coefficients greater than 2³⁰ (at least 30 bits). Only solvers with support for big integers should register in that category.

Two other categories are defined:

Category SAT/UNSAT
The benchmarks in this category do not contain an objective function. The solver is expected to answer SATISFIABLE or UNSATISFIABLE. All solvers should register in that category.
Category "optimisation" (OPT)
Benchmarks in this category contain an objective function which should be minimized. Solvers entering this category must be able to find the best solution and give an OPTIMUM FOUND answer.

Incomplete solvers

Incomplete solvers are welcome in this evaluation. Despite the fact that they will never answer UNSATISFIABLE or OPTIMUM FOUND, incomplete solvers can be registered in both the SAT/UNSAT and OPT categories.

In the SAT/UNSAT category, an incomplete solver will stop as soon as it finds a solution and will time out if it can't find one. The only difference with a complete solver is that it will time out systematically on unsatisfiable instances.

In the optimisation category, an incomplete solver will systematically time out because it will be unable to prove that it has found the optimum solution. Yet, it may have found the optimum value well before the time out. In order to get relevant informations in this category, an incomplete solver must fulfill two requirements:

it must intercept the SIGTERM sent to the solver on timeout and output either "s UNKNOWN" or "s SATISFIABLE" with the "v " line corresponding to the best model it has found
it MUST output "o " lines whenever it finds a better solution so that, even if the solver always timeout, the timestamp of the last "o " line indicates when the best solution was found. Keep in mind that it is the evaluation environment which is in charge of timestamping "o " lines.

Submitters must indicate is their solver is complete or incomplete on the submission form.

Execution environment

The solvers will run on a cluster of computers using the Linux operating system. They will run under the control of another program (runsolver) which will enforce some limits on the memory and the total CPU time used by the program. Solvers will be run inside a sandbox that will prevent unauthorized use of the system (network connections, file creation outside the allowed directory,...)

The pseudo-Boolean solver must accept one first parameter which is the name of the file containing the pseudo-Boolean instance, and a second parameter, which is a number between 0 and 4294967295 to be used as a seed for the random generator if the solver uses random numbers:

	./mysolver instancefile.pb 12345

Two executions of a solver with the same parameters and system resources must output the same result in approximately the same time (so that the experiments can be repeated).

The solver may also (optionally) use the values of the following environment variables:

PBTIMEOUT (the number of seconds it will be allowed to run),
PBRAM (the amount of RAM in Mb available to the solver).
TMPDIR (the absolute pathname of the only directory where the solver is allowed to create temporary files)

After PBTIMEOUT seconds have elapsed, the solver will first receive a SIGTERM to give it a chance to output the best solution it found so far (in the case of an optimizing solver). One second later, the program will receive a SIGKILL signal from the controlling program to terminate the solver.

The solver cannot write to any file except standard output, standard error and files in the TMPDIR directory. A solver is not allowed to open any network connection or launch unexpected external commands. Solvers may use several processes or threads.

Input Format

The input file format is a stricter variant of the OPB format (see the end of the README file in http://www.mpi-sb.mpg.de/units/ag2/software/opbdp/opbdp1.1.1.tar.gz). Here is an example:

* #variable= 5 #constraint= 4
*
* comments
*
* 
min: 1 x2 -1 x3 ;
1 x1 +4 x2 -2 x5 >= 2;
-1  x1 +4   x2 -2   x5 >= +3;
12345678901234567890 x4 +4 x3 >= 10;
* an equality constraint
2 x2 +3 x4 +2 x1 +3 x5 = 5 ;

Most of the syntax of this file can be described by a simple BNF grammar (see http://en.wikipedia.org/wiki/Backus-Naur_form). <formula> is the start symbol of this grammar.

<formula>::= <sequence_of_comments>
             [<objective>]
             <sequence_of_comments_or_constraints>

<sequence_of_comments>::= <comment> [<sequence_of_comments>]
<comment>::= "*" <any_sequence_of_characters_other_than_EOL> <EOL>

<sequence_of_comments_or_constraints>::=<comment_or_constraint> [<sequence_of_comments_or_constraints>]
<comment_or_constraint>::=<comment>|<constraint>

<objective>::= "min:" <zeroOrMoreSpace> <linearfunction>  ";"
<constraint>::= <linearfunction> <relational_operator> <zeroOrMoreSpace> <integer> <zeroOrMoreSpace> ";"

<linearfunction>::= <product> | <product><linearfunction>
<product>::= <integer> <oneOrMoreSpace> <variablename> <oneOrMoreSpace>

<integer>::= <unsigned_integer> | "+" <unsigned_integer> | "-" <unsigned_integer>
<unsigned_integer>::= <digit> | <digit><unsigned_integer>

<relational_operator>::= ">=" | "="

<variablename>::= "x" <unsigned_integer>

<oneOrMoreSpace>::= " " [<oneOrMoreSpace>]
<zeroOrMoreSpace>::= [" " <zeroOrMoreSpace>]

Some comments and details:

A line starting with a '*' is a comment and can be ignored. Comment lines are allowed anywhere in the file.
As a hint to perform memory allocation, the first line of the file will be a comment containing the word "#variable=" followed by a space and the number of variables in the file, then a space and the word "#constraint=" followed by a space and the number of constraints in the file. The space between the word and the number is mandatory to make parsing trivial. This information is only provided as a commodity for solvers which include a very limited parser. High quality provers are encouraged to ignore this information as it may not be accurate outside the evaluation environment (e.g. when a user creates a file by hand).
Each non comment line must end with a semicolon ';'
The first non comment line may be an objective function to minimize. It starts with the word "min:" followed by the linear function to minimize and terminated by a semicolon. No other objective function can be found after this first non comment line.
A constraint is written on a single line and is terminated by a semicolon.
A boolean variable (atom) is named by a lowercase 'x' followed by a strictly positive integer number. The integer number can be considered as a identifier of the variable. This integer identifier is strictly less than 2³². Therefore, a solver can input a variable name by reading a character (to skip the 'x') and then an unsigned long int.
Inside the evaluation environment, variable names are guaranteed to range from "x1" to "xN" where N is the total number of variables in the instance (as given on the first line of the file). Each variable between x1 and xN will occur in at least one constraint or the objective function. High quality provers are encouraged to avoid relying on this assumption as it may not hold outside the evaluation environment.
Inside the evaluation environment, each variable present in the objective function will occur in at least one constraint. High quality provers are encouraged to avoid relying on this assumption as it may not hold outside the evaluation environment.
Each variable name must be followed by a space
The negation of an atom A will not appear in the file (it will be translated to 1-A)
The weight of a variable may contain an arbitrary number of digits. There must be no space between the sign of an integer and its digits.
Lines may be very long. Programmers should avoid reading a line as a whole.

Notice that integers may be of arbitrary size in the file. See here for a rationale.

The rules let us write a very simple parser and avoid some ambiguities present in the original description of the OPB format. At the same time, the format remains easily human readable and is compatible with solvers using the OPB format.

Output Format

The solvers must output messages on the standard output that will be used to check the results. The output format is inspired by the DIMACS output specification of the SAT competition and may be used to manually check some results.

Messages

With the exception of the "o " line, there is no specific order in the solvers output lines. However, all line, according to its first char, must belong to one of the four following categories:

comments ("c " lines)
These lines start by the two characters: lower case c followed by a space (ASCII code 32).
These lines are optional and may appear anywhere in the solver output.
They contain any information that authors want to emphasize, such as #backtracks, #flips,... or internal cpu-time. They are recorded by the evaluation environment for later viewing but are otherwise ignored. At most one megabyte of solver output will be recorded. So if a solver is very verbose, some comments may be lost.

Submitters are advised to avoid outputing comment lines which may be useful in an interactive environment but otherwise useless in a batch environment. For example, outputing comment lines with the number of constraints read so far only increases the size of the logs with no benefit.

If a solver is really too verbose, the organizers will ask the submitter to remove some comment lines.
value of the objective function ("o " lines)
These lines start by the two characters: lower case o followed by a space (ASCII code 32).
These lines are mandatory for incomplete solvers. As far as complete solvers are concerned, these lines are not strictly mandatory but solvers are strongly invited to output them.
These lines should be output only for optimisation instances. They will be ignored for SAT/UNSAT instances.
Whenever the solver finds a solution with a better value of the objective function, it is asked to output an "o " line with the current value of the objective function. Therefore, an "o " line must contain the lower case o followed by a space and then by an integer which represents the better value of the objective function. The integer output on this line must be the value of the objective function as found in the instance file. "o " lines should be output as soon as the solvers finds a better solution and be ended by a standard Unix end of line character ('\n'). Programmers are advised to flush immediately the output stream.

Example:
The instance file contains an objective function min: 1 x1 +1 x2 -1 x3
Let f be this objective function found in the file
The solver chooses to rewrite this function as f '=x1+x2+not(x3) to get only positive weights. It must remember that f=f '-1 (since -x=not(x)-1). When it finds a solution x1=true, x2=true, x3=false, it must output "o 2" (f ' has value 3 with this assignment but f has value 2). If x1=false, x2=false and x3=true is a solution, the solver may successively output
o 2 o 1 o -1 s OPTIMUM FOUND v -x1 -x2 x3
The evaluation environment will automatically timestamp each of these lines so that we'll be able to know when the solver has found a better solution and how good this solution was. The goal is to analyse the way solvers progress toward the best solution.
solution ("s " lines)
This line starts by the two characters: lower case s followed by a space (ASCII code 32).
Only one such line is allowed.
It is mandatory.
This line gives the answer of the solver. It must be one of the following answers:
- s SATISFIABLE
  this line must be output when the solver has found a solution but it can't prove that this solution give the least value of the objective function. This is the answer to output when the instance file contains no objective function.
- s OPTIMUM FOUNDthis line must be output when the solver has found a model and it can prove that no other solution will give a value of the objective function strictly less than the one obtained with this model. Let v be the value of the objective obtained with the valuation output by the solver. Giving this result is a commitment that the formula extended with the constraint objective<v is unsatisfiable.
  This answer must not be used for instances which do not contain an objective function.
- s UNSATISFIABLEthis line must be output when the solver can prove that the formula has no solution.
- s UNKNOWNthis line must be output in any other case, i.e. when the solver is not able to tell anything about the formula
It is of uttermost importance to respect the exact spelling of these answers. Any mistake in the writing of these lines will cause the answer to be disregarded.

In contrast with last year's evaluation, solvers are not required to provide any specific exit code (we had some troubles with exit codes not matching the "s " lines in a few cases).
values ("v " lines)
This line starts by the two characters: lower case v followed by a space (ASCII code 32).
More than one "v " line is allowed but the evaluation environment will act as if their content was merged.
It is mandatory.

If the solver finds a solution (it outputs "s SATISFIABLE" or "s OPTIMUM FOUND"), it must provide a model (or an implicant) of the instance that will be used to check the correctness of the answer. i.e., it must provide a list of non-contradictory literals which, when interpreted to true, makes every constraint of the input formula true. When optimization is considered, this set of literals should give the minimum value of the utility function that the solver was able to discover. The negation of a literal is denoted by a minus sign immediately followed by the identifier of the variable. The solution line MUST define the value of EACH VARIABLE. The order of literals does not matter. Arbitrary white space characters, including ordinary white spaces, newline and tabulation characters, are allowed between the literals, as long as each line containing the literals is a values line, i.e. it begins with the two chars "v ".
The values lines should only appear with SATISFIABLE instance (including instances for which an OPTIMUM was FOUND).

Values lines must be terminated by a Line Feed character (the usual Unix line terminator '\n'. A "v " line which does not end with that terminator will be ignored because it will be considered that the solver was interrupted before it could output a complete solution.

For instance, the following outputs are valid for the instances given in example:
```
mycomputer:~$ ./mysolver myinstance-sat
c mysolver 6.55957 starting with PBTIMEOUT fixed to 1000s
c Trying to guess a solution...
s SATISFIABLE
v x1 x2 -x5 x4 -x3
c Done (mycputime is 234s).

mycomputer:~$ ./mysolver myinstance-unsat
c mysolver 6.55957 starting with PBTIMEOUT fixed to 1000s
c Trying to guess a solution...
c Contradiction found!
s UNSATISFIABLE
c Done (mycputime is 2s).
```
Note that we do not require a proof for unsatisfiability.

If the solver does not output a solution line, or if the solution line is misspelled, then UNKNOWN will be assumed.

Bugs and wrong answers...

A solver is declared to give a wrong answer in the following cases:

It outputs UNSATISFIABLE for an instance which can be proved to be satisfiable.
It outputs SATISFIABLE or OPTIMUM FOUND, but provides an assignment which doesn't satisfy each constraint. The only exception is when the solver outputs an incomplete "v " line (which doesn't end by '\n') in which case it is assumed that the solver was interrupted before it could output the complete model and the answer will be considered as UNKNOWN.
It outputs OPTIMUM FOUND but there exists a model with a better value of the objective function that the one obtained from the model found.

When a solver provides even one single wrong answer in a given category of benchmarks, the solver's results in that category will be excluded from the final evaluation results because they cannot be trusted. Exceptionally, the organizers may decide to present separately the results of such a solver but only if it obtained particularly good results and if a detailed explanation of the problem as well as a solution is provided by the submitters.

A solver which ends without giving any solution, or just crashes for some reason (internal bugs...), is simply considered as giving an UNKNOWN result. It is buggy, but not incorrect.