- Category "no optimization function" (SAT)

This is the category of problems which do not include an objective function. Since there's no optimization to perform, the expected answer is either SAT or UNSAT. -
Category "optimization, small integers" (OPTSMALLINT)

The benchmarks in this category contain an objective function which sould be minimized. They use only small integers, which means that they contain no constraint with a sum of coefficients greater than 2^{20}(20 bits). The expected answer is OPTIMUM FOUND or UNSAT (possibly SAT when a solver finds a model but cannot prove it is the optimum solution). -
Category "optimization, medium integers" (OPTMEDINT)

The benchmarks in this category contain an objective function which sould be minimized. This category contains the set of benchmarks which are neither SMALLINT, nor BIGINT. That means that no number in the file is greater than 2^{30}(30 bits) but there's at least one constraint with a sum of coefficients greater than 2^{20}(20 bits). The expected answer is OPTIMUM FOUND or UNSAT (possibly SAT when a solver finds a model but cannot prove it is the optimum solution). - Category "optimization, big integers" (OPTBIGINT)

The benchmarks in this category contain an objective function which sould be minimized. The constraints may contain big integers, which mean that at least one number in the file is greater than 2^{30}(30 bits). The expected answer is OPTIMUM FOUND or UNSAT (possibly SAT when a solver finds a model but cannot prove it is the optimum solution).

Not all solvers had support for each of these categories. Here are the capabilities for each solver:

Solver | SAT | OPTSMALLINT | OPTMEDINT | OPTBIGINT |
---|---|---|---|---|

bsolo | ||||

galena | ||||

minisat+ | ||||

PBS4 | ||||

Pueblo | ||||

sat4jpseudo | ||||

vallst_0.9.258 | ||||

pb2sat+zchaff |

As one can see, a few bugs still appeared in a few solvers. Here's what we know about these minor bugs.

On one single instance, solver *Pueblo* outputs a solution which
did not satisfy all constraints. According to its author, that problem
was caused by an objective function that contains terms with both
positive and negative coefficients, which was not correctly handled by
the solver. The problem has been fixed since then.

Because of a missing flush in its signal handler, *minisat+*
never outputs a complete solution on the "v " line and therefore,
all its SAT answer are counted as NO CERTIFICATE.

On 3 instances classified as NO CERTIFICATE, solver *sat4jPseudo*
required more than two seconds to output a solution when it received a
SIGTERM and was therefore killed before it could provide a complete
solution.

The wrong UNSAT answers of *vallst_0.9.258* seem to be caused by
an error in the script (a 'grepres' command is called but not found).

Solver Name | #benchs | UNSAT. | OPT. | SAT. | SAT (timeout) | SAT (out of mem.) | UNKNOWN | UNKNOWN (timeout) | UNKNOWN (out of mem.) | UNKNOWN (exit code) | Sig. Caught | NO CERT. | WRONG CERT. | WRONG OPT. | WRONG UNSAT. | Misc. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

bsolo | 113 | 36 | 0 | 8 | 0 | 0 | 69 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

galena | 113 | 36 | 0 | 7 | 0 | 0 | 70 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

minisat+ | 113 | 43 | 0 | 0 | 0 | 0 | 0 | 35 | 0 | 0 | 0 | 35 | 0 | 0 | 0 | 0 |

PBS4 | 113 | 61 | 0 | 28 | 0 | 0 | 0 | 24 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

Pueblo | 113 | 61 | 0 | 42 | 0 | 0 | 0 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

sat4jpseudo | 113 | 52 | 0 | 17 | 0 | 0 | 0 | 44 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

vallst_0.9.258 | 113 | 38 | 0 | 29 | 0 | 0 | 0 | 46 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

pb2sat+zchaff | 113 | 42 | 0 | 36 | 0 | 0 | 0 | 35 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

Solver Name | #benchs | UNSAT. | OPT. | SAT. | SAT (timeout) | SAT (out of mem.) | UNKNOWN | UNKNOWN (timeout) | UNKNOWN (out of mem.) | UNKNOWN (exit code) | Sig. Caught | NO CERT. | WRONG CERT. | WRONG OPT. | WRONG UNSAT. | Misc. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

bsolo | 386 | 10 | 159 | 159 | 21 | 0 | 31 | 0 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

galena | 386 | 9 | 98 | 135 | 0 | 0 | 140 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0 |

minisat+ | 386 | 10 | 176 | 0 | 0 | 0 | 0 | 78 | 1 | 1 | 0 | 120 | 0 | 0 | 0 | 0 |

PBS4 | 386 | 10 | 133 | 0 | 0 | 0 | 0 | 243 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

Pueblo | 386 | 10 | 160 | 182 | 0 | 0 | 0 | 33 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |

sat4jpseudo | 386 | 10 | 120 | 0 | 225 | 0 | 1 | 29 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |

vallst_0.9.258 | 386 | 10 | 131 | 4 | 0 | 0 | 0 | 231 | 0 | 0 | 0 | 0 | 3 | 0 | 7 | 0 |

pb2sat+zchaff | 386 | 10 | 136 | 0 | 148 | 10 | 0 | 33 | 7 | 42 | 0 | 0 | 0 | 0 | 0 | 0 |

Solver Name | #benchs | UNSAT. | OPT. | SAT. | SAT (timeout) | SAT (out of mem.) | UNKNOWN | UNKNOWN (timeout) | UNKNOWN (out of mem.) | UNKNOWN (exit code) | Sig. Caught | NO CERT. | WRONG CERT. | WRONG OPT. | WRONG UNSAT. | Misc. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

bsolo | 191 | 0 | 28 | 78 | 4 | 0 | 50 | 5 | 26 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

galena | 191 | 4 | 5 | 15 | 0 | 0 | 109 | 0 | 0 | 0 | 51 | 0 | 0 | 0 | 7 | 0 |

minisat+ | 191 | 0 | 24 | 0 | 0 | 0 | 0 | 92 | 7 | 1 | 0 | 67 | 0 | 0 | 0 | 0 |

PBS4 | 191 | 0 | 33 | 0 | 0 | 0 | 0 | 158 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

Pueblo | 191 | 0 | 34 | 74 | 0 | 0 | 48 | 35 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

sat4jpseudo | 191 | 2 | 19 | 0 | 107 | 0 | 1 | 62 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

pb2sat+zchaff | 191 | 0 | 14 | 0 | 14 | 2 | 0 | 56 | 31 | 71 | 3 | 0 | 0 | 0 | 0 | 0 |

Solver Name | #benchs | UNSAT. | OPT. | SAT. | SAT (timeout) | SAT (out of mem.) | UNKNOWN | UNKNOWN (timeout) | UNKNOWN (out of mem.) | UNKNOWN (exit code) | Sig. Caught | NO CERT. | WRONG CERT. | WRONG OPT. | WRONG UNSAT. | Misc. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

bsolo | 482 | 90 | 9 | 81 | 1 | 1 | 241 | 5 | 54 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

minisat+ | 482 | 103 | 26 | 0 | 0 | 0 | 0 | 239 | 41 | 9 | 0 | 64 | 0 | 0 | 0 | 0 |

sat4jpseudo | 482 | 85 | 3 | 12 | 157 | 0 | 4 | 218 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 0 |

pb2sat+zchaff | 482 | 8 | 11 | 0 | 11 | 0 | 0 | 69 | 119 | 259 | 5 | 0 | 0 | 0 | 0 | 0 |

Solver Name | #benchs | UNSAT. | OPT. | SAT. | SAT (timeout) | SAT (out of mem.) | UNKNOWN | UNKNOWN (timeout) | UNKNOWN (out of mem.) | UNKNOWN (exit code) | Sig. Caught | NO CERT. | WRONG CERT. | WRONG OPT. | WRONG UNSAT. | Misc. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

bsolo | 1172 | 136 | 196 | 326 | 26 | 1 | 391 | 10 | 86 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

galena | 690 | 49 | 103 | 157 | 0 | 0 | 319 | 0 | 0 | 0 | 51 | 0 | 0 | 0 | 11 | 0 |

minisat+ | 1172 | 156 | 226 | 0 | 0 | 0 | 0 | 444 | 49 | 11 | 0 | 286 | 0 | 0 | 0 | 0 |

PBS4 | 690 | 71 | 166 | 28 | 0 | 0 | 0 | 425 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

Pueblo | 690 | 71 | 194 | 298 | 0 | 0 | 48 | 78 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |

sat4jpseudo | 1172 | 149 | 142 | 29 | 489 | 0 | 6 | 353 | 0 | 0 | 1 | 3 | 0 | 0 | 0 | 0 |

vallst_0.9.258 | 499 | 48 | 131 | 33 | 0 | 0 | 0 | 277 | 0 | 0 | 0 | 0 | 3 | 0 | 7 | 0 |

pb2sat+zchaff | 1172 | 60 | 161 | 36 | 173 | 12 | 0 | 193 | 157 | 372 | 8 | 0 | 0 | 0 | 0 | 0 |

**#Benchs:**Number of benchs for this solver**UNSAT.:**solver proved unsatitisfiability ("s UNSATISFIABLE" was output)**OPT.:**solver found an optimum solution ("s OPTIMUM FOUND" was output), the provided implicant satisfies each constraint and no solver gave a better solution**SAT.:**solver found a solution ("s SATISFIABLE" was output) and the provided implicant satisfies each constraint**SAT (timeout):**solver exceeded the time limit but was able to find a solution ("s SATISFIABLE" was output) and the provided implicant satisfies each constraint**SAT (out of mem.):**solver exceeded the memory limit but was able to find a solution ("s SATISFIABLE" was output) and the provided implicant satisfies each constraint**UNKNOWN:**solver couldn't decide ("s UNKNOWN" was output)**UNKNOWN (timeout):**solver exceeded the time limit and gave no answer. These runs are considered to give result UNKNOWN**UNKNOWN (out of mem.):**solver exceeded the memory limit and gave no answer. These runs are considered to give result UNKNOWN.**UNKNOWN (exit code):**solver didn't output a solution line and terminated with an unexpected exit code (different of 0, 10, 20 and 30). These runs are considered to give result UNKNOWN.**Sig. Caught:**solver was terminated by a signal (SIGSEGV for example) and didn't output a solution line**NO CERT.:**solver answered SATISFIABLE but either didn't provide a certificate (the "v " line) or either gave a truncated certificate (which doesn't end in a new line)**WRONG CERT.:**solver gave an implicant but it appears that it doesn't satisfy every constraint**WRONG OPT.:**solver found an optimum solution ("s OPTIMUM FOUND" was output) but there exists an implicant which gives a better value to the objective function**WRONG UNSAT.:**solver proved unsatisfiability but was wrong ("s UNSATISFIABLE" was output)**Misc:**This is a internal check. It counts the runs that don't fall into one of the previous category. Should be 0 if categories are exhaustive and mutually exclusive. Will be negative if categories are not mutually exclusive and positive if categories are not exhaustive.