SAT-based Approaches for Formal Verification with B method

PhD Student:
Daniel Crowley

Co-Supervisor :
Olivier Roussel

Funding : ANR

Problem modelling and solving using constraints

Start year :
2023

Summary

The ANR project BLaSST targets bridging combinatorial and symbolic techniques in automatic theorem prov- ing, in particular for proof obligations generated from B models. Work will be carried out on SAT-based techniques as well as on more expressive SMT formalisms. In both cases encoding techniques, optimized resolution techniques, model generation, and lemma suggestion will be considered. Combining both lines of work, the expected scientific impact is a substantially higher degree of automation of solvers for expressive input languages by leveraging higher-order reasoning and enumerative instantiations over finite domains. The effectiveness of the techniques developed in the project will be quantified by applying them to benchmark sets provided by the industrial partner. The industrial impact of BLaSST will be a higher productivity of proof engineers.

BLaSST brings together academic experts in automated reasoning techniques (CRIL, INRIA and Université de Liège) and an industrial company (CLEARSY) leader in formal methods for a 4-year long effort that aims to provide a breakthrough in the formal verification applied to software design.

The main aim of this PhD position is to take advantage of SAT technologies to address different reasoning tasks for formal developments of software, in particular in approaches such as the B method. Indeed, this thesis will focus on developing and optimizing SAT-based encodings and techniques to achieve the following objectives:

Develop different SAT-based encodings for generating counter-examples of proof obligations.
Develop a method for building proof suggestions in connection with the proposed SAT-encodings to help showing proof obligations.
Develop a method for identifying missing premises to establish validity.

The research will be carried out in strong relationship with our industrial partner CLEARSY.

Applicant profile

We are looking for a PhD candidate with a master degree (or equivalent) in computer science with a background in logic. Prior knowledge in automated reasoning or formal methods (TLA+, B, Z or Alloy for instance) is a plus. The candidate should master one mainstream programming language such as C, C++ or Java.

Context

The use of software systems in critical application domains such as avionics, trains or medical devices is subject to certification requirements, and standards such as CENELEC EN 50128 used in the railway domain strongly recommend the use of formal verification techniques at high integrity levels. CLEARSY, the industrial partner in BLasSST, is a leader in carrying out certified developments, in particular for rail-bound transportation systems, based on the B method [1]. For carrying out formal verification, it develops and maintains Atelier B, an IDE for editing and analyzing B models. Atelier B in particular contains several engines for automatic proof and has been formally qualified for use in the development of safety-critical systems. Nevertheless, the existing automatic provers fail on a significant number of proof obligations (POs) generated from B models. In a recent experiment, among the roughly 77,000 POs of a representative development project, 64% were proved automatically by Atelier B, leaving 28,000 POs to be proved by human interaction. Only very few of the remaining POs are challenging: most of them can be discharged by the application of one or two proof commands in interactive proof. Also, Atelier B provides very little help for explaining why a PO could not be proved. During initial stages of development, many POs are actually not true, and providing counter-examples or suggesting missing hypotheses would be of great help. Combinatorial reasoning is at the heart of SAT solving [4]. In particular, techniques of propagation and lemma learning are crucial for the efficiency of modern SAT solvers. Although every SMT solver embeds a SAT solver for handling the propositional structure of ground (i.e., fully instantiated) problems, SMT solvers do not employ SAT-based techniques at higher levels of reasoning. We believe that a tighter integration of SAT and SMT solving beyond the ground case would be beneficial not just for discharging POs from B models. As an illustration, SAT solvers excel at solving Sudoku puzzles within milliseconds, whereas SMT solvers are defeated by the easiest 9 × 9 Sudokus when these are encoded in the natural way using quantifiers. Although BLaSST is motivated by the analysis of POs generated by the B method, and such POs will be used for validating our contributions, the project will carry out fundamental research in automated theorem proving, and the intended impact of the project goes well beyond one particular application domain.

Objectives

In this Ph.D thesis, we intend to focus on SAT-based encodings of POs. Such encodings can be effective for POs whose validity can be established by purely combinatorial reasoning over finite domains, possibly after brute- force instantiation or after abstraction of the original problem. We will investigate techniques for optimizing the resolution of such problems, for example by using parallel solving and exploiting hidden structures in the original problems (e.g., symmetries [5], sub-formula redundancy [7] and boolean gates [8]). For POs that cannot be faithfully represented by propositional encodings, these can still be useful for computing counter- examples to POs that are actually unprovable, thus avoiding futile proof attempts. Additionally, we will study the interest of extracting specific types of counter-examples in relation to notions such as prime implicants and minimal models (e.g., [6, 9, 3]). Finally, even incomplete SAT encodings can be exploited for abductive reasoning in order to discover missing hypotheses that would make an invalid PO provable, by relying on extended resolution techniques [10, 2] or by analyzing minimal unsatisfiable subsets. In this way, the proposed techniques can help proof engineers carry out interactive proof and understand why certain steps fail.

References

[1] J.-R. Abrial. Modeling in Event-B. Cambridge University Press, 2010.

[2] G. Audemard, G. Katsirelos, and L. Simon. A restriction of extended resolution for clause learning SAT

solvers. In 24th AAAI Conf. Artificial Intelligence. AAAI Press, 2010. [3] R. Ben-Eliyahu and R. Dechter. On computing minimal models. Ann. Math. Artif. Intell., 18(1):3–27, 1996.

[4] Armin Biere, Marijn Heule, Hans van Maaren, and Toby Walsh, editors. Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications. IOS Press, 2009.

[5] J. M. Crawford, M. L. Ginsberg, E. M. Luks, and A. Roy. Symmetry-breaking predicates for search problems. In Fifth Intl. Conf. Princ. of Knowledge Representation and Reasoning (KR’96), pages 148– 159, Cambridge, Mass., 1996. Morgan Kaufmann.

[6] S. Jabbour, J. Marques-Silva, L. Sais, and Y. Salhi. Enumerating prime implicants of propositional formulae in conjunctive normal form. In Logics in Artificial Intelligence - 14th European Conference, JELIA 2014, volume 8761 of LNCS, pages 152–165. Springer, 2014.

[7] Saïd Jabbour, Lakhdar Sais, Yakoub Salhi, and Takeaki Uno. Mining-based compression approach of propositional formulae. In 22nd ACM Intl. Conf. Information and Knowledge Management, CIKM’13, pages 289–298, San Francisco, CA, 2013. ACM.

[8] R. Ostrowski, E. Grégoire, B. Mazure, and L. Sais. Recovering and exploiting structural knowledge from CNF formulas. In Principles and Practice of Constraint Programming - CP 2002, volume 2470 of LNCS, pages 185–199. Springer, 2002.

[9] Y.Salhi. Approaches for enumerating all the essential prime implicants. In Artificial Intelligence:Methodology, Systems, and Applications - 18th International Conference, AIMSA 2018, volume 11089 of LNCS, pages 228–239. Springer, 2018.

[10] G.S.Tseitin.On the Complexity of Derivation in Propositional Calculus,pages 466–483. Springer Berlin Heidelberg, 1983.