School of Computing Science, Newcastle University, UK
Abstract. This PhD project aims to investigate how enough informa-tion can be collected from an interactive formal proof to capture anexpert’s ideas as a high-level proof process. It would then serve for ex-tracting proof strategies to facilitate proof automation. Ways of inferringthis proof process automatically are explored; and a family of tools is de-veloped to capture the different proof processes and their features.
Formal methods such as VDM Event-B or Z provide notations andtechniques to describe and develop formal specifications and designs. Develop-ment steps are justified by discharging relevant (often automatically generated)proof obligations (POs). Proofs can be used to establish that the model is consis-tent (domain/consistency checks), that a detailed design satisfies the propertiesof abstract specification (data reification proofs), and other properties.
The AI4FM proaims to help users discharge such POs in formal devel-
opments on an industrial scale. It investigates how to learn from one interactiveproof so that other similar proofs can be completed automatically. Being part ofthe AI4FM project, this PhD research is dealing with the first steps of learning:capturing, analysing and understanding the expert’s proof process.
Current-generation theorem provers include powerful automation techniques
and can discharge a large proportion of arising POs. In industrial-size formaldevelopments, however, even a small percentage of remaining POs can amountto a disincentive to deploy formal methoThe general heuristics of automatictheorem provers can fail when faced with complex data structures and proofsthat require domain knowledge. In such cases developers revert to interactiveproof. The nature of industrial-style proofs (see Sect. allows grouping theminto “families” of similar POs. In each family a single proof idea is usuallynecessary and all other POs in the family are discharged “in a similar way”. These are not general proof heuristics—they are specific to the context.
Thus we state the overall hypothesis of the AI4FM research project:
Hypothesis 0. We believe that it is possible to extract strategies from successfulproofs that will facilitate automatic proofs of related POs.
We approach this problem as a three-step process: (1) collecting informationabout an expert’s proof; (2) extracting proof strategies; (3) replaying the proof
1 2 E.g. remaining 8% amounted to 2250 POs and over 7 man-months of proof in
strategies to discharge related POs. Current automatic replay techniques, suchas proof planning, require laborious manual development of reusable proof pat-terns (see Sect. AI4FM aims to fill this gap by capturing and extracting suchpatterns (proof strategies) automatically.
The scope of this PhD project is narrower and focuses on the first part of
the problem—collecting information about the proof—with such hypothesis:
Hypothesis 1. Enough information can be collected from interactive proof to fa-cilitate understanding of expert’s high-level reasoning as reusable proof strategies.
Some of this information may need to be provided by the expert during theinteractive proof. Besides that, the capture should be automatic:
Hypothesis 2. Certain information about the proof process can be inferredautomatically, via analysis of proof context and previous proofs.
This PhD research aims to provide a generic framework and tools to capture,analyse and infer expert’s proof process. The information captured would fa-cilitate extraction of reusable proof strategies. It could also be used for proofvisualisation, maintenance or teaching and training (e.g. learning by example).
Note that we do not intend to develop a new theorem prover. The aim is
to capture and infer high-level proof information. When replaying, we are look-ing into driving the theorem provers using the high-level strategies instead ofconstructing the mathematical proofs, thus not affecting their correctness.
Extracting proof strategies from existing proof and using them to discharge POswould significantly improve automation of formal proofs. This section outlineswork in this area and how it relates to this PhD and the AI4FM research project. We also present features of industrial-style proofs, which we expect to be moreamenable to proof strategy capture, extraction and reuse.
Proof Planning. Proof planning is a framework to encode a common structureof a family of proofs as high-level proof strategies (methods). A proof plan canbe used to guide the proofs of similar theorems. Proof critics can providecommon patches to help with recovery from failed applications of the associatedmethods. Rippling is a powerful proof planning method, very successful ininductive proofs. Furthermore, tool support is available for the various proofplanning techniques, namely IsaPlanner for Isabelle
The proof methods (strategies) used by proof planning techniques need to
be encoded manually. Some methods allow specifying parts of the strategy(schemas) in a modular way and then dynamically arranging them to find theproof plan Automatic extraction of strategies from existing proofs is a gapin the process, which AI4FM project aims to address.
Data-Mining Proofs. Some advances have been made in extracting proof strate-gies by data-mining existing proofs Common patterns of low-level proofsteps (e.g. rewrite rule applications) can be found by data-mining well-chosensets of examples. This approach, however, faces difficulty in capturing when theextracted tactics need to be applied. Also, found patterns of low-level proof stepstend to be short and thus of limited reusability. Finally, the need for a corpus ofavailable proofs to learn from would be too restrictive during interactive proof. From the AI4FM perspective, we expect our system to start extracting strategiesand assisting the user with similar proofs as soon as one of them is completed.
Proof Reuse & Term Abstraction. Proof reuse after model change has been ex-plored in If model change is small and well-constrained, parts of previousproofs can be reused directly or adapted in a simple way. Proof reuse applica-bility is identified by comparing shapes of terms in proof goals and hypotheses. Low level details of similar terms can be abstracted using generalisation anti-unification or schemes
Industrial POs. Proof obligations generated during industrial use of formalmethods (we will call them industrial-style POs) exhibit certain features and pat-terns rarely found in mathematical proofs. The involved formal models consistof complex data structures and operations, featuring a large number of variablesand invariants. The POs follow certain method-specific patterns about modelconsistency, properties or data reification. In addition to that, the core datastructures are usually the same (e.g. two operations differ in a couple invari-ants and share large portions of parent data structure), thus large parts of theproofs are very similar. The difficulty of industrial style proofs lies in the size andcomplexity of involved structures. Their proof strategies exhibit domain-specificfeatures and often require limiting the scope of prover tactic applications. Incontrast, mathematical problems and proofs are usually deeper and smaller.
Capture and Analyse: ProofProcess Framework
The process of developing an interactive proof is rarely straightforward andinvolves proof exploration, backtracking and “eureka” moments. The final proofis customarily cleaned up and optimised thus disposing of the insights and proofsteps that have lead to discovering it. We consider the ideas from the originalproof process to be more amenable for reuse in similar proofs and thus moreuseful than the streamlined final proof. Capturing and analysing them thereforebecomes a crucial task, undertaken in the first part of this PhD project.
A number of case studies for proof process analysis were selected initially.
These involved new proofs (e.g. formal verification of the Tokeneer ID Stationproject revisiting old proofs by the members of the AI4FM project as well as doing these proofs using different theorem provers. The exercises fo-cused at discovery of various proof process patterns and similarities as well asidentification of waypoints that guide expert’s decisions.
The case studies served for the development of a new ProofProcess framework
to represent, capture, store and analyse all information regarding the proof pro-cess. The framework is a core part of this PhD project and consists of two parts:a generic data model of a proof process (abstract model and implementation)and a capture/analysis system.
This section provides an overview of the main ideas on representing and
capturing the proof process, as well as general architecture of the framework. Thecore capture functionality of the ProofProcess framework is already available. It establishes the groundwork for development and evaluation of techniques tounderstand the captured proof process (Sect.
The ProofProcess model is designed to represent a high-level proof process, whichis backed by actual proof steps in the theorem prover (e.g. command/tacticapplications). It provides a flexible data structure, which could accommodateany proof process and any proof structure that the expert wishes to describethe proof attempt byThe low-level proof steps are augmented with additional“why” information about the expert’s proof process:
– Proof granularity: we want different layers of abstraction: from high-level
proof plan to the tactic steps. The proof needs to be captured at any arbitrarygranularities that the expert deems appropriate.
– Proof structure: while the proofs are usually done as a sequence of steps,
the actual structure is rarely linear (e.g. the base and step cases of inductiveproofs can be seen as branches in proof trees).
– Multiple proof attempts: we want to capture the whole proof development.
Failed attempts may still contain generally applicable proof steps, eventhough they were not successful in that particular case. All versions leadingto a finished proof should be captured.
– Intent : high-level explanation of expert’s proof direction. These are simple
tags that give names and description to underlying proof reasoning (e.g. akinto Isolation, Collection and Attraction in Ch. 12]). The intent is coupledwith features (below) to paint the whole proof process picture.
– Proof features: anything that drives the expert’s proof. It can be the shape
of the goal, the available lemma, certain datatypes or records, etc. Suchinformation is usually readily available during the proof process, howeverdeducing it from the finished proof is difficult. The features are used to pin-point what triggered a proof step, change in proof structure or direction. Byadapting or generalising them, we would know when the strategies extractedfrom this proof attempt apply in the next one.
A generic model allows building reusable analysis techniques on top of it. The de-sire is that adequate proof intent and feature combinations would be provided or
3 It is being tested with different theorem provers and different proof structures,
e.g. “chain” of command applications in Z/EVES or a structured Isar proof
inferred during proof capture. Then generic analysis techniques and tools couldwork without the knowledge of prover-specific terms or commands. If needed,prover-specific model extensions allow representing detailed proof context.
The abstract ProofProcess model is developed as a formal Z specification. The
ProofProcess system (Sect. has a concrete implementation in EMF/Java.
The industrial-style proofs are usually of significant size and capturing theproof process manually immediately becomes too cumbersome. For this reason,a ProofProcess system is being built alongside the model to develop and testtechniques for capture and analysis of the proof process.
The tools have extensible and modular architecture. They are built on mod-
ern platforms of Java, Eclipse, and EMF The system works as an add-onto theorem prover assistanuser is not forced into new habits of using theprover. Instead, the tool “wire-taps” the link with the theorem prover to cap-ture basic proof information and performs analysis calculations in parallel. Aclose integration is necessary to avoid losing the proof information.
The captured proof activities are subjected to analysis in order to try to
infer certain information about the proof process automatically (as proposedin Hyp. The design allows for modular “matcher” routines, which wouldcalculate the proof process features, structure or intent. These would then becombined to determine the most applicable solution. One approach of combiningdifferent “matchers” using weights is described in
Initial “matchers” already available in the ProofProcess system can recognise
proof re-runs, diverging proof attempts, or basic proof structure, e.g. parallel casesplits. Plans for further proof process analysis routines are discussed next.
This section presents further ideas on analysing and understanding the capturedproof process. These are directions that will be explored in the remainder ofthis PhD research. By building upon the core functionality of the ProofProcessframework, we focus on capturing features of industrial-style proofs and inferringthe proof process automatically by learning from previous examples.
Existing proof reuse/proof planning techniques focus on the shapes of goal terms,e.g. a term is inductive or the goal has a certain skeleton We believethat lemma usage (and extraction) is an important proof feature, especially inindustrial-style proofs with similar data structures. Intermediate lemmas about
4 Implementations for Isabelle/HOL and Z/EVES are available; integration
data structures significantly facilitate automatic proof, however the lemmas arenot obvious and are closely related to the specific structure and characteristics ofthe model. The need for such a lemma would be discovered by the expert duringan interactive proof attempt. Proof features could be used to describe the needfor the lemma and its properties, e.g. how it is related to the data structure. Then a need for a similar lemma would be recognised when a similar PO isencountered next time. Given enough information about the original lemma, thenew lemma can be generated or adapted, and the general proof strategy reused.
Capturing used lemmas is paramount for highly automatic proof tactics,
e.g. ones that perform multiple goal rewrites and trThe factthat a specific lemma was required is not obvious from the tactic applicationand naive reuse would easily fail, leaving the user puzzled. Furthermore, certainlemmas (e.g. associativity, equality substitution) would help recognise specificproof patterns and understand the proof process better.
Industrial scale POs can total hundreds of terms in the proof goal arising fromcomplex data structures. This prevents use of automated tactics and requirescareful interactive guidance (i.e. to make sure that only interesting/requiredhypotheses are used). By analysing the goals before and after tactic applicationwe could try to identify affected terms. This would provide several advantages:
– Understanding the proof process: by following the narrow path of affected
goal terms, we could more easily understand the expert’s path of reasoning.
– Reuse information about “unimportant” terms: by inverting the scope, we
may isolate the terms which never change in certain proofs, thus limitingproof search in similar POs.
– Extracting a toy problem: the affected terms would constitute a minimal
example representing the given problem. It could be extracted as a simplifiedview of the PO, and analysed without the burden of surplus hypotheses.
A similar approach was taken in where hundreds of hypotheses in POs werefiltered to analyse which of them were used in the proof. The knowledge wasused to check whether a proof reuse was possible (e.g. if only unused hypotheseshave changed). Proof planning techniques do not perform such analysis, becausethey are usually applied to much smaller examples.
The approach can be refined further by investigating affected fields in com-
plex data structures; or important features of complex functions. For example,when faced with a list reverse function rev in the proof, the important prooffeature can be that rev is defined in terms of list append function append.
The ProofProcess model is designed to be flexible and allow the expert to spec-ify any proof process with arbitrary proof features. One of the aims of this PhD
5 E.g. auto in Isabelle/HOL prove by reduce in Z/EVES
project is to reuse this information automatically when a similar PO is encoun-tered. Recognising the similarities with a previously captured proof process isthe difficult part of achieving this objective. By performing generalisation andpartial matching of properties and intents with some uncertainty—we call this“fuzzy” matching—we aim to recognise the proof process being undertaken. TheProofProcess model allows setting proof features and intent on every level of theproof tree. When analysing the current proof, we can match it with previousproofs or their subproofs. The ultimate objective is to recognise that somethingsimilar has been attempted before, and infer that the proof process is similar. Note that similar approach and techniques would be applied when extracting theproof strategies from the collected proof process information. The difference isthat when inferring the proof process, we have access to more data at any proofstep—namely the application of the proof step tactic, and a resulting goal—thanwhen guiding the proof search with an extracted proof strategy.
A step even further would be analysing custom proof features. We anticipate
some of the proof properties to be quite abstract, domain specific, or simplydifficult to express formally. For example, an expert may wish to indicate thata function is “addable”. Such property is difficult to express formally, but if en-countered during interactive proof, the expert may easily relate it to appropriateproof strategy. The difficulty lies with matching such proof properties in similarproofs automatically, because we lack semantics for this property. One approachmay be to check what other properties can be inferred when this property isspecified, and relate them in some way.
The ProofProcess model is designed to be generic, thus custom properties
are supported. There is a possibility to provide an extensible mechanism (likelyprover-specific), which could allow specifying semantics for the custom proper-ties. The properties could be linked to custom “matchers”, which would be usedin the general framework.
Acknowledgements. Thanks to Leo Freitas, Gudmund Grov, Cliff Jones andother AI4FM members for the useful input to the ProofProcess framework. This PhD research is supported by EPSRC grant EP/H024050/1 (AI4FM).
1. Abrial, J.R.: Formal methods: Theory becoming practice. J. UCS 13(5), 619–628
2. Abrial, J.R.: Modeling in Event-B - System and Software Engineering. Cambridge
3. Abrial, J.R., Butler, M.J., Hallerstede, S., Hoang, T.S., Mehta, F., Voisin, L.:
Rodin: an open toolset for modelling and reasoning in Event-B. STTT 12(6), 447–466 (2010)
4. Bundy, A.: The Computer Modelling of Mathematical Reasoning. Academic Press
Professional, Inc., San Diego, CA, USA, digital (2010) edn. (1985)
5. Bundy, A.: The use of explicit plans to guide inductive proofs. In: Lusk, E., Over-
beek, R. (eds.) CADE. LNCS, vol. 310, pp. 111–120. Springer (1988)
6. Bundy, A., Basin, D., Hutter, D., Ireland, A.: Rippling: Meta-level Guidance
for Mathematical Reasoning, Cambridge Tracts in Theoretical Computer Science,vol. 56. Cambridge University Press (June 2005)
7. Butterfield, A., Freitas, L., Woodcock, J.: Mechanising a formal model of flash
memory. Sci. Comput. Program. 74(4), 219–237 (2009)
8. Copper, D., Barnes, J.: Tokeneer ID station EAL5 demonstrator: Summary report.
Tech. Rep. S.P1229.81.1 Issue: 1.1, Altran-Praxis (August 2008)
9. Dixon, L., Fleuriot, J.: IsaPlanner: A prototype proof planner in Isabelle. In:
Baader, F. (ed.) CADE, LNCS, vol. 2741, pp. 279–283. Springer (2003)
10. Duncan, H.: The Use of Data-Mining for the Automatic Formation of Tactics.
Ph.D. thesis, School of Informatics, University of Edinburgh (2007)
11. Felty, A.P., Howe, D.J.: Generalization and reuse of tactic proofs. In: Pfenning, F.
(ed.) LPAR. LNCS, vol. 822, pp. 1–15. Springer (1994)
12. Freitas, L., Woodcock, J.: Mechanising Mondex with Z/Eves. Formal Asp. Comput.
13. Heneveld, A.: Using Features for Automated Problem Solving. Ph.D. thesis, School
of Informatics, University of Edinburgh (February 2006)
14. Ireland, A., Bundy, A.: Productive use of failure in inductive proof. J. Autom.
15. Jamnik, M., Kerber, M., Pollet, M., Benzm¨
methods in proof planning. Logic Journal of the IGPL 11(6), 647–673 (2003)
uth, C.: Theorem reuse by proof term transformation. In: Slind,
K., Bunker, A., Gopalakrishnan, G. (eds.) TPHOLs. LNCS, vol. 3223, pp. 152–167. Springer (2004)
17. Jones, C.B.: Systematic Software Development using VDM. Prentice Hall Interna-
18. Krumnack, U., Schwering, A., Gust, H., K¨
order anti-unification for analogy making. In: Orgun, M.A., Thornton, J. (eds.)Australian Conference on Artificial Intelligence. LNCS, vol. 4830, pp. 273–282. Springer (2007)
19. Mehta, F.: Supporting proof in a reactive development environment. In: SEFM.
pp. 103–112. IEEE Computer Society, London, England, UK (2007)
20. Montano-Rivas, O., McCasland, R.L., Dixon, L., Bundy, A.: Scheme-based syn-
thesis of inductive theories. In: Sidorov, G., Aguirre, A.H., Garc´ıa, C.A.R. (eds.)MICAI (1). LNCS, vol. 6437, pp. 348–361. Springer (2010)
21. Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL: A Proof Assistant for
Higher-Order Logic, LNCS, vol. 2283. Springer (May 2002)
22. Reif, W., Stenzel, K.: Reuse of proofs in software verification. In: Shyamasundar,
R. (ed.) Foundations of Software Technology and Theoretical Computer Science,LNCS, vol. 761, pp. 284–293. Springer (1993)
23. Saaltink, M.: The Z/EVES system. In: Bowen, J., Hinchey, M., Till, D. (eds.) ZUM
’97. LNCS, vol. 1212, pp. 72–85. Springer (1997)
24. Steinberg, D., Budinsky, F., Paternostro, M., Merks, E.: EMF: Eclipse Modeling
Framework. Addison-Wesley Professional, second edn. (December 2008)
25. Velykis, A., Freitas, L.: Formal modelling of separation kernel components. In:
eharbe, D., Gaudel, M.C., Woodcock, J. (eds.) ICTAC. LNCS,
vol. 6255, pp. 230–244. Springer (2010)
26. Wenzel, M.M.: Isabelle/Isar - a versatile environment for human-readable formal
proof documents. Ph.D. thesis, Technische Universit¨
27. Woodcock, J., Davies, J.: Using Z: Specification, Refinement, and Proof. Prentice
Medicinal products and driving* On behalf of the working Group “Medicinal products and automobile driving” created by Afssaps: Christian RICHE (chairman), Charles CAULIN (vice-chairman), Jacques CARON, Anne CHIFFOLEAU, Christian CORBE, Bertrand DIQUET, Alain ESCHALIER, Françoise HARAMBURU, Georges LAGIER, Jean-Pierre LEPINE, Michel MALLARET, Charles MERCIER-GUYON, Louis MERLE, Jea