ACSL Mini-Tutorial
18 Pages
English

ACSL Mini-Tutorial

Downloading requires you to have access to the YouScribe library
Learn all about the services we offer

Description

ACSL Mini-Tutorial
1Virgile Prevosto
1 CEA LIST, Software Security Laboratory, Saclay, F-91191
This work has been partially supported by the CAT ANR project (ANR-05-RNTL-0030x) and by the FP6
European project Open TC. Chapter 1
Foreword
This document is a brief introduction to the ANSI/ISO C Specification Language (ACSL). ACSL allows to formally
specify the properties of a C program, in order to be able to formally verify that the implementation respects these
properties. This verification is done via tools that are able to take into account ACSL annotations attached to the C
code. This tutorial focuses on the most important ACSL constructs and gives an intuitive grasp of their semantics,
through short and illustrative examples of C code annotated with ACSL formulas. A complete reference of the
ACSL language can be found in [1]. ACSL is inspired from the specification language used by Caduceus [2],
which is itself derived from the Java Modeling Language (JML, see [3]). Chapter 2
A First ACSL Example
The most important ACSL concept is the function contract. A function contract for a C function f is a set of
requirements over the arguments of f and/or a set of properties that are ensured at the end of the function. The
formula that expresses the requirements is called a pre-condition, whereas the formula that expresses the properties
ensured whenf returns is a post-condition. Together, these conditions form a contract betweenf and its callers:
each caller must guarantee that ...

Subjects

Informations

Published by
Reads 281
Language English
ACSL Mini-Tutorial
Virgile Prevosto 1 1 CEA LIST, Software Security Laboratory, Saclay, F-91191
This work has been partially supported by the CAT ANR project (ANR-05-RNTL-0030x) and by the FP6 European project Open TC.
Chapter 1
Foreword
This document is a brief introduction to the ANSI/ISO C Specification Language (ACSL). ACSL allows to formally specify the properties of a C program, in order to be able to formally verify that the implementation respects these properties. This verification is done via tools that are able to take into account ACSL annotations attached to the C code. This tutorial focuses on the most important ACSL constructs and gives an intuitive grasp of their semantics, through short and illustrative examples of C code annotated with ACSL formulas. A complete reference of the ACSL language can be found in [ 1 ]. ACSL is inspired from the specification language used by Caduceus [ 2 ], which is itself derived from the Java Modeling Language (JML, see [ 3 ]).
Chapter 2
A First ACSL Example
The most important ACSL concept is the function contract . A function contract for a C function f is a set of requirements over the arguments of f and/or a set of properties that are ensured at the end of the function. The formula that expresses the requirements is called a pre-condition , whereas the formula that expresses the properties ensured when f returns is a post-condition . Together, these conditions form a contract between f and its callers: each caller must guarantee that the pre-condition holds before calling f . In exchange, f guarantees that the post-condition holds when it returns. Let us consider the example of the max function. Informally, its specification can be expressed this way: the function max takes two int as arguments, and returns the greatest one. Let us see how this can be expressed in ACSL: /*@ ensures \result >= x && \result >= y; ensures \result == x || \result == y; */ int max (int x, int y) { return (x > y) ? x : y; } As can be seen above, ACSL annotations are written in special C comments, the difference with plain comments being that annotations begin with ’ /*@ ’. It is also possible to write one-line annotations introduced by ’ //@ ’. The function contract is written immediately above the function declaration. In this example, the contract contains only post-conditions ( ensures clauses), as max does not have any particular requirement. The first ensures clause expresses the fact that the result of max is greater than both x and y , the arguments of max . The second clause mandates that the result is equal to either x or y . Since both clauses must hold, our ACSL specification indeed expresses that the result of max is the greatest of its two arguments. It should be noted already that there is an intuitive demarcation between “complete” and “partial” specifica-tions. The above specification for max can be thought of as “complete”, meaning that any function that satisfies the specification should be deemed a satisfactory implementation for max . Partial specifications on the other hand express some of the properties that are expected to hold for any implementation, but satisfying them is not suffi-cient for an implementation to be satisfactory. Generally speaking, partial formal specifications are the most likely to be encountered in practice for real-life examples, complemented by informal specifications. For instance, the above specification for max is in fact partial, for reasons that will become clearer later.
Chapter 3
Pointers
3.1 A First Specification Let us now consider a small program involving pointers. Informally, the max_ptr function takes two pointers as argument, and if necessary swaps the two pointed values so that the value stored in the first pointer is the minimal one and the value stored in the second pointer is the maximal one. A specification can be the following: /*@ requires \valid (p) && \valid (q); ensures *p<=*q; */ void max_ptr(int *p, int* q); Here, we have a pre-condition (the requires clause). Namely, we want our function to be called with valid pointers as arguments. This is what the built-in ACSL predicate \valid says. Note that \valid takes into account the static type of its argument: in our context, \valid(p) indicates that there address p is in-cluded in an allocated block which is large enough to store an int starting at p . This is thus different from \valid((char *)p) for instance. Our post-condition is that when the function returns, the value pointed to by p is less than or equal to the value pointed to by q . A correct implementation for max_ptr is then void max_ptr(int*p, int*q) { if (*p > *q) { int tmp = *p; *p = *q; *q = tmp; } } Namely, since we require max_ptr to be called with valid pointers, there is no need to perform any check on them. Then, if *p is already less than or equal to *q , there’s nothing to do. If not, we just swap the content of the two pointers.
3.2 Building a Complete Specification The implementation seen in the previous section is correct with respect to the specification of max_ptr . Unfor-tunately, this is not the only conforming implementation: the specification is only partial, and is for instance met by the following function: void max_ptr(int* p, int*q) { *p = *q = 0; }
3.3 Degree of Completeness
Indeed, since 0 0 , we have *p<=*q at the end of the function. Depending on the verification tasks we have in mind (see next section), we may want to refine our specification to avoid an implementation such as above. Namely, we want to enforce a relation between the values pointed to by p and q at the beginning and at the end of the function. A possible specification is then /*@ requires \valid (p) && \valid (q); ensures *p <= *q; ensures (*p == \old (*p) && *q == \old (*q)) || (*p == \old (*q) && *q == \old (*p)); */ void max_ptr(int* p, int*q); The \old built-in function says that its argument must be evaluated in the pre-state ( i.e. at the beginning) of the function. The second ensures clauses thus says that we either leave p and q unchanged or swap the two pointed values, which, together with the first clause implies that we meet exactly the informal specification.
3.3 Degree of Completeness The previous section has teached us that writing a fairly complete specification (in fact we could still add some clauses to the specification above, as we will see in the next chapters) is not immediate, and thus that it is easy to come up with only a partial specification. Hence, it raises two frequently asked questions: how can we be sure that our specification is complete, and how complete must a specification be. The answers however do not lie in ACSL itself. For the first one, one must reason on some model of the specified application. For the second one, there is no definite answer. It depends on the context in which the specification is written and the kind of properties that must be established: the amount of specification required for a given function is very different when verifying a given property for a given application in which calls to the function always occur in a well-defined context and when specifying it as a library function which should be callable in as many contexts as possible.
CALSutotirla
4
Chapter 4
Behaviors
The second ensures clause of our final specification of max_ptr is a bit complicated, and does not explain immediately in which case we will end up. It is possible to express that differently, by using ACSL’s behav-iors . A function can have several behaviors in addition to a general specification. A behavior can have additional ensures clauses, but in addition, it can also have assumes clauses, which indicate when the behavior is trig-gered. There is no requirement for the behaviors to cover all contexts in which a function can be called, and behaviors need not to cover disjoint cases. This can be further specified by the complete behaviors and disjoint behaviors clauses, as in the following specification. /*@ requires \valid (p) && \valid (q); ensures *p <= *q; behavior p_minimum: assumes *p < *q; ensures *p == \old (*p) && *q == \old (*q); behavior q minimum: _ assumes *p >= *q; ensures *p == \old (*q) && *q == \old (*p); complete behaviors p_minimum, q_minimum; disjoint behaviors p_minimum, q_minimum; */ void max_ptr(int* p, int*q); We explain here precisely in which case we keep the same values and in which case we swap. Note that the global ensures clause is implied by the ensures clauses of the two behavior and the fact that p_ minimum q_ omplete set of behaviors. and minimum form a c
Chapter 5
Arrays
5.1 Basic Specification Now that we have specified a max_ptr function, we can use it to extract the maximal value found in a sequence of int s. A first step is to write the prototype of the corresponding function with its specification. /*@ requires n > 0; requires \valid (p+ (0..n 1)); ensures \forall int i; 0 <= i <= n 1 ==> \result >= p[i]; ensures \exists int e; 0 <= e <= n 1 && \result == p[e]; */ int max_seq(int* p, int n); The function takes two arguments: a pointer p to the block containing the int s, and the number n of elements in the sequence. This time, there are pre-conditions on these arguments. First, it is not possible to compute the maximal value of an empty sequence, and so n must be positive. Moreover, the block pointed to by p must contain at least n elements. In other words, p[0] , p[1] , ... p[n-1] must all be valid memory accesses. This is what the second requires clause expresses. The two ensures clauses display some similarities with the contract of the max function above: the result is greater than or equal to every value in the sequence, and there exists an index for which the equality is attained. Note that the formulas in the post-condition only make sense under the assumption that the pre-condition holds: the validity condition ensures that it makes sense to speak about p[i] , and if n could be 0 , it would not be possible to find an index where \result is attained.
5.2 Advanced specification Note: This section can be skipped on a first reading In addition, a more advanced ACSL construction allows to provide a shorter specification of max_seq : \max is a built-in constructor (together with \min , \sum , and a few others), that returns the maximal value taken by a function in an interval of integers. With \max and the \lambda construction to write functions as first-class expressions, our specification becomes: /*@ requires n > 0 && \valid (p + (0..n 1)); ensures \result == \max (0,n 1, \lambda integer i; p[i]); */ int max_seq(int* p, int n);
5.3 Implementation The implementation of the max_seq function is pretty straightforward. For instance, we can use the following code.
5.3 Implementation
int max_seq(int* p, int n) { int res = *p; for (int i = 0; i < n; i++) { if (res < *p) { res = *p; } p++; } return res; }
The specification we have given in the preceding section defines an expected behavior for the function max_seq . We will see later in this document that actually verifying that the function max_seq implements the specification from the preceding section may require additional work.
ACSLtutorial
7
Chapter 6
Assigns clauses
As with the initial specification of max_ptr , an implementation of max_seq could zero all the locations p[0] , p[1] ,. . . , p[n-1] , return zero, and would still satisfy the post-conditions in the specification from sec-tion 5.1 . Again, we can use the \old keyword to avoid that. /*@ requires n > 0; requires \valid (p+ (0..n 1)); ensures \forall int i; 0 <= i <= n 1 ==> p[i] == \old (p[i]); ensures \forall int i; 0 <= i <= n 1 ==> \result >= p[i]; ensures \exists int e; 0 <= e <= n 1 && \result == p[e]; */ int max_seq(int* p, int n); It would be possible, but tedious, to use the same approach to specify that global variables do not change during the execution of max_seq . The ACSL language provides a special clause to specify that a function is not allowed to change memory locations other than the ones explicitly listed. This clause is the assigns clause, and it is part of the function contract. When no assigns clauses are specified, the function is allowed to modify every visible variable. In presence of such clauses, the function can only modify the content of the locations that are explicitly mentioned in these clauses. In our case, we do not expect max_seq to have any visible side-effect, so that the contract becomes: /*@ requires n > 0; requires \valid (p+ (0..n 1)); assigns \nothing ; ensures \forall int i; 0 <= i <= n 1 ==> \result >= p[i]; ensures \exists int e; 0 <= e <= n 1 && \result == p[e]; */ _ int max seq(int* p, int n); Again, it is not necessary to use \old for the values of the expressions p[i] and p[e] in the the post-conditions, since the specification forces them to stay unchanged during the e _ q . xecution of max se assigns clauses can be found in behaviors too. Writing appropriate assigns clauses for p_minimum and q_minimum to complete the sp _ptr is left as an exe ecification of max rcise for the reader.
Chapter 7
Termination
There is yet another property that is implicitely expected from a satisfactory implementation of max_seq . Namely, this function, when called with arguments that satisfy its pre-conditions, should eventually terminate (and return a result that satisfies its post-conditions). The assigns clause in itself only guarantees that each time the function terminates, only the specified locations may have been modified. Similarly, the post-conditions only apply when the function terminates, and so a function that never terminates would for instance satisfy the assigns and ensures parts of the specification for max_seq . The termination of the function is a separate property that can be specified in its contract using the terminates clause. Because in practice many functions are implicitely expected to terminate, the default in ACSL is to expect functions to terminate in all the contexts that satisfy their pre-conditions. It is possible to relax a particular func-tion’s specification by providing a formula that describes the conditions in which the function is guaranteed to terminate. An implementation is then allowed not to terminate when it is called in contexts that do not satisfy this condition. In the following example, the function f can be called with any argument c , but the function is not guaranteed to terminate if c<=0 . /*@ assigns \nothing ; terminates c>0; */ void f (int c) { while (!c); return ;} Another valid contract for the same implementation of f is the following one, where the function implicitely guarantees to terminate whenever it is called, but expects to be called only with non-zero arguments. /*@ requires c!=0; assigns \nothing ; */ void f (int c) { while (!c); return ;}
Chapter 8
Predicates and Logic Functions
So far, we have only used logical built-ins operators and relations. It is often needed to define new logic predicates and logic functions. For instance, if we define (simply) linked lists as such: typedef struct _list { int element; struct _list* next; } list; there are some common properties of lists that we want to be able to deal with in the logic. In particular, the notion of reachability (can a given node be attained from some root through a chain of next fields) plays an important role. It can be defined in ACSL through the following annotation: /*@ inductive reachable{L} (list* root, list* node) { case root_reachable{L}: \forall list* root; reachable(root,root); case next_reachable{L}: \forall list* root, *node; \valid (root) ==> reachable(root >next, node) ==> reachable(root,node); } */ The notion of reachability is defined in ACSL by an inductive predicate. Namely, root_reachable and next_reachable can be seen as axioms indicating the cases under which reachable must hold, and the fact that reachable is inductive implies that these are the only cases under which it can hold. root_reachable s both its arguments are q _ indicates that reachable holds as soon a e ual. next reachable indicates that if root is valid and node can be proved reachable from root->next , then it is also reachable from root . The {L} notation indicates that the predicate takes as parameter a label, which represent a memory state of the program. Indeed, as soon as we deal with pointers or arrays, we must refer to a particular memory state: reachable has no absolute meaning, it must be tied to a point of the program. In a case like this, there is no need to explicitely instantiate the label when we use reachable : the memory state to which it refers will usually be inferred from the context ( e.g. for a precondition, it is the state at the start of the function, and for a post-condition, it is the state just before returning to the caller). More complex predicate may relate two or more memory states, but such definitions are out of scope in this tutorial. Now, the reachable predicate can be used to discriminate between circular and finite lists: in a finite list, we will ultimately reach \null ; /*@ predicate finite{L}(list* root) = reachable(root, \null ); */ Here, it is not necessary to embed finite as inductive. Instead, we directly give its definition in terms of reachability. We can also note that finite is also parameterized by a label L . In fact, it is implicitely the memory state in which reachable itself is evaluated. It would have been possible to make that explicit by writing reachable{L}(root,\null) instead, but this is not necessary here, as L denotes the only state in the envi-ronment at this level.