Explicit allocation on the heap
A simple example
Traditionally, most imperative languages have placed the responsibility for the allocation and deallocation of objects on the heap with the programmer. In Pascal, memory is allocated in the heap by the new procedure. Given a pointer variable p, new(p) causes p to point to newly allocated storage for an object of the type to which p is declared to point. The object is deallocated or freed by calling dispose(p). The program fragment is Algorithm 1.1 on the following page creates a list [1,2,3].
program pointer(input, output);
type ptr = ↑cell;
cell = record
value : integer;
next : ptr
end;
var myList : ptr;
function Insert (item : integer; list : ptr) : ptr;
var temp : ptr;
begin
new(temp);
temp↑.value := item;
temp↑.enxt := list;
Insert := temp;
end;
begin
myList := Insert(1, Insert(2, Insert(3, nil)))
end
Algorithm 1.1 Dynamic allocation of a list in Pascal
Diagram 1.1 The list built by algorithm 1.1
Garbage
Dynamically allocated storage may become unreachable. Objects that are not live, but are not free either, are called garbage. With explicit deallocation, garbage cannot be reused; its space has leaked away. We could generate a space leak in the program in Algorithm 1.1 on the following page by adding a line
$$ myList↑.next∶= nil;
$$ after the list is created(Diagram 1.2).
Now only the first element of list is accessible to the program; the memory containing items 2 and 3 is out of the program’s reach and can neither be used nor recovered. Automatic storage management can recover inaccessible memory: this is the subject of this book.
Diagram 1.2 myList↑.next∶= nil creates a space leak
Dangling references
Memory can also be deallocated while there are still references to it. Suppose we replace the new line in Algorithm 1.1 by
$$ dispose(myList↑.next);
$$
to return item 2 to the heap manager. Again, item 3 has become garbage: this small leak will not harm our tiny program (see Diagram 1.3 on the page). However, the next field of item 1 refers to memory that has been deallocated. A dangling reference has been created.
The program has no control over the use to which the disposed storage is put. It may be cleared, used to store book-keeping information or recycled by the heap manager. If the program follows the dangling reference, the best that can be hoped for is that it will crash immediately. If the heap manager had reallocated the disposed memory to another of the program’s data structures, a single location would represent two different objects. If we are lucky, the program will eventually crash at some future point. If we are unlucky, it will continue to run but produce incorrect results.
Diagram 1.3 dispose(myList↑.next) creates a space leak and a dangling pointer
Sharing
Garbage and dangling references are the two sides of the same coin of explicit allocation. Garbage is created by destroying the last reference before an object is deallocated. Dangling references are created by deallocating an object while references to it remain. It might appear that the solution is that both actions --- destruction of the last reference and deallocation of its target --- should be co-ordinated, but this is not easy in the presence of sharing.
Suppose two lists share a common suffix (see Diagram 1.4 on the following page). A well-behaved list disposal routine will recursively deallocate each item of a list when he pointer to the head of the list is destroyed. However, if either cat or mat were destroyed in this way, the other would consist of a single item and a dangling pointer. This was the problem that led to interest in automatic storage reclamation techniques in the late 1950s [McCarthy, 1981].
Diagram 1.4 Two lists may share a common suffix. mat∶= Insert(‘m’,cat↑.next);
Failures
Dynamic memory in complex programs is hard to manage correctly with explicit allocation and deallocation, and examples of failing programs are common. Programs crash unexpectedly and servers run out of memory for no apparent reason. The effect of such programming errors is indeterminate, particularly in multi-threaded environments. Dangling references may be benign if the hap manager does not reallocate that particular object. Space leaks may lie dormant under testing and even under normal conditions of use. Failures commonly only surface when the program is put under stress or left running for long periods. For example, the input to a complier may be machine generated and violate assumptions about the shape of code that a programmer might reasonably be expected to write. Space leaks may remain undiscovered when the code is run on the development machine. However, when executed on a machine with a smaller memory or on a long-running server, the leak may exhaust the memory. Debugging under these conditions is extremely difficult as failures are often unrepeatable.