Implementation of protection in an OS generally involves three factors. The interface to the user, the interface to the hardware, and the decision making process with regard to filling requests. If we expand our thinking, we can make the same statement about protection in any environment where there are external users, underlying capabilities, and a protection function to be performed.
Because the external interface is so heterogeneous and so little has been done to model its effect on protection systems, it is difficult to cover it at more than a cursory level. Hardware protection and the decision making mechanisms are fairly well developed arts, however, so we will cover them here at length.
The implementation of protection in OSs almost always depends heavily on a hardware separation mechanisms. A separation mechanism is a way to partition information into areas that only communicate through well defined and controlled channels. In order to enforce separation against a serious attacker, it is insufficient to make information flow inconvenient or available only to the knowledgeable as is the case in most personal computer systems.
One way to provide protection is to simulate a hypothetical machine on a physical machine so that all operations of the hypothetical machine are controlled by the simulation. This could be as secure a system as any purely physical system, but the performance of the physical machine is severely reduced because most of the time is spent in controlling the simulation rather than performing user level processing. A purely physical, and very reliable method must be used for most real-world applications.
A physical means for implementing protection is typically provided by architectural features of computer hardware. In order to provide physical partitioning, we must assure that there is no path through which information can flow between users. In the hardware domain, information is typically processed by finite state machines (FSMs) which maintain a state, and transform inputs combined with state information into outputs and changes in state [Meally55] [Moore56-2] . The only information that persists in such a system is state information, and thus we can protect information from flowing between domains by keeping their state information separate. This is just like OSs based on a sign-up sheet, wherein each user has full access to the machine over a different period of time.
A typical model for a computer system is the Von Neumann model in which a machine consists of a control unit (C), an input and output unit (I/O), an arithmetic logic unit (ALU), and a memory unit (M) [VonNeumann63] . The machine fetches instructions telling it what to do next from the memory. The control unit which consists of FSMs then controls M, I/O, ALU, and C to implement the instructions. In most such machines, state information is maintained in a set of special memory elements called "registers", in M, and in peripheral devices such as disks and tapes. *****
There are generally two classes of these registers. One class is used primarily by C for remembering the instruction being performed, the portion of the instruction currently under way, the conditions of the ALU, the memory location of the next instruction to be performed, and other control related information. The other class is used to store the current values of data registers associated with user programs including the result of the last arithmetic operation, the user's general purpose registers, and other user data.
Registers are generally quite expensive to implement compared to the state information stored in M and I/O devices, they operate at extremely high speeds compared to state information stored M and I/O, and they are central to the interpretation of instructions, so they must be connected to many other devices using specially designed complex switching devices. As a result, it is generally not cost or performance effective to provide enough sets of registers to store state information for all of the processes that could possibly coexist on a typical machine. As an alternative, designers provide two sets of registers that can be stored in and reloaded from M, the set of registers in use at any given time being associated with the protection state of the machine.
When the OS is executing its 'kernel' program, it uses a completely independent set of state information from that used by any other process, and it generally has access to the full range of hardware instructions available to the physical machine. When any other program is operating, it generally uses limited state information, and has access to only a limited number of instructions that effect the physical machine. When the machine is using the kernel registers and has full access, we will the machine is running in the kernel state, or kernel mode, and when the other state registers are used and limited instructions are allowed, we say the machine is in the user state, or user mode. Several citations are given in [Denning82] for protected hardware states, but as yet, the originator of this concept has not been identified.
In user mode, machine instructions that are only allowed to kernel mode cannot be executed, so a typical method for providing these services to users is through 'system calls'. A system call is typically implemented by executing an instruction which is not allowed in user mode. This causes an error which the hardware interprets as a request to change to the kernel mode. Since the register that determines where the next instruction comes from is different in kernel mode than in user mode, the next instruction comes from the kernel's memory location after the location of the instruction which the kernel used to return to user mode after the last kernel call. This instruction is typically the beginning of a routine which determines the nature of the request from the state of the user's registers and memory, and either processes the request or denies it.
A similar process occurs whenever the system clock signals to the kernel that it's time to switch to another user's turn to execute programs in a timesharing environment. In this case, the kernel typically saves the values of the old user's registers in the kernel's memory space, restores the register values associated with the new user from kernel space, and returns to user mode. Thus, the next user continues wherever they left off, and the last user waits to regain access to the system.
The astute reader will see several problems that remain to be resolved. In particular, the memory of the machine must be protected in some manner or the state information of one domain could be altered or examined by writing or reading its memory from another domain. Similarly, state information in peripheral devices may be exploited in this manner unless some additional protection is provided. Finally, information may be exchanged between domains by differing usage of shared facilities.
Because of the prominent position of memory in the fetching of instructions and data and the storage of control state information for the kernel and user domains, it presents a problem that must be addressed by hardware in order to afford efficient processing under most current architectures. Memory is generally protected through the use of a technique called 'memory mapping'. Memory mapping involves the use of a special set of control registers that translate memory locations specified by a user into physical memory locations in M. Thus a given user might access a memory location numbered 125, which the 'translation buffer' maps into physical location 2978. If the kernel assures that the memory maps of different users don't have overlaps in physical memory locations, we are guaranteed that they don't share state information, and thus we maintain the separation mechanism. Similarly, shared memory can be implemented with this mechanism for controlled communication. In most practical systems, memory mapping is done on a page by page basis (i.e. in 1000 word groups) so that the size of the memory map needn't be enormous or expensive.
All I/O is generally performed by kernel mode instructions so the kernel can enforce protection of I/O resources. In the case of terminal IO, the user may have an interface that is nearly identical to the kernel's instructions, while for shared devices such as disks and printers, the kernel may abstract the physical characteristics of the device completely, and leave the user with a purely logical abstraction such as a file system.
The logical extension of the two state machine (i.e. kernel and user mode) is the use of machines with numerous states, each with an increasing level of control. In general, a two state machine is much simpler to implement. Because complexity increases the likelihood of errors, and all protection relevant instructions are likely to be equally critical to system wide protection, this is usually the way implementations are performed. If multiple logical protection 'rings' are desired, they can be implemented with a two state machine in which certain domains are treated differently than others by the kernel, rather than by providing a multitude of control registers and classes of instructions in hardware.
The design and implementation of protection in operating systems is a very difficult problem. Although it is possible to implement a correct system through purely ad-hoc methods, as we saw earlier, these techniques have proven inadequate in case after case. Even if such a system were correctly implemented, we would never be able to be sure of that fact unless we could provide some sort of believable proof of its correctness. When OS changes are made, it is highly unlikely that they will maintain the correctness of the OS unless there is some formal method to describe and analyze their effects. Thus much of the current effort is directed towards the implementation of provably correct systems. [Landwehr83] This is typically done through a series of phases in which we design a formal policy, formal models, formal specifications, formal implementations, formal management, and formal maintenance techniques for system use [Klein83] . Some of these steps involve extremely complex procedures, and often take days of CPU time on supercomputers in order to perform analysis. Some work has also been done on the verification of policy maintenance at compile time and at runtime, but results have not come into widespread use as of this time [Denning82] .
In most OSs, user processes perform privileged operations by performing system calls which request the OS to perform those operations for them. System calls may or may not be in correspondence with hardware instructions, and part of the job of the OS is to translate the system calls into sets of operations which abstract the physical hardware from the user such that programs need not be rewritten for each change in hardware configuration. OSs typically abstract virtual memory, a file structure, a process structure, interrupts, communications protocols, and a variety of other mechanisms that are useful to the user.
If we were to implement protection by assuring that all of these mechanisms operated properly, system complexity would preclude implementing a system that could be proven correct. Instead, we generally implement a minimal protection mechanism (i.e. a kernel) that acts as a 'reference monitor' for the system, and implement high level OS services by providing user callable routines that call upon the kernel only for security relevant operations. The security kernel provides protection for itself, other OS routines, and user information, to assure that each is protected from the next. Corrupt OS service routines outside the kernel may cause improper functional behavior or bizarre results, but don't cause violations of the protection policy. The more involved the protection policy is, the larger the trusted portion of the OS must become, and the less trusted it becomes due to its increased complexity.
Access matrices are widely used to hold a symbolic representation of the set of rights available to subjects for access to objects [Denning82] . In the case of the POset policy, 'flow control matrices' hold only a single right which determines whether or not flow is permitted from domain to domain [Cohen86] , while more complex policies require more rights, and thus more complex software. Matrices are well understood data structures which have been used for a long time, and implementations are very straight forward. The programmer merely implements a table lookup for every protection related OS request to determine whether or not the requested right is to be granted.
In cases where there are a large number of subjects and objects, access matrices tend to become very large, and are dominated by empty space. An alternative representation is a set of lists wherein each subject is associated with a list of objects for each type of right. Variations on this theme are easily generated. These structures are also well understood, but because of the unknown size of the structures and the dynamic manner in which they can change size, they may considerably complicate implementation, and thus cause difficulties in assuring proper operation. Most experienced programmers feel that they can write a perfect version of a dynamic structure which works for all possible cases, but the complexities of dynamic structures may be quite subtle when implemented without all of the support usually granted to the programmer by the OS. As an example, there is no dynamic memory access unless it is explicitly programmed, and this implies that an entire dynamic memory access facility will have to become part of the trusted kernel of the OS.
Most policies specify that information must be labeled at all times so that the sensitivity level of information can be determined by observation, and errors in the proper maintenance of information can be detected. Physical and logical labels are typically included on all output from a system, information in its stored form in the system, backup tapes, IO devices, disks, terminals, and everything else imaginable. Labels in the OS are usually maintained as part of the control information associated with files or devices.
Most protection policies require that security relevant operations be auditable, and thus that an uncircumventable and uncorruptable mechanism allow tracking of all subjects performing security relevant operations on all objects. This is not feasible if the security kernel is corrupt, but it provides a degree of redundancy that makes undetectable abuse of the system much more difficult even in the presence of kernel errors. More importantly, it may detect misuse and abuse by systems administrators, managers, and maintainers as well as users.
In the case of a provably secure system, the security kernel must be proven correct. Correctness theory as it currently exists doesn't include provisions for limiting side effects, and thus 'correctness' is not sufficient for correct operation, but rather only a necessary condition. Such proofs typically require 1 day for a 200 line limited Pascal program assuming a CRAY-1 is used to perform the proofs, that the program is of a restricted class, and that the things being proven are sufficiently simplistic. The time required for this type of proof goes up almost exponentially with the length of the program, so keeping the security kernel small is a central issue in keeping it correct.
Another typical implementation constraint is that the security kernel be easily read and understood by a knowledgeable third party. This is intended to preclude the inclusion of an obvious Trojan horse, but in the strict sense does not preclude the introduction of subtle problems that can be exploited.
Basic administrative tools are generally required in order to facilitate the management of secure systems. Some tools have been designed and developed, but there is no consistently applied theory or practice for this aspect of implementation. Simple tools for risk analysis, assuring the maintenance of B-L, Biba, and compartments, limiting the effects of collusions, and automating configuration management have been prototyped [Cohen86] [Cohen87-2] [Cohen87-3] , but these techniques are only the beginning of exploration in this area.