KB#00044-Advantage article describing file locking in BBx
Advantage article describing file locking in BBx
This article presents an overview of maintaining data integrity in a multi-tasking environment. Most of us think of data protection in terms of file and record locking, and that will be the main focus of this article.
When discussing concurrent access to data, invariably the word "atomic" comes up. An operation is considered "atomic" if it can conceptually execute uninterrupted with respect to potentially conflicting operations. In plain English, an operation can be interrupted but with the guarantee that no other process will interfere with it during the interruption. If two processes are attempting to access the same data, one of them must complete the operation before the other begins. The data being accessed can be anything from a single bit to an entire database. There are several ways this is accomplished. One way is to guarantee that no interruptions take place. This is done at the hardware level by disabling cpu interrupts. Normally this is impractical except for brief periods by low-level device drivers. Another way is to use a shared "flag" that indicates that a particular operation is being performed. If a program wishes to access shared data, it must set (lock) the "flag" prior to doing anything. When finished, it must reset (unlock) the flag so that other processes may get their turn:
lock_flag --> perform_operation --> unlock_flag
Only one program can have the flag locked at a time (variations on this will be discussed later). If a program attempts to lock the flag when it is already locked, then that program will either wait for it to be unlocked or generate an error.
Once a program has control over the flag, it has virtually unlimited safe access to the data in question. The access may be only an inquiry or it may also perform an update. The important thing is that the data access is "atomic". This is necessary even if the program is merely performing an inquiry. This may seem counter-intuitive to the BBx programmer but I will later show why this is always true. The single exception to this reasoning is when all accesses to the data are guaranteed to be inquiries, in which case no protection is needed. For the remainder of this article we will assume that at least one process may potentially modify the data.
Atomic operations can show up in places we normally wouldn't think about. On a single-task system such as MS-DOS, one would not think that modifying data could ever be unsafe. However, even a simple system such as MS-DOS uses cpu interrupts to process keyboard input and timer counters. Since interrupts can take place at any time, any access to keyboard buffers (and other data modified by interrupt handlers) must be protected.
Another unexpected example is when performing simple reads and writes to disk on a multi-tasking system (such as Unix). When a program attempts to read a record from a disk file, it is possible that the record can cross disk sector boundaries requiring two accesses to the disk in order to read the entire record. Another task could be trying to write that very same record (also requiring two disk accesses). If these various disk accesses were interleaved, then the program reading the data could get half of the old record and half of the new record. Therefore, simple reads and writes to a disk file are performed "atomically" through the use of lock flags.
The software "flag" described in the preceding paragraphs has an official name. It is called a "semaphore". A handy analogy to this mechanism is a railroad track in which 2 trains traveling in opposite directions must share the same stretch of track. Railroad semaphores (stop lights) are placed at each end of the track so that each train knows if another train is currently using the track. Needless to say, failure to observe such a semaphore could have disastrous results. It's only a matter of time.
In order to use semaphores effectively it is important to understand a subtle but important concept about semaphores: Semaphores do not lock the data; they lock the procedures that access the data. The use of semaphores always follows the following pattern:
lock --> perform operation --> unlock
Advisory locking vs. mandatory locking
There seems to be a recurring debate about the virtues of advisory locking and mandatory locking. "Advisory locking" is essentially what I have been describing so far. A program requests a specific lock before performing a critical operation followed by an unlock. Since the program must first perform a lock, this is the most logical point to check to see if another program already has the lock. Assuming that the lock was successful, no further lock checks are necessary while performing the operation. With advisory locking, the programmer decides exactly when and how locks are used.
"Mandatory locking", in a sense, doesn't really exist. At best it's a relative term that always requires further clarification. Strictly speaking, mandatory locking implies that certain operations will be semaphored automatically. This is more of a convenience issue than a safety issue. When using mandatory locking, the programmer must always consider whether or not the automatic locking taking place is performing the desired protection.
Semaphores and BBx
Now that we know what locking is all about, I will apply these concepts to the way BBx deals with locks. For now, I will concentrate on the locking that is available at the programmer level--not the internal locks that are invisible to the application.
The only kinds of semaphores available in BBx control access to data files. Normally, disk file access uses mandatory locking--record level lock checks are performed on all accesses. For example, the BBx READ verb implies the following:
lock record --> read data --> unlock record
Notice how the record is briefly locked during the read. A WRITE verb does essentially the same thing:
lock record --> write data --> unlock record
If an application wanted to update the data in the record, it would have to read the data, modify it, and write it back out. If the READ and WRITE verbs were used to do this, the operation would come out like this:
lock --> read --> unlock --> modify --> lock --> write --> unlock
Notice that the update is invalid. The record lock is released and then re-locked in the middle of the update operation. Therefore, the update was not atomic and the data cannot be trusted.
The problem is resolved with the EXTRACT verb. The EXTRACT verb performs the following sequence:
lock record --> read data
There is no unlock. If the read-modify-write operation is performed with EXTRACT instead of READ then the following sequence takes place:
lock --> read --> modify --> lock (redundant) --> write --> unlock
Now the update is 100% safe since the lock is not released until the whole operation is complete. However, notice the redundant lock performed by the WRITE verb. While this doesn't compromise data integrity, it does add unneeded overhead to the WRITE verb.
The problem with BBx's mandatory locking is that it imposes locks when they aren't necessary and yet it never adds a degree of additional safety--not even to the careless programmer.
The main complaint about mandatory locking is that an application cannot READ a record that has been EXTRACTed by another user. Clearly, an application attempting a READ instead of an EXTRACT is intending only to make an inquiry. (READing prior to an update is not safe regardless of the locking mode in use.)
With advisory locking enabled, BBx will not perform a lock check on a READ. The READ verb now performs this sequence:
read data --> unlock record if we had it locked
Notice that a conditional unlock is performed since the READ causes the file pointer to move to a new record. Likewise, the WRITE verb is affected the same way:
write data --> unlock record if we had it locked
Some people react to the idea of a WRITE without checking for a lock. However, an update WRITE without a prior EXTRACT is always wrong so the lock check is not helping and serves only to slow down WRITEs.
More about inquiry reads
Earlier in this article, I stated that accesses to shared data must always be protected even if the access is for inquiry only. However, I have just shown how lock checks are not performed on a READ with advisory locking. In reality, there is an internal lock check being performed to make sure the disk read operation is atomic (which is a separate lock check from the EXTRACT). This is safe only when the data involved in the inquiry is contained in a single record. In this case, the inquiry access to the data is performed atomically. However, if the data being read involves several records, there is no way to simply read them all without additional coding logic. This is because BBx's semaphores only protect access to single records as opposed to groups of records.
If an operation requires several file accesses to complete then the programmer has to implement his own special semaphore to perform that operation. The common way of doing it is to EXTRACT a certain record of the file while performing the multiple record update (or inquiry). Since reading other records will cause the EXTRACT to lose the lock, the file must be opened on two channels--one channel to hold the EXTRACT and the other channel to do the work.
Another way to guarantee atomic access to multiple records is to lock the whole file with the LOCK verb. Unfortunately, there are too severe restrictions on the LOCK verb to make it useful for on-the-fly semaphoring (the file must not be opened by another user). Therefore, the LOCK verb is mainly used for large batch updates to a database.
The REMOVE verb
Normally, the REMOVE verb will check for a lock on the record it is removing.
Surprisingly enough, this behavior doesn't change with advisory locking. Some folks might ask why the REMOVE verb checks for locks while the WRITE verb doesn't. The answer is simple if you realize that a REMOVE verb performs a complete update to the record whereas the WRITE verb is only part of an update operation (beginning with an EXTRACT). Therefore, the whole REMOVE operation is completely semaphored.
So far, I have presented a semaphore as an all-or-nothing lock allowing only one process to access data at a time. While this type of semaphore is effective at protecting data, it can be an overkill at times. BBx's keyed files serve as a good example here. The key area of the file is a complicated data structure. In order to look up a key it could take several disk accesses to the key area. It also takes several disk accesses to modify the key area (add a new key or remove an existing key). It doesn't take much imagination to realize that two programs updating the key area at the same time could corrupt the file. However, it's less obvious that if one program is reading the key area while another is updating it, the inquiring program could get confused since the data structure is not consistent until the update is complete. This could cause a core dump or a lock up even though the file itself won't get corrupted. Obviously, access to the key area must be semaphored. However, if several programs wish to look up keys in the key area there is no reason why they should do it one at a time. If several inquiries could be made concurrently then system throughput will increase.
This problem is solved by using an "inquiry semaphore" (or a "read-permit lock"). An inquiry semaphore has more states than a simple "locked" or "unlocked" status. Now you may request a "lock for inquiry" or a "lock for update". The semaphore will allow any number of concurrent inquiries. However, if a program wants to do an update, it must wait until all inquiries are complete before access is granted for an update (hopefully no new inquiry requests will be allowed to "cut in line" ahead of the pending update). While the update is taking place, all inquiry requests must wait until the single update process releases the semaphore.
Don't hog the semaphore
There is yet another problem encountered when using semaphores--the question of how long to keep something locked. Ideally, a data structure should not be locked any longer than necessary to perform an inquiry or update. However, sometimes the human factor must be taken into account.
Consider an airline reservation system. When a passenger makes a reservation on a flight, he will be allowed to reserve a specific seat. Somewhere in a computer database is a record showing the seat availability for that flight. An inquiry must be made to see which seats are available. The customer must decide which seat to take and the record must be updated to reflect the newly reserved seat.
This process could be taking place for that very same flight at a dozen locations all over the world. It is quite impractical to EXTRACT the record and leave it extracted while the customer makes up his mind. In reality, the first read is merely an inquiry. When the customer finally selects a seat, the record must be read again with an EXTRACT and updated. However, it is possible that the same seat had just been taken by somebody else (since the record wasn't kept locked all this time). The software must check for this possibility and the customer will be asked to select another seat.
There is no guarantee that this process will eventually succeed. It is mathematically possible that someone will repeatedly beat the customer to the desired seat until there are no more seats left. That's simply something we must live with--there are no magical answers. In any case, the database is never corrupted.
The deadly embrace
There remains one classical dilemma of semaphore usage--the "deadlock" or "deadly embrace". This situation is possible when a process wants control of more than one semaphore. As an example, assume process 1 has control of semaphore A and process 2 has control of semaphore B. Now process 1 wants semaphore B also. Process 1 must wait until process 2 releases semaphore B. However, if process 2 is currently waiting for semaphore A, a deadlock is created. There are entire chapters in text-books dedicated to the detection and prevention of deadlocks, and that discussion is beyond the scope of this article. However, it is important to be aware that such a situation can occur with excessive use of multiple EXTRACTs on different files.
Last Modified: 12/18/1997 Product: PRO/5 Operating System: All platforms
BASIS structures five components of their technology into the BBx Generations.