Understanding Validation Criteria In A DBMS Model
and semantic (or consistency) constraints relating to the stored database
are actually an extension of the database definition. A semantic constraint
defines the acceptable value domain for an attribute or a consistency
relationship between several attribute values, for example, the net pay
for a period should equal the gross pay less deductions for income tax
withheld, pension, insurance premiums, etc. In the example of net pay,
the database designer may define the net pay as a derived data item to
be calculated whenever requested. In this case, the expression becomes
the derivation rule.
Alternatively, the net pay could be explicitly defined, stored and maintained in the database. Now the expression becomes a validation rule used to check the consistency of the values in the database. Due to the dependency, whenever any gross pay or deduction data is entered or modified, the net pay must be modified simultaneously. The designer clearly faces a tradeoff in choosing to explicitly store the data item or to derive its value when necessary.
Nevertheless, with either alternative, the user must specify the semantic information as part of the data definition process. If these semantics are not defined to the system such that it knows what the data should look like, the users are collectively responsible for ensuring that the values in the database remain consistent. Some argue that validation criteria and semantic constraints should not be a part of the database definition. The difficulty with such an argument is in determining where definitional information ends and validation information begins.
Validation involves comparing stored data to some expression of what the stored data should look like. In fact, all database definition information provides a basis for validation. To define an item as numeric integer means that a value containing any alphabetic or special characters must be rejected as being invalid for that item. Other parts of the database definition information provide semantic rules for screening out unacceptable values or operations in the database: enumeration of values or ranges on a value set, limitations on the number of instances of an item or repeating group, declaring a data item to be mandatory in a record or unique across entities in a class, or exclusive or dependent characteristics of a relationship.
The viewpoint here is that validation is a process, not a set of criteria. It is a process which compares data to its definition. The data may be stored in the database or in update transactions. The validation criteria may be stored as part of the database definition or the update transaction definition, or it may be embedded in the validation program or in the transaction processing program. Ideally, it should be part of the stored database definition so that it can be enforced at all avenues of access which update the stored data. The more complete and comprehensive the definition the more effective can be the validation process.
The process of validation requires the specification of three pieces of information:
- Validation criteria
or semantic constraints
- Condition under which the database is to be tested against the criteria.
- Action the system is to take in response to a detected violation.
If the database were
static, there would be no need for the second piece of information, the
database would be checked continuously and would always satisfy the stated
validation criteria. The database does change, however as update processes
act upon it. If an update process consists of multiple steps, the database
could temporarily pass through an invalid state. This would happen, for
example, between the posting of a debit and its corresponding credit in
an accounting transaction.
Specifying when to apply validation criteria can avoid testing a database during a temporarily invalid state. Although the structural information may be sufficient for a user to comprehend and reference the database, it does not provide sufficient information to enable the system to store and subsequently access the stored data. Whereas the structural definition serves to build up a data structure, the storage structure information defines how the system is to break down the structure, map it onto storage media and devices and subsequently access it.
The characteristics of the secondary storage devices must be defined or assumed by the system. This includes such information as the physical block size, the devices and volumes used to store the data and how to partition the database to fit onto the volumes. The dominant characteristics of storage devices are that they consist of a linear sequence of physical blocks, each consisting of a linear sequence of character spaces.