Sunday, September 26, 2010

Primary keys

A key is a set of columns that can be used to identify or access a particular row or rows. The key is identified in the description of a table, index, or referential constraint. The same column can be part of more than one key.

A unique key is a key that is constrained so that no two of its values are equal. The columns of a unique key cannot contain NULL values. For example, an employee number column can be defined as a unique key, because each value in the column identifies only one employee. No two employees can have the same employee number.

The mechanism used to enforce the uniqueness of the key is called a unique index. The unique index of a table is a column, or an ordered collection of columns, for which each value identifies (functionally determines) a unique row. A unique index can contain NULL values.

The primary key is one of the unique keys defined on a table, but is selected to be the key of first importance. There can be only one primary key on a table.

A primary index is automatically created for the primary key. The primary index is used by the database manager for efficient access to table rows, and allows the database manager to enforce the uniqueness of the primary key. (You can also define indexes on non-primary key columns to efficiently access data when processing queries.)

If a table does not have a “natural” unique key, or if arrival sequence is the method used to distinguish unique rows, using a time stamp as part of the key can be helpful.

Primary keys for some of the sample tables are:

Table
Key Column
Employee table
EMPNO
Department table
DEPTNO
Project table
PROJNO

The following example shows part of the PROJECT table, including its primary key column.

Table 1. A Primary Key on the PROJECT Table
PROJNO (Primary Key) PROJNAME DEPTNO
MA2100 Weld Line Automation D01
MA2110 Weld Line Programming D11

If every column in a table contains duplicate values, you cannot define a primary key with only one column. A key with more than one column is a composite key. The combination of column values should define a unique entity. If a composite key cannot be easily defined, you may consider creating a new column that has unique values.

The following example shows a primary key containing more than one column (a composite key):

Table 2. A Composite Primary Key on the EMP_ACT Table
EMPNO (Primary Key) PROJNO (Primary Key) ACTNO (Primary Key) EMPTIME EMSTDATE (Primary Key)
000250 AD3112 60 1.0 1982-01-01
000250 AD3112 60 .5 1982-02-01
000250 AD3112 70 .5 1982-02-01

Identifying candidate key columns

To identify candidate keys, select the smallest number of columns that define a unique entity. There may be more than one candidate key. In Table 3, there appear to be many candidate keys. The EMPNO, the PHONENO, and the LASTNAME columns each uniquely identify the employee.

Table 3. EMPLOYEE Table
EMPNO (Primary Key) FIRSTNAME LASTNAME WORKDEPT (Foreign Key) PHONENO
000010 Christine Haas A00 3978
000030 Sally Kwan C01 4738
000060 Irving Stern D11 6423
000120 Sean O’Connell A00 2167
000140 Heather Nicholls C01 1793
000170 Masatoshi Yoshimura D11 2890

The criteria for selecting a primary key from a pool of candidate keys should be persistence, uniqueness, and stability:

  • Persistence means that a primary key value for each row always exists.
  • Uniqueness means that the key value for each row is different from all the others.
  • Stability means that primary key values never change.

Of the three candidate keys in the example, only EMPNO satisfies all of these criteria. An employee may not have a phone number when joining a company. Last names can change, and, although they may be unique at one point, are not guaranteed to be so. The employee number column is the best choice for the primary key. An employee is assigned a unique number only once, and that number is generally not updated as long as the employee remains with the company. Since each employee must have a number, values in the employee number column are persistent.

No comments:

Post a Comment