RoR World of Programmers: Primary keys

A key is a set of columns that can be used to identify or access a particular row or rows. The key is identified in the description of a table, index, or referential constraint. The same column can be part of more than one key.

A unique key is a key that is constrained so that no two of its values are equal. The columns of a unique key cannot contain NULL values. For example, an employee number column can be defined as a unique key, because each value in the column identifies only one employee. No two employees can have the same employee number.

The mechanism used to enforce the uniqueness of the key is called a unique index. The unique index of a table is a column, or an ordered collection of columns, for which each value identifies (functionally determines) a unique row. A unique index can contain NULL values.

The primary key is one of the unique keys defined on a table, but is selected to be the key of first importance. There can be only one primary key on a table.

A primary index is automatically created for the primary key. The primary index is used by the database manager for efficient access to table rows, and allows the database manager to enforce the uniqueness of the primary key. (You can also define indexes on non-primary key columns to efficiently access data when processing queries.)

If a table does not have a “natural” unique key, or if arrival sequence is the method used to distinguish unique rows, using a time stamp as part of the key can be helpful.

Primary keys for some of the sample tables are:

Table: Key Column
Employee table: EMPNO
Department table: DEPTNO
Project table: PROJNO

The following example shows part of the PROJECT table, including its primary key column.

Table 1. A Primary Key on the PROJECT Table
PROJNO (Primary Key)	PROJNAME	DEPTNO
MA2100	Weld Line Automation	D01
MA2110	Weld Line Programming	D11

If every column in a table contains duplicate values, you cannot define a primary key with only one column. A key with more than one column is a composite key. The combination of column values should define a unique entity. If a composite key cannot be easily defined, you may consider creating a new column that has unique values.

The following example shows a primary key containing more than one column (a composite key):

Table 2. A Composite Primary Key on the EMP_ACT Table
EMPNO (Primary Key)	PROJNO (Primary Key)	ACTNO (Primary Key)	EMPTIME	EMSTDATE (Primary Key)
000250	AD3112	60	1.0	1982-01-01
000250	AD3112	60	.5	1982-02-01
000250	AD3112	70	.5	1982-02-01

Identifying candidate key columns

To identify candidate keys, select the smallest number of columns that define a unique entity. There may be more than one candidate key. In Table 3, there appear to be many candidate keys. The EMPNO, the PHONENO, and the LASTNAME columns each uniquely identify the employee.

Table 3. EMPLOYEE Table
EMPNO (Primary Key)	FIRSTNAME	LASTNAME	WORKDEPT (Foreign Key)	PHONENO
000010	Christine	Haas	A00	3978
000030	Sally	Kwan	C01	4738
000060	Irving	Stern	D11	6423
000120	Sean	O’Connell	A00	2167
000140	Heather	Nicholls	C01	1793
000170	Masatoshi	Yoshimura	D11	2890

The criteria for selecting a primary key from a pool of candidate keys should be persistence, uniqueness, and stability:

Persistence means that a primary key value for each row always exists.
Uniqueness means that the key value for each row is different from all the others.
Stability means that primary key values never change.

Of the three candidate keys in the example, only EMPNO satisfies all of these criteria. An employee may not have a phone number when joining a company. Last names can change, and, although they may be unique at one point, are not guaranteed to be so. The employee number column is the best choice for the primary key. An employee is assigned a unique number only once, and that number is generally not updated as long as the employee remains with the company. Since each employee must have a number, values in the employee number column are persistent.

RoR World of Programmers

Sunday, September 26, 2010

Primary keys

Identifying candidate key columns

No comments:

Post a Comment