Primary Indexes - Teradata Database

Database Introduction

Product
Teradata Database
Release Number
15.00
Language
English (United States)
Last Update
2018-09-25
dita:id
B035-1091
lifecycle
previous
Product Category
Teradata® Database

You can create a table with a Unique Primary Index (UPI), a Non-Unique Primary Index (NUPI), or No Primary Index (NoPI).

 

IF you create a table with...

THEN...

a NUPI

the PI is a column, or columns, that may have duplicate values.

a UPI

the PI is a column, or columns, that has no duplicate values.

NoPI

there is no PI column and rows are not hashed based on any column values.

Primary Indexes and Data Distribution

Unique Primary Indexes (UPIs) guarantee uniform distribution of table rows.

Nonunique Primary Indexes (NUPIs) can cause skewed data. While not a guarantor of uniform row distribution, the degree of uniqueness of the index will determine the degree of uniformity of the distribution. Because all rows with the same PI value are distributed to the same AMP, columns with a small number of distinct values that are repeated frequently do not make good PI candidates.

The most efficient access methods to get data in a table is through the PI. For this reason, choosing a PI should take the following design goal into consideration: choosing a PI that gives good distribution of data across the AMPs must be balanced against choosing a PI that reflects the most common usage pattern of the table.

If you do not explicitly specify a primary index when a table is created, Teradata Database uses the first column as the nonunique primary index by default.

User tables can also be created explicitly without a primary index. These NoPI tables are used for special purposes. NoPI tables are typically used as staging tables to allow faster data loading. NoPI tables must be used for tables that will use Teradata Columnar partitioning.

Primary Key

A Primary Key (PK), a term that comes from data modeling, defines a column, or columns, that uniquely identify a row in a table. Because it is used for identification, a PK cannot be null. There must be something in that column, or columns, that uniquely identify it. Moreover, PK values should not be changed. Historical information, as well as relationships with others tables, may be lost if a PK is changed or re-used.

A PK is a logical relational database concept. It may or may not be the best column, or columns, to choose as a PI for a table.

Foreign Key

A Foreign Key (FK) identifies table relationships. They model the relationship between data values across tables. Relational databases, like Teradata Database, permit data values to associate across more than one table.

Thus each FK a table may have must exist somewhere as a PK. That is, there must be referential integrity between FKs and PKs.

Relationships Between Primary Indexes and Primary Keys

The following table describes some of the relationships between PKs and PIs.

 

Primary Key

Primary Index

Identifies a row uniquely.

Distributes rows.

Does not imply access path.

Defines most common access path.

Must be unique.

May be unique or nonunique.

May not be null.

May be null.

Causes a Unique Primary Index (UPI) or Unique Secondary Index (USI) to be created.

N/A

Constraint used to ensure referential integrity.

Physical access mechanism.

Required by Teradata Database only if referential integrity checks are to be performed.

Defined for most production tables. Some staging tables may not have a primary index (NoPI table).

  • If Teradata Database performs referential integrity checks, then the column limit is 64.
  • If Teradata Database performs no referential integrity checks, then there is no arbitrary column limit.
  • 64-column limit.

    Values should not be changed if you want to maintain data integrity and preserve historical relations among tables.

    Values can be changed.

    The columns chosen for the UPI of a table are frequently the same columns identified as the PK during the data modeling process, but no hard-and-fast rule makes this so. In fact, physical database design considerations often lead to a choice of columns other than those of the primary key for the PI of a table.