Teradata has lived and understood the data warehouse environment for decades over their competitors. One of the key fundamentals of the teradata is in the ability to allow the AMPs to access data quicker with Partition Primary Indexes. Partition Primary Indexes are always and I mean always defined in the Physical Model. In the past Teradata has hashed the Primary Index, which produced a Row Hash. From the Row Hash, Teradata was able to send the row to a specific AMP. The AMP would place a uniqueness value and the Row Hash plus the Uniqueness value made up the Row ID. The data on each AMP was grouped by table and sorted by ROW ID.
Through years of experience working with data warehouse user queries Teradata has decided to take the hashing to an additional level. In the past you could choose a Unique Primary Index (UPI) or a Non-Unique Primary Index (NUPI). Now Teradata will let you choose either a Partition Primary Index (PPI) or a Non-Partition Primary Index (NPPI).
This allows for fantastic flexibility because user queries will often involve ranges or are specific to a particular department, location, region, or code of some sort. Now the AMPs can find the data quicker because the data is grouped in alphabetical order. You can avoid Full Table Scans more often. An example is definitely called for here. I will show you a table that is hashed and another that has a Partition Primary Index.
Partitioning doesnt have to be part of the Primary Index
understanding partitioning is easy once you understand the basic steps. You do not have to partition by a column that is the primary index. Here is an example:
CREATE SET TABLE EMPLOYEE_TABLE ( EMPLOYEE INTEGER NOT NULL ,DEPT INTEGER ,FIRST_NAME VARCHAR(20) ,LAST_NAME CHAR(20) ,SALARY DECIMAL(10,2) ) PRIMARY INDEX (EMPLOYEE) PARTITION BY DEPT;
You can NOT have a UNIQUE PRIMARY INDEX on a table that is partitioned by something not included in the Primary Index. Here is an interesting brain teaser? If a Primary Index is Non-Unique on a Partition Primary Index table can Teradata utilize the Primary Index column to access the row or rows? Absolutely! Teradata scans all partitions that have not been eliminated and the Hashed Primary Index value is scanned for the Row Hash.
The Bad NEWS about Partitioning on a column that is not part of the Primary Index Before you get too excited about partitioning by a column that is not part of the primary index you should remember The Alamo. This is because when you run queries that dont mention the PARTITION COLUMN in your SQL you have to check every partition and this can be some serious battle. Partitions can range from 1-65,535 partitions. The example below will have to check every partition so be careful.
GREAT Things about Partition Primary Indexes: PPI avoids full table scans without the overhead of a secondary index and allows for instantaneous dropping of old data and rapid addition of newer data. Remember these rules: A Primary KEY cant be changed. A Primary Index always distributes the data. A Partition Primary Index (PPI) partitions data to avoid full table scans.
Two ways to handle Partitioning on a column that is not part of the Primary Index You have two ways to handle queries when you partition by a column that is not part of the Primary Index. a. You can assign a Unique Secondary Index (when appropriate). b. You can include the partition column in your SQL.