In data warehouse environment each dimension table should have primary key which uniquely identifies dimension record. We can use natural keys or business keys as a primary key in dimension table however these keys are not recommended to be used as primary key in dimension table due to following reasons.
1. These keys are generally intelligent alphanumeric keys like ABC123 which consumes lot of indexes space when used as primary key. It makes index traversing slower and index size big. 2. Business keys are often re-uses over the period of time say after 5 years product key prd123 might be used for some other products. And data warehouse keeps historical as well as current data which might problematic if business keys are re-used and used as primary key. To solve this issue, surrogate keys are used as primary key in dimension tables.
Surrogate keys are nothing but integers which do not have any meaning in terms of business and used as primary key in dimension table. Due to this surrogate keys are often called as meaningless key. Data warehouse best practice is to have dimension table joined to fact table using only surrogate key and not business key.
Benefits of using surrogate key.
1. As we discussed earlier, business keys can be re-used over the period of time. Using surrogate keys can make dimension table handle this change easily as surrogate keys are meaningless less numbers and even if business keys are re-used new surrogate key can identify dimension record uniquely. 2. Surrogate keys are generally small integer numbers, which makes index size smaller when used as index column. This gives better performance due small index size. 3. Surrogate keys can be used when source data do not have consistent keys to uniquely identify a record. This often comes handy when new data source are introduced in a data warehouse environment. 4. Surrogate keys are needed to implement slowly changing dimensions.