Normalization
- It is the process of organizing data into related table.
- To normalize database, we divide database into tables and establish relationships between the tables.
- It reduces redundancy. It is done to improve performance of query.
Steps of normalization:
First Normal form
Entities of the table must have unique identifier or entity key.
Second Normal Form
All the attributes of the table must depend on the entity key for that entity.
Third Normal Form
All attributes that are not part of the key must not depend on any other non-key attributes.
De-normalization
The process of adding redundant data to get rid of complex join, in order to optimize database performance. This is done to speed up database access by moving from higher to lower form of normalization.
Example:----
Rules for First Normal Form (1NF)
Eliminate repeating groups. This table contains repeating groups of data in the Software column.
Computer | Software |
1 | Word |
2 | Access, Word, Excel |
3 | Word, Excel |
To follow the First Normal Form, we store one type of software for each record.
Computer | Software |
1 | Word |
2 | Access |
2 | Word |
3 | Excel |
3 | Word |
3 | Excel |
Rules for second Normal Form (2NF)
Eliminate redundant data plus 1NF. This table contains the name of the software which is redundant data.
Computer | Software |
1 | Word |
2 | Access |
2 | Word |
3 | Excel |
3 | Word |
3 | Excel |
To eliminate the redundant storage of data, we create two tables. The first table stores a reference SoftwareID to our new table that has a unique list of software titles.
Computer | SoftwareID |
1 | 1 |
2 | 2 |
2 | 1 |
3 | 3 |
3 | 1 |
3 | 3 |
SoftwareID | Software |
1 | Word |
2 | Access |
3 | Excel |
Rules for Third Normal Form (3NF)
Eliminate columns not dependent on key plus 1NF and 2NF. In this table, we have data that contains both data about the computer and the user.
Computer | User Name | User Hire Date | Purchased |
1 | Joe | 4/1/2000 | 5/1/2003 |
2 | Mike | 9/5/2003 | 6/15/2004 |
To eliminate columns not dependent on the key, we would create the following tables. Now the data stored in the computer table is only related to the computer, and the data stored in the user table is only related to the user.
Computer | Purchased |
1 | 5/1/2003 |
2 | 6/15/2004 |
User | User Name | User Hire Date |
1 | Joe | 5/1/2003 |
2 | Mike | 6/15/2004 |
Computer | User |
1 | 1 |
2 | 1 |
What does normalization have to do with SQL Server?
To be honest, the answer here is nothing. SQL Server, like any other RDBMS, couldn't care less whether your data model follows any of the normal forms. You could create one table and store all of your data in one table or you can create a lot of little, unrelated tables to store your data. SQL Server will support whatever you decide to do. The only limiting factor you might face is the maximum number of columns SQL Server supports for a table.
SQL Server does not force or enforce any rules that require you to create a database in any of the normal forms. You are able to mix and match any of the rules you need, but it is a good idea to try to normalize your database as much as possible when you are designing it. People tend to spend a lot of time up front creating a normalized data model, but as soon as new columns or tables need to be added, they forget about the initial effort that was devoted to creating a nice clean model.
To assist in the design of your data model, you can use the DaVinci tools that are part of SQL Server Enterprise Manager.
Advantages of normalization
2. Better performance:
b. Fewer indexes per table mean faster maintenance tasks such as index rebuilds.
c. Only join tables that you need.
Disadvantages of normalization
2. Tables contain codes instead of real data: Repeated data is stored as codes rather than meaningful data. Therefore, there is always a need to go to the lookup table for the value.
3. Data model is difficult to query against: The data model is optimized for applications, not for ad hoc querying.
No comments:
Post a Comment