Is Normalization is an important aspect of a good database design?
Normalization of data
is a process of analyzing the given relational schemas based on their
functional dependencies and primary keys to achieve some desirable properties
such as, Minimizing data redundancy, and minimizing the insertion, deletion
and updation.
Primary key is basically used to avoid duplicate values or datas
in our parent table, moreover a parent table can have only one primary
key but we can assign a single primary key to combination of columns,
which is called a composite primary key. It is nothing but process of
building a database structure to store data, formal process of developing
a data structure in a manner to eliminate data redundancy and promotes
data integrity.
A database is said to be a relational system, when the database system
satisfies Codd twelve rules and relational theory in algebra. In a relational
database system there exist some relationship among the stored datas,
it is also possible to form relations from the stored data with the help
of relational algebra. Normalization takes a relation schema through a
series of tests to certify whether it satisfies certain normal form.
The process proceeds in top down fashion by evaluating each relation against
the criteria for normal forms and decomposing relations are necessary.
Redundancy is unnecessary repetition of data. It can cause problems with
storage and retrieval of datas in the databases. During the process
of normalization, dependencies can be identified easily. Normalization
theory is based on the functional notion of functional dependency and
moreover it helps in simplifying the structure of tables. There are four
levels of normalization.
First normal form:
It is a relation in which intersection of each rows and column contains
only one value. It is used to identify the repeating group of datas
within the same table, repeating groups is a set of information that stores
similar information that repeats in the same table and these repeating
datas are separated. It is now considered to be part of the formal
definition of the relation in the basic relational model. It states that
the domain of an attribute should contain only atomic values. In other
words first normal form it disallows relation within relation.
Second normal form:
The relation is said to be second normal from if it satisfies first normal
form and every non-key attribute is functionally and fully dependent on
the primary key. Functional dependency, every column in our database table
should be fully and functionally dependent on primary key. No non-key
attribute should exist in the relation. Any data, which is not dependent,
should be segregated. To establish second normal form we need to decompose
and set up new relation for each partial key with its dependent attribute.
Third normal form:
Third normal form is based on the concept of transitive dependency. Relation
is said to be in third normal form if it satisfies both first and second
normal form. Transitive dependencies in a relation are functionally dependent
between one or more non-key attribute. Columns in each table should be
a group of columns that contributes to the description of each row in
our database table. In order to achieve third normal forms decompose and
set up a relation that includes the non-key attribute that functionally
determine other non-key attribute.
BCNF (Boyce Codd Normal form):
A database design is in BCNF if each member of the set of relation schemas
that constitutes the design is in BCNF. BCNF checks for non-trivial dependency
and checks for relational schema.
A functional dependency
is a property of the semantics or meaning of the attributes. A functional
dependency helps us to detect poor E-R design. If the generated relations
are not in desired normal from, the problem can be fixed in the E-R diagram.
That is, normalization can be done as a part of data modeling. Alternately
normalization can be done formally on the relations generated fro the
E-R model.
Even though normalization helps in reducing redundancy, but sometimes
when information is required from more than one table at a faster rate,
it is essential to introduce some amount of redundancy in the table. This
intentional introduction of redundancy into the table is called denormalization.
Thus functional dependency helps in maintaining a relationship among the
datas present in the table. Any columns that are not dependent on
the primary of the parent table should be segregated. Every column should
be dependent on the primary key of the parent table.