Normalization of
data is a process of analyzing the given relational schemas based on
their functional dependencies and primary keys to achieve some desirable
properties such as, Minimizing data redundancy, and minimizing the insertion,
deletion and updation.
______________________________________________________________________________
______________________________________________________________________________
Primary key is basically used to avoid duplicate values or datas
in our parent table, moreover a parent table can have only one primary
key but we can assign a single primary key to combination of columns,
which is called a composite primary key. It is nothing but process of
building a database structure to store data, formal process of developing
a data structure in a manner to eliminate data redundancy and promotes
data integrity.
A database is said to be a relational system, when the database system
satisfies Codd twelve rules and relational theory in algebra. In a relational
database system there exist some relationship among the stored datas,
it is also possible to form relations from the stored data with the
help of relational algebra. Normalization takes a relation schema through
a series of tests to certify whether it satisfies certain normal form.
The process proceeds in top down fashion by evaluating each relation
against the criteria for normal forms and decomposing relations are
necessary. Redundancy is unnecessary repetition of data. It can cause
problems with storage and retrieval of datas in the databases.
During the process of normalization, dependencies can be identified
easily. Normalization theory is based on the functional notion of functional
dependency and moreover it helps in simplifying the structure of tables.
There are four levels of normalization.
First normal form:
It is a relation in which intersection of each rows and column contains
only one value. It is used to identify the repeating group of datas
within the same table, repeating groups is a set of information that
stores similar information that repeats in the same table and these
repeating datas are separated. It is now considered to be part
of the formal definition of the relation in the basic relational model.
It states that the domain of an attribute should contain only atomic
values. In other words first normal form it disallows relation within
relation.
Second normal form:
The relation is said to be second normal from if it satisfies first
normal form and every non-key attribute is functionally and fully dependent
on the primary key. Functional dependency, every column in our database
table should be fully and functionally dependent on primary key. No
non-key attribute should exist in the relation. Any data, which is not
dependent, should be segregated. To establish second normal form we
need to decompose and set up new relation for each partial key with
its dependent attribute.
Third normal form:
Third normal form is based on the concept of transitive dependency.
Relation is said to be in third normal form if it satisfies both first
and second normal form. Transitive dependencies in a relation are functionally
dependent between one or more non-key attribute. Columns in each table
should be a group of columns that contributes to the description of
each row in our database table. In order to achieve third normal forms
decompose and set up a relation that includes the non-key attribute
that functionally determine other non-key attribute.
BCNF (Boyce Codd Normal form):
A database design is in BCNF if each member of the set of relation schemas
that constitutes the design is in BCNF. BCNF checks for non-trivial
dependency and checks for relational schema.
A functional dependency
is a property of the semantics or meaning of the attributes. A functional
dependency helps us to detect poor E-R design. If the generated relations
are not in desired normal from, the problem can be fixed in the E-R
diagram. That is, normalization can be done as a part of data modeling.
Alternately normalization can be done formally on the relations generated
fro the E-R model.
Even though normalization helps in reducing redundancy, but sometimes
when information is required from more than one table at a faster rate,
it is essential to introduce some amount of redundancy in the table.
This intentional introduction of redundancy into the table is called
denormalization. Thus functional dependency helps in maintaining a relationship
among the datas present in the table. Any columns that are not
dependent on the primary of the parent table should be segregated. Every
column should be dependent on the primary key of the parent table.
______________________________________________________________________________
______________________________________________________________________________