Minggu, 26 April 2009

Database Normalization

Database Normalization

Design Database Process

The process includes:

- Gathering user or business's requirement

- Develop E-R Model bases user or business's requirements

- Convert E-R Model go to relationship gatherings (table)

- Relationship normalization to remove anomaly

- Implementation goes to database by make table for each relationship already most normalization

Database Normalization

- Normalization process is the establishment of the database structure so that most of the ambiguity can be removed.

- Normalization stage, starting from the most mild (1NF) to most stringent (5NF)

- Usually only up to the level of 3NF or BCNF because already sufficient to generate the table-table good quality

Why done by normalization?

- Optimizing table structures

- Increasing speed

- Removing same data inclusion

- More efficient in storage media purpose

- Reducing redundancy

- Avoiding anomaly (insertion anomalies, deletion anomalies, and anomaly's update).

- Increased data integrity

A table saying good (efficient) or if the normal 3 to meet the following criteria:

  1. If there is decomposition (decomposition) table, then the decomposition will be guaranteed safe (Lossless-Join Decomposition). That is, after the table is described / in the decomposition into a new table-table, the table-table can generate a new table with the same exact.
  2. Maintain the functional dependence on the change data (Dependency preservation)
  3. No violate Boyce-Code Normal Form (BCNF)

If the three criteria (BCNF) can not be met, then at least the table does not violate the Normal Form of the third stage (3rd Normal Form / 3NF).

Functional Dependency

A functional dependency (FD) is a constraint between two sets of attributes in a relation from a database.
Given a relation R, a set of attributes X in R is said to functionally determine another attribute Y, also in R, (written X --> Y) if and only if each X value is associated with precisely one Y value. Customarily we call X the determinant set and Y the dependent attribute. Thus, given a tuple and the values of the attributes in X, one can determine the corresponding value of the Y attribute. For the purposes of simplicity, given that X and Y are sets of attributes in R, X --> Y denotes that X functionally determines each of the members of Y - in this case Y is known as the dependent set. Thus, a candidate key is a minimal set of attributes that functionally determine all of the attributes in a relation.

Example :



Functional Dependency:

  • NRP--> Nama
  • Mata_Kuliah, NRP --> Nilai


Non Functional Dependency:

  • Mata_Kuliah --> NRP
  • NRP --> Nilai

Functional Dependency from tables of Nilai:
Nrp --> Nama
Because to each Nrp value is same, so Name value is same too.

{Mata_Kuliah, NRP} --> Nilai
Because attribut value depend on Mata_Kuliah and NRP by together. In other meaning for the Mata_Kuliah and NRP is same, so have same Nilai too, because Mata_Kuliah and NRP is key ( having the character is unique).

Mata_Kuliah --> NRP
NRP --> nilai


First Normal Form / 1NF

First normal form (1NF or Minimal Form) is a normal form used in database normalization. A relational database table that adheres to 1NF is one that meets a certain minimum set of criteria. These criteria are basically concerned with ensuring that the table is a faithful representation of a relation and that it is free of repeating groups.
The concept of a "repeating group" is, however, understood in different ways by different theorists. As a consequence, there is no universal agreement as to which features would disqualify a table from being in 1NF.

Not allowed for :

  • Attribute that valuably there are many ( Multi valued attribute )
  • Attribute composite or combine of both

So :

  • Price of Domain attribute must represent atomic price

For Example


Table "Mahasiswa"


Or

That Tables is not complete 1NF condition.
Decomposition becomes:

Table mahasiswa :

Table hoby :


Second Normal Form / 2NF

Second normal form (2NF) is a normal form used in database normalization. 2NF was originally defined by E.F. Codd in 1971. A table that is in first normal form (1NF) must meet additional criteria if it is to qualify for second normal form. Specifically: a 1NF table is in 2NF if and only if, given any candidate key and any attribute that is not a constituent of a candidate key, the non-key attribute depends upon the whole of the candidate key rather than just a part of it.
In slightly more formal terms: a 1NF table is in 2NF if and only if none of its non-prime attributes are functionally dependent on a part (proper subset) of a candidate key. (A non-prime attribute is one that does not belong to any candidate key.)

Functional dependency X --> Y is said full if erase an attribute A of X its mean that Y no longer functional dependent. Functional dependency X --> Y is said partial if erases an A's attribute of X matter Y still functional dependent. Relationship schemer on 2NF form if each attribute non primary key A R full dependent functionally on primary key R.


This table accomplishes 1NF, but not exclude 2NF :


That table is Not accomplishing 2NF form, because (NIM, KodeMk) is regarded as the primary key:

{NIM, KodeMk} -> NamaMhs
{NIM, KodeMk} -> Alamat
{NIM, KodeMk} -> Matakuliah
{NIM, KodeMk} -> SKS
{NIM, KodeMk} -> NilaiHuruf

Table need to decomposition become some table measures 2NF

Functional dependency as follows:

  • {NIM, KodeMk} --> NilaiHuruf (fd1)
  • NIM --> {NamaMhs, Alamat} (fd2)
  • KodeMk --> {Matakuliah, Sks} (fd3)


So

  • fd1 (NIM, KodeMk, NilaiHuruf) --> Tabel Nilai
  • fd2 (NIM, NamaMhs, Alamat) --> Tabel Mahasiswa
  • fd3 (KodeMk, Matakuliah, Sks) --> Tabel MataKuliah


Third Normal Form

The third normal form (3NF) is a normal form used in database normalization. 3NF was originally defined by E.F. Codd in 1971. Codd's definition states that a table is in 3NF if and only if both of the following conditions hold:

  1. The relation R (table) is in second normal form (2NF)
  2. Every non-prime attribute of R is non-transitively dependent (i.e. directly dependent) on every key of R.

An example of a 2NF table that fails to meet the requirements of 3NF is:

Because still there are attribute not primary key (namely Kota and Provinsi) own depended to attribute not other primary key (name is Kodepos) :

  • Kodepos --> {Town, Provinsi }


So that the table require to decomposition become :

  • Mahasiswa ( NIM, Namamhs, Jalan, Kodepos)
  • Kodepos ( Kodepos, Provinsi, Town)


Boyce-Codd Normal Form / BCNF

Boyce-Codd normal form (or BCNF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF). A table is in Boyce-Codd normal form if and only if, for every one of its non-trivial functional dependencies X --> Y, X is a superkey—that is, X is either a candidate key or a superset thereof.

Fourth and Fifth Normal Form

Relationship in fourth normal form (4NF) if relationship in BCNF and not contains multi value dependency. To remove multi value dependency from one relationship, we are divide relationship become two new relationships. Each relationship contains of two attributes have multi value relationship.

Relationship in fifth normal form (5NF) get business with property is calling join without marks sense information loss (lossless join). The fifth normal Form (5 NF) also know as PJNF (projection join normal form). This case is very rare to appearance and hard to detect practically.

Minggu, 19 April 2009

DATABASE AND ER-DIAGRAM

Definition Of Database
database is group of data that stored into magnetic disk, optical disk, or another secondary data storage.
fused collection of datas, that connected mutually, from an enterprise (a company, firm, and government).

  • manufacture company -> production planning data, actual production data, material ordering data, etc
  • hospital -> data of patient, doctor, nurse, etc


DBMS (Database Management System)

DBMS is a collection of databases or a combination of software-based database applications. DBMS is a software designed to assist in the maintenance and utility data collection in large numbers. Application programs are used to access and maintain databases. The main purpose DBMS is to provide an environment that is efficient and easy to use, withdrawal and storage of data and information.


BIT, BYTE, Field

- Bit is the smallest pieces of data that contains the value 0 or 1

- Byte is a set of bit-bit similar

- Field is a set of byte-byte similar, in the database used the term attribute








ATTRIBUTE OR FIELD

Attribute is the nature or characteristics of an entity that provides provide detail on these entities. A relationship can also have attributes. Example attributes:

  1. STUDENTS: NIM, NAME, ADDRESS
  2. CAR: NOMOR_PLAT, COLOR, TYPE, CC

types of attribute

single vs multivalue

  • single -> only can containing mostly 1 value
  • multivalue -> can containing more than 1 value with same kind value

atomic vs composition

  • atomic -> cannot divided into smaller attribute
  • composition -> alliance of smaller attribute

derived attribute -> attribute that can be yield from other attribute value, for example : age can be yield from attribute of birth datenull

value attribute -> attribute with no value for a record

mandatory value attribute -> attribute that must have a value


RECORD / TUPLE

Record is a data line in a relationship. Record consists of a set of attributes where the attribute is an attribute-related entity or to inform the full relationship.


Entity / FILE
File is a collection of similar records and have the same elements, the same attributes but different data value.


File Type
In processing applications, files can category as follows:
- Master File
- Transaction Files
- File Reports
- File History
- File Protection
- File Work
Domain is the set of values that are allowed to reside in one or more attributes. Each attribute in a database relational is defined as a domain
Key elements of record which is used to find these records at the time of access, or can also be used to identify each entity / record / line.


DOMAIN

Domain is the set of values that are allowed to reside in one or more attributes. Each attribute in a database relasional is defined as a domain.


KEY DATA ELEMENT

Key elements of record which is used to find these records at the time of access, or can also be used to identify each entity / record / line.

Kind of key:
Super key is one or more attribute from a table which can be used for identifying entity / record from table uniquely
Candidate key is super key with minimum attribute. Candidate key may not contain attribute of other table so that the candidate is super key, but not the contrary
Primary key
One of the attribute of candidate key can be selected / determined into primary key with 3 criteria:
- The key is more natural for use as reference
- The key is simpler
- The key is unique
Alternate key
Alternate key is attribute of candidate key which is not chosen become primary key
Foreign key
Foreign key is any attribute subjecting to primary key at other tables. Foreign key will be happened in a relation which have one to many cardinality or many to many cardinality. Foreign key usually put at the table which direct to many cardinality.
External key
External key is lexical attribute or collection of lexical attribute which values always identify one object instance


ERD (Entity Relationship Diagram)
ERD is a model of a network that uses word order is stored in the abstract system.
Differences between the DFD and ERD
DFD is a model of network functions that will be implemented by the system
ERD is a model that emphasizes the network data on the structure and relationship data
Elements of the ERD

Entity
In ER Diagram, entity was denoted with rectangle shape. Entity is something that exist in real system as well as abstract where lies the storage of the data or the data being stored
Relationship
This relationship ER Diagram was denoted with a rhombus shape. Relationship is a natural relation that exists between entity. Generally named with a basic verb in order to easy to do the reading of the relation
Relationship Degree
Denoting a number of entity which participated in one relationship. A degree which often used in ERD.
Attribute
Denoting a characteristic from every entity or relationship
Cardinality
Shows a maximum number of tupple that can be related with entity, in other entity

Relationship degree

Unary relationship is Relationship model happen between the entity which coming from the same entity set.

Binary relationship is relationship model happen between 2 entity.

Ternary relationship between instance of 3 entity unilaterally.


Cardinality

There are 3 cardinality relation, that is :

  • One-to-one : relationship degree one-to-one expressed by one event in first entity, only have one relation with one event in second entity and on the contrary.
  • One-to-many or many-to-one : relationship degree one-to-many is equal many-to-one dependent from where that relationship seen. For one event in first entity can have many relationship with event in second entity, the other way, one event in second entity only have relationship with one event in first entity.
  • Many-to-many : happen if each event in an entity having a lot of relationship with event in other entity


NOTATION (ER DIAGRAM)
Simbolic notation in ER Diagram are :
• Rectangle denoting an entity community
• Circle round shape denoting attribute
• Rhombus denoting a relation community
• Line as a connector between relation community with entity community and Entity Community with its attribute



Sabtu, 04 April 2009

DATA FLOW CHART

  • called also data flow diagram (DFD)
  • describe system distribution into module smaller
  • will make user less will realize computer area to will understand system that be done

context diagram

  • consist of one process and describe scope from a system
  • be level highest from DFD that describe entire input to system and outpur from system
  • system is limitted by boundary (described by line broken offs)
  • there may not be any storage (storage)

diagram zero

  • describe process from DFD
  • give opinion according to comprehensive hit system that handled, show function or principal process, data current and eskternal entity
  • in level this maked data existence storage
  • for process is not detailed again in level furthermore so added simbol ‘*’ or ‘p’ by the end of process number
  • balance input and output (balancing) between diagram 0 with context diagram must be taked care

detail diagram

  • be diagram that elaborate what process is there in diagram zero or level at on it
  • numberization level in DFD:

Name Level

Name Diagram

process number

0

Context

1

Diagram 0

1.0, 2.0, 3.0, ...

2

Diagram 1.0

1.1, 1.2, 1.3, ...

3

Diagram 1.1

1.1.1, 1.1.2, ...

  • in one level best not found more than 7 process and maximal 9, when more so dekomposisi

process specification

  • every process in DFD must has process spesification
  • in top level method that used to describe process can by using descriptive sentence
  • in level detailer that is in process most under (functional primitive) want spesification structureder
  • process spesification will be guide for programmer in will make program (coding)
  • method that used in process spesification: process explanation in the form of story, decision table, decision tree

unitary outside

  • something that beyond system, but he gives data into system or give data from system
  • symbolized with notation box
  • external entity doesn't belong part from system
  • nomenclature:

· terminal name shaped noun

· terminal may not has name same except the object really same

data current

  • be place currences information
  • described with straight line that connect component from system
  • data current is showed with arrow direction and line is given name on data current that flow
  • data current flows between process, data storage and show data current from data shaped input for system

  • name gift guide:

· data current name that consist of several word currents relateds with line continues

· there may not be any data current the name same and name gift must reflect its contents

· data current that consist of several elements can be declared with element group

· koid word use ‘data’ and ‘informasi’ to give name in data current

· sedapat may be data current name is written complete

rule other:

· data current name that come into a process may not equal to secretory data current name from process

· data flow that step into or out from data storage unnecessary given whose name:

· simple data current and easy understood

· data current describes entire item datas

· there may not be any data current from terminal to data storage or on the contrary because terminal not part from system, terminal connection with data storage must pass process

process

· process is what done by system

· process can cultivate data or data current enters to be data current out

· process functioneds mentransformasi one or several input datas is one or several product datas as according to desirable spesification

· every has one process or several inputs with produce one or several products

· process often also called bubble

process name gift guide:

· process name consists of verb and noun that reflect process function

· don't use process word as part from name a bubble

· there may not be any several process that has names same

· process must be given number. sequence number sedapat may be follow current or sequence process, but such sequence meaningless number absolutely is sequence process chronologically

data savings

· data storage be existing data repository in system

· symbolized with a pair parallel line or two lines wrongly one sides side opened

· process can take data from or give data to database

· name gift guide:

· name must reflect data storage

· when does the name more than one word so be must be given sign continue

data dictionary

· functioned to help system executant to interpret application in detail and mengorganisasi all data element that used system according to exactly so that user and has base explanation same about input, product, storage and process

· in analysis stage, data dictionary is used as a means of communication between system analyst with user

· in system planning stage, data dictionary is used to design input, report and database

· data current in dad has global, explanation detailer visible in data dictionary

data dictionary holds the followings:

· data current name: must be registerred so that reader that need furthermore explanation about a data current can look for it easily

· alias: alias or name other from data can be written if (there are) any

· data form: used to group data dictionary into its use sewaktu system planning

· data current: show from which data flows and where does data aim

· explanation: give explanation about meaning from data current


balancing in DFD

· data current that come into and out from one process must equal to data current that comes into and out from process details in level/stage at under it

· data current name that come into and out from one process must equal to data current name that enters to into and out from process details

· total and name entitas outside from a process must equal to total and name entitas outside from process details

· matters that must be payed in DFD that has more than one level:

· must found balance input and output between one level and level next

· balance between level 0 and level 1 seen in input/output from data current to or from terminal in level 0, while balance between level 1 and level 2 seen in input/output from data current to/from process concerned

· data current name, data storage and terminal in every level must same if the object same

prohibition in DFD

· data current may not from entitas outside direct aim entitas outside another without pass a process

· data current may not from direct data savings aim to entitas outside without pass a process

· data current may not from direct data savings aim another data savings without pass a process

· data current from one direct process aims another process without pass a data savings best may be avoided