Chapter-2
CHAPTER-2
Data Models
Data
Modeling and Data Models:
Data
model:
A
data model is a collection of concepts that can be used to describe the
structure of a database
Data modeling in the
first step in designing a database refers to the process of creating a specific
data model for a problem.
A model is an abstraction of a real
world object. A data model represents data structures and their
characteristics, relations, constraints and transactions.
Data model is an iterative process we
start with a simple understanding of the problem increases, and finally design
a database in a specific database model.
Importance
of Data Models:
1. Data
Model can facilitate interaction among the designer, the application programmer
and the end user.
2. Applications
are used to transform data into information. But data are viewed in different
ways by different people.
3. For
e.g. the manager and clerk both are working in the same company, the manager
have wide view of company data than the clerk.
4. A
company president has universal view of data.
5. Different
Managers views data differently in a company. The inventory manager is more
concerned about inventory levels, while purchasing manager concerned about
items and supplies.
6. Application
programmers have another view of data i.e., concerned with data locations and
formatting.
7. A
house is a collection of roots, if someone is going to build a house, they have
the overall view i.e., provided by blue print. A sound data environment
requires an overall database blue print based on appropriate data model.
8. When
a good database blue print is available, an application programmer view of data
is different from the managers and end users. When a good database blue print
is not available problems are likely to ensure.
Data
Model basic building blocks:
The basic building blocks of data
models are entities, attributes, relationships and constraints. An entity
represents a real world object person (or) place.
For e.g., a customer entity have different of
customers.
An attribute is a characteristic of an entity.
For e.g. customer entity have attributes
customer_no, customer_name, customer_address etc. A relationship describes an
association between entities. Data models use three types of associations.
One-to-many, many-to-many and one-to-one.
One-to-many
(1:M, 1…..*): A painter paints many different paintings.
Therefore, the database designer label the relationship PAINTER PAINTS
PAINTINGS as one-to-many.
Many-to-many
(M:N, *…….*): An
employee may learn many job skills and each job skill may be learned by many
employees. Therefore, the database
designer label the relationship Employee learns skills as many-to-many (M:N).
One-to-one
(1:1, 1…….1): Each
store manager manages only a single store. Therefore, the data designer label
the relationship employee manages stores as one-to-one (1:1).
By using the business rules we can properly
identify entities, attributes, relationships and constraints.
Business rules:
A business rule is a description of a policy,
procedure (Or) principle within a business organization. Examples of business
rule.
1. A
customer may generate many invoices.
2. A
training session cannot be scheduled for fever than 10 employees or for more
than 30 employees. These
business rules establish entities, relationships and constraints.
The first business rule establishes two
entities (customer, invoices) and a one-to-many relationship between these two
entities.
The 2nd business rule
establishes a constraint. (No fewer than 10 people (or) more than 30 people)
and two entities (training, people) and a relationship between employee and
training.
Discovering business rules:
The main source of business rules are
company manager, policy manager, department manager and written documents such
as company’s procedures, standards (or) operation manuals. A faster direct
source of business rules is direct interviews with the concerned persons.
Translating
business rules into Data Model:
General rule: A
noun in a business rule be translate into an entity in that model, and a verb
associating nouns will translated into a relationship among the entities.
For e.g. the
business rule “ customer may generate many voices “
Containing two nouns (customer and
invoices) and a verb (generate) that associates the noun.
To proper identify the type of
relationship, the relationships are bi-directional.
For e.g. the business rule “ A customer may generate many invoices”,
the relationship is one-to-many (1:M, 1…….*). Customer is the 1 side and
invoice is the many side.
Evolution
of Data Models:
|
Generation
|
Time
|
Model
|
Example
|
|
First
|
1960-70 |
File
system |
VMS |
|
Second |
1970 |
Hierarchical
and network data models. |
IMS,
focus IDMS |
|
Third
|
Mid
1970’s-present |
Relational
data model |
M.S.Acess,
Oracle |
|
Fourth
|
Mid
1980’s-present |
Object
Oriented Model Extended
Relational Model |
Versant Objectivity
|
|
Fifth
|
Present-Future |
XML |
Oracle
log |
Hierarchical Data Model:
The hierarchical
data model is the oldest type of data model, developed by IBM in 1968.
This data model organizes the data in a tree-like structure, in which
each child node (also known as dependents) can
have only one parent node. The database based on the hierarchical
data model comprises a set of records connected to one another through links.
The link is an association between two or more records. The top of the tree
structure consists of a single node that does not have any parent and is called
the root node.
1. The
hierarchical data model consists of a set of nested relationships one-to-many
and one-to-one association.
2. In
hierarchical data model the relations are presented in the form of tree-structure
in which the root segment is kept at the top and further branches emanate
downwards from the root segment.
3. In
this model the type of association can be one-to-one and one-to-many. This
means that many-to-one association is not permitted. This is equivalent to say
that multiple percentages for a child segment is not permitted.
The above conceptual data model can be
mapped into any one ways as shown below.
An alternative1 student file is kept at
the root segment of the tree and the faculty file is kept at the bottom of the
tree. By mapping the conceptual data model into the hierarchical data model the
following facts are observed.
1. The
association from student to enrollment is one-to-many. This mapped without any
modifications.
2. The
association from enrollment to subject is many-to-one. This is not permitted in
hierarchical data model. Hence it is modified into one-to-one association.
3. The
association from subject to faculty in many-to-one. This is not permitted in
hierarchical data model. Hence it is modified into one-to-one association.
In alternative1 while mapping the
conceptual data model into hierarchical data model, the many-to-one association
presents at two levels are modified into one-to-one association. These
modifications will increase the data redundancy.
In alternative2 the faculty file
is kept at the root of the tree and student file is kept at the bottom of the
tree. While mapping the conceptual data model the following facts are observed.
1. The
association from faculty to subject file is one-to-many. So it is mapped
without any modifications.
2. The
association from to subject enrollment is many-to-one. This is not permitted in
hierarchical data model. Hence it is modified into one-to-one association.
3. The
association from enrollment to student is many-to one. This is not permitted in
hierarchical data model. Hence it is modified into one-to-one association.
Finally which alternative has less
redundancy should be selected for implementation.
In alternative2, the association
change between enrollment and student. That means we are changing one type.
When we compare alternative2 with alternative1, alternative2 has less
redundancy and it is implemented.
Advantages:
1. It promotes data sharing.
2. Parent/Child relationship promotes conceptual
simplicity.
3. Database security is provided and enforced by DBMS.
4. Parent/Child relationship promotes data integrity.
5. It is efficient with 1:M relationships.
Disadvantages:
1.
Complex implementation requires knowledge of
physical data storage characteristics.
2.
Changes in structure require changes in all
application programs.
3.
There are implementation limitations (no multi
parent or M:N relationships).
4.
There is no data definition or data
manipulation language in the DBMS.
5.
There is a lack of standards.
Network Data Model:
A Network data model consists of a set of
pair wise association between the entities.
The Network data model was created to
improve database performance, database standards and represent complex
relationships effectively than the hierarchical data model.
To establish database standards, the
conference of database system languages (CODASYL) created the database task
group (DBTG). The DBTG define standard specifications for database creation and
data manipulations.
1. Schema:
The schema provides overall view of the database to the administrator.
2. Sub Schema: The sub schema which defines the portion of
the database seen by the application programs.
3. Database Management Language: That defines the environment in which data
can be changed. The DBTC specify 3 DML components.
a. A
schema data Definition Language (DDL), which enables the data base
administrator to create the database.
b. A
subschema DDL, which allows the application programs to define database
component that will be used by the application.
c. A
Data Manipulation Language, to manipulate the data in the database.
In Network data model, the
Network database as a collection of records in one-to-many record to have more
than one parent.
In Network database, a
relationship is called a set. Each set contains two entities one entity is
owner and other entity is member.
Sets
|
Set
Name |
Owner |
Member
|
|
Ordered
|
Customer
|
Sales
|
|
Buyers
|
Book
|
Sales
|
|
Contribute
|
Author
|
Book
|
|
Catalog
|
Publisher
|
Book
|
Advantages:
1. Conceptual
simplicity is at least equal to that of the hierarchical model.
2. It handles more relationship
types, such as M:N and multi-parent.
3. Data access is
more flexible than in hierarchical and file system models.
4. Data Owner/Member
relationship promotes data integrity.
5. There is
conformance to standards.
6. It includes data
definition language (DDL) and data manipulation language (DML) in DBMS
Disadvantages:
1. System complexity
limits efficiency—still a
navigational system.
2. Navigational
system yields complex implementation, application development, and management.
3. Structural changes
require changes in all application programs.
Relational data model:
The relational model was introduced
in 1970 by E.M.Codd. The foundation of relation is a mathematical concept known
as relation. The Relation is composed of intersecting rows and columns. Each
row in a relation represents a tuple. Each column represents an attribute.
The relational data model is
implemented through a Relational Database Management System (RDBMS).
Tables are related through the sharing
of common attribute. For e.g. the table agent and customer as shown below.
Agent
|
Agent_Code
|
Agent_Name |
Agent_Address |
Agent_PhoneNo |
Agent_Area
code |
|
|
|
|
|
|
Customer
|
Cust_No |
Cust_name |
Cust_Address |
Phone_No |
Agent_Code |
|
|
|
|
|
|
By
matching the Agent_Code in the customer table with Agent_Code in the Agent
table we can find Agent details of that customer.
The relationship types one-to-one,
one-to-many and many-to-many have in a relational data model.
A relationship diagram is a representation
of entities, the attributes within the entities and the relationship between
the entities.
Agent
Customer
|
|
|
Agent_Name |
|
Agent_Address |
|
Agent_PhoneNo |
|
Agent_Area code |
|
Cust_No |
|
Cust_name |
|
Cust_Address |
|
Phone_No |
|
Agent_Code |
In the above
diagram, the relationship is one-to-many. The symbol ∞ indicates many.
The customer represents “many”
sides, because an AGENT can have many CUSTOMERS.
The AGENT represents the “1” side
because each CUSTOMER has only one AGENT.
The languages which are supported to
relational data model is powerful and flexible. Because of that the relational
data model is popular. FoxPro, database, M.S.Acess, SQL are relational database
software’s. This software’s allows the user to specify what must be done
without specifying how it must be done.
SQL based database applications
involves 3 parts. 1. End user interface.
2. Set of tables stored in the database. 3. SQL Engine.
Advantages:
1. Structural
independence is promoted by the use of independent tables. Changes in a table’s structure do not affect data
access or application programs.
2. Tabular view
substantially improves conceptual simplicity, thereby promoting easier database
design, implementation, management, and use.
3. Ad hoc query
capability is based on SQL.
4. Powerful RDBMS
isolates the end user from physical-level details and improves implementation
and management simplicity.
Disadvantages:
1. The RDBMS
requires substantial hardware and system software overhead.
2. Conceptual
simplicity gives relatively untrained people to use a good system poorly, and
if unchecked, it may produce the same data anomalies found in file systems.
3. It may promote “islands of information” problems as individuals and
departments can easily develop their own applications.
Entity relation model:
Peter Chen first introduced the
E.R.data model in 1976; it was the graphical representation of entities and
their relationship in a database.
E.R. models are normally represented
in an entity relationship diagram.
The E.R.Model is based on the following
components.
a. Entity:
entities are the real time objects. Entities represented by a rectangle.
e.g. painter, employee, skills, noun.
b. Attribute:
Attributes are the characteristics of entities.
e.g. Empno,
Empname, Empaddress etc.
c. Relationships: A
relationship describes association among the entities. There are three types of
relationships, one-to-many, many-to-many and one-to-one.
There are two types
of ER notations.
1. Chen
notation
2. Crow’s
foot notation.
For
different types of relationships.
CHEN
NOTATION
Crown’s
Foot Notation
In Chen
notation, entities are represented rectangle and entity names are written in
the capital letters at the centre of the rectangle. Relationships are
represented by a diamond. The diamonds are connected to entities through a
relationship name is written inside the diamond.
In the
crows foot notation, the crow foot is derived from the three pronged symbol
used to represent many relationships. In this notation, the one represented by
a short line segments, and many is represented by the crow’s foot. The
relationship name is written above the relationship line. The relationships are
also show in vertical.
Advantages:
1. Visual modeling yields exceptional conceptual
simplicity.
2. Visual representation makes it an effective
communication tool.
3. It is integrated with dominant relational model.
Disadvantages:
1. There is limited constraint representation.
2. There is limited relationship representation.
3. There is no data manipulation language.
4. Loss of information content occurs when attributes
are removed from entities to avoid crowded displays. (This limitation has been
addressed in subsequent graphical versions)
Object Oriented Model:
In the
Object Oriented Data Model (OODM) both data and their relationships are
contained in a single structure known as an Object.
Object
Oriented Data Model has allowed an object, the object contains operations that
can be performed on it, such as changing data values, finding a specific data
value, and printing data values.
The OODM is based on the following components.
a. An
object is an abstraction of a real world entity.
b. Attributes
describes the properties of an object.
E.g. Person
object contains the attribute name, social security number, date of birth etc.
c. A
collection of similar objects contains attributes and methods. By using those
methods change data values, find data values and print data values in the
objects.
d. Classes
are organized in a class hierarchic key. The class hierarchic key is similar to
upside down tree. In which each class has only one parent.
e. One
of the properties of object oriented data model is inheritance. By using the
inheritance we can inherit attributes and methods from super class to sub
classes.
E.g. customer and employee sub classes of the person
super class. Customer and employee will inherit all attributes and methods from
person.
f. Object
oriented data models are drawn using unified modeling language (UML) class
diagram. The UML class diagrams are used to represent data and other their
relationships.
For E.g. Let us use
invoice program. In this case the invoices are generated by customer, each
invoice contains one (or) more lines, and each line represents an item
purchased by customer.
The following diagram shows an object
representation UML class diagram, ER model
for invoice.
The
object representation of invoice includes all the related objects within the
same object. The 1 next to the customer
object indicates that each invoice related to one customer. The M next to the
line object indicates that each invoice contains no. of lines.
The UML class diagram uses 3 separate
classes (customer, invoice and line) and two relationships to represent this
problem.
The E.R Model also uses the 3 separate
entities and two relationships to represent the invoice problem.
Advantages:
1. Semantic content
is added.
2. Visual
representation includes semantic content.
3. Inheritance
promotes data integrity.
Disadvantages:
1. Slow development
of standards caused vendors to supply their own enhancements, thus eliminating
a widely accepted standard.
2. It is a complex
navigational system.
3. There is a steep
learning curve.
4. High system
overhead slows transactions







Comments
Post a Comment