Chapter-2

 

CHAPTER-2

Data Models

Data Modeling and Data Models:

Data model:

A data model is a collection of concepts that can be used to describe the structure of a database

Data modeling in the first step in designing a database refers to the process of creating a specific data model for a problem.

        A model is an abstraction of a real world object. A data model represents data structures and their characteristics, relations, constraints and transactions.

        Data model is an iterative process we start with a simple understanding of the problem increases, and finally design a database in a specific database model.

Importance of Data Models:

1.    Data Model can facilitate interaction among the designer, the application programmer and the end user.

2.    Applications are used to transform data into information. But data are viewed in different ways by different people.

3.    For e.g. the manager and clerk both are working in the same company, the manager have wide view of company data than the clerk.

4.    A company president has universal view of data.

5.    Different Managers views data differently in a company. The inventory manager is more concerned about inventory levels, while purchasing manager concerned about items and supplies.

6.    Application programmers have another view of data i.e., concerned with data locations and formatting.

7.    A house is a collection of roots, if someone is going to build a house, they have the overall view i.e., provided by blue print. A sound data environment requires an overall database blue print based on appropriate data model.

8.    When a good database blue print is available, an application programmer view of data is different from the managers and end users. When a good database blue print is not available problems are likely to ensure.

Data Model basic building blocks:   

         The basic building blocks of data models are entities, attributes, relationships and constraints. An entity represents a real world object person (or) place.

 For e.g., a customer entity have different of customers.

 An attribute is a characteristic of an entity.

 For e.g. customer entity have attributes customer_no, customer_name, customer_address etc. A relationship describes an association between entities. Data models use three types of associations. One-to-many, many-to-many and one-to-one.

One-to-many (1:M, 1…..*): A painter paints many different paintings. Therefore, the database designer label the relationship PAINTER  PAINTS  PAINTINGS as one-to-many.

Many-to-many (M:N, *…….*):  An employee may learn many job skills and each job skill may be learned by many employees.  Therefore, the database designer label the relationship Employee learns skills as many-to-many (M:N).

One-to-one (1:1, 1…….1):  Each store manager manages only a single store. Therefore, the data designer label the relationship employee manages stores as one-to-one (1:1).

  By using the business rules we can properly identify entities, attributes, relationships and constraints.

Business rules:

  A business rule is a description of a policy, procedure (Or) principle within a business organization. Examples of business rule.

1.    A customer may generate many invoices.

2.    A training session cannot be scheduled for fever than 10 employees or for more than 30 employees.            These business rules establish entities, relationships and constraints.

       The first business rule establishes two entities (customer, invoices) and a one-to-many relationship between these two entities.

            The 2nd business rule establishes a constraint. (No fewer than 10 people (or) more than 30 people) and two entities (training, people) and a relationship between employee and training.

Discovering business rules:

      The main source of business rules are company manager, policy manager, department manager and written documents such as company’s procedures, standards (or) operation manuals. A faster direct source of business rules is direct interviews with the concerned persons.

Translating business rules into Data Model:

 General rule: A noun in a business rule be translate into an entity in that model, and a verb associating nouns will translated into a relationship among the entities.

For e.g. the business rule      customer may generate many voices “

       Containing two nouns (customer and invoices) and a verb (generate) that associates the noun.

       To proper identify the type of relationship, the relationships are bi-directional.

       For e.g. the business rule    “ A customer may generate many invoices”, the relationship is one-to-many (1:M, 1…….*). Customer is the 1 side and invoice is the many side.

Evolution of Data Models: 

Generation

 

Time

Model

Example

First

 

1960-70

File system

VMS

Second

 

1970

Hierarchical and network data models.

IMS, focus IDMS

Third

 

Mid 1970’s-present

Relational data model

M.S.Acess, Oracle

Fourth

 

Mid 1980’s-present

Object Oriented Model

 

Extended Relational Model

Versant

 

Objectivity

Fifth

 

Present-Future 

XML

Oracle log

 

Hierarchical Data Model: 

The hierarchical data model is the oldest type of data model, developed by IBM in 1968. This data model organizes the data in a tree-like structure, in which each child node (also known as dependents) can have only one parent node. The database based on the hierarchical data model comprises a set of records connected to one another through links. The link is an association between two or more records. The top of the tree structure consists of a single node that does not have any parent and is called the root node.

1.    The hierarchical data model consists of a set of nested relationships one-to-many and one-to-one association.

2.    In hierarchical data model the relations are presented in the form of tree-structure in which the root segment is kept at the top and further branches emanate downwards from the root segment.

3.    In this model the type of association can be one-to-one and one-to-many. This means that many-to-one association is not permitted. This is equivalent to say that multiple percentages for a child segment is not permitted.




    The above conceptual data model can be mapped into any one ways as shown below.

       An alternative1 student file is kept at the root segment of the tree and the faculty file is kept at the bottom of the tree. By mapping the conceptual data model into the hierarchical data model the following facts are observed.

1.    The association from student to enrollment is one-to-many. This mapped without any modifications.

2.    The association from enrollment to subject is many-to-one. This is not permitted in hierarchical data model. Hence it is modified into one-to-one association.

3.    The association from subject to faculty in many-to-one. This is not permitted in hierarchical data model. Hence it is modified into one-to-one association.

              In alternative1 while mapping the conceptual data model into hierarchical data model, the many-to-one association presents at two levels are modified into one-to-one association. These modifications will increase the data redundancy.

              In alternative2 the faculty file is kept at the root of the tree and student file is kept at the bottom of the tree. While mapping the conceptual data model the following facts are observed.

1.    The association from faculty to subject file is one-to-many. So it is mapped without any modifications.

2.    The association from to subject enrollment is many-to-one. This is not permitted in hierarchical data model. Hence it is modified into one-to-one association.

3.    The association from enrollment to student is many-to one. This is not permitted in hierarchical data model. Hence it is modified into one-to-one association.

         Finally which alternative has less redundancy should be selected for implementation.

         In alternative2, the association change between enrollment and student. That means we are changing one type. When we compare alternative2 with alternative1, alternative2 has less redundancy and it is implemented.

Advantages:

1. It promotes data sharing.

2. Parent/Child relationship promotes conceptual simplicity.

3. Database security is provided and enforced by DBMS.

4. Parent/Child relationship promotes data integrity.

5. It is efficient with 1:M relationships.

Disadvantages:

1.    Complex implementation requires knowledge of physical data storage characteristics.

2.    Changes in structure require changes in all application programs.

3.    There are implementation limitations (no multi parent or M:N relationships).

4.    There is no data definition or data manipulation language in the DBMS.

5.    There is a lack of standards.

 

Network Data Model:

      A Network data model consists of a set of pair wise association between the entities.

      The Network data model was created to improve database performance, database standards and represent complex relationships effectively than the hierarchical data model.

       To establish database standards, the conference of database system languages (CODASYL) created the database task group (DBTG). The DBTG define standard specifications for database creation and data manipulations.

1.    Schema: The schema provides overall view of the database to the administrator.

2.    Sub Schema:  The sub schema which defines the portion of the database seen by the application programs.

3.    Database Management Language:  That defines the environment in which data can be changed. The DBTC specify 3 DML components.

a.    A schema data Definition Language (DDL), which enables the data base administrator to create the database.

b.    A subschema DDL, which allows the application programs to define database component that will be used by the application.

c.    A Data Manipulation Language, to manipulate the data in the database.

              In Network data model, the Network database as a collection of records in one-to-many record to have more than one parent.

              In Network database, a relationship is called a set. Each set contains two entities one entity is owner and other entity is member.


 

 

 

Sets

Set Name

 

Owner

Member

Ordered

Customer

Sales

Buyers

Book

Sales

Contribute

Author

Book

Catalog

Publisher

Book

 

Advantages:

1. Conceptual simplicity is at least equal to that of the hierarchical model.

2. It handles more relationship types, such as M:N and multi-parent.

3. Data access is more flexible than in hierarchical and file system models.

4. Data Owner/Member relationship promotes data integrity.

5. There is conformance to standards.

6. It includes data definition language (DDL) and data manipulation language (DML) in DBMS

Disadvantages:

1. System complexity limits efficiencystill a navigational system.

2. Navigational system yields complex implementation, application development, and management.

3. Structural changes require changes in all application programs.

Relational data model: 

          The relational model was introduced in 1970 by E.M.Codd. The foundation of relation is a mathematical concept known as relation. The Relation is composed of intersecting rows and columns. Each row in a relation represents a tuple. Each column represents an attribute.

           The relational data model is implemented through a Relational Database Management System (RDBMS).

            Tables are related through the sharing of common attribute. For e.g. the table agent and customer as shown below.

Agent

Agent_Code

Agent_Name

Agent_Address

Agent_PhoneNo

Agent_Area code

 

 

 

 

 

 

 

 

 

 

 

Customer

Cust_No

Cust_name

Cust_Address

Phone_No

Agent_Code

 

 

 

 

 

 

 

 

 

 

By matching the Agent_Code in the customer table with Agent_Code in the Agent table we can find Agent details of that customer.

       The relationship types one-to-one, one-to-many and many-to-many have in a relational data model.

      A relationship diagram is a representation of entities, the attributes within the entities and the relationship between the entities.

                            Agent                                                         Customer


 Agent_Code

 Agent_Name

Agent_Address

Agent_PhoneNo

Agent_Area code

Cust_No

Cust_name

Cust_Address

Phone_No

Agent_Code

 

 

 

 

 

 


In the above diagram, the relationship is one-to-many. The symbol ∞ indicates many.

           The customer represents “many” sides, because an AGENT can have many CUSTOMERS.

           The AGENT represents the “1” side because each CUSTOMER has only one AGENT.

           The languages which are supported to relational data model is powerful and flexible. Because of that the relational data model is popular. FoxPro, database, M.S.Acess, SQL are relational database software’s. This software’s allows the user to specify what must be done without specifying how it must be done.

          SQL based database applications involves 3 parts. 1. End user interface.  2. Set of tables stored in the database. 3. SQL Engine.

 

 

 

Advantages:

1. Structural independence is promoted by the use of independent tables. Changes in a tables structure do not affect data access or application programs.

2. Tabular view substantially improves conceptual simplicity, thereby promoting easier database design, implementation, management, and use.

3. Ad hoc query capability is based on SQL.

4. Powerful RDBMS isolates the end user from physical-level details and improves implementation and management simplicity.

Disadvantages:

1. The RDBMS requires substantial hardware and system software overhead.

2. Conceptual simplicity gives relatively untrained people to use a good system poorly, and if unchecked, it may produce the same data anomalies found in file systems.

3. It may promote islands of informationproblems as individuals and departments can easily develop their own applications.

Entity relation model:

          Peter Chen first introduced the E.R.data model in 1976; it was the graphical representation of entities and their relationship in a database.

           E.R. models are normally represented in an entity relationship diagram.

 The E.R.Model is based on the following components.

a.    Entity: entities are the real time objects. Entities represented by a rectangle.

e.g. painter, employee, skills, noun.

b.    Attribute: Attributes are the characteristics of entities.

e.g.   Empno, Empname, Empaddress etc.

c.    Relationships: A relationship describes association among the entities. There are three types of relationships, one-to-many, many-to-many and one-to-one.

There are two types of ER notations.

1.    Chen notation

2.    Crow’s foot notation.

              For different types of relationships.

 

CHEN NOTATION

 

          

 

                   

Crown’s Foot Notation

 

 

        In Chen notation, entities are represented rectangle and entity names are written in the capital letters at the centre of the rectangle. Relationships are represented by a diamond. The diamonds are connected to entities through a relationship name is written inside the diamond.

        In the crows foot notation, the crow foot is derived from the three pronged symbol used to represent many relationships. In this notation, the one represented by a short line segments, and many is represented by the crow’s foot. The relationship name is written above the relationship line. The relationships are also show in vertical.

 

 Advantages:

1. Visual modeling yields exceptional conceptual simplicity.

2. Visual representation makes it an effective communication tool.

3. It is integrated with dominant relational model.

Disadvantages:

1. There is limited constraint representation.

2. There is limited relationship representation.

3. There is no data manipulation language.

4. Loss of information content occurs when attributes are removed from entities to avoid crowded displays. (This limitation has been addressed in subsequent graphical versions)

 

Object Oriented Model:

         In the Object Oriented Data Model (OODM) both data and their relationships are contained in a single structure known as an Object.

         Object Oriented Data Model has allowed an object, the object contains operations that can be performed on it, such as changing data values, finding a specific data value, and printing data values.

The OODM is based on the following components.

a.    An object is an abstraction of a real world entity.

b.    Attributes describes the properties of an object.

E.g.   Person object contains the attribute name, social security number, date of birth etc.

c.    A collection of similar objects contains attributes and methods. By using those methods change data values, find data values and print data values in the objects.

d.    Classes are organized in a class hierarchic key. The class hierarchic key is similar to upside down tree. In which each class has only one parent.

e.    One of the properties of object oriented data model is inheritance. By using the inheritance we can inherit attributes and methods from super class to sub classes.

E.g. customer and employee sub classes of the person super class. Customer and employee will inherit all attributes and methods from person.

f.     Object oriented data models are drawn using unified modeling language (UML) class diagram. The UML class diagrams are used to represent data and other their relationships.

For E.g. Let us use invoice program. In this case the invoices are generated by customer, each invoice contains one (or) more lines, and each line represents an item purchased by customer.

    The following diagram shows an object representation UML class diagram, ER model

for invoice.  



The object representation of invoice includes all the related objects within the same object.  The 1 next to the customer object indicates that each invoice related to one customer. The M next to the line object indicates that each invoice contains no. of  lines.

       The UML class diagram uses 3 separate classes (customer, invoice and line) and two relationships to represent this problem.

       The E.R Model also uses the 3 separate entities and two relationships to represent the invoice problem.

Advantages:

1. Semantic content is added.

2. Visual representation includes semantic content.

3. Inheritance promotes data integrity.

Disadvantages:

1. Slow development of standards caused vendors to supply their own enhancements, thus eliminating a widely accepted standard.

2. It is a complex navigational system.

3. There is a steep learning curve.

4. High system overhead slows transactions

 

 

 

 

 

 

 

 

Comments

Popular posts from this blog

DBMS Notes

I Bcom(CA) _ 2021 Notes