Sunday, October 4, 2015

File Organization - DBMS



A file is a collection of or log of records. Having stored the records in a file it is necessary to access these records using either a primary or secondary key. A File is organized logically as a sequence of records. These records are mapped onto disk blocks .


 There are two types of access:
1. Sequential access - is performed when records are accessed in the order they are stored. Sequential access is the main access mode only in batch systems, where files are used and updated at regular intervals.
2. Direct access - on-line processing requires direct access, whereby a record can be accessed without accessing the records between it and the beginning of the file. The primary key serves to identify the needed record.
There are three methods of file organization:
1. Sequential organization
2. Indexed-sequential organization
3. Direct organization
Sequential Organization
In sequential organization records are physically stored in a specified order according to a key field in each record.
Advantages of sequential access:
1. It is fast and efficient when dealing with large volumes of data that need to be processed periodically (batch system).
Disadvantages of sequential access:
1. Requires that all new transactions be sorted into the proper sequence for sequential access processing.
2. Locating, storing, modifying, deleting, or adding records in the file  requires rearranging the file.
3. This method is too slow to handle applications requiring immediate updating or responses.
Indexed-Sequential Organization
In the indexed-sequential files method, records are physically stored in sequential order on a magnetic disk based on the key field of each record. Each file contains an index that references one or more key fields of each data record to its storage location address.
Direct Organization
Direct file organization provides the fastest direct access to records. When using direct access methods, records do not have to be arranged in any particular sequence on storage media.
Characteristics of the direct access method include:
·         Computers must keep track of the storage location of each record using a variety of direct organization methods so that data can be retrieved when needed.
·         New transactions' data do not have to be sorted.
·         Processing that requires immediate responses or updating is easily performed.
Example
Fixed Length Records
 let us consider a file of account records for our bank database
.Each record of this file is defined (in pseudo code) as:
type deposit = record
Account _number char (10);
Branch_name char (22);
Balance numeric (12, 2);
end

If we assume that each character occupies 1 byte and that numeric (12, 2) occupies 8 bytes, our account record is 40 bytes long. A simple approach is to use the first 40 bytes for the first record, the next 40 bytes for the second record and so on.
Variable Length Records
Variable length records arise in the database systems in several ways.
- Storage of multiple record types in a file
- Record types that allow variable lengths for one or more fields.
- Record types that allow repeating fields, such as arrays or multisets.
Different techniques for implementing variable length records exist.
The slotted page structure is commonly used for organizing records within a block. There is a header at the beginning of each block, containing the following information.
·         The number of record entries in the header.
·         The end of free space in the block
·                  An array whose entries contain the location and size of each record.


No comments:

Post a Comment