Physical Storage Media

Several types of data storage exist in most computer systems. They vary in speed of access, cost per unit of data, and reliability.
- Cache: most costly and fastest form of storage. Usually very small, and managed by the operating system.
- Main Memory (MM): the storage area for data available to be operated on.
  - General-purpose machine instructions operate on main memory.
  - Contents of MM are usually lost in a power failure or ``crash''.
  - Usually too small to store the entire database.
- Direct-access Storage (disk): primary medium for long-term storage.
  - Typically the entire database is stored on disk.
  - Data must be moved from disk to MM in order for the data to be operated on.
  - After operations are performed, data must be copied back to disk if any changes were made.
  - Disk storage is called direct access storage as it is possible to read data on the disk in any order (unlike sequential access).
  - Disk storage usually survives power failures and system crashes.
- Tape Storage: used primarily for backup and archival data.
  - Cheaper, but much slower access, since tape must be read sequentially from the beginning.
  - Used as protection from disk failures!
As disk storage is so important in database implementation, we will look at disk characteristics in detail.
Figure 7.2 shows a simple disk.
- The head is a device which stays close to the surface of the platter and reads or writes information encoded magnetically on the platter.
- The platter is organized into concentric tracks of data (see Figure 7.3).
- The arm can be positioned over any one of the tracks.
- The platter is spun at high speed.
- To read information, the arm is positioned over the correct track.
- When the data to be accessed passes under the head, the read or write operation is performed.
Since the platter rotates at high speed, it does not take long for the contents of an entire track to pass under the head.
- This amount of time is called disk latency time.
- Relative to latency time, it takes a long time to reposition the arm.
- The repositioning time, called seek time, grows as the distance the arm must move increases.
- It is therefore useful to store related information on the same track or physically close tracks in order to minimize seek time.
Multiple-platter disks (see figure 7.4) are called disk-packs. When we use the term disk from now on, we will be referring to multiple-platter disks.
- Multiple disk arms are moved as a unit by the actuator.
- Each arm has two heads, to read disks above and below it.
- The set of tracks over which the heads are located forms a cylinder.
- This cylinder holds that data that is accessible within the disk latency time.
- It is clearly sensible to store related data in the same or adjacent cylinders.
Data is transferred between disk and main memory in units called blocks.
- A block is a contiguous sequence of bytes from a single track of one platter.
- Block sizes range from 512 bytes to several thousand.
- If several blocks from a cylinder need to be transferred, we may save time by requesting them in the order in which they pass under the heads.
- Similarly, if blocks are from different cylinders, we may save time by requesting them in an order that minimizes actuator movement.
- These techniques may not always be possible, or may be expensive.

Next: File Organization Up: File & System Previous: Overall System Structure

Page created and maintained by Osmar R. Zaï ane
Last Update: Tue Oct 31 12:59:25 PST 1995