Clustering File Organization

Next: Data Dictionary Storage Up: Organization of Records in Previous: Sequential File Organization

One relation per file, with fixed-length record, is good for small databases, which also reduces the code size.
Many large-scale DB systems do not rely directly on the underlying operating system for file management. One large OS file is allocated to DB system and all relations are stored in one file.
To efficiently execute queries involving , one may store the depositor tuple for each cname near the customer tuple for the corresponding cname, as shown in Figure 10.19.
This structure mixes together tuples from two relations, but allows for efficient processing of the join.
If the customer has many accounts which cannot fit in one block, the remaining records appear on nearby blocks. This file structure, called clustering, allows us to read many of the required records using one block read.
Our use of clustering enhances the processing of a particular join but may result in slow processing of other types of queries, such as selection on customer.
For example, the query
```
 aaaaaaaaaaaa¯select *

		 from customer
```
now requires more block accesses as our customer relation is now interspersed with the deposit relation.
Thus it is a trade-off, depending on the types of query that the database designer believes to be most frequent. Careful use of clustering may produce significant performance gain.

Osmar Zaiane
Tue Jul 7 16:00:21 PDT 1998