Next: Ordered Indices Up: Indexing & Hashing Previous: Indexing & Hashing

Basic Concepts

An index for a file works like a catalogue in a library. Cards in alphabetic order tell us where to find books by a particular author.
In real-world databases, indices like this might be too large to be efficient. We'll look at more sophisticated indexing techniques.
There are two kinds of indices.
- Ordered indices: indices are based on a sorted ordering of the values.
- Hash indices: indices are based on the values being distributed uniformly across a range of buckets. The buckets to which a value is assigned is determined by a function, called a hash function.
We will consider several indexing techniques. No one technique is the best. Each technique is best suited for a particular database application.
Methods will be evaluated on:
1. Access Types -- types of access that are supported efficiently, e.g., value-based search or range search.
2. Access Time -- time to find a particular data item or set of items.
3. Insertion Time -- time taken to insert a new data item (includes time to find the right place to insert).
4. Deletion Time -- time to delete an item (includes time taken to find item, as well as to update the index structure).
5. Space Overhead -- additional space occupied by an index structure.
We may have more than one index or hash function for a file. (The library may have card catalogues by author, subject or title.)
The attribute or set of attributes used to look up records in a file is called the search key (not to be confused with primary key, etc.).

Osmar Zaiane
Mon Jul 13 13:28:03 PDT 1998