Use of an Index

Next: Hash Join Up: Join Strategies Previous: Merge-Join

Use of an Index

Frequently the join attributes form a search key for an index on one of the relations being joined.
- In such cases we can consider a join strategy that makes use of such an index.
- Our first join algorithm is more efficient if an index exists on customer (inner loop relation) for cname.
- Then for a tuple d in deposit, we no longer have to read the entire customer relation.
- Instead, we use the index to find tuples in customer with the same value on cname as tuple d of deposit.
- We saw that without an index, we could take as many as 2,010,000 block accesses.
- Using the index, and making no assumptions about physical storage, the join can be performed more efficiently.
- We still need 10,000 accesses to read deposit (outer loop relation).
- However, for each tuple of deposit, we only need an index lookup on the customer relation.
- As and we assumed that 20 pointers fit in one block, this lookup requires at most 2 index block accesses (depth of -tree) plus block access for the customer tuple itself.
- This means 3 block accesses instead of 200, for each of the 10,000 deposit tuples.
- This gives 40,000 total, which appears high, but is still better than 2,010,000.
- More efficient strategies required tuples to be stored physically together.
- If tuples are not stored physically together, this strategy is highly desirable.
- The savings (1,970,000 accesses) is enough to justify creation of the index, even if it is used only once.

Osmar Zaiane
Sun Jul 26 17:45:14 PDT 1998