Next: Merge-Join Up: Join Strategies Previous: Simple Iteration

## Block-Oriented Iteration

1. If we process tuples on a per-block basis we can save many accesses.

The idea is that, if both relations have tuples stored together physically, we can examine all the tuple pairs for a block of each relation at one time. We still need to read all the tuples of one relation for a block of the other relation.

The block method algorithm is:

``` afor each block Bd of deposit do

begin

for each block Bc of customer do

begin

for each tuple d in Bd do

begin

for each tuple c in Bc do

begin

test pair (d, c) to see if a tuple

should be added to the result

end

end

end

end

```

• Instead of reading the entire customer relation for each tuple of deposit, we read the entire customer relation once for each block of deposit.
• Since there are 500 blocks of deposit tuples and 10 blocks of customer tuples, reading customer once for each block of deposit requires 10 * 500 = 5000 accesses to customer blocks.

• Total cost is then 5000 + 500 (for accesses to deposit blocks) = 5500.
• This is obviously a significant improvement over the non-block method, which required roughly 100,000 or 2,000,000 accesses.

• Choice of customer for the inner loop is arbitrary, but does provide a potential advantage.

• Being the smaller relation, it may be possible to keep it all in main memory.
• If this was the case, we would only require 500 blocks to read deposit plus 10 blocks to read customer into main memory, for a total of 510 block accesses.

Osmar Zaiane
Sun Jul 26 17:45:14 PDT 1998