Block-Oriented Iteration

Next: Merge-Join Up: Join Strategies Previous: Simple Iteration

Block-Oriented Iteration

If we process tuples on a per-block basis we can save many accesses.
The idea is that, if both relations have tuples stored together physically, we can examine all the tuple pairs for a block of each relation at one time. We still need to read all the tuples of one relation for a block of the other relation.
The block method algorithm is:
```
 afor each block Bd of deposit do

		 		 begin

		 		 		 for each block Bc of customer do

		 		 		 		 begin

		 		 		 		 		 for each tuple d in Bd do

		 		 		 		 		 		 begin

		 		 		 		 		 		 		 for each tuple c in Bc do

		 		 		 		 		 		 		 		 begin

		 		 		 		 		 		 		 		 		 test pair (d, c) to see if a tuple

		 		 		 		 		 		 		 		 		 should be added to the result

		 		 		 		 		 		 		 		 end

		 		 		 		 		 		 end

		 		 		 		 end

		 		 end
```
- Instead of reading the entire customer relation for each tuple of deposit, we read the entire customer relation once for each block of deposit.
- Since there are 500 blocks of deposit tuples and 10 blocks of customer tuples, reading customer once for each block of deposit requires 10 * 500 = 5000 accesses to customer blocks.
- Total cost is then 5000 + 500 (for accesses to deposit blocks) = 5500.
- This is obviously a significant improvement over the non-block method, which required roughly 100,000 or 2,000,000 accesses.
- Choice of customer for the inner loop is arbitrary, but does provide a potential advantage.
- Being the smaller relation, it may be possible to keep it all in main memory.
- If this was the case, we would only require 500 blocks to read deposit plus 10 blocks to read customer into main memory, for a total of 510 block accesses.

Osmar Zaiane
Sun Jul 26 17:45:14 PDT 1998