Next: Join Strategies for Parallel Up: Join Strategies Previous: Hash Join

## Three-Way Join

1. We now consider the strategies for computing a 3-way join:

``` branch    deposit    customer

```

2. We'll assume that
3. Strategy 1:
• Compute deposit customer using one of the previous methods.
• As cname is a key for customer, the result of this join has at most 10,000 tuples (the number in deposit).

• If we then build an index on branch for bname, we can compute

``` branch    (deposit    customer)

```

by considering each tuple t of

``` (deposit    customer)

```

and looking up tuples in branch with the same bname value using the index.

• (How many accesses, roughly?)
4. Strategy 2:
• Compute the join without any indices.
• This requires checking 50 * 10,000 * 200 possibilities = 100,000,000.
• Not too good of an idea...
5. Strategy 3:
• Perform the pair of joins at once.
• Construct two indices:
• One on branch for bname.
• One on customer for cname.

• Then consider each tuple t in deposit.
• For each t, look up corresponding tuples in customer and in branch with equal values on respective join attributes.
• Thus we examine each tuple of deposit exactly once.
Using strategy 3 it is often possible to perform a three-way join more efficiently than by using two two-way joins.
6. It is hard to calculate exact costs for 3-way joins. Costs depend on how the relations are stored, distribution of values and presence of indices.

Osmar Zaiane
Sun Jul 26 17:45:14 PDT 1998