Consider the query to find the assets and branch-names of all banks who have
depositors living in Port Chester.
In relational algebra, this is
(customer deposit branch))
This expression constructs a huge relation,
customer deposit branch
of which we are only interested in a few tuples.
We also are only interested in two attributes of this relation.
We can see that we only want tuples for which
ccity = ``Port Chester''.
Thus we can rewrite our query as:
depositbranch)
This should considerably reduce the size of the intermediate relation.
Suggested Rule for Optimization:
Perform select operations as early as possible.
If our original query was restricted further to customers with a
balance over $1000, the selection cannot be done directly to the customer
relation above.
The new relational algebra query is
(customer deposit branch))
The selection cannot be applied to customer, as balance
is an attribute of deposit.
We can still rewrite as
(customerdeposit))
branch)
If we look further at the subquery (middle two lines above), we can
split the selection predicate in two:
(customerdeposit))
This rewriting gives us a chance to use our ``perform selections
early'' rule again.
We can now rewrite our subquery as:
Second Transformational Rule:
Replace expressions of the form
by where and are
predicates and e is a relational algebra expression.