The previous example might seem to suggest that we should decompose schemes
as much as possible.
This too can lead to a bad design.
Consider a design where Borrow-scheme is decomposed into two schemes
We construct our new relations from borrow by:
It appears that we can reconstruct the borrow relation by
performing a natural join on the new schemes amt and loan.
Figure 6.5 shows what we get by computing amtloan.
We notice that there are tuples in amtloan that are
not in borrow.
How did this happen?
The intersection of the two schemes is amount, so the
natural join is made on the basis of equality in the loan amount.
If two loans are for the same amount, there will be four tuples in the
natural join.
Two of these tuples will be spurious - they will not appear in the
original borrow relation, and should not appear in the database.
Although we have more tuples in amtloan,
we have less information.
Because of this, we call this a lossy or
lossy-join decomposition.
A decomposition that is not lossy-join is called a
lossless-join decomposition.
The only way we could make a connection between loan and amt
was through amount.
When we decomposed Lending-scheme into Borrow-scheme and
Branch-scheme earlier, we did not have a similar problem.
Why not?
The only way we could represent a relationship between tuples in the
two relations was through bname.
This did not cause problems.
For a given branch name, there is exactly one assets value and branch
city.
The branch associated with a loan depends on the customer, not on
the amount of the loan (which is not unique).
We'll make a more formal definition of lossless-join:
Let be a relation scheme.
A set of relation schemes is a
decomposition of if
That is, every attribute in appears in at least one for
.
Let be a relation on , and let
That is, is the database that results
from decomposing into .
It is always the case that:
To see why this is, consider a tuple .
When we compute the relations , the
tuple gives rise to one tuple in each .
These tuples combine together to regenerate when we
compute the natural join of the .
Thus every tuple in appears in .
However, in general,
We saw an example of this inequality in our decomposition of
borrow into amt and loan.
In order to have a lossless-join decomposition, we need to impose
some constraints on the set of possible relations.
Let represent a set of constraints on the database.
A decomposition of a relation scheme
is a lossless-join decomposition for if, for all relations
on scheme that are legal under :
In other words, a lossless-join decomposition is one in which, for any
legal relation , if we decompose and then ``recompose'' , we
get what we started with - no more and no less.