Type conversions allow expressions of an actual type Ta to be used in the context of an expected type Te. Of course, programmers can simply write their own type conversion routines as ordinary functions. But languages often provide special support for type conversions of various kinds.
There are several issues involved in language support for type conversions.
Languages may provide three different types of notation for conversions.
TRUNC
and
ROUND
functions for two different conversions from
REAL
to INTEGER
.
INTEGER(3.78)
in Modula-2, (short) 5
in Java.
Coercions interact with typing rules. When a type mismatch occurs between the actual type Ta of expression occuring in a context which expects a type Te, a type checker would normally report an error. However, if the language defines a coercion in this case, then that coercion is applied instead of reporting the error.
When a language supports conversions using type casts, this
effectively allows a family of overloaded conversion functions
to be represented.
For example, the Java cast short (expr)
specifies
six different conversions from each of the primitive types
byte
, char
, int
,
long
, float
, and double
.
It may be that there is a language provides no standard conversion in a particular case for an actual type Ta in the context of an expected type Te. But suppose that there are is an intermediate type Ti for which there exist conversions from Ta to Ti and from Ti to Te. Then these two conversions may be used transitively to achieve the Ta to Te conversion.
Algol 68 is a language with implicit transitive conversions. This is probably an unsafe feature: actual type errors might go undetected if the compiler can find a series of conversions that resolve the type conflict.
Some conversions may be defined transitively. Java's float to char conversion is defined as a float to int conversion followed by a int-to-char conversion.
Cast notation may be used for explicit transitive conversions.
For example, consider (int) (char) Float.POSITIVE_INFINITY
versus (int) Float.POSITIVE_INFINITY
.
The latter conversion narrows the largest representable floating
point value to the largest 32-bit two's complement integer
value 2147483647.
That conversion is also the first step in the float-to-char
conversion; the second step is to discard the high-order
16 bits to get the Unicode 16-bit character '\uffff'
.
Converting from this value to integer yields 65535.
Given a syntactic form specifying a conversion from one type to another, the semantics of the conversion are rules defining the actual mappings from values of the input type to values of the output type.
Lossless conversions preserve the information contained in a value.
Lossy conversions may lose information.
Unsafe conversions are ones in which conversions are not performed according to the semantics of the types involved, but simply by reinterpreting the bit patterns. For example, Modula-2 defines its type transfer functions (casts in a functional notation) as low-level unsafe conversions.
Unsafe conversions provide efficient ways to reinterpret bit patterns. The disadvantages include the introduction of a loophole into the security system of a strongly-typed language and the likelihood that programs will not be portable accross systems that implement the types using different representations.
Unsafe conversions can also occur through the variant record mechanisms of Pascal or Modula-2. This is problably worse than the use of explicit unsafe type transfers, because conversion in this manner may be accidental, i.e., an uncaught programmer error. Ada's variant record mechanism has been carefully designed to disallow this safety problem.
Modula-3 provides for unsafe conversions with its LOOPHOLE
construct. However, LOOPHOLE
and other unsafe features may only be used in explicitly
declared UNSAFE
modules.
Ordinary, safe modules may not use any unsafe features, nor
may they import UNSAFE
modules unless those
modules have a safe interface. This can provide for
a nice division of a complex
system into a large number of safe high-level modules
supported by a few low-level modules using unsafe techniques for
efficiency. If an inexplicable run-time error occurs,
responsibility for the error is localized to the UNSAFE
modules.