Shape

Based on the chapter in Williams & Bizup of the same name.

Shape

Again, we will worry about more complex sentences.

By now, we can comfortably manage the simple sentences. The more complex, the harder it is to keep things readable.

Shape

A sentences get more complicated, it can be hard to manage the complexity.

Quicksort takes O(n log n) time in expectation, averaged over all n! permutations of n elements with equal probability, to sort an array of n distinct elements.

But isn't that good? Character and action are early in the sentence.

Shape

Or,

To sort an array of n distinct elements, in expectation, averaged over all n! permutations of n elements with equal probability, quicksort takes O(n log n) time.

Important part at the end for emphasis.

Shape

The original is easier to read:

To sort an array of n distinct elements, quicksort takes O(n log n) time in expectation, averaged over all n! permutations of n elements with equal probability. Wikipedia, “Quicksort”

Maybe the verb could be sorts, but takes O(?) time is common in this context.

There is some concision work that could be done, but let's ignore it.

Shape

This sentence has a lot going on:

  1. To sort an array intro phrase
  2. of n distinct elements, qualifier
  3. quicksort subject
  4. takes verb
  5. O(n log n) time object
  6. in expectation more object
  7. averaged over all n! permutations of n elements more object
  8. with equal probability. more object

Shape

We could split that into multiple sentences: the sentence is giving a basic explanation of average-case complexity. We will assume it's reasonable as one sentence.

The O(n log n) is only true with a bunch of qualifiers.

The problem is to manage all of those parts.

Revising Sprawl

The question of “shape” will be how we can rearrange the part of a complex sentence to make it more readable.

Let's look at some heuristics to help…

Get to the Subject

As before, we'd like to have the subject appear early in the sentence.

Since most undergraduate students change their major fields of study at least once during their college careers, first-year students who are not certain about the program of studies they want to pursue should not load up their schedules to meet requirements for a particular program. Williams & Bizup, 11th ed, p. 145

Get to the Subject

An introductory phrase/qualifier can have its place, but keep it short. Or move it later in the sentence.

First-year students should not load up their schedules with requirements for a particular program if they are not certain about the program of studies they want to pursue, because most change their major fields of study at least once during their college careers. Williams & Bizup, 11th ed, p. 145

Get to the Subject

When I do this wrong, it's often because I want to write a thought as a implies b, instead of b is implied by a.

A short introductory phrase can sneak in just fine.

Since n is small, insertion sort can be used.

Which is best depends on the context, but if b is the main subject, it's a vote for the later.

Get to the Verb & Object

Similarly, readers want to get the verb and object quickly.

This also mirrors earlier advice, but is more subtle in complex sentences.

Get to the Verb & Object

This sentence has the subject early, but it's so complicated that the verb is very late.

The memory access pattern of Heapsort's heapify and siftDown phases slows it down on modern hardware.

Get to the Verb & Object

With a shuffling, the subject, verb, and object all happen early:

Heapsort is slowed on modern hardware by the memory access pattern of the heapify and siftDown phases.

Don't Interrupt

English grammar is flexible enough that you can, if you are so inclined, inject thoughts into many locations.

In that sentence, if you are so inclined is in the middle of the verb can inject.

That kind of thing, you will find, comes up more often, due to the number of clauses, in more complex sentences.

Don't Interrupt

Avoid interrupting the connection between the subject & verb, or the verb & object.

Those things go together. Don't break them up without a good reason.

Don't Interrupt

A more realistic example:

Heapsort, due to its memory access pattern, is slow on modern hardware.

In this case, the subject is far from the verb:

Heapsort (subject), due to its memory access pattern, is (verb) slow on modern hardware (object).

Don't Interrupt

An example that breaks up the object:

Heapsort accesses memory, due to the nature of the heapify procedure, in non-contiguous locations.
Heapsort accesses memory in non-contiguous locations, due to the nature of the heapify procedure.

Get to the Point

Another heuristic: start with the actual point you're making.

If you start with supporting evidence, the reader has to replay what was just said once they figure out the context.

Due to the nature of the heapify procedure, heapsort accesses memory in non-contiguous locations.

Get to the Point

For example, this sentence has the point at the end:

When the size of the array to be sorted approaches or exceeds the available primary memory, so that (much slower) disk or swap space must be employed, the memory usage pattern of a sorting algorithm becomes important, and an algorithm that might have been fairly efficient when the array fit easily in RAM may become impractical. Wikipedia, “Sorting algorithm”

Get to the Point

Putting the point early lets the reader predict the rationale.

An algorithm that might have been fairly efficient when the array fit easily in RAM may become impractical when the size of the array to be sorted approaches or exceeds the available primary memory, so that (much slower) disk or swap space must be employed and the memory usage pattern becomes important.

[Starts with a familiar topic; emphasizes usage pattern at end; subject early but complex.]

In-Class Exercise

Rewrite this to make it more readable.

Although operations on a hash table take constant time on average, the cost of a good hash function can be significantly higher than the inner loop of the lookup algorithm for a sequential list or search tree, so hash tables are not effective when the number of entries is very small. adapted from Wikipedia, “Hash table”

Restriction: keep the complexity in one sentence.

Reshaping Sprawl

Let's rework the quicksort sentence:

Quicksort takes O(n log n) expected time, when given an array of n distinct elements, where the time is averaged over all n! permutations of n elements, which are equally weighted.

Maybe not actually better, but we can see what's happening…

Reshaping Sprawl

There are a bunch of “relative clauses” modifying the main point.

  1. Quicksort takes O(n log n) expected time, subject+verb, the main point
  2. when given an array of n distinct elements, relative clause
  3. where the time is averaged over all n! permutations of n elements, relative clause
  4. which are equally weighted. relative clause

Reshaping Sprawl

The “memory access” sentence can also be stretched into that pattern:

  1. An algorithm that might have been fairly efficient when the array fit easily in RAM may become impractical
  2. when the size of the array to be sorted approaches or exceeds the available primary memory,
  3. which may cause swap space to be employed,
  4. which makes the memory usage pattern important.

[I think those are relative clauses, but wouldn't bet on it.]

Reshaping Sprawl

Each of these sentences has a main point, and some parts that modify/​restrict/​refine it.

With a more clear idea of what those are, we can make some choices…

Remove Clauses

Maybe you can just remove them. Let concision win if that's appropriate.

In the quicksort sentence:

…averaged over all n! permutations of n elements, which are equally weighted.

How else would you weight them? Isn't equally implied unless you say otherwise?

…averaged over all n! permutations of n elements.

Separate Sentences

Often, it's easy to break these into separate sentences by breaking the complex thought into manageable parts.

An algorithm that might have been fairly efficient when the array fit easily in RAM may become impractical when the size of the array to be sorted approaches or exceeds the available primary memory. The memory usage pattern becomes very important if swap space must be used.

Separate Sentences

As always, some restraint is appropriate.

Many algorithms are fairly efficient when the array fits easily in RAM. These may become impractical when the size of the array to be sorted approaches or exceeds the available primary memory. Swap space must then be used. This causes the memory usage pattern to become very important.

Modifying Phrases

Sometimes the cause can be changed to a modifying phrase, a phrase that could be left out, but helps its meaning.

These can make the clauses less repetitive, varying the sentence structure.

Modifying Phrases

For example,

…exceeds the available primary memory, which may cause swap space to be employed, making the memory usage pattern important.
…exceeds the available primary memory, causing swap space to be used and making the memory usage pattern important.

Ordering Clauses

Generally, arrange clauses from short to long. Ending with shorter clauses is abrupt.

We should devote a few final words to a matter that reaches beyond the techniques of research to the connections between those subjective values that reflect our deepest ethical choices and objective research. Williams & Bizup, 11th ed, p. 157

That last thought falls out of nowhere.

Ordering Clauses

With shorter clauses earlier, the reader can build:

We should devote a few final words to a matter that reaches beyond the techniques of the connections between those objective research and those subjective values that reflect our deepest ethical choices. adapted from Williams & Bizup, 11th ed, p. 158

In-Class Exercise

Reassemble and reshape this sentence. adapted from Wikipedia, “Radix sort”

  1. Radix sort is a non-comparative sorting algorithm
  2. that sorts data with integer keys,
  3. which works by grouping keys by the individual digits,
  4. where those digits share the same significant position value,
  5. that runs in O(wn) time.

Create one (good) sentence out of these parts.

Troubleshooting Sentences

Sometimes you end up with a sentence that sucks, and it's hard to figure out why.

Let's talk about a few things that are often problems in more complex sentences…

Grammatical Coordination

With multiple clauses complicating a sentence, having non-parallel structures can be disorienting.

When using quicksort, care must be taken to choose a good pivot and of repeated values in the list.

Grammatical Coordination

The two parts here have different roles:

  1. When using quicksort, care must be taken
  2. to choose a good pivot and
  3. of repeated values in the list.

The first is an action in the algorithm; the second is a property of the input.

Grammatical Coordination

We can revise to describe parallel issues:

When using quicksort, care must be taken of bad pivots and of repeated values in the list.
When using quicksort, care must be taken to choose a good pivot and to efficiently handle repeated values in the list.

Grammatical Coordination

Another example:

Quicksort is a space-optimized version of the binary tree sort, which has a tree implied by the recursive calls instead of an explicit tree, but makes exactly the same comparisons in a different order. adapted from Wikipedia, “Quicksort”

Grammatical Coordination

Again, this is a statement with two pieces of supporting explanation.

  1. Quicksort is a space-optimized version of the binary tree sort,
  2. which has a tree implied by the recursive calls instead of an explicit tree,
  3. but makes exactly the same comparisons in a different order.

The but refers to the first clause, not the implied tree, which isn't obvious.

Grammatical Coordination

We can repair something like:

Quicksort is a space-optimized version of the binary tree sort, which makes exactly the same comparisons in a different order, and has a tree implied by the recursive calls instead of an explicit tree.

Unclear Connections

The previous example was A which B, but C. The new information in C applied to A, not to the thing immediately before it. That's confusing.

Be careful with these implicit connections around your topics.

Unclear Connections

Pronouns can easily be a problem.

Insertion sort has best-case running time O(n), which is less than merge sort's O(n log n), so it can still be useful.

What can still be useful?

I often do this as the result of incomplete editing.

Unclear Connections

It's easy to fix it once you notice.

Insertion sort has best-case running time O(n), which is less than merge sort's O(n log n), so insertion sort can still be useful.

Or even:

… so despite its slower average-case, insertion sort can still be useful.

In-Class Exercise

Rewrite this sentence to make it more readable.

The Mustache template system is implemented for many programming languages, provides logic-less design, so it doesn't have control flow constructs, which is distributed on GitHub.