18. Sound in Processing

In these notes you will learn:

  • Some of the many different kinds of sound files formats.
  • Some of the basic issues of getting sound to work in a program.
  • How to use the Minim library to get sound to work in Processing
  • How to use Minim to make a simple simulation of drums.

18.1. Sound File Formats

On a compute , sound is stored as digital audio. Roughly speaking, that means that sound is stored as a sequence of sound levels. While there are dozens of different ways of storing sound in a computer, they can be grouped into one of several categories:

  • Uncompressed audio, such as WAV, AIFF, or AU formats. WAV for instance,

    is the default audio file format on Windows computers, and it uses about 10 megabytes per minute of sound. Uncompressed audio formats are not efficient when it comes to storing. Before MP3 players became popular and CDs were wildly used, you could store about 65-70 minutes of audio on a CD.

  • Lossy Compression arose to deal with the problem of storage. Examples are

    MP3, WMA, AAC, and the oddly named OGG Vorbis.

    These formats differ greatly in their details, but what they have in common is a clever approach to dealing with storing sound: the observation that that there are parts of the audio that most people can’t hear. If most people can’t hear certain parts of the audio, why store it? This achieves a wondrous factor of 10 reduction in the size of the file! The formats are called “Lossy” because they delete those audio parts that can’t be heard (by most).

    The software that translates uncompressed music to MP3‘s, called an encoder, is quite complex and has a number of patents associated with it that differ by country. Thus you can’t write software to play and create MP3‘s without negotiating with the patent holders. One way to avoid this headache is to use a free lossy format such as OGG Vorbis. Alternatively, if you are big company with lots of money, you can hire some engineers and create your own format, like Microsoft did with WMA.

  • Lossless Compression. If you’re an audiophile (or just have very good

    hearing), then the notion of eliminating parts of the audio might not be as attractive to you. Further, for music archives, or for converting live music to digital format, we might not want to lose any information.

    Lossless audio formats create do some compression by using tricks other than removing things out of the range of the average human ear. For example, 10 seconds of silence take up as much space as 10 seconds of music on an uncompressed WAV file. However in a WMA Lossless file the silence takes up much less space.

    Such files are called lossless because they store everything that would have been stored in an equivalent uncompressed audio file. In practice lossless compression typically achieves about 2:1 compression ratio: it takes about half of the original, uncompressed audio file. Not as good as Lossy formats, but still better uncompressed audio, without losing any information.

    The type of audio can make a significant impact on the compression ratio. For example, the FLAC lossless file format compresses voice recordings more efficiently than music recordings.

18.2. Sound in Programs

Getting sound to work in a program is trickier than you might you think. From a programming point of view, what we need to worry about is the following: almost all sound effects, or music, are played while our program is doing something else at the time. Everything that we’ve seen in Processing so far has been sequential, with only one action happening at a time. In other words, we have been writing sequential programs. Never have we made it so that two actions occur simultaneously. That is, we have not written any parallel programs. But that’s usually what you need with sound.

In turns out that Processing has a general-purpose solution to the problem of getting things to happen simultaneously: threads. A thread is essentially a part of a program that is allowed to executed along side other parts of the program - other threads. If your computer has multiple CPUs, then it is possible (but not guaranteed!) that each thread uses its own CPU. If you have only one CPU, then Processing can simulate running two (or more) threads in parallel by running each thread of a very short period of time, then switching to another. In other words, the threads take turns in using the CPU.

So far we’ve only been explicitly using one thread. If we want sound to run concurrently with other things in our program, then the natural solution is to use two threads: one thread for a program, and a second thread for sound.

However, threads are notoriously tricky to use correctly, and so we won’t use them directly. Instead, we’ll rely on a tool called Minim to manage our sound-related threads.

Note

Getting two or more threads to share CPUs and memory in an efficient and error-free way is surprisingly difficult. So difficult that it may be the topic of entire courses! Even simple threaded programs can contain subtle bugs that are very difficult to spot. Numerous languages, libraries, and programming techniques have been invented to make it easier to write parallel programs.

18.3. Using the Minim Library

Minim is a sound library that comes with Processing, and with it you can do the following sorts of things:

  • Play many different kinds of sound files, including MP3s.
  • Record(e.g. via a microphone) sounds and store them in files.
  • Apply sound effects to sound files, on the fly.

The following example from the Minim documentation page (always a good resource when you’re stuck). This example shows how to play a sound file:

import ddf.minim.*;

Minim minim;
AudioPlayer song;

void setup() {
    size(100, 100);

    minim = new Minim(this);

    // this loads mysong.mp3 from the data folder
    song = minim.loadFile("mysong.mp3");
    song.play();
}

void draw() {
    background(0);
}

void stop() {
    song.close();
    minim.stop();

    super.stop();
}

Note the following:

By default, Processing assumes we do not want sound in our program. When we

do wish to include sound, we must tell Processing about it. This is done with the line:

import ddf.minim.*;

We are importing the Minim library into our code. Try commenting out this statement and running the program. Processing will complain that it does not know what Minim is. The import line gives the program access to all of the code in the library called ddf.minim; the * means “all” in this statement.

We must create, and initialize, a Minim variable:

Minim minim;

void setup() {
    // ...

    minim = new Minim(this);

    // ...
}

The expression new Minim(this) creates a new Minim object specifically for this program. The variable this is a special variable that is automatically supplied by Processing. Essentially, this refers to the object that contains your program, and Minim requires access to this object to work correctly.

To play a song, we need to create and initialize an AudioPlayer object, and then we need to tell Processing to start playing the song:

AudioPlayer song;

void setup() {
    //...

    song = minim.loadFile("mysong.mp");
    song.play();
}

The statement song.play() plays the song starting at the beginning of the file. If you want the song to start at a different point, you can do this by passing the number of mili seconds as a parameter. e.g. if you want to start playing the song 5 seconds in, write song.play(5 * 1000);.

An important point here is that the program doesn’t pause and wait for the song to finish playing. Any code that we place in the draw() function will be drawn to the screen.

Finally, Minim needs a little bit of help in managing its sound threads. In any program that uses Minim, you need to define a stop() function that looks like the one above. This is another special Processing function, and can be thought of as the ‘opposite’ of setup: it is called right before the program stops. For programs that use Minim, we close the song, stop Minim, and then call the stop() function of the parent object (more on parents and children later).

For now, you don’t have to worry too much about the details of stop(): just know that it is required when you use Minim, and that it turns of the sound.

18.4. Digital Drums

As a simple example that shows graphics and sound working together, let’s write a program that lets the user play a pair of bongo drums (the sound files used are bongo1.wav and bongo7.wav). For simplicity, we’ll draw the drums as circles, and let the user tap a drum by pressing ‘f’ (bongo1) or ‘j’ (bongo2).

Here is the program:

// Drum sounds are from http://www.drumsamples.org/

// import Minim
import ddf.minim.*;

// setup the sound variables;
Minim minim;
AudioSnippet drum1;
AudioSnippet drum2;

// track when a drum has been struck
boolean drum1struck, drum2struck;

void setup() {
    //initialize the screen
    size(500, 500);
    smooth();

    // init sound
    minim = new Minim(this);
    drum1 = minim.loadSnippet("bongo1.wav");
    drum2 = minim.loadSnippet("bongo7.wav");

    drum1struck = false;
    drum2struck = false;
}

void draw() {
    background(255);

    // draw the drums: if a drum has just been struck
    // then fill it with colour as visual feedback

    // drum 1
    if (drum1struck) {
        fill(0);
        drum1struck = false;
        drum1.rewind();
    } else {
        fill(255, 0, 0);
    }
    ellipse(50, 55, 100, 100);

    if (drum2struck) {
        fill(255);
        drum2struck = false;
        drum2.rewind();
    } else {
        fill(0, 255, 0);
    }
    ellipse(160, 55, 100, 100);
}

void keyPressed() {
    if (key != CODED) {
        if (key == 'f') {
           drum1.play();
           drum1struck = true;
        } else if (key == 'j') {
            drum2.play();
            drum2struck = true;
        }
    }
}

void stop() {
    drum1.close();
    drum2.close();
    minim.stop();

    super.stop();
}

There are a few things to notice about this program:

  • The audio files are loads as AudioSnippet objects, rather than AudioPlayer objects. Generally AudioSnippets are used for short snippets of sound.
  • After a drum is struck, note that we rewind the sound effect using the call drum1.rewind() for drum1 and drum2.rewind() for drum2. This is because the AudioSnippet object keeps track of how much of the audio file has been played, and the next time play() is called the object results play from that point. This allows us to easily implement ‘pausing’. But we want to be able to repeat the sound, and so we have to rewind our position in the audio file to the beginning. It is also possible to simply call ,say, drum1.play(0) every time, rather than using the rewind() function call.
  • In draw(), we provide a brief visual indicator when a drum is struck, by changing the fill colour of the appropriate drum for 1 call to draw. To achieve this, we use the drum1struck and drum2struck variables.
  • The KeyPressed function checks to see if the user has pressed either the ‘f’ or ‘j’ keys have been pressed. Note that only lower case ‘f’ and ‘j’ are permitted, because you’ll recall that Strings and char types in Processing are case-sensitive when it comes to comparison: 'f' != 'F'.

18.5. Questions

  1. What is Minim? Describe it briefly.
  2. Write the Processing import statement that must be put at the top of any Processing program that uses Minim.
  3. Why are uncompressed audio formats, such as WAV, not often used for storing music?
  4. Briefly explain the difference between lossless audio file compression, and lossy audio file compression.
  5. Why is the major technical reason why lossy audio file formats, such as MP3, are so popular?
  6. Name three kinds of things you can do with Minim.
  7. Why does Minim require a stop() function in any program that uses it?

18.6. Programming Questions

  1. Modify the bongo drum program so that the drums are labelled ‘A’ and ‘B’ on the screen.
  2. Add a third drum to the bongo drum program that is played when the user presses ‘C’. It should sound different than the other two.
  3. Change the visual feedback in the bongo drum program so that when a drum is struck the shape of the drum briefly changes from a circle to a rectangle.
  4. Find (or make) a background image of some drums and so that it looks like the user is playing a real drum set. Modify the visual feedback to make it fit with the style of the background image.