Sound in Processing (Extra)

In these notes you will learn:

  • Some of the many different kinds of sound file formats.
  • Some of the basic issues of getting sound to work in a program.
  • How to use the Minim library to get sound to work in Processing.
  • How to use Minim to make a simple simulation of drums.

Sound File Formats

On a computer, sound is stored as digital audio. Roughly speaking, that means that sound is stored as a sequence of sound levels. While dozens of different audio file formats have been created, they can be categorized into a few basic types:

  • Uncompressed audio, such as WAV, AIFF, or AU formats. WAV, for instance, is the default audio file format on Windows computers, and it uses about 10 megabytes per minute of sound. Uncompressed audio formats are not efficient for storing, say, 10 songs from your favourite band’s latest album: uthey take up too much space.

  • Lossless compression audio formats create compressed audio files. For instance, in an uncompressed WAV file, 10 seconds of silence takes up just as much space as 10 seconds of music. But in, say, a WMA Lossless file the silence takes up much less space (thanks to a trick called run-length encoding). Such files are called lossless because they store everything that would have been stored in an equivalent uncompressed audio file. In practice, lossless audio compression typically stores audio in about half the size of the original (i.e. a 2:1 compression ratio).

    Compressing and de-compressing lossless audio files takes more time and computing power than uncompressed audio. But since it does not lose any information (the way, for instance, MP3s do), it is a good format for music archives, or for converting live music to digital format.

    The type of audio can make a significant difference in the compression. For example, the FLAC lossless file format compresses voice recordings more efficiently than music recordings.

  • Lossy audio formats, such as MP3, WMA, AAC, and OGG Vorbis, store audio in a format even more compressed than lossless compression files.

    MP3, for instance, is a popular format for consumer music. The MP3 format can compress audio into a file about 10 times smaller than the uncompressed audio. It does this in a very clever way by reducing the quality of parts of the audio that most people can’t hear. It’s called a lossy format because some of the audio is deleted.

    The software that encodes and decodes MP3s is quite complex, and has a number of patents associated with it that differ by country. Thus you can’t write software to play and create MP3s without negotiating with the patent holders. One way to avoid the licensing hassles of MP3s is to use a free lossy format, such OGG Vorbis. Or, if you are a big company with lots of resources, you could also create your own lossy format the way Microsoft did with WMA.

It turns out that Processing supports most popular sound file formats, and so we’ll use whatever file format is convenient.

Sound in Programs

Getting sound to work in a program is trickier than you might think. Of course, sound requires that your computer have a sound card and speakers (and a microphone, if you want to do recording), just as your computer needs a graphics card and monitor for graphics. Pretty much all personal computers come with sound cards and speakers, and so that is rarely a problem.

But from a programming point of view what we need to worry about is this: almost all sound effects, or music, are played while our program is doing something else at the same time. Everything we’ve seen in Processing so far has been sequential, with only one action happening at a time. In other words, we have been writing sequential programs. Never have we allowed two actions to occur simultaneously, i.e. we have not written any parallel programs. But that’s usually what we need with sound.

It turns out Processing has a general-purpose solution to the problem of getting things to happen simultaneously: threads. A thread is essentially a special object that is allowed to run at the same time as other threads. If your computer has multiple CPUs (or cores), then it is possible (but not guaranteed!) that each thread uses its own CPU. If you have only one CPU, then Processing can simulate running two threads simultaneously by running each thread for a short time in round-robin fashion (i.e. the threads alternate taking turns on the CPU).

So far we’ve only been (implicitly) using one thread. If we want sound to run at the same time as our program, then the natural solution is to use two threads: one thread for our program, and a second thread for sound.

However, threads are notoriously tricky to use correctly, and so we won’t use them directly. Instead, we’ll rely on Minim to manage our sound-related threads.

Note

Getting two or more threads to share CPUs and memory in an efficient and error-free way is surprisingly difficult. Even simple threaded programs can contain subtle bugs that are very difficult to spot. Numerous languages, libraries, and programming techniques have been invented to make it easier to write parallel programs.

Using the Minim Library

Minim is a sound library that comes with Processing, and with it you can do the following sorts of things:

  • Play many different kinds of sound files, including MP3s.
  • Record (e.g. via a microphone) sounds and store them in files.
  • Apply sound effects in real time to sound files.

We will only be using Minim to play music and sound effects.

Here’s an example from the Minim documentation that shows how to play a sound file:

import ddf.minim.*;

Minim minim;
AudioPlayer song;

void setup() {
  size(100, 100);

  minim = new Minim(this);

  // this loads mysong.wav from the data folder
  song = minim.loadFile("mysong.wav");
  song.play();
}

void draw() {
  background(0);
}

void stop() {
  song.close();
  minim.stop();

  super.stop();
}

Note the following:

  • You must tell Processing that you want to use the Minim library functions by importing Minim with this statement:

    import ddf.minim.*;
    

    This line gives the program access to all the code in the ddf.minim library; the * means “all” in this statement.

  • You must create, and initialize, a Minim variable:

    Minim minim;
    
    void setup() {
      // ...
    
      minim = new Minim(this);
    
      // ...
    }
    

    The expression new Minim(this) creates a new Minim object specifically for this program. The variable this is a special variable that is automatically supplied by Processing. Essentially, this refers to the object that contains your program, and Minim needs access to this object to work correctly.

  • Assuming your audio file contains a song, you also need to create and initialize an AudioPlayer object:

    // ...
    AudioPlayer song;
    
    void setup() {
      // ...
    
      minim = new Minim(this);
    
      song = minim.loadFile("mysong.wav");
    
      song.play(0);
    }
    

    The statement song.play(0) plays the song starting at location 0 (i.e. the beginning) of the file. The program keeps running as the song plays, i.e. the song is played in a separate thread of execution.

  • Finally, Minim needs some help on managing it sound threads. In any program that uses Minim, you need a stop() function that looks something like this:

    void stop() {
      song.close();
      minim.stop();
    
      super.stop();
    }
    

    The stop() function is a special Processing function that contains code to be run when a thread stops. In other words, the code in stop() will be run automatically when your program stops. All it does is close the song, stop Minim, and then call the stop() function of the parent object.

    For now, don’t have to worry about the details of stop(): just know that it is required when you use Minim, and that it turns off the sound.

Digital Drums

As a simple example that shows graphics and sound working together, lets write a program that lets the user play a pair of bongo drums (the sound files used are bongo1.wav and bongo7.wav). For simplicity, we’ll draw drums as circles, and let the user tap a drum by pressing the ‘A’ or ‘B’.

Screenshot of the bongo drums.

Here is the program:

// Drum sounds are from http://www.drumsamples.org/

// import Minim
import ddf.minim.*;

// set up the sound variables
Minim minim;
AudioSnippet drum1;
AudioSnippet drum2;

// track when a drum has been struck
boolean drum1struck;
boolean drum2struck;

void setup() {
  // initialize the screen
  size(210, 120);
  smooth();

  // initialize sound
  minim = new Minim(this);
  drum1 = minim.loadSnippet("bongo1.wav");
  drum2 = minim.loadSnippet("bongo7.wav");

  // initialize the graphics
  drum1struck = false;
  drum2struck = false;
}

void draw() {
  background(255);

  // draw the drums: if a draw has just been struck
  // then fill it with color as visual feedback for the user

  // drum 1
  if (drum1struck == true) {
    fill(0);
    drum1struck = false;
  } else {
    fill(255);
  }
  ellipse(50, 55, 100, 100);

  // drum 2
  if (drum2struck == true) {
    fill(0);
    drum2struck = false;
  } else {
    fill(255);
  }
  ellipse(160, 55, 100, 100);
}

void keyPressed() {
  if (key == 'a' || key == 'A') {
    drum1struck = true;
    drum1.play(0);
  }
  else if (key == 'b' || key == 'B') {
    drum2struck = true;
    drum2.play(0);
  }
}

// Minim requires that this function be added
void stop() {
  drum1.close();
  drum2.close();
  minim.stop();

  super.stop();
}

Notice a couple of things:

  • The audio files are loaded as AudioSnippet objects (instead of AudioPlayer objects) because they are short snippets of sound.
  • The code in draw() looks complex, but it is conceptually pretty simple: if a drum has just been struck, then the on-screen drum is briefly drawn filled-in to give the user visual feedback that it has been struck.
  • The keyPressed function checks to see if the user has pressed either the ‘A’ or ‘B’ keys. We check for both lowercase and uppercase letters in the if-statement as a convenience for the user so that the program will work the same regardless of whether or not they are typing in lowercase or uppercase.

Questions

  1. What is Minim? Describe it briefly.
  2. Write the Processing import statement that must be put at the top of any Processing program that uses Minim.
  3. Why are uncompressed audio formats, such as WAV, not often used for storing music?
  4. Briefly explain the difference between lossless audio file compression, and lossy audio file compression.
  5. Why is the major technical reason why lossy audio file formats, such as MP3, are so popular?
  6. Name three kinds of things you can do with Minim.
  7. Why does Minim require a stop() function in any program that uses it?

Programming Questions

  1. Modify the bongo drum program so that the drums are labelled ‘A’ and ‘B’ on the screen.
  2. Add a third drum to the bongo drum program that is played when the user presses ‘C’. It should sound different than the other two.
  3. Change the visual feedback in the bongo drum program so that when a drum is struck the shape of the drum briefly changes from a circle to a rectangle.
  4. Find (or make) a background image of some drums and so that it looks like the user is playing a real drum set. Modify the visual feedback to make it fit with the style of the background image.