In these notes you will learn:
On a compute , sound is stored as digital audio. Roughly speaking, that means that sound is stored as a sequence of sound levels. While there are dozens of different ways of storing sound in a computer, they can be grouped into one of several categories:
is the default audio file format on Windows computers, and it uses about 10 megabytes per minute of sound. Uncompressed audio formats are not efficient when it comes to storing. Before MP3 players became popular and CDs were wildly used, you could store about 65-70 minutes of audio on a CD.
MP3, WMA, AAC, and the oddly named OGG Vorbis.
These formats differ greatly in their details, but what they have in common is a clever approach to dealing with storing sound: the observation that that there are parts of the audio that most people can’t hear. If most people can’t hear certain parts of the audio, why store it? This achieves a wondrous factor of 10 reduction in the size of the file! The formats are called “Lossy” because they delete those audio parts that can’t be heard (by most).
The software that translates uncompressed music to MP3‘s, called an encoder, is quite complex and has a number of patents associated with it that differ by country. Thus you can’t write software to play and create MP3‘s without negotiating with the patent holders. One way to avoid this headache is to use a free lossy format such as OGG Vorbis. Alternatively, if you are big company with lots of money, you can hire some engineers and create your own format, like Microsoft did with WMA.
hearing), then the notion of eliminating parts of the audio might not be as attractive to you. Further, for music archives, or for converting live music to digital format, we might not want to lose any information.
Lossless audio formats create do some compression by using tricks other than removing things out of the range of the average human ear. For example, 10 seconds of silence take up as much space as 10 seconds of music on an uncompressed WAV file. However in a WMA Lossless file the silence takes up much less space.
Such files are called lossless because they store everything that would have been stored in an equivalent uncompressed audio file. In practice lossless compression typically achieves about 2:1 compression ratio: it takes about half of the original, uncompressed audio file. Not as good as Lossy formats, but still better uncompressed audio, without losing any information.
The type of audio can make a significant impact on the compression ratio. For example, the FLAC lossless file format compresses voice recordings more efficiently than music recordings.
Getting sound to work in a program is trickier than you might you think. From a programming point of view, what we need to worry about is the following: almost all sound effects, or music, are played while our program is doing something else at the time. Everything that we’ve seen in Processing so far has been sequential, with only one action happening at a time. In other words, we have been writing sequential programs. Never have we made it so that two actions occur simultaneously. That is, we have not written any parallel programs. But that’s usually what you need with sound.
In turns out that Processing has a general-purpose solution to the problem of getting things to happen simultaneously: threads. A thread is essentially a part of a program that is allowed to executed along side other parts of the program - other threads. If your computer has multiple CPUs, then it is possible (but not guaranteed!) that each thread uses its own CPU. If you have only one CPU, then Processing can simulate running two (or more) threads in parallel by running each thread of a very short period of time, then switching to another. In other words, the threads take turns in using the CPU.
So far we’ve only been explicitly using one thread. If we want sound to run concurrently with other things in our program, then the natural solution is to use two threads: one thread for a program, and a second thread for sound.
However, threads are notoriously tricky to use correctly, and so we won’t use them directly. Instead, we’ll rely on a tool called Minim to manage our sound-related threads.
Note
Getting two or more threads to share CPUs and memory in an efficient and error-free way is surprisingly difficult. So difficult that it may be the topic of entire courses! Even simple threaded programs can contain subtle bugs that are very difficult to spot. Numerous languages, libraries, and programming techniques have been invented to make it easier to write parallel programs.
Minim is a sound library that comes with Processing, and with it you can do the following sorts of things:
The following example from the Minim documentation page (always a good resource when you’re stuck). This example shows how to play a sound file:
import ddf.minim.*; Minim minim; AudioPlayer song; void setup() { size(100, 100); minim = new Minim(this); // this loads mysong.mp3 from the data folder song = minim.loadFile("mysong.mp3"); song.play(); } void draw() { background(0); } void stop() { song.close(); minim.stop(); super.stop(); }
Note the following:
do wish to include sound, we must tell Processing about it. This is done with the line:
import ddf.minim.*;
We are importing the Minim library into our code. Try commenting out this statement and running the program. Processing will complain that it does not know what Minim is. The import line gives the program access to all of the code in the library called ddf.minim; the * means “all” in this statement.
We must create, and initialize, a Minim variable:
Minim minim; void setup() { // ... minim = new Minim(this); // ... }
The expression new Minim(this) creates a new Minim object specifically for this program. The variable this is a special variable that is automatically supplied by Processing. Essentially, this refers to the object that contains your program, and Minim requires access to this object to work correctly.
To play a song, we need to create and initialize an AudioPlayer object, and then we need to tell Processing to start playing the song:
AudioPlayer song; void setup() { //... song = minim.loadFile("mysong.mp"); song.play(); }
The statement song.play() plays the song starting at the beginning of the file. If you want the song to start at a different point, you can do this by passing the number of mili seconds as a parameter. e.g. if you want to start playing the song 5 seconds in, write song.play(5 * 1000);.
An important point here is that the program doesn’t pause and wait for the song to finish playing. Any code that we place in the draw() function will be drawn to the screen.
Finally, Minim needs a little bit of help in managing its sound threads. In any program that uses Minim, you need to define a stop() function that looks like the one above. This is another special Processing function, and can be thought of as the ‘opposite’ of setup: it is called right before the program stops. For programs that use Minim, we close the song, stop Minim, and then call the stop() function of the parent object (more on parents and children later).
For now, you don’t have to worry too much about the details of stop(): just know that it is required when you use Minim, and that it turns of the sound.
As a simple example that shows graphics and sound working together, let’s write a program that lets the user play a pair of bongo drums (the sound files used are bongo1.wav and bongo7.wav). For simplicity, we’ll draw the drums as circles, and let the user tap a drum by pressing ‘f’ (bongo1) or ‘j’ (bongo2).
Here is the program:
// Drum sounds are from http://www.drumsamples.org/ // import Minim import ddf.minim.*; // setup the sound variables; Minim minim; AudioSnippet drum1; AudioSnippet drum2; // track when a drum has been struck boolean drum1struck, drum2struck; void setup() { //initialize the screen size(500, 500); smooth(); // init sound minim = new Minim(this); drum1 = minim.loadSnippet("bongo1.wav"); drum2 = minim.loadSnippet("bongo7.wav"); drum1struck = false; drum2struck = false; } void draw() { background(255); // draw the drums: if a drum has just been struck // then fill it with colour as visual feedback // drum 1 if (drum1struck) { fill(0); drum1struck = false; drum1.rewind(); } else { fill(255, 0, 0); } ellipse(50, 55, 100, 100); if (drum2struck) { fill(255); drum2struck = false; drum2.rewind(); } else { fill(0, 255, 0); } ellipse(160, 55, 100, 100); } void keyPressed() { if (key != CODED) { if (key == 'f') { drum1.play(); drum1struck = true; } else if (key == 'j') { drum2.play(); drum2struck = true; } } } void stop() { drum1.close(); drum2.close(); minim.stop(); super.stop(); }
There are a few things to notice about this program: