I’m doing it again! Before Adobe MAX sets in, and that’s all we talk about for weeks to come, I wanted to update my series of blog posts.
I wrote a blog post right before 360Flex about music theory and Flash, which was itself a follow up to a Hello World sound application on the Flash Platform. Each post was accompanied by a demo wherein you could view the source and try things for yourself.
This time, I’ve completely refactored and added a TON of good stuff. In fact, I’ve put so much elbow grease into this thing, I’ve decided that it’s time to just make it an open source project. So, I called it Flashamaphone and dumped it onto the Google Code depository: https://code.google.com/p/flashamaphone.
The goal of this project is to provide an engine for people to use to make their own live instruments and design a cool UI in Flash or Flex. They can have buttons or gizmos in the UI that play certain notes or chords or tones. That’s where folks hook into Flashamaphone—it’ll play the C, or it’ll play a C-chord if you tell it that a C-chord contains C, E, and G notes.
Eventually, I want Flashamaphone to be smart enough to know that you want to play a C-minor 7th, and it’ll know exactly what notes to play. I also want it to give the developer a likely set of notes to play if they want to remain in key. Let’s take our combined UX knowledge and make a better interface for a musical instrument, or rather, tons of new, better interfaces for musical instruments.
And what better time to do this? We have Android tablets that run Flash now! Grab a Samsung Galaxy Tab, pop open your Flash app that uses Flashamaphone as the audio engine, and play some music. You don’t know how to play keyboard? That’s OK; just design an app that never plays the wrong note for a given key signature and fiddle around pressing random keys—it’ll probably sound good!
For a good example of what this isn’t, check out the probably far superior audio engine that Andre Michelle just released called TonFall: http://code.google.com/p/tonfall/. Andre put some cool stuff in it like a sequencer that reads a data file and plays the Super Mario Brothers’ theme. It looks like a great engine, but it also looks like it’s not geared for live performance and doesn’t really have an API that a musician would understand.
So how did Flashamaphone evolve from my last two demos and blog posts?
Well, what was severely lacking before was the ability to play them live. Basically, we created a note—say a middle C. The middle C would be processed into a raw byteArray and sent over to the audio buffer. We could choose the duration: it could have been one second, five seconds, or a minute if we wanted.
What about the times when we have no idea how long this sound is going to last? When would that be, you ask? Well, it’s when we’re physically pressing a key on a keyboard. We know that we’ve started pressing the key, we know we’re STILL pressing the key, but we don’t know when we”ll be finished pressing the key. That is why we can’t rely on duration, and why I had to rewrite a lot of code in my engine.
I started with the audio buffer. Previously, you threw a bunch of sound bytes over to the sound buffer, and it would queue it up and play it. Now, there can be no queue: we must play a tiny bit at a time. This tiny bit at a time can only be as much as we need to play to keep the sound going. It has to be a small enough sample that we can stop it on a dime. In other words, if I physically release the key, the sound has to immediately stop. If I press the key, it has to immediately start.
You might remember that in the “Hello World” blog post, I talked about the SAMPLE_DATA event. You start with an empty sound but provide a function for the sound object to go and get sound data. You can specify, at maximum, 8192 samples, and at minimum, 2048 samples.
If we leave it at 8192 samples, we can do the math (8192 samples/44100 samples per second) and figure out that we update the buffer every 1/5th of a second if using an 8192 sample per buffer update. That sounds very frequent, but with music, it might not sound as responsive as you’d like.
Instead, I’ve defaulted the buffer to 2048 samples per second or 1/20th of a second (hopefully, Flash keeps up OK!)
Anyway, the sound buffer wasn’t the only part of the equation. I had to make the buffer reach out to grab something when it needed audio. I thought that I could send notes directly to the buffer, but then I thought there should be a middle layer. This middle layer could interpret physical keys or other things that were being used, manage them, and send a byte stream to the buffer. I made one of these middle layers and called it a KeyboardController. I’m envisioning similar creations like a StringController or a HornController or any other number of crazy controllers that might act slightly different. But they all have one thing in common, and that’s sending something to the sound buffer.
The next, and major, piece I needed to upgrade here was the “phase” of each tone or note. “What’s a phase?” you might ask. Well, phases are a huge piece of making sounds or “voices” on a synthesizer. Smart audio synthesis folks (dating back to the late 1930s) have settled on a standard four phases that every note goes through when played. These four phases are
- Attack: the time taken for the initial run-up of level from nil to peak, beginning when the key is first pressed
- Decay: the time taken for the subsequent run down from the attack level to the designated sustain level
- Sustain: the level during the main sequence of the sound’s duration, until the key is released
- Release: the time taken for the level to decay from the sustain level to zero after the key is released
Let’s turn to Wikipedia to help visualize:
So, in my audio engine, I made each tone able to stream a different set of bytes for each phase. A note can be set with the specific amount of time it takes to get through the attack phase and then the decay phase. Then, as you stay on the note, you are perpetually in the sustain phase. After you release the key, your sound winds down with the release phase. This is very important if you think of something like a piano key. Think of the inside of the piano: when you press the key, the hammer strikes a string. This would be very loud, so there’s an extremely short lead up through the attack. We come off of the initial hammer strike, and the note gets a little less intense as we drop down through the decay phase. During the sustain phase, as you hold your finger on the key, it’s definitely not as loud but still very audible. When you release the key, you can still hear the note for a second or two. But it gets very quiet, very quickly. You can see this huge drop-off during the release phase in the previous graph.
What’s crazy is that the project went a little above and beyond what I expected and made it so you could assign a totally different type of voice to each phase. You could make the attack phase sound harsh like an 8-bit game and the sustain phase sound nice and smooth. I’ll cover all this “voicing” stuff in a future post.
The other big thing in this latest revision is the ability to play multiple notes at once. I’ve put this functionality into two places. First, I’ve extended the Tone class and created a Polytone class. I’ve also extended the Note class, creating a Polynote class. Each new class accepts an array of frequencies or notes and will then mix the notes when you want the bytes.
For example, if you pass the Polynote class a C note, an E note, and a G note, you can stream the bytes of all three notes at once. We do this by getting the bytes for each note individually. Then we go through each set of bytes two floats/numbers at a time (one float for left channel, one for right channel). We would get the average of all these numbers, and pop the average into a new byte array. What we had at the end would be a mix of the notes, enabling us to play a whole chord with a single key press. I did a similar mixing process with my KeyboardController class. This class figures out which keys you have pressed and which keys you have in the release phase, and mixes all these notes (whichever phase they happen to be in).
The last feature I’ve added is a debugger. Basically, you can set up a sprite on your stage and have it draw a graphic representation of your sound.
Whew! I’ll stop describing things here. We’ve gone from a nice little audio tutorial to a full blown live synthesizer. I’m glossing over quite a bit, but that’s only because I put a lot of stuff in, and I don’t want to make this post incredibly long. Now instead of teaching you how to build your own engine, I’m going to switch gears, and give you an audio engine you can use.
Flashamaphone is available on Google Code:
As this is the 3rd post on the subject, you can grab version 0.3 at
or the latest at
Where do we go from here? Well, there are two things I want to do next. The first is to implement what I talked about at first and provide an API to generate real chords or let a user know which notes are available in a specific key signature.
The second thing I want to do is to explore the voicing side (the sound of the note) more. My goal will be to put enough features in to get something that sounds like a piano (and many other common sounds).
Check out my latest demo here: http://www.blastanova.com/labs/flashamaphone/article3/!