It was a little while ago that I built an Adobe AIR desktop application that aggregated songs from music blogs to play on my radio show. To find out more checkout my previous post on the application here. I’ve since put a fair bit of work in, moving away from AIR and even the client side! I’ve switched it over to Node.js, rewriting the entire thing as a service!
Why do I keep on keeping on? Well – probably the biggest reason is that discovering music can be expensive and difficult these days. In fact, it’s pretty well summarized in a Twitter conversation some Codebass DJ’s and a listener had the other day.
It’s a lot of work finding new music! Don’t get me wrong, it’s fun, but there are so many ways to go that it can be pretty hard keeping up. On my old radio show, Weird Ass Wednesday, I used to buy one or two albums a week to play a new single or two. Eventually, I just got so sick of my collection that I knew I had to end things.
But how to discover new tunes? Well, one of my favorite radio stations KEXP has a song of the day playlist. That’s pretty nice! I started looking for other song of the day playlists and found a few – just not too many. However, I also discovered that music blogs will post free MP3’s. I hadn’t thought of this, but it makes sense. The undiscovered bands that I want to listen to will let music blogs post free and legal downloads to spread the word about their music. Even better, some of the blogs and publications with budgets will get bands to record studio sessions!
Going down this rabbit hole, I discovered too many blogs to count. How does one keep tabs on all of this great music? Should I run to each one every day to find new stuff? It would take too long!
That’s why I started Shark Attack. The original Adobe AIR based UI would download my radio show at the press of a button. It took a little while though, maybe 40 minutes or so to parse all the feeds and grab all the songs, but it did work pretty well!
I wanted MORE though. I thought to myself, what if this wasn’t JUST a radio show but a daily updated music playlist? For this to happen, a new service would need to be built that operates continually. So it makes sense to switch from Adobe AIR to a server side solution that I could automate.
Additionally, some of the negative aspects of Node.js that people are complaining of don’t apply to me. The first is that it doesn’t have it’s own HTTP server. I’ve heard that ExpressJS is too young from Node.js’ detractors. Well, maybe it is (I haven’t used it yet) – but I don’t care because I don’t need an HTTP server. No, I’m building a daily library of music – but I’m not serving any webpages.
Anyway the main music discovery project in node consists of a custom written subtask flow controller with 15 steps:
- Discover – Parse RSS feeds and HTML pages from radio stations, music blogs, magazines, and whatever else has good music. We’re looking for mp3’s, Youtube videos, and Vimeo videos. We record this information in a big JSON structure
- Download – Use the download links we obtained during the discovery phase to download mp3’s and videos locally to our server
- Transcode – Since we don’t play videos on our radio show, any Youtube/Vimeo videos we downloaded, we convert them to mp3’s
- Inject Metadata – At this point, with all the information about the songs we have, we probably have a good idea what the song name, artist name, and track label are. We inject this information into the mp3 file itself
- Remove Dead Assets – Unfortunately, some of our songs will be corrupt, non-existant, or even too long (remember we’re shooting for songs here, so removing 10+ minute assets are a safe bet). So, we remove these from our library
- Remove Duplicate Artists – Many times, a blog will throw 5-10 tracks from the same artist in one blog post. We don’t want this, we want variety and for our playlist not to be taken over by the same artists all the time.
- Remove Assets with Insufficient Metadata – Occasionally, we come by an mp3 that has no information recorded about it whatsoever. If this happens, I can’t tell people what it is, so we throw it out
- Record in Database – Any new artists that we discover are recorded in MongoDB, stamped with the time they were found.
- Record Delayed Tweet – At this point, we throw any new assets in a MongoDB queue of tweets to be sent
- Create Buy Links – Hit Amazon and iTunes to create affiliate purchase links to the songs
- Spotify Lookup – Query Spotify to create Spotify ID’s for the tracks we can find
- File System Cleanup – Trash any assets we aren’t referencing at this point
- Package – Create a local JSON file of all the songs we discovered in this process
- Output Libraries – Here we take the master JSON song package, and slice and dice it for a bunch of different output formats. One will only contain the mp3’s we found, another will only contain the videos we found, and yet another will only contain Spotify tracks
- Deploy – FTP all of our JSON files to a website
So, of course, not all of these directly help the SharkAttack radio show on Codebass. Much of this drives my daily music playlist at http://play.blastanova.com. There you can listen to all the new discoveries every day, download a playlist as a zip file, load up what’s available on Spotify, and soon more! The important thing here, though, is that we have a pretty spiffy library of new GOOD music generated every time the service is run.
I run this Node.js task through Jenkins Continuous Integration every day at 7am, 2pm, and 9pm EST. It seems to take around 30 minutes to 2 hours to run depending on what it needs to download and transcode. Currently, I pull music from 41 sites.
I have a few more Node.js/Jenkins tasks that get run throughout the day. One of them is to pull from our Tweet queue every 45 minutes. If new songs and videos are discovered during the main process, I tweet them out, but space them out by at least 45 minutes. These end up on my twitter account @blastanovamusic, and my Facebook page.
Another task is to build a playlist of favorites. When you favorite something on http://play.blastanova.com, your anonymous favorite goes to Google Analytics. I run this task everyday, pulling the top 20 favorite results going back 1 month from the Google Analytics API.
Lastly, is the SharkAttack Radio Show task! This is a pretty simple task – I load up the JSON music library file and then figure out what music goes into my playlist based on a start date value and total duration value in my configuration. The backbone of the show is an XML script that includes blocks of music from specific sources mixed with some pre-recorded voice overs made by me to introduce each block of music. When all of that is done, I export an m3u8 playlist for my DJ software along with all the songs and package them up in a zip file. I can then just download this file from my Jenkins workspace.
So, this gets played every Wednesday at 1pm EST on CodebassRadio! I also created a simplified version of my Blastanova playlist application to show my Shark Attack playlist. So there it is! SharkAttack…and it got much bigger than SharkAttack for me. With all the new avenues of digital music like Spotify, GrooveShark, iTunes, Last.fm, Pandora, etc, there still seems to be a need to find GOOD music in addition to playing it (which all those other services are good at). So Blastanova took on a life of its own – but thanks to my weekly radio show, I had lots of motivation to get this service working and keep it working!
Right now, the Node.JS Blastanova Music Discovery Service is at 0.9.14 about to reach a 1.0 milestone. The theme of 1.x versions will be to take what we’ve done to discover music and discover INFORMATION about that music. I’d like to reach out and get information about the band, get song lyrics, tour dates, any relevant information that helps the listener get to know the band and the song and help them reach out and buy it. At my 2.0 milestone, I’d like this to be less of a music playlist, and more of a interactive music magazine that you can listen to as you flip through the pages. Of course, SharkAttack will get the benefits! Perhaps this music magazine can be tied in to what’s live and streaming. Whatever happens, I’m having fun learning about new music, and I hope you are too!