Archive for the ‘digital audio’ Category
AudioRegent article published in Code4Lib journal
If anyone's reading and is interested: last week the Code4Lib Journal published an article of mine entitled "AudioRegent: Exploiting SimpleADL and SoX for Digital Audio Delivery".
The article is a little overview of AudioRegent and SimpleADL and how they are utilized at the University of Alabama Libraries, where I work.
from http://journal.code4lib.org/mission:
"The Code4Lib Journal exists to foster community and share information among those interested in the intersection of libraries, technology, and the future."
AudioRegent Installation (Xubuntu)
I just got a Vimeo account.
AudioRegent Installation (Xubuntu) from nitin arora on Vimeo.
A quick tutorial on installing and running AudioRegent 1.1 in Xubuntu 9.04.
AudioRegent’s documentation is available at:
http://blog.humaneguitarist.org/projects/audioregent/
The documentation includes a download link to the program files.
segmenting audio with AudioRegent, SoX and XML
For some reason I feel obligated to point out that I haven’t blogged in a while for a few reasons:
- Christmas break from school/work at the University of Alabama
- the desire not to blog for the sake of blogging
- and …
I’ve been working on something huge – at least for me. It’s a piece of software called AudioRegent that harnesses XML to create derivative "clips" of regions within WAV audio files. A region is simply a user-defined segment within an audio file, like a track on a Compact Disc.
Besides writing the program in Python, which I pretty much finished in December, I had to also develop the XML format which I call SimpleADL (Simple Audio Decision List) that AudioRegent looks at and then makes derivative audio clips by leveraging SoX, the Sound Exchange command line audio editor. AudioRegent and SimpleADL can also be used to sync audio to text, like transcripts.
Actually, the programming and devising SimpleADL were the easy part. The hard stuff was the documentation and deciding on a license for the software.
I tried to find a balance in documenting the software: being thorough without writing a novel. I’m not sure I succeeded, but I can always improve it with time.
I used the W3C’s Amaya editor to write the documentation in XHTML. Sure, you can use OpenOffice to export a document to XHTML, but man is it bloated and messy. Amaya writes really clean XHTML.
As for the license, I chose the BSD license. As I understand it, this allows one to use the source code at will in future open or closed-source applications as long as you maintain the credits for AudioRegent. I was tempted to use the Mozilla Public License (MPL) which, again from what I can tell, is similar to the BSD license except that any source derived from AudioRegent would have to stay open-source though any peripheral code can be closed-source. I absolutely decided against the GNU General Public License which is viral and imposes its philosophy perpetually on all subsequent code, even peripheral code. Some have even argued that it works against its own objectives and is less "open" than the MPL.
Now I realize that, practically speaking, a skilled programmer could write better code from scratch in 30 minutes as opposed to the some 30 hours I needed, but I wanted to go about this quasi-professionally. And I learned more about licensing, which was cool.
Anyway, rather than try and explain the software itself and how to get it, I’d be better off pointing you to the documentation if you have any interest …
XSPF: a simple XML media playlist
Just a short post on the XSPF playlist format.
I’m not going to get into how to use it, but I do want to share a demo of how it could be used to lets users more easily navigate an audio rendering of a poem or transcript, etc. using the JW Player.
The demo uses XSPF to allow a user to jump to the first 3 stanzas of The Midnight Ride of Paul Revere as read by Bridget Rafferty for LibriVox. Just use the fast forward button to jump to the next stanza.
Conversely, if the stanzas were segmented into multiple MP3s, etc. then the playlist could be used in the opposite manner. That’s to say that the playlist could be used to play the various segments back as a single continuity so the user doesn’t have to manually start the playback of each stanza/MP3.
Here’s the XSPF playlist if you’d like to see it.
Here’s the demo.
lossy test conclusions?
found via the Hydrogen Audio forums:
The writer of this blog entry makes an interesting point regarding the double blind ABX listening test.
Essentially what they’re getting it is that the test evaluates what we hear, not necessarily the worth of the compression.
My personal concern is that if we bow to these tests that determine users don’t hear the difference between lossly and lossless audio formats (MP3 vs WAV, for example) some might use that to argue that, aside from initial recording/mastering, lossless audio is unnecessary – perhaps even for long term archival storage.
The writer, I think, makes a valid point in that just because we don’t hear the difference, that doesn’t mean it can’t affect us.
Given that lossy audio – and even things like losless format 16-bit/44.1 khz audio – achieves some of its size compression via tapering/rejection of high, "inaudible" frequencies it might be interesting to consider the post alongside this article:
Inaudible High-Frequency Sounds Affect Brain Activity: Hypersonic Effect
J Neurophysiol. 2000 Jun;83(6):3548-58.