blog.humaneguitarist.org

discoveries in digital audio, music notation, and information encoding

Archive for the ‘digital audio’ Category

full-text searching of timed text and a farewell to Andy Roddick

leave a comment

It’s been a while since I had one of my “So, I’m home sick today and wrote this silly, little script” things.

Well, here’s another one while the antibiotics take root.

I’ve always wanted to do something with offering full-text search against timed-text files and allowing a user to click on a result and skip to the audio segment matching the returned line of timed-text, etc. Hulu has had a BETA version of this kind of thing for a while and I suspect others do too.

Well, today I just whipped up a little search API using PHP and MySQL. It’s a nice little start and super easy to do.

I made a database table using the timed-text data from my SAVS project, OpenOffice Calc, and phpMyAdmin. The text is from Shakepeare’s Sonnet 130 using a LibriVox recording (version #14, Miller). BTW, parsing DFXP or SRT files and throwing those into a table is easy, but it’s not within the scope of this little mock-up.

If I send a query for “rare love” to the API as such:

http://blog.humaneguitarist.org/uploads/SAVS/currentVersion/search/?q=rare%20love

… I get the following JSON response:

{
  "results":{
    "result":[
      {
        "text":"Than in the breath that from my mistress reeks.",
        "highlighted_text":"Than in the breath that from my <mark>mistress<\/mark> <mark>reeks<\/mark>.",
        "startTime":"34",
        "stopTime":"37",
        "source":"sonnet130_shakespeare_njm",
        "relevance":"4.04993200302124"
      },
      {
        "text":"My mistress, when she walks, treads on the ground:",
        "highlighted_text":"My <mark>mistress<\/mark>, when she walks, treads on the ground:",
        "startTime":"46",
        "stopTime":"49",
        "source":"sonnet130_shakespeare_njm",
        "relevance":"1.62977826595306"
      }
    ]
  }
}

Note that the text is returned in the “text” field and I’m also trying to return a “highlighted_text” field in which search terms are surrounded by the HTML5 “mark” tag. There’s also a relevance score … of sorts (pun!).

It needs a lot of work, but there’s enough data returned to launch an audio segment using some HTML5/JavaScript or some Flash or Silverlight API, etc. Hey, it ain’t too bad for a bad stomach and some sports-entertainment distractions.

Below, I’ll paste the CSV file I used to make the table, the PHP script … and a personal note about the best male American tennis professional of the last decade.

Here’s the CSV file from the spreadsheet application (note the “line_text” field is full-text indexed in the database):

"line_id";"line_text";"start_time";"stop_time";"file_prefix"
"1";"Coral is far more red than her lips' red:";"13";"17";"sonnet130_shakespeare_njm"
"2";"If snow be white, why then her breasts are dun;";"17";"21";"sonnet130_shakespeare_njm"
"3";"If hairs be wires, black wires grow on her head.";"21";"26";"sonnet130_shakespeare_njm"
"4";"I have seen roses damask'd, red and white,";"26";"29";"sonnet130_shakespeare_njm"
"5";"But no such roses see I in her cheeks;";"29";"32";"sonnet130_shakespeare_njm"
"6";"And in some perfumes is there more delight";"32";"34";"sonnet130_shakespeare_njm"
"7";"Than in the breath that from my mistress reeks.";"34";"37";"sonnet130_shakespeare_njm"
"8";"I love to hear her speak, yet well I know";"37";"40";"sonnet130_shakespeare_njm"
"9";"That music hath a far more pleasing sound:";"40";"43";"sonnet130_shakespeare_njm"
"10";"I grant I never saw a goddess go, --";"43";"46";"sonnet130_shakespeare_njm"
"11";"My mistress, when she walks, treads on the ground:";"46";"49";"sonnet130_shakespeare_njm"
"12";"And yet, by heaven, I think my love as rare";"49";"54";"sonnet130_shakespeare_njm"
"13";"As any she belied with false compare.";"54";"56";"sonnet130_shakespeare_njm"

Here’s the PHP script:

<?php
//GET search words from URL parameter
$searchWords = trim($_GET["q"]);

//prepare for highlighting keywords
$search_array= explode(" ", $searchWords);

//prepare for output
$output = array();

//connect to database
include_once("db_setup.php");

//run query
$searchWords = mysql_real_escape_string($searchWords);
$query = "SELECT *, MATCH(line_text) AGAINST(\"$searchWords\") AS relevance
FROM $table WHERE MATCH(line_text) AGAINST(\"$searchWords\" IN BOOLEAN mode)
ORDER BY relevance DESC";
$result = mysql_query($query);

if($result) {
    while($row = mysql_fetch_array($result)) {
      $line_text = $row["line_text"];
      $start_time = $row["start_time"];
      $stop_time = $row["stop_time"];
      $file_prefix = $row["file_prefix"];
      $relevance = $row["relevance"];

      //highlight seach words in line_text
      $highlighted_text = $line_text;
      foreach ($search_array as $word) {
        $highlighted_text = str_ireplace($word, "<mark>$word</mark>", $highlighted_text);
      }

      $this_output = array("text" => htmlspecialchars($line_text),
      "highlighted_text" => htmlspecialchars($highlighted_text),
      "startTime" => $start_time,
      "stopTime" => $stop_time,
      "source" => $file_prefix,
      "relevance" => $relevance);
      array_push($output, $this_output);
    }
}

//send JSON results
if (count($output) == 0) {
  $results = array("results" => "No results.");
}

else {
  $result = array("result" => $output);
  $results = array("results" => $result);
}

$response = json_encode($results);
include_once("indent_json.php");
header("Content-type: application/json; charset=UTF-8");
echo(indent_json($response));
?>

And here’s something more important.

As a huge tennis fan, today was a melancholy one for me as Andy Roddick played his last match, having just lost a few moments ago to Juan Martin del Potro. The Wikipedia article on Roddick here already lists him as retired but the important thing to remember about Roddick is that he achieved more with less than a lot of other players with more talent and was entertaining to watch, win or loose, in big matches.

Thanks for the memories!

--------------

Related Content:

Written by nitin

September 5th, 2012 at 6:17 pm

less is more, a SAVS update

leave a comment

Just a quick post before I find a movie to stream on Netflix and ride out my Sunday …

So, I've been working a tad on SAVS, aka the "Simple Audio/Verse Synchronizer". And the changes really have to do with the data model for the timed text and for the backend/technical requirements. It's now all done with HTML, CSS, and JavaScript – as it should be.

First, the data model for a line of timed text in what I'm calling "st2" or "SAVS timed text" is now like this:

  <span
    class="savs-st2"
    data-startTime="10"
    data-stopTime="13">My mistress' eyes are nothing like the sun
  </span>

Before, it was much clunkier, like this:

  <p onclick="seekTo(10)" id="1">
    <span class="savs-text">My mistress' eyes are nothing like the sun</span>
    <span class="savs-time">10</span>
  </p>

That's to say, now – using the HTML5 "data-" attribute – the demands for the HTML markup are far fewer given that the JavsScript file "savs.js" takes care of more.

Before, with the older mark up model, there was no support for a stop time value and one also had to take the responsibility for adding several attributes related to calling JavaScript functions and for creating "id" attributes for both the corresponding <audio> or <video> element as well as for the timed text, etc.

I actually have thought about doing this as a jQuery plugin, but I'm not sure I see the point. Simply including the "savs.js" file is easier. By editing the "savs.css" file, one can control the look of their page. But I digress …

Now that the data model is different and the JavaScript file does more, one can generate a "SAVS compliant" HTML doc with whatever they want.

See, before I was thinking I'd write a PHP script that would build the page, etc, etc. but then I realized that "No, that's not my job." People should be able to store their timed data however they want, generate their HTML however they want, and only have to use the "savs.js" file and the "st2" data model to get this to work.

Sort of.

One also needs to give their HTML5 <audio> or <video> element an id of "savs-player" and also needs to put a tag somewhere in their HTML doc with an id of "savs-caption" a la:

<span class="savs-caption"></span>

That's where the captions go and it's currently required. If someone doesn't want to display captions, then they can just use CSS to hide that element.

Anyway, I'm not explaining anything well since I'm in a rush to watch a movie and have a soda, so here's the latest demo and below is the original version shown via a screencast.

SAVS: a Simple Audio/Verse Synchronizer from nitin arora on Vimeo.

--------------

Related Content:

Written by nitin

March 11th, 2012 at 8:50 pm

Posted in digital audio,scripts

Tagged with , ,

motivation be damned, just wait for someone else to do it and sleep more

leave a comment

So, a few weeks ago I said in these posts that I thought it was an oversight that the HTML5 audio and video elements didn't let one pass start and stop time parameters, allowing one to use specific portions of an audio or video file.

Well, I just saw on this page that Mozilla has implemented this functionality per the Media Fragments URI Specification. And it seems that Webkit is there, too, per this post from a few weeks ago. Actually, that post was published the same week as when I started fiddling with a pure JavaScript and "data-" attribute solution when I went home sick from work. Now I wish I had just napped more that day and just waited to learn more about these implementations to take place.

Know what's funny? I never saw these docs when I wrote the aforementioned posts. Perhaps that's because the pages I linked to above about Mozilla/Webkit were last updated within just a few days of mine. But anyway, I found them today because I saw someone came to my page searching for "<audio> starttime" and I wanted to see what else came up for that query so I "Googled" it. Sometimes it's educational and not just an exercise in vanity to know why people come to your site, huh?

ZZZ …

;)

--------------

Related Content:

Written by nitin

February 7th, 2012 at 9:11 am

Posted in digital audio

Tagged with , ,

jAUs 3-D: no glasses required

leave a comment

So, I am slightly obsessive compulsive.

I worked a little more tonight on this jAUs thing to add support for start and stop time attributes in the HTML5 <audio> tag.

Video should work, too, with a little work, but I don't care about that right now.

What I did tonight was make it so that if a "stopTime" attribute is used, then after that point is reached the player will move the scrubber back to the original "startTime" value though at that point the playback is already paused.

If there is no "stopTime" attribute, then after the file's played itself to the end, the scrubber will move back to the "startTime" value. Again, playback is already paused at that point.

I've tested this with Firefox 9.01, Internet Explorer 9, Safari 4.0.5, Opera 11.60, and Chrome 16.0.912.75 on my Windows 7 (32-bit) laptop.

Running everything from my desktop, I had no problems except that I should mention that if I used the "autoplay" attribute and set it to "true", Chrome didn't start the playback as it should, but it seems that maybe that's a Chrome problem that others are having, too.

Testing with the files uploaded to my hosted account was a different story. Opera seemed to need a page refresh before the scrubber would locate itself at the proper positions – though adding a pesky alert() at the beginning seemed to make Opera happy. Chrome and Safari seemed to take a few seconds to get situated, although they seemed to generally need a restart to move the scrubber to the right place for the last audio player. I didn't test the alert() thing for these two. Firefox did well although I hate the way Firefox moves the audio players around depending on whether the audio has been played or not. And that leaves IE9 … which, hands down, did the best. Maybe that's because of the exception I'd added for it as I mentioned in an earlier post.

So, there's still work to do and things to investigate.

Also, I haven't tested this with really long files or anything so I don't know how that would go. But then again, as I've heard others say, HTML5 media elements aren't really for long-form media anyway.

Oh yeah, one more thing. I did in fact change to "data-startTime" and "data-stopTime" to make the HTML legal HTML5.

Here's the HTML5 code itself, letting one see that the JavaScript has now been moved into a separate JS file.

<!DOCTYPE html>
<html>
  <head>
    <title>jAUs</title>
    <meta charset="UTF-8" />
  </head>
  <body>
  
  <script type="text/javascript" src="jAUs.js"></script>
   
  <p>This is the entire recording of Shakespeare's Sonnet 130, read by Nathan Miller for 
  <a href="http://librivox.org/sonnet-130-by-william-shakespeare/">LibriVox</a>.</p>
  <audio controls>
    <source src="sonnet130_shakespeare_njm.mp3" type="audio/mp3" />
    <source src="sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>
  
  <p>"My mistress' eyes are nothing like the sun." - <em>start = 10; end = 13.</em></p>
  <audio class="jAUs" controls data-startTime="10" data-stopTime="13">
    <source src="sonnet130_shakespeare_njm.mp3" type="audio/mp3" />
    <source src="sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>
  
  <p>"End of poem."- <em>start = 57, no end specified.</em></p>
  <audio class="jAUs" controls data-startTime="57">
    <source src="sonnet130_shakespeare_njm.mp3" type="audio/mp3" />
    <source src="sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>
  
</body>
</html>

Oh, and here's the JavaScript file:

/*
***** Note: This software is still in ALPHA. Please refrain from using 
the code without first contacting Nitin Arora at nitaro74@gmail.com.
Thanks!
***** 

jAUs: JavaScript <audio> Shark.

Copyright (c) 2012 Nitin Arora. 

Permission is hereby granted, free of charge, to any person obtaining a 
copy of this software and associated documentation files (the 
"Software"), to deal in the Software without restriction, including 
without limitation the rights to use, copy, modify, merge, publish, 
distribute, sublicense, and/or sell copies of the Software, and to 
permit persons to whom the Software is furnished to do so, subject to 
the following conditions:

The above copyright notice and this permission notice shall be included 
in all copies or substantial portions of the Software. 

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 

jAUs is licensed under the MIT license:

http://www.opensource.org/licenses/mit-license.php.

*/

function jAUs(){
 
  audioTagArray = document.getElementsByClassName("jAUs");
 
  for (i=0;i<=audioTagArray.length-1;i++){
    var thisAudioTag = audioTagArray[i];
    jAUs_2(audioTagArray,thisAudioTag);
  }
}
 
function jAUs_2(audioTagArray,thisAudioTag){
  //if this is placed directly into jAUs() - i.e. not a separate function,
  //then this whole thing doesn't seem to work.
 
  if (navigator.appName == "Microsoft Internet Explorer"){
    thisAudioTag.onloadeddata = function(){
      var thisAudioTag_startTime = thisAudioTag.getAttribute("data-startTime");
      thisAudioTag.currentTime = thisAudioTag_startTime;
    }
  }
 
  else {
    var thisAudioTag_startTime = thisAudioTag.getAttribute("data-startTime");
    thisAudioTag.currentTime = thisAudioTag_startTime;
  }
 
  var thisAudioTag_stopTime = thisAudioTag.getAttribute("data-stopTime");
  var stopString = "jAUs_3(this.currentTime," + thisAudioTag_stopTime + "," + i + ");";
  //returns "jAUs_3(this.currentTime, 13, i);" where "i" is an int.
    
  thisAudioTag.setAttribute("ontimeupdate",stopString);
}
 
function jAUs_3(this_currentTime,thisAudioTag_stopTime,i){

  if (thisAudioTag_stopTime){
    //if there's a data-stopTime attribute then ...
    if (this_currentTime > thisAudioTag_stopTime){
      //... reset audio to data-startTime when data-stopTime is reached.
      audioTagArray[i].currentTime = audioTagArray[i].getAttribute("data-startTime"); 
      audioTagArray[i].pause();
    }
  }
  else if (audioTagArray[i].ended == true){
    //if there's no data-stopTime, move back to data-startTime when playback has ended.
    audioTagArray[i].currentTime = audioTagArray[i].getAttribute("data-startTime");
    audioTagArray[i].pause();
  }
}
   
window.onload = function(){  
  jAUs();
}

Update, January 12, 2012: Turns out "jAUs" is the name for some robotic SDK and I'm not too crazy about that name anyway. So, I'm leaning toward "m(AUj)ulate" (pronounced like 'modulate') which would stand for something like "My Untimely Little Audio Tag Extender". The word "untimely" being, of course, a play on the fact that time is what this is all about. The parenthetical bit refers to "audio" (AU) and JavaScript (j).

And, yes, I care much more about the name/acronym than the script itself.

:P

Update, January 15, 2012: OK, this is interesting. If, for the source of the audio file, I actually use the audio files on the Archive.org site for the LibriVox recordings like so …

  <audio class="jAUs" controls data-startTime="10" data-stopTime="13">
    <source src="http://www.archive.org/download/sonnet_130_librivox/sonnet130_shakespeare_njm_64kb.mp3" type="audio/mp3" />
    <source src="http://www.archive.org/download/sonnet_130_librivox/sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>

… then this seems to be working OK in all the browsers. Chrome seems, per casual observation, the slowest in terms of getting the scrubber moved to the appropriate points, but I guess this is progress.

Related to all this stuff I've been messing with, I found this: Consistent event firing with HTML5 video – Dev.Opera. But here, too, they use an alert() to notify the user that metadata is loaded using "onloadedmetadata", but in my tests it seems like the alert() function itself was what was fixing the inability of some browsers to set the current time as my script was instructing.

--------------

Related Content:

Written by nitin

January 11th, 2012 at 7:55 pm

jAUs 2: just when you thought it was safe to go back to hating Internet Explorer

leave a comment

Yesterday, I posted this about trying to add "startTime" and "stopTime" attributes to the HTML5 <audio> tag using each of the major desktop browsers' native HTML5 audio player.

If anyone read that post, it's clear I ran into some problems with Internet Explorer 9 when all the other HTML5 browsers seemed to be fine with my JavaScript.

Well, I updated that post today to reflect a possible solution. Possible in the sense that it works, but I don't know if it's the best – or even a good – solution to the IE problem.

It basically involved checking for IE as the user's browser and using the HTML5 media events to find the event that would make IE wait until the right time before trying to access the current time of the audio element. I also referred to this page on Microsoft's site in terms of checking for IE.

You can read that update here: http://blog.humaneguitarist.org/2012/01/09/jaus-trying-to-add-a-starttime-attribute-to-the-audio-tag#update011012.

Hooray for anchor tags – and shark repellent.

--------------

Related Content:

Written by nitin

January 10th, 2012 at 3:27 pm

Posted in digital audio

Tagged with ,

jAUs: trying to add a startTime attribute to the audio tag

leave a comment

I work up this morning feeling unwell, but I still went in to work for a few minutes to enable Remote Desktop as it had stopped working (was a Windows Firewall thing) …

Anyway, on the walk back home I started thinking about something I've wanted to play with for a while.

And that's seeing what it would take to add support for a "startTime" attribute for the HTML5 <audio> tag using a browser's built-in player. I think it's a real oversight that there isn't native support for passing the start and stop times in the <audio> and <video> tags themselves. Um, there isn't right?

Adding support for a "stopTime" attribute would, I think, require more than I'm willing to think about right now (I should be napping) because obviously that entails checking the current time against the point at which one would want the media to stop.

But adding a simple (and, yes, non-standard) "startTime" attribute inside the <audio> tag seems pretty easy – if you aren't using IE. Ugh.

As you can see in the code below there's an alert() that returns a null, but without it IE (version 9) won't set the audio players to the values in the "startTime" attributes. The other "major" browsers do fine without it. And, really, it can't be there because it's annoying as hell.

But I think this still points to the problem of each browser having its own player. Take Safari for instance. It doesn't natively show the current or total time of the track. Chrome doesn't show the total time.

Still think Flash is a bad thing?

:P

<!DOCTYPE html>
<html>
  <head>
    <title>jAUs</title>
    <script type="text/javascript">
      function jAUs(){
        var audioTag = document.getElementsByClassName('jAUs');
        for (i=0;i<=audioTag.length;i++){
          var audioTag_startTime = audioTag[i].getAttribute('startTime');
          alert('');
          audioTag[i].currentTime = audioTag_startTime;
        }
      }
    </script>
  </head>
<body onload="jAUs()">

  <audio class="jAUs" controls startTime="10">
    <source src="sonnet130_shakespeare_njm.mp3" type="audio/mp3" />
    <source src="sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>

  <br />

  <audio class="jAUs" controls startTime="25">
    <source src="sonnet130_shakespeare_njm.mp3" type="audio/mp3" />
    <source src="sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>

</body>
</html>

ps: If this ever turns into a little project of mine, it needs a decently cool name to keep my interest. "JS" for JavaScript and "AU" for audio mashed up equals "jAUs". I hope that isn't already taken within the JavaScript/HTML5 domain.

Update, January 09, 2012: Now that an hour has past and I finished watching "Thor & Loki: Blood Brothers", I figured out how to elegantly do the "stopTime" thing – well, at least I think it's better than what I was thinking earlier today!

So unlike in the code above, the JavaScript below only hits on one <audio> element (the one with an "id" value of "jAUs") and not all the ones with a "class" value of "jAUs". But the point is that this new code adds the "ontimeupdate" attribute to the <audio> tag and makes it run a function called stopper() that will stop the audio once the current time is greater than the designated stopping time.

Now, there's still work to do. For example, the code should test for the existence of the "stopTime" attribute before forcing the stopper() function to run, but that's no big deal. I also need to test doing this as above – i.e. hitting all <audio> tags with a "class" value of "jAUs" – and see how that works. And there's that pesky IE thing, too.

Anyway, this is working in the other "major" browsers 'far as I can tell.

<!DOCTYPE html>
<html>
  <head>
    <title>jAUs</title>
    <script type="text/javascript">
      function jAUs(){
        var audioTag = document.getElementById('jAUs');
        var audioTag_startTime = audioTag.getAttribute('startTime'); //returns "10"
        //alert(''); //no IE love this time, sorry!
        audioTag.currentTime = audioTag_startTime;
        var audioTag_stopTime = audioTag.getAttribute('stopTime');  //returns "20"
        var stopThis = "stopper(this.currentTime," + audioTag_stopTime + ");"; //returns "stopper(this.currentTime,20);"
        audioTag.setAttribute('ontimeupdate',stopThis); //sends current time and stopTime value to stopper()
      }
      function stopper(currentTime, stopTime){
        var audioTag = document.getElementById('jAUs');
        if (currentTime > stopTime){
          audioTag.pause(); //stops playback if the current time is greater than the stopTime value
        }
      }
    </script>
  </head>
  <body onload="jAUs()">

  <audio id="jAUs" controls startTime="10" stopTime="20">
    <source src="sonnet130_shakespeare_njm.mp3" type="audio/mp3" />
    <source src="sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>

</body>
</html>

Update, January 09, 2012: I still don't like it, but it seems that the alert() line can go at the top of the jAUs() function and still allow all this to work on IE 9. At least that prevents an alert message for each <audio> tag within the page.

Update, January 10, 2012: Ah. Now, we might be getting somewhere. I don't know if it's just a band-aid fix, but if I add an "onloadeddata" bit for IE9 this actually seems to be working just fine. The problem was that IE didn't consider the <audio> element's properties to be accessible, thus giving me an "Invalid State Error" with a code number of 11 which is this, more or less: "An attempt was made to use an object that is not, or is no longer, usable." So, IE9 is taking a little longer than the other browsers in realizing what's what.

<!DOCTYPE html>
<html>
  <head>
    <title>jAUs</title>
    <script type="text/javascript">
      function jAUs(){
        audioTag = document.getElementById('jAUs'); //this is a global variable
        var audioTag_startTime = audioTag.getAttribute('startTime'); //returns "10"
        //alert(audioTag.readyState); //IE returns "0", other browsers "4"
        //alert(audioTag.readyState); //IE and others return "4"
   
        //needs to be for IE9 only ... I guess it waits until the audioTag is ready to go.
        if (navigator.appName == 'Microsoft Internet Explorer')
        {
          audioTag.onloadeddata = function() {
            audioTag.currentTime = audioTag_startTime;
          }
        }
        else {
          audioTag.currentTime = audioTag_startTime;
        }

        var audioTag_stopTime = audioTag.getAttribute('stopTime');  //returns "13"
        var stopThis = "stopper(this.currentTime," + audioTag_stopTime + ");"; 
        //returns "stopper(this.currentTime, 13);"

        audioTag.setAttribute('ontimeupdate',stopThis);
      }

      function stopper(currentTime, stopTime){
        if (currentTime > stopTime){
          audioTag.pause();
        }
      }
    </script>
  </head>
  <body onload="jAUs()">

  <audio id="jAUs" controls startTime="10" stopTime="13">
    <source src="sonnet130_shakespeare_njm.mp3" type="audio/mp3" />
    <source src="sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>

</body>
</html>

Update, January 10, 2012: Hopefully (for you and me) this is the last update for a while. Hey, I'm still sick, jerk!

Anyway, this is a test for hitting on all <audio> tags with a class value of "jAUs".

<!DOCTYPE html>
<html>
  <head>
    <title>jAUs</title>
    <script type="text/javascript">
      function jAUs(){
     
        audioTagArray = document.getElementsByClassName('jAUs');
       
        for (i=0;i<=audioTagArray.length-1;i++){
          var thisAudioTag = audioTagArray[i];
          jAUs_2(audioTagArray,thisAudioTag);
        }
      }
     
      function jAUs_2(audioTagArray,thisAudioTag){
      //if this is placed directly into jAUs() - i.e. not a separate function,
      //then this whole thing doesn't seem to work.
     
        if (navigator.appName == 'Microsoft Internet Explorer'){
          thisAudioTag.onloadeddata = function(){
            var thisAudioTag_startTime = thisAudioTag.getAttribute('startTime');
            thisAudioTag.currentTime = thisAudioTag_startTime;
          }
        }
       
        else {
          var thisAudioTag_startTime = thisAudioTag.getAttribute('startTime');
          thisAudioTag.currentTime = thisAudioTag_startTime;
        }
     
        var thisAudioTag_stopTime = thisAudioTag.getAttribute('stopTime');
        var stopString = "jAUs_3(this.currentTime," + thisAudioTag_stopTime + "," + i + ");"; 
        //returns "jAUs_3(this.currentTime, 13, i);" where "i" is an int.

        thisAudioTag.setAttribute('ontimeupdate',stopString);
      }

      function jAUs_3(this_currentTime,thisAudioTag_stopTime,i){
     
        if (this_currentTime > thisAudioTag_stopTime){
          audioTagArray[i].pause();
        }
      }
    </script>
  </head>
  <body onload="jAUs()">

  <audio class="jAUs" controls startTime="10" stopTime="13">
    <source src="sonnet130_shakespeare_njm.mp3" type="audio/mp3" />
    <source src="sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>

  <br />

  <audio class="jAUs" controls startTime="25" stopTime="26">
    <source src="sonnet130_shakespeare_njm.mp3" type="audio/mp3" />
    <source src="sonnet130_shakespeare_njm.ogg" type="audio/ogg" />
    Your browser does not support the audio element.
  </audio>

</body>
</html>

By the way, using the HTML5 "data-" prefix a la "data-startTime" would make the attribute(s) valid, though still not part of a standard per the <audio> tag specifically. But I guess the "data-" prefix is an acknowledgment of reality.

--------------

Related Content:

Written by nitin

January 9th, 2012 at 12:29 pm

Posted in digital audio

Tagged with , ,

indexing and searching timed text with Solr

leave a comment

I'm still learning about Solr so maybe this post is much ado about nothing. But according to this nabble.com thread, one can't index a source XML document in Solr with it's native XML structure intact and then in turn search that structure as one can in an XML database like BaseX.

For most things, that's fine. I mean for indexing titles, creators, and descriptions, etc. I just need to index the value of a given element like <title> so that I can search for that element's value.

But for timed text, it's different. Or at least, it can be.

Say I have this DFXP snippet for an audio file with an "id" value of "XYZ".

<p begin="10.0s" end="30.0s">Hello world!</p>

I would need the user to search for the string "Hello world!" or part of it but I would also need to index at least the value of the "begin" attribute so that I can pass that to a page that will play the file "XYZ" starting at the 10 second mark – if the user clicks on the "Hello world!" line in their search result. And I don't want the "10" second value to be something they search against since they might be searching for the string "10" within the text itself.

So I'm wondering how to do that with Solr.

Maybe when I learn more I'll discover a better way to do this, but for now I'm thinking I could do the following:

First, I would pretty much index the timed text twice in Solr.

<doc>
  <field name="id">XYZ</field>
...
  <field name="timedText-stripped">Hello world!</field>
  <field name="timedText">Hello World! {10}</field>
</doc>

After indexing the "id" of the audio file this would index:

  • just the text "Hello world!"
  • the text of "Hello world!" with the "begin" attribute value in curly quotes.

I guess this way the user could be made to search across the "timedText-stripped" field but, via the XSL that can be passed to Solr to display results, the "timedText" field could be displayed in a manner that would make the text "Hello World!" linked to whatever file will play file "XYZ" starting at the 10 second mark. Basically, by planting the "begin" value in curly quotes, I can parse the string for the text and the "begin" value as separate things.

So, here's a really crappy XSL snippet that would do something like that. It assumes a variable "$id" exists that equals "XYZ", the identifier for the example audio file.

<xsl:for-each select="//field[@name='timedText']">
  <xsl:variable name="whole">
    <xsl:value-of select="."/>
    <!-- Gets entire element string -->
  </xsl:variable>
  <xsl:variable name="text">
    <xsl:value-of select="substring-before($whole,'{')"/>
    <!-- Gets text prior to seconds -->
  </xsl:variable>
  <xsl:variable name="begin">
    <xsl:value-of select="substring-before(substring-after($whole,'{'),'}')"/>
    <!-- Gets seconds value from end of string -->
  </xsl:variable>
  <a href="someMediaPlayer.php?id={$id)&amp;begin={$begin}">
    <xsl:value-of select="$text"/>
  </a>
  <!-- So, I'm saying that
  "someMediaPlayer.php?id=XYZ&start=10"
  would launch a player that would start file XYZ at the 10 seconds mark.
  -->
</xsl:for-each>

The search output would be some HTML code like so:

<a href="someMediaPlayer.php?id=XYZ&amp;begin=10>Hello World!</a>

It seems weird to index something twice, more or less, but as user Erick says in the nabble.com thread, "You've gotta take off your DB hat and not worry about duplicating data."

But now as I write this, I'm wondering if I can't just index as follows:

  <field name="text">Hello world!</field>
  <field name="begin">10</field>

and trust that for each "text" field, there will be a matching "begin" field and that they can't just be used in tandem to create the same HTML link as above. Sounds like I need to play around some more.

:)

Update, September 6, 2012: I wrote a related post to this yesterday in terms of searching across timed text with MySQL and in doing so I realized that the way I was thinking of doing it in Solr was off. Rather than doing it the way I outlined in the original post content (above) in which I was thinking to index all the timed text for a given recording in one Solr "doc" element, I think it makes much more sense to index each line in its own "doc" element as such:

<doc>
  <field name="id">someMediaPlayer.php?source=someFile.mp3&amp;begin=10&amp;end=30</field>
  ...
  <field name="startTime">10</field>
  <field name="stopTime">30</field> 
  <field name="timedText">Hello world!</field>
  <field name="source">someFile.mp3</field> 
</doc>

That way there's no need to post-parse any data fields to get the start and stop time. And, moreover, rather than construct the URL to launch that segment of audio you can just put the URL directly in the "id" field. You can always use Solr built-in support for facets to facet off of the "source" field or some descriptive metadata like "title".

I'll file the original post under the "thinking out loud yet poorly" category.

--------------

Related Content:

Written by nitin

October 16th, 2011 at 10:54 am

SAVS: a Simple Audio/Video Synchronizer

leave a comment

About a year ago I did some text to audio synchronization tests with HTML5 and Flash.

The tests were partially successful, but I think what really mattered is that I set four goals that I felt needed to be met before the word "synchronization" could truly be used:

  1. The user should be able to click on a line of text and hear the related media.
  2. The user should be able to "scrub" ahead on the media player and the text should follow.
  3. The page should report where in the document the user is.
  4. The page should automatically keep the media/text synchronized without user intervention.

Basically, I've seen a few people make it so that you could watch media while the transcript text was also on the page (scrollable as opposed to overlaid closed captions) and the user could click on a line and have the movie/audio skip ahead to that moment (goal #1). That's great and all, but that's not synchronization.

;)

Synchronization is a two way street and I've been working this past week during what I'm calling "4 days of madness" to come up with a really simple solution to real synchronization. I did run across this really cool RadioLab page that achieves goal #1, but as much as I like it I want more features with less flash (as in "flash and dash" not Adobe Flash!) and less code. No mistake: it looks fantastic and I also appreciate that they've got the text timed to clusters of a couple of words rather than by line but the only thing I've seen that gets it all "right" per my perspective was a subscription resource by Alexander St. Press. It achieved all the goals above using a Flash player and the rest appeared to by done with Javascript and some jQuery smooth scrolling. It was also timed by clusters of words and not just by line or by paragraph. Of course, conceptually it's the same whether one marks up their text – in the temporal sense – by line or by word, but it's a little more work to do it by word of course. Unfortunately, I've seen people do the opposite: they use a static unit of time like 60 seconds and only mark up the text every minute. That's taking the easy way out and also misses the point entirely since it makes the text subservient to an arbitrary unit of time. Would it be acceptable if closed captioning and subtitles on your foreign films only showed up in large chunks every minute? I would hope not, and in the case of the former it would violate the the spirit if not the letter of the "law" in regard to accessibility. If done right, you can use the same timed text file to both serve up captions in addition to showing the full text on the page. It's more time and cost efficient to re-purpose the same data for two needs.

Anyway, let's get back to Alexander St. Press. I loved what I saw when my boss (I work at NC Live) showed it to me. I got really excited and said something like, "This is what I've been waiting to see!". In addition to the great and true syncing, they also had a feature that would let the user make and share clips, much the way you can on sites like NBC's Meet The Press. The Alexander St. Press site also allowed you to annotate that clip, which is a great feature for teachers and librarians, etc. Alexander St. Press also has this with their classical music streaming subscription service, which in the spirit of full disclosure I pay for. They ALSO had a cool timeline where you could see what I call "hot spots" – places where others had made clips. The idea, I guess, is that spots on the timeline with more clusters would indicate a particular point of interest. Nothing new, because you see that all the time with streaming sports like the US Open's site where you can go back and watch previous moments in matches and then "go live" at any time. But the difference is, of course, that Alexander St. Press was using user-contributed clips.

So long story short (or just not as long), in a few weeks I need to present these ideas to some people and talk about how we think these features could be useful for our users. And the more I struggled with how to talk about these concepts without a prototype the more I thought I would a) sound like I'm crazy and b) like I'm full of hot air.

I decided that it was time to go back to some earlier tests of mine from early April and just build a prototype so we could just show it to people and not have to talk theoretical speak. I think it's generally easier to explain and convince people of the utility of software by showing it rather than telling it. Actions > words, right?

Well, early tests are working and only required me to add one line of Actionscript to our current Flash player and about only 50 lines of Javascript code are needed to keep the text and media synced. The tests I did were for some PBS videos we purchased along with closed captioning files.

I was so excited that it was finally working that I went home during those "4 days of madness" to write an HTML5 version which is virtually identical to the Flash version. It's got basic clip making features as well as a very basic tool inspired by this video score tutorial to make timed text files provided you have the audio and full text in hand. Eventually, I'll comment the code up and improve some options and post a download to the source for the HTML5 version. At work, we'll probably eventually offer the code as it's tweaked to meet our aesthetic needs, etc. As you'll see in the demo video below, I have no aesthetics!

I'll shut up now and leave you to the video if you're interested. I recommend watching it in HD so you can read the words on the page.

As my friend whom the HTML5 version is kinda named after likes to say:

More later …

SAVS: a Simple Audio/Verse Synchronizer from nitin arora on Vimeo.

Update, September 20, 2011: To avoid confusion as to what this does, I'm renaming this from "Simple Audio/Video Synchronizer" to "Simple Audio/Verse Synchronizer" or something …

:)

Update, October 16, 2011: Cool, I found one more thing that meets all the four goals at http://www.dinglabs.com. They're pitching it as a foreign language learning tool, but same difference. Also, that site led me to TranscriberAG, a tool for transcribing audio.

--------------

Related Content:

Written by nitin

September 5th, 2011 at 9:39 am

AudioRegent 1.3.1 released

leave a comment

I've updated AudioRegent to version 1.3.1.

You can read an overview of the software and get the download link to the new version here.

The only reason I updated the software is because, as I've mentioned before, I've been having problems with Windows (and only recently at that) in terms of calling executables from the command line.

What seems to have helped is to no longer pass a command as a string a la:

RunSoxString = SoxPath + " ./outWavs/" + OggArray[cnt] + ws + "--comment-file comment.txt ./outOggs/" + str(OggArray[cnt])[:-4] + "." + outputType + ws + SoxOptions
RunSox = subprocess.Popen([RunSoxString], shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
RunSox.wait() #wait until the subprocess finishes

Now, it seems I have to pass it as a Python list (aka an array):

RunSoxString = SoxPath + " ./outWavs/" + OggArray[cnt] + ws + "--comment-file comment.txt ./outOggs/" + str(OggArray[cnt])[:-4] + "." + outputType + ws + SoxOptions
import shlex
RunSoxList = shlex.split(RunSoxString)
RunSox = subprocess.Popen(RunSoxList, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
RunSox.wait() #wait until the subprocess finishes

By the way, I totally haven't tested this new version enough to distribute it and I haven't tested it at all on a Linux box. But since no one's using it, I'm not too worried.

--------------

Related Content:

Written by nitin

July 28th, 2011 at 6:31 pm

Posted in digital audio,scripts

Tagged with , ,

of ADLs and SMIL and stuff

leave a comment

Even more than usual – this post is me thinking out loud. So some of the stuff at the bottom might not make sense since it refers to some software of mine that really only I use.

This morning I played around a little with Kino, an open-source video editor, and Adobe Audition, Adobe's flagship audio editor – which is based on their acquisition of Cool Edit.

The reason I wanted to play around with Kino is because it can export the project timeline to SMIL. I was mainly interested in seeing if it could be used as a pseudo audio editor – the idea being it could be a quick and dirty SMIL exporter. Well, it doesn't seem to support importing audio formats. I couldn't get it to import WAV or OGG files. It's still a cool application though.

The session exports from Audition are, as expected, pretty dense. For people like me who work in libraries there are issues involved in terms of setting limits for how much can and should be done in digital audio "preservation" (funny, I don't remember ordering jam and bread …). Well, at least I think there need to be limits, lest libraries want to start being creators, too, and admit that in doing so they are donating material of their own editorial designs back onto themselves. Anyway, by imposing limits I'm not sure XML session exports of thousands of lines for simple edits are a good idea.

I'd like to see other session formats without downloading demos for all kinds of audio editing software (some more expensive packages don't even seem to offer demos). For a small fee, there's always AATranslator.

But getting back to SMIL, I'm wondering how to use it in conjunction with AudioRegent without writing more code into the application – for now.

It would seem pretty easy to create a SMIL to SimpleADL XSLT and set up a chain to create derivative files.

Specifically, say I have a source file called source.wav. And I have two SMIL files as such:

source-1.smil.xml

<?xml version="1.0"?>
<smil xmlns="http://www.w3.org/2001/SMIL20/Language">
  <body>
    <seq>
      <audio src="source.wav" clipBegin="00:00:00.000" clipEnd="00:00:30.000."/>
    </seq>
  </body>
</smil>

and source-2.smil.xml

<?xml version="1.0"?>
<smil xmlns="http://www.w3.org/2001/SMIL20/Language">
  <body>
    <seq>
      <audio src="source.wav" clipBegin="00:00:30.000" clipEnd="00:00:50.000."/>
    </seq>
    <seq>
      <audio src="source.wav" clipBegin="00:01:00.000" clipEnd="00:02:00.000."/>
    </seq>
  </body>
</smil>

For both, the assumption is that two clips are to be made from source.wav: source-1 and source-2.

All I'd need to do is then setup a chain as such:

  1. Do source-1.smil.xml to temp.adl.xml via XSLT.
  2. Have AudioRegent make source.ogg by pointing it, via the command line options, to the source file, source.wav, and the SimpleADL file, temp.adl.xml.
  3. Rename source.ogg to source-1.ogg – i.e. with the same prefix as the corresponding SMIL file.
  4. Do source-2.smil.xml to temp.adl.xml via XSLT, overwriting temp.adl.xml.
  5. Have AudioRegent make source-2.ogg by pointing it, via the command line options, to the source file, source.wav, and the SimpleADL file, temp.adl.xml.
  6. Rename source.ogg to source-2.ogg – i.e. with the same prefix as the corresponding SMIL file.

Here's what temp.adl.wav would look like initially (step 1):

<?xml version="1.0" encoding="UTF-8"?>
<audioDecisionList filename="source.wav">
  <region id="_01">
    <in unit="seconds">0</in>
    <duration unit="seconds">30</duration>
  </region>
  <outputAsTracks>false</outputAsTracks>
</audioDecisionList>

And then it would look like this during the second pass (step 4):

<?xml version="1.0" encoding="UTF-8"?>
<audioDecisionList filename="source.wav">
  <region id="_01">
    <in unit="seconds">30</in>
    <duration unit="seconds">20</duration>
  </region>
  <region id="_02">
    <in unit="seconds">60</in>
    <duration unit="seconds">60</duration>
  </region>
  <outputAsTracks>false</outputAsTracks>
</audioDecisionList>

By the way, since the SimpleADL files are temporary, I don't see why – rather than converting time format to seconds – I couldn't just use something like this:

<in unit="time">00:01:00.000</in>
<duration unit="time">00:01:00.000</duration>

or something …

--------------

Related Content:

Written by nitin

January 9th, 2011 at 12:02 pm

Switch to our mobile site