blog.humaneguitarist.org

discoveries in digital audio, music notation, and information encoding

Archive for the ‘XQuery’ tag

LS-598 #2: XQuery problems and solutions

leave a comment

Just a quick morning post today …

The last 10 days or so I've been struggling with some major problems that arose in trying to implement effective XQuer-ies on my web demo.

  1. Dublin Core doesn't allow me to differentiate creator "types", so I was limited to searching across the DC:creator element for all creators, be they Composer, Lyricist, or Arranger. MusicXML does differentiate these types, so essentially Dublin Core was making me "dumb down" some information. I want people to be able to search creator specifically by their role: Composer, Lyricist, or Arranger.
  2. I needed a way to iterate an XQuery over all the MusicXML documents and I needed it to be relatively fast. A demo is a demo, but impatience is impatience and I just can't accept slow query processing.
  3. The XQuery processor I was using didn't support some XQuery functions that would allow a searcher to type in "Bach" and retrieve documents for which the creator was "J.S Bach", "Johann Sebastian Bach", "Bach, J.S", "P.D.Q Bach", etc. This really was limiting the search/query coolness factor and I wasn't at all happy about it.

Here are my solutions (details to follow in a few days or so):

  1. Ditch Dublin Core and switch to MODS, which does allow me to specify the role of a creator. Last week, I made a MusicXML to MODS XSL transformation for descriptive metadata and it's working well.
  2. Steal an idea from Using XQuery on MusicXML Databases for Musicological Analysis so that rather than iterate one query (say for the number of notes in a piece) across multiple MusicXML docs, I just concatenated all the MusicXML documents. The original files are left alone, but a "super" MusicXML file gets created so that one can just query that one file, hence no need for lengthy iteration. I'm not sure how those fellows did it, but I just automated it via PHP using the following format:

 <hyperMXML>

<hypoMXML file="foo1.xml">

1st MusicXML document

</hypoMXML>

<hypoMXML file="foo2.xml">

2nd MusicXML document

</hypoMXML>

</hyperMXML>

  1. Switch XQuery processors! I'll go into the ones that didn't support the function I needed another time, but I will say that BaseX did the trick. Below is the query that searches for creators with "Bach" somewhere in MusicXML's <creator> element. For the deliverable demo, I won't be querying these big MusicXML documents for simple descriptive metadata like Creator, that's what the MODS is for. But this is just an example. The "ftcontains" syntax is what allows for retrieval of values where "Bach" is somewhere within the element, but isn't necessarily equivalent to the entire element value.

for $i in doc("../temp/concat/concatMXML.xml")/hyperMXML/hypoMXML/score-partwise
where $i/identification/creator ftcontains "Bach"
return ($i/work/work-title)


This blog post is part of a semester-long investigation into digital encoding of symbolic music representation (SMR), its context in libraries, web-based delivery, preservation and metadata, and search and retrieval technologies.

--------------

Related Content:

Written by nitin

February 4th, 2010 at 8:29 am

running XQuery online

2 comments

A while ago, I posted about my first experience with XQuery and how I'd used the .NET version of the Saxon processor on my local Windows machine.

Obviously, I want to extend that experience to running XQueries online. So far, I know this can easily be done with a native XML database server like eXist. That is to say, when I installed eXist on my local machine and made it go live: bam! – instant XML server with built-in XQuery functionality, accessible from anywhere.

Another way is to use one of the more popular web-scripting languages to execute XQuery syntax. I nearly killed myself this weekend trying to install the Zorba PHP binding for XQuery (i.e. run XQuery natively from within a PHP script). I just couldn't get all the dependencies successfully installed on my virtual install of Ubuntu Netbook Remix (BTW: I use Sun's VirtualBox for virtualization). Perhaps I'll be able to make it work another time.

Now, even though it makes all the sense in the world to stick with a native XML server like eXist if I want to make a large collection of searchable XML documents online (and I do), I'm feeling non-sensical. What I decided to try was to run my computer as a more traditional server using the X-Apache-MySql-PHP model, specifically WampServer.

From there, I placed my sample document, "books.xml" and my "test.xquery" file from last time:

<ul>
{
for $x in doc("books.xml")/bookstore/book/title
order by $x
return <li>{data($x)}</li>
}
</ul>

in WAMP's "www" directory – i.e. the directory which is accessible from the browser, the place where one would put all their HTML files, etc. for the world to see.

Of course, I was still missing the actual XQuery processor at this point. What I tried is to put the relevant executables for Saxon in the "www" directory as well.

… Now, I'm sure this is totally unsafe or something, but I was just testing and I only make my computer "go live" as a server for short periods of time.

Anyway, from there I used PHP to call the Saxon processor and to display the results in the browser. It actually worked!

Here's the code:

<?php
echo "Hi. I'm going to use XQuery to list the books alphabetically.";
exec("query.exe test.xquery !indent=yes", $results);
foreach ($results as $value)
    {
       echo "$value";
    }
?> 

You can see that the PHP "exec" command called the Saxon executable named "query.exe" and executes the "test.xquery" file.

It saves the results in a variable called "results" and then prints each value of the results in the browser.

For now, I'm OK with this, but I need to eventually do the same thing with the Java version of Saxon, I suppose, if I'm ever going to run this on a Linux server. It shouldn't present any new hurdles, but I need to try it to make sure, of course.

If anyone out there has any thoughts or recommendations on a more elegant method to achieve these results – and what the security risks of the approach I've outlined presents (since I didn't use a CGI-bin), please speak up. I'm all ears.

:)

Update, November, 2009: No problems with using the Java version of Saxon. I did place it in a "bin" directory on my local server so that the Java file wouldn't reside in the directory that users have direct access to.

--------------

Related Content:

Written by nitin

November 8th, 2009 at 7:26 pm

Posted in scripts,XML

Tagged with , ,

XQuery and MusicXML

3 comments

Earlier today,  I posted about my first experience with XQuery. I'd mentioned that I wanted to get my feet wet before I started trying to run queries on MusicXML documents.

Well, I'm an incredibly impatient person.

I couldn't wait to take a couple of simple queries for a test run, especially after reading the following paper from the 2008 International Conference on Music Information Retrieval hosted by ISMIR, the International Society for Music Information Retrieval:  

Using XQuery on MusicXML Databases for Musicological Analysis
Joachim Ganseman, Paul Scheunders and Wim D’haes

Now, I've known for a while the tests have been done using XQuery on MusicXML documents, but this paper was getting at something that's been on my mind for a long time now: the day we can have digital libraries of sheet music, not as image files, but as encoded documents, allowing musicians and the like to have the same online ability to query music in the way that users of prose and literary documents now take for granted.

Anyway, on to my first XQuery and MusicXML experience …

For testing, I used a very silly little ditty I wrote called "MusicXML: I Heart Thee".

Here are its various manifestations:

The first query demonstrated in the paper (see page 3) is one to count the total notes in a digital library, in this case the Wikifonia collection of MusicXML docs.

I couldn't get it to work as written even after I adjusted the query to work on my test document. This is likely due to my own ignorance, but in the end it was a good thing because it forced me to write my own, simpler queries.

I'm using the Saxon query processor as described in my earlier post.

1. This query (in red) counts all the notes in my piece:

<ul>
{
for $i in doc("i_heart_thee.xml")/score-partwise
let $j :=count($i/part/measure/note)
return $j
}
</ul>

A line-by-line translation:

  • Open an unordered list.

  • Open the query syntax with the "{" character.

  • Let there be a variable called "i" that will start at the root element, <score-partwise>, of the document "i_heart_thee.xml".

  • Let there be a variable, "j", that executes the Count function on "i" for the <note> element which is a child of <measure> and a grandchild of <part>.

  • Print the value of "j".

  • Close the query syntax with the "}" character.

  • Close the unordered list.

2. This query (in red) counts all the notes in the vocal part (there are 3 parts altogether: voice, guitar, bass):

<ul>
{
for $i in doc("i_heart_thee.xml")/score-partwise
let $j :=count($i/part[@id='P1']/measure/note)
return $j
}
</ul>

A line-by-line translation:

  • Open an unordered list.

  • Open the query syntax with the "{" character.

  • Let there be a variable called "i" that will start at the root element, <score-partwise>, of the document "i_heart_thee.xml".

  • Let there be a variable, "j", that executes the Count function on "i" for the <note> element which is a child of <measure> and a grandchild of <part>, where the "ID" attribute of <part> is = to "P1". This is the vocal part of the score.

  • Print the value of "j".

  • Close the query syntax with the "}" character.

  • Close the unordered list.

If you run the first query you get the result "137" as in 137 notes, including rests – even the hidden rests in measures 1,5, and 9 that exist because both voices in the guitar part have rests, though it only displays as one rest each time on the image version of the score.

If you run the second query, you get 43 notes including rests and the tied notes.

I'm sure there are ways to subtract rests and tied notes, but I have to start somewhere, right?

:)

--------------

Related Content:

Written by nitin

September 12th, 2009 at 8:11 pm

Posted in music notation,XML

Tagged with , ,

on using XQuery for the first time

2 comments

Obviously, I've been playing around with XSLT lately. So naturally, the next logical step was to delve into XQuery, the XML query language de jure. Eventually I want to run queries on MusicXML documents, but I need to start small.

While the W3Schools tutorial on XQuery is a great introduction, there's one little problem.

It doesn't really tell you how to implement XQuery: i.e. how to actually run a query and retrieve results.

So after some random perusing and downloading, I - like the fool I am – was made aware by Dr. Michael Kay's "Learn XQuery in 10 Minutes: An XQuery Tutorial"  that the Saxon XSLT processor I was already using for XSLT transformations already had an XQuery engine built in.

That's to say that the .NET version of Saxon has 2 command line executables:

  1. Transform.exe, which I'd already used for XSLT transformations
  2. Query.exe, which allows one to run XQuery queries

So much for paying attention to what I download …

From there, it was a simple matter to use XQuery for the first time.

Here are the steps:

  1. I downloaded the books.xml file provided by W3Schools and place it into the "bin" directory of Saxon on my drive. This is same directory where the 2 afformentioned executables reside.
  2. Using the kick-tail text editor jEdit, I copy/pasted/saved this query example from the W3Schools as "test.xquery" (also saved in the "bin" directory):

<ul>
{
for $x in doc("books.xml")/bookstore/book/title
order by $x
return <li>{data($x)}</li>
}
</ul>

This query simply lists all the titles from "books.xml" in alphabetical order.

  1. Then using jEdit's command line plug-in called "Console", I set Console to the Saxon "bin" directory where "query.exe", "books.xml", and "test.xquery" reside. The easiest way to set the directory in Console is to type:

cd "C:\Documents and Settings\nitin\Desktop\saxon\bin"

Of course, you might extract Saxon elsewhere, but the important thing is to type cd + opening quotation mark + full path to Saxon's "bin" folder + ending quotation mark.

  1. Now I was in the correct folder and could run the XQuery with the following command line syntax:

query test.xquery

And my results look like this:

I know what you're thinking: no line breaks! Sure, the computer doesn't care, but this is really hard for humans to read!

Yes, that's true. But I went ahead and pasted the following:

<?xml version="1.0" encoding="UTF-8"?><ul><li>Everyday Italian</li><li>Harry Potter</li><li>Learning XML</li><li>XQuery Kick Start</li></ul>

into a new document in jEdit anyway.

We're gonna take care of those line breaks now …

  1. One of the many great things about jEdit is the ability to run Beanshell commands, which despite my attempts to sound authoritative, I only learned about roughly 30 minutes ago. This means that a search and replace can be done in jEdit using simple Java syntax to fix that line break issue. The first step is identifying where to insert the line break. I needed it in between > and <. Specifically, I needed a line break between all the red and green colored brackets:

<?xml version="1.0" encoding="UTF-8"?><ul><li>Everyday Italian</li><li>Harry Potter</li><li>Learning XML</li><li>XQuery Kick Start</li></ul>

 So I just invoked the jEdit search/replace box and did the following:

This simply says:

Find all instances of

><

and

Replace it with

>

<

- i.e. the text between the quotation marks. The n is, by the way, the line break syntax.

When I hit "Replace All", this was the result:

<?xml version="1.0" encoding="UTF-8"?>
<ul>
<li>Everyday Italian</li>
<li>Harry Potter</li>
<li>Learning XML</li>
<li>XQuery Kick Start</li>
</ul>

Problem solved.

  1. Now I simply saved this document as "test.html" and opened it in a browser.

Anyway, that's my very simple start to XQuery, but I'm feeling pretty good about it nonetheless.

--------------

Related Content:

Written by nitin

September 12th, 2009 at 12:26 pm

Posted in XML

Tagged with , , ,

Switch to our mobile site