blog.humaneguitarist.org

discoveries in digital audio, music notation, and information encoding

Archive for the ‘linked data’ tag

Full Metal Alchemyapi.com or “more term extraction crap and linky data crud”

leave a comment

As I mentioned before, I'm playing with the idea of using term generating APIs to build facets in a Solr index project that I'm working on with some people.

The results seem really promising.

If I wasn't in need of a nap before some more college basketball gets underway, I'd say more than I'm about to.

Instead, I'm going to do three quick things here:

  1. Provide a screenshot of the index UI using Calais "social tags" for facets.
    1. This is a local (my computer) copy of the index using a very small set of item metadata. That's to say we currently have about 37k items in the index, and I'm just using about 1k.
    2. I'm only using Calais tags if the "importance" attribute is equal to "1", so I'm leaving out tags Calais considers less relevant because, well, some of the terms generated were making me think "WTF?".
    3. Some of the terms with underscores like "War_Conflict" appear to be those used in the news industry and are potentially ones to throw out.
  2. Post a small Python script to make a call to Alchemyapi.com, which is similar – and possible better – than Calais.
  3. Post the Alchemyapi.com results XML document and talk a little about what I think it can be used for in our project.

So, here's the Calais screenshot (you'll need to view the image at full-resolution to read it):

Calais Facets

Here's the Python script to call the Alchemyapi.com API:

import urllib, urllib2

#set API url and API key
url = 'http://access.alchemyapi.com/calls/text/TextGetRankedConcepts'
apikey = '' #your API key goes here
#get Alchemy API key from: http://www.alchemyapi.com/api/register.html

#set some text for the API
text = '''
Episcopal churches
Churches Cemeteries
Tombs and sepulchral monuments
Postcards--North Carolina.
Flat Rock (N.C.)
Henderson County (N.C.)
'''

#send data to API
params = urllib.urlencode({
  'apikey': apikey,
  'text': text,
  'showSourceText': '1', #shows the original text sent to the API
})
alchemyThis = urllib2.urlopen(url, params).read()

#view results
print alchemyThis

And here's the output for the code above:

<?xml version="1.0" encoding="UTF-8"?>
<results>
  <status>OK</status>
  <usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>
  <url/>
  <language>english</language>
  <text>Episcopal churches Churches Cemeteries Tombs and sepulchral monuments Postcards--North Carolina. Flat Rock (N.C.) Henderson County (N.C.)</text>
  <concepts>
    <concept>
      <text>North Carolina</text>
      <relevance>0.920839</relevance>
      <website>http://www.nc.gov</website>
      <dbpedia>http://dbpedia.org/resource/North_Carolina</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000002b62d</freebase>
      <opencyc>http://sw.opencyc.org/concept/Mx4rvViyspwpEbGdrcN5Y29ycA</opencyc>
      <yago>http://mpii.de/yago/resource/North_Carolina</yago>
      <geonames>http://sws.geonames.org/4482348/</geonames>
    </concept>
    <concept>
      <text>Tomb</text>
      <relevance>0.837256</relevance>
      <geo>29.855 31.219</geo>
      <dbpedia>http://dbpedia.org/resource/Tomb</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000007ff03</freebase>
      <opencyc>http://sw.opencyc.org/concept/Mx4rwQw2p5wpEbGdrcN5Y29ycA</opencyc>
    </concept>
    <concept>
      <text>Burial monuments and structures</text>
      <relevance>0.773605</relevance>
      <dbpedia>http://dbpedia.org/resource/Burial_monuments_and_structures</dbpedia>
    </concept>
    <concept>
      <text>Flat Rock, Henderson County, North Carolina</text>
      <relevance>0.718415</relevance>
      <geo>35.266666666666666 -82.45333333333333</geo>
      <website>http://villageofflatrock.org/</website>
      <dbpedia>http://dbpedia.org/resource/Flat_Rock,_Henderson_County,_North_Carolina</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000ebc28</freebase>
      <yago>http://mpii.de/yago/resource/Flat_Rock,_Henderson_County,_North_Carolina</yago>
    </concept>
    <concept>
      <text>Henderson County, North Carolina</text>
      <relevance>0.615825</relevance>
      <geo>35.34 -82.48</geo>
      <website>http://www.hendersoncountync.org</website>
      <dbpedia>http://dbpedia.org/resource/Henderson_County,_North_Carolina</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000a10b4</freebase>
      <yago>http://mpii.de/yago/resource/Henderson_County,_North_Carolina</yago>
    </concept>
    <concept>
      <text>Asheville, North Carolina</text>
      <relevance>0.610351</relevance>
      <website>http://www.ashevillenc.gov/</website>
      <dbpedia>http://dbpedia.org/resource/Asheville,_North_Carolina</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000eb2ac</freebase>
      <census>http://www.rdfabout.com/rdf/usgov/geo/us/nc/counties/buncombe_county/asheville</census>
      <yago>http://mpii.de/yago/resource/Asheville,_North_Carolina</yago>
      <geonames>http://sws.geonames.org/4453066/</geonames>
    </concept>
    <concept>
      <text>Episcopal Church in the United States of America</text>
      <relevance>0.610029</relevance>
      <dbpedia>http://dbpedia.org/resource/Episcopal_Church_in_the_United_States_of_America</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000000015f1b</freebase>
      <yago>http://mpii.de/yago/resource/Episcopal_Church_in_the_United_States_of_America</yago>
    </concept>
    <concept>
      <text>New York</text>
      <relevance>0.592008</relevance>
      <geo>43.0 -75.0</geo>
      <website>http://www.ny.gov</website>
      <dbpedia>http://dbpedia.org/resource/New_York</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000054dd5d</freebase>
      <opencyc>http://sw.opencyc.org/concept/Mx4rvViNs5wpEbGdrcN5Y29ycA</opencyc>
      <census>http://www.rdfabout.com/rdf/usgov/geo/us/ny</census>
      <yago>http://mpii.de/yago/resource/New_York</yago>
    </concept>
  </concepts>
</results>

As you can see, "New York" shows up but it has less than 60% relevance, so maybe that's a threshold to consider when indexing automated subject terms with Alchemyapi. That's just my theory and only lots of testing will help determine what the threshold really is – if there's one at all.

As you can also see, there's a lot of potential for linked data with this output: to data from relevant dbpedia pages, etc. One neat thing would be to make it so that if the user hovers over a facet, that the UI pops-up more information from these linked data sources like relevant websites, mapped geo-coords using the Google Maps API, definitions of the faceted term, and similar concept visualizations, etc.

That's all. Sleepy time and B-ball starts soon …

--------------

Related Content:

Written by nitin

March 25th, 2012 at 4:57 pm

these two apps are too cool: TeamViewer and dotNetRDF

leave a comment

TeamViewer and dotNetRDF have nothing in common except that I downloaded both this week and I'm glad I did.

TeamViewer is free for non-commercial use and let's me control my Samsung netbook with my Lenovo laptop. I have my Samsung hooked up to my stereo system and now I can start Pandora or ClassicsOnline on my Samsung by controlling it with my Lenovo.

I wouldn't have been to do this with Windows Remote Desktop as far as I know given that I have Windows XP Home on the Samsung and Windows 7 Home on the Lenovo.

I could also use TeamViewer to transfer files across computers if I wanted.

dotNetRDF seems like a great open source tool to help me learn more about linked data and it's got a nice tool for running SPARQL queries called SparqlGUI.

--------------

Related Content:

Written by nitin

July 6th, 2011 at 9:20 pm

you are what you eat: junk food and linked data

leave a comment

I always feel a little strange just passing along something from another blog or posting clips, but this short video by Tim Berners-Lee is very cool.

You can just watch the video below, or better yet you can read a little more and see the video by going to:

http://inkdroid.org/journal/2010/06/04/the-5-stars-of-open-linked-data/

--------------

Related Content:

Written by nitin

October 31st, 2010 at 10:37 am

Switch to our mobile site