blog.humaneguitarist.org

discoveries in digital audio, music notation, and information encoding

Archive for the ‘faceting’ tag

Full Metal Alchemyapi.com or “more term extraction crap and linky data crud”

leave a comment

As I mentioned before, I'm playing with the idea of using term generating APIs to build facets in a Solr index project that I'm working on with some people.

The results seem really promising.

If I wasn't in need of a nap before some more college basketball gets underway, I'd say more than I'm about to.

Instead, I'm going to do three quick things here:

  1. Provide a screenshot of the index UI using Calais "social tags" for facets.
    1. This is a local (my computer) copy of the index using a very small set of item metadata. That's to say we currently have about 37k items in the index, and I'm just using about 1k.
    2. I'm only using Calais tags if the "importance" attribute is equal to "1", so I'm leaving out tags Calais considers less relevant because, well, some of the terms generated were making me think "WTF?".
    3. Some of the terms with underscores like "War_Conflict" appear to be those used in the news industry and are potentially ones to throw out.
  2. Post a small Python script to make a call to Alchemyapi.com, which is similar – and possible better – than Calais.
  3. Post the Alchemyapi.com results XML document and talk a little about what I think it can be used for in our project.

So, here's the Calais screenshot (you'll need to view the image at full-resolution to read it):

Calais Facets

Here's the Python script to call the Alchemyapi.com API:

import urllib, urllib2

#set API url and API key
url = 'http://access.alchemyapi.com/calls/text/TextGetRankedConcepts'
apikey = '' #your API key goes here
#get Alchemy API key from: http://www.alchemyapi.com/api/register.html

#set some text for the API
text = '''
Episcopal churches
Churches Cemeteries
Tombs and sepulchral monuments
Postcards--North Carolina.
Flat Rock (N.C.)
Henderson County (N.C.)
'''

#send data to API
params = urllib.urlencode({
  'apikey': apikey,
  'text': text,
  'showSourceText': '1', #shows the original text sent to the API
})
alchemyThis = urllib2.urlopen(url, params).read()

#view results
print alchemyThis

And here's the output for the code above:

<?xml version="1.0" encoding="UTF-8"?>
<results>
  <status>OK</status>
  <usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>
  <url/>
  <language>english</language>
  <text>Episcopal churches Churches Cemeteries Tombs and sepulchral monuments Postcards--North Carolina. Flat Rock (N.C.) Henderson County (N.C.)</text>
  <concepts>
    <concept>
      <text>North Carolina</text>
      <relevance>0.920839</relevance>
      <website>http://www.nc.gov</website>
      <dbpedia>http://dbpedia.org/resource/North_Carolina</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000002b62d</freebase>
      <opencyc>http://sw.opencyc.org/concept/Mx4rvViyspwpEbGdrcN5Y29ycA</opencyc>
      <yago>http://mpii.de/yago/resource/North_Carolina</yago>
      <geonames>http://sws.geonames.org/4482348/</geonames>
    </concept>
    <concept>
      <text>Tomb</text>
      <relevance>0.837256</relevance>
      <geo>29.855 31.219</geo>
      <dbpedia>http://dbpedia.org/resource/Tomb</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000007ff03</freebase>
      <opencyc>http://sw.opencyc.org/concept/Mx4rwQw2p5wpEbGdrcN5Y29ycA</opencyc>
    </concept>
    <concept>
      <text>Burial monuments and structures</text>
      <relevance>0.773605</relevance>
      <dbpedia>http://dbpedia.org/resource/Burial_monuments_and_structures</dbpedia>
    </concept>
    <concept>
      <text>Flat Rock, Henderson County, North Carolina</text>
      <relevance>0.718415</relevance>
      <geo>35.266666666666666 -82.45333333333333</geo>
      <website>http://villageofflatrock.org/</website>
      <dbpedia>http://dbpedia.org/resource/Flat_Rock,_Henderson_County,_North_Carolina</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000ebc28</freebase>
      <yago>http://mpii.de/yago/resource/Flat_Rock,_Henderson_County,_North_Carolina</yago>
    </concept>
    <concept>
      <text>Henderson County, North Carolina</text>
      <relevance>0.615825</relevance>
      <geo>35.34 -82.48</geo>
      <website>http://www.hendersoncountync.org</website>
      <dbpedia>http://dbpedia.org/resource/Henderson_County,_North_Carolina</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000a10b4</freebase>
      <yago>http://mpii.de/yago/resource/Henderson_County,_North_Carolina</yago>
    </concept>
    <concept>
      <text>Asheville, North Carolina</text>
      <relevance>0.610351</relevance>
      <website>http://www.ashevillenc.gov/</website>
      <dbpedia>http://dbpedia.org/resource/Asheville,_North_Carolina</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000eb2ac</freebase>
      <census>http://www.rdfabout.com/rdf/usgov/geo/us/nc/counties/buncombe_county/asheville</census>
      <yago>http://mpii.de/yago/resource/Asheville,_North_Carolina</yago>
      <geonames>http://sws.geonames.org/4453066/</geonames>
    </concept>
    <concept>
      <text>Episcopal Church in the United States of America</text>
      <relevance>0.610029</relevance>
      <dbpedia>http://dbpedia.org/resource/Episcopal_Church_in_the_United_States_of_America</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000000015f1b</freebase>
      <yago>http://mpii.de/yago/resource/Episcopal_Church_in_the_United_States_of_America</yago>
    </concept>
    <concept>
      <text>New York</text>
      <relevance>0.592008</relevance>
      <geo>43.0 -75.0</geo>
      <website>http://www.ny.gov</website>
      <dbpedia>http://dbpedia.org/resource/New_York</dbpedia>
      <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000054dd5d</freebase>
      <opencyc>http://sw.opencyc.org/concept/Mx4rvViNs5wpEbGdrcN5Y29ycA</opencyc>
      <census>http://www.rdfabout.com/rdf/usgov/geo/us/ny</census>
      <yago>http://mpii.de/yago/resource/New_York</yago>
    </concept>
  </concepts>
</results>

As you can see, "New York" shows up but it has less than 60% relevance, so maybe that's a threshold to consider when indexing automated subject terms with Alchemyapi. That's just my theory and only lots of testing will help determine what the threshold really is – if there's one at all.

As you can also see, there's a lot of potential for linked data with this output: to data from relevant dbpedia pages, etc. One neat thing would be to make it so that if the user hovers over a facet, that the UI pops-up more information from these linked data sources like relevant websites, mapped geo-coords using the Google Maps API, definitions of the faceted term, and similar concept visualizations, etc.

That's all. Sleepy time and B-ball starts soon …

--------------

Related Content:

Written by nitin

March 25th, 2012 at 4:57 pm

make you some facets, boy!

leave a comment

As I mentioned the other day in this post, I've been working with some awesome people to harvest, index, and make searchable metadata for digital library collections from multiple institutions across the state of North Carolina, USA.

In the post I just linked to, I talked about the problems of inconsistent metadata across institutions and how that negatively impacts browsing via facets with Solr. I also wondered out loud about resolving/aligning small discrepancies via text analysis.

Well, another way to tackle this problem is – after harvesting the metadata but before indexing it – to "make" facet-able terms via some sort of term extraction. While at DrupalCon 2012 in Denver, CO this week I went to a presentation where the presenter mentioned a project he'd worked on pulling in RSS feeds. In passing, he mentioned using OpenCalais to make a tag cloud. I totally forgot I had an API key for OpenCalais!

Anyway, now I see there are lots of similar web services. Which one is best in terms of term extraction and which one allows the most API hits per day is a matter for another day, but today – in my hotel now that the conference has ended – I thought I'd do a little scripting to get me on the path to really testing this.

Using the soon-to-be deprecated Yahoo Term Extraction Web Service I tested taking a sample Solr-compatible XML index file and sending the metadata in it to the service to retrieve new subject terms. While my test script doesn't do it here, the idea is that after retrieving from the API these new terms, the terms could be placed into the Solr-compatible index file. After indexing the updated file, these new terms could be exposed to the user as click-able facets.

I'll have to test this with lots of real-world metadata from across our test-set of metadata to see if the term extraction service can be used to produce nicer facets with disparate metadata than what we currently see, but for now I just needed to write a play/test script.

Below, I've pasted the Python script and the the output which explains a little what it's doing.

Actually, I've pasted the output first since people might not need or want to see the code. At the end, I've posted the "social tags" that OpenCalais would seem to generate for the same metadata – for comparison purposes.

The output:

Here's an XML file that can indexed by Solr (it was generated via harvesting data from the Library of Congress using Python and XSL).

<add>
  <doc>
    <field name="identifier">http://hdl.loc.gov/loc.mbrsmi/amrlv.4007</field>
    <field name="title">[Theater commercial--electric refrigerators]. Buy an electric refrigerator /</field>
    <field name="creator">AFI/Kalinowski (Eugene) Collection (Library of Congress)</field>
    <field name="subject">Refrigerators.</field>
    <field name="subject">Advertising--Electric household appliances--Pennsylvania--Pittsburgh.</field>
    <field name="subject">Trade shows--Pennsylvania--Pittsburgh.</field>
    <field name="subject">Silent films.</field>
    <field name="subject">Pittsburgh (Pa.)--Manufactures.</field>
    <field name="description">Largely graphic commercial for electric refrigerators in general and a refrigerator show, presumably in Pittsburgh, in particular.</field>
  </doc>
 </add>

-----

After using the Yahoo term extraction service we can create more <field> elements.

<field name="yahooTerm">electric household appliances</field>
<field name="yahooTerm">electric refrigerators</field>
<field name="yahooTerm">electric refrigerator</field>
<field name="yahooTerm">library of congress</field>
<field name="yahooTerm">silent films</field>
<field name="yahooTerm">collection library</field>
<field name="yahooTerm">pittsburgh pa</field>
<field name="yahooTerm">pennsylvania</field>

-----

If we place those new terms into the original XML file and reindex the item, we'll have new facets to play with.

This is a *potential* solution for creating practical, useable, and consistent(?) facets for metadata harvested from different institutions that use different subject terms and internal taxonomies, etc.

I think the basic Yahoo term extractor is deprecated(?), but there are other options such as their newer Context Analysis API, OpenCalais, and AlchemyAPI.com, etc.

The script:

#####
## merge all <fields> into one string; place in "context" variable.
SolrXML = '''
<add>
  <doc>
    <field name="identifier">http://hdl.loc.gov/loc.mbrsmi/amrlv.4007</field>
    <field name="title">[Theater commercial--electric refrigerators]. Buy an electric refrigerator /</field>
    <field name="creator">AFI/Kalinowski (Eugene) Collection (Library of Congress)</field>
    <field name="subject">Refrigerators.</field>
    <field name="subject">Advertising--Electric household appliances--Pennsylvania--Pittsburgh.</field>
    <field name="subject">Trade shows--Pennsylvania--Pittsburgh.</field>
    <field name="subject">Silent films.</field>
    <field name="subject">Pittsburgh (Pa.)--Manufactures.</field>
    <field name="description">Largely graphic commercial for electric refrigerators in general and a refrigerator show, presumably in Pittsburgh, in particular.</field>
  </doc>
 </add>
'''

from lxml import etree # see: http://lxml.de/ for this library.

SolrXML_parsed = etree.XML(SolrXML)
SolrXML_combined = SolrXML_parsed.findall(".//field")
SolrXML_combined.pop(0) #remove <field name="indentifier"> since we don't want
                        #a term generated from the URL; ideally this should be
                        #removed by having an attribute of "identifier" rather
                        #than by position, but this is just a test.

SolrXML_combinedList = []
for element in SolrXML_combined:
  SolrXML_combinedList.append(element.text)
context = (" ".join(SolrXML_combinedList))
#print context #test line


#####
## send XML example to Yahoo termExtraction service; print generated terms
## reference example: http://developer.yahoo.com/python/python-rest.html#post
import urllib, urllib2

url = 'http://search.yahooapis.com/ContentAnalysisService/V1/termExtraction'
appid = 'YahooTermTest'

params = urllib.urlencode({
  'appid': appid,
  'context': context,
})

yahooResultsXML = urllib2.urlopen(url, params).read()
#print yahooResultsXML #test line

yahooResultsXML_parsed = etree.XML(yahooResultsXML)
newSolrTerms = ""
for yahooTerm in yahooResultsXML_parsed:
  newSolrTerms = newSolrTerms + "<field name=\"yahooTerm\">" + yahooTerm.text \
  + "</field>\n"
 
#####
## print what the script is trying to do and the results ...
print "Here's an XML file that can indexed by Solr\
 (it was generated via harvesting data from the Library of Congress and XSL)."
 
print SolrXML

print "-"*5 + "\n"

print "After using the Yahoo term extraction service we can create more\
 <field> elements.\n"
 
print newSolrTerms

print "-"*5 + "\n"

print "If we place those new terms into the original XML file and reindex the\
 item, we'll have new facets to play with.\n"

print "This is a *potential* solution for creating practical, useable, and\
 consistent(?) facets for metadata harvested from different institutions that use\
 different subject terms and internal taxonomies, etc.\n"

print "I think the basic Yahoo term extractor is deprecated(?), but there are\
 other options such as their newer Context Analysis API, OpenCalais, and\
 AlchemyAPI.com, etc."

And here's what OpenCalais extracted as "social tags":

  • Business Finance
  • Entertainment Culture
  • Food storage
  • Food preservation
  • Home appliances
  • Pittsburgh
  • Refrigerator
--------------

Related Content:

Written by nitin

March 22nd, 2012 at 7:58 pm

facet mashing, a tragedy in 0.987 acts

leave a comment

Update, March 21, 2012: I'm at DrupalCon 2012 and after going to a session on node.js – which I've had in the back of my head as a potential replacement for Python for some metadata harvesting software I'm working on – I was reminded of OpenCalais which I haven't looked at in forever, probably because I wouldn't have understood it before. Anyway, maybe that's a solution to the issues I'm describing below in terms of generating some sort of browse-able facets. This is definitely something to look into.

Home sick again, so that means another meaningless contribution to the "blogosphere" …

So, I've been working with some folks on a project to make a single site search for digital collections across the state I work in.

We're using Solr for the index and OAI feeds for now even though the metadata harvesting software is agnostic of OAI and can support other feed types, etc. But that's not the point here …

The point is that metadata coming in from different places makes for a mess if you want to expose facets … and we might veer to not showing them because noone wants to get into the murky waters of trying to control for that across multiple places.

I think subject facets are still useful though because I like to "play around", to stumble in the dark, and just have fun.

But, of course, there's still the fact-of-the-matter that across multiple institutions you might see subjects from one place written as "Asheville, NC" and another as "Asheville, (N.C.)".

Well, that stinks. There are essentially the same thing, but would get exposed as two separate facets.

So, in the spirit of stumbling in the dark, last Saturday morning I worked on a preliminary little function in Python to try and merge strings like the Asheville example above.

The idea is that the function should present to the user the version that has more "votes", i.e. the one that has more matches in the current search results. So, if "Asheville, NC" appeared 10 times and "Asheville, (N.C.)" appeared 15 times in the user's search results, the function would display "Asheville, (N.C.)" to the user and say it has 25 matches. When the user clicks "Asheville, (N.C.)" a search would be launched for either "Asheville, (N.C.)" or "Asheville, NC". Essentially, the idea is to beautify the facets at the last possible moment (i.e. through a function in the user interface) so the user doesn't have to see the ugly reality of metadata from all over the place; it's also about rectifying things based on text similarity not on semantic similarity – which is another ballgame altogether.

The function uses some known string similarity methods. It's promising but there's still lots of work to do if I really decide to pursue this. And by "lots of work" I really mean seeing if someone with the proper computer science and linguistic background has already written a library for this kind of thing. And (adding this the day after I originally wrote this), I also need to play with s-match.

Anyway, the test code is below and the results are below that but I need to stop writing because I'm dropping out and need to take a nap.

:/

#####
def facetMasher(x,y):
  info = "Comparing \"%s\" with %s facets, against \"%s\" with %s facets." %(x[0],x[1],y[0],y[1])
  print info
 
  output = ""
 
  import Levenshtein #Windows32/Python 2.7 installer: http://sourceforge.net/projects/translate/files/python-Levenshtein/
  lev = Levenshtein.jaro
  myJaro = lev(x[0], y[0])
 
  lev2 = Levenshtein.distance
  myDist = lev2(x[0], y[0])
 
  print "Jaro-Winkler score: ", myJaro
  print "Levenshtein distance: ", myDist
  if myJaro > .95 or (myJaro > .75 and myDist < 10):
      if myDist > 1:
          totalFacets = x[1] + y[1]
          if (x[1] >= y[1]):
              mergedString = x[0]
          else:
              mergedString = y[0]
          output =  "Merging to \"%s\" with %s facets." %(mergedString, totalFacets)
  if output == "":
    output = "Keeping \"%s\" with %s facets, and \"%s\" with %s facets." %(x[0],x[1],y[0],y[1])

  print output
  print ("--\n")
      
##### tests ...
facetMasher (("Bibles",3),("bible",2)) #interesting ...
facetMasher (("Fibles",3),("fible",2))

facetMasher (("World War 1",3),("World War 2",2))

facetMasher (("Images",4),("image",3))
facetMasher (("Images",2),("movies",3))

facetMasher (("Asheville, NC",3),("Asheville (N.C.)",2))
facetMasher (("Asheville, (NC)",3),("Asheville (N.C.)",2))
facetMasher (("Granville County (N.C.)",120),("Granville County, N.C.",2))

facetMasher (("foo & bar",3),("foo and bar",2))

facetMasher (("United States--History--Civil War, 1861-1865",3),("United States--History--Civil War, 1861-1865--Correspondence",2))

facetMasher (("United States--History--World War II",3),("United States--History--World War I",2))
facetMasher (("United States--History--World War Two",3),("United States--History--World War 2",2))
facetMasher (("United States--History--World War Two",3),("United States--History--World War 1",2))
facetMasher (("United States--History--World War 1",3),("United States--History--World War 2",2))

And here are the results, below. It's interesting how "Bibles" vs. "bible" doesn't merge, yet "Fibles" and "fible" do. Also, there are some undesired results such as merging "United States–History–World War Two" with "United States–History–World War 1" because the algorithm still sucks.

Comparing "Bibles" with 3 facets, against "bible" with 2 facets.
Jaro-Winkler score:  0.738888888889
Levenshtein distance:  2
Keeping "Bibles" with 3 facets, and "bible" with 2 facets.
--

Comparing "Fibles" with 3 facets, against "fible" with 2 facets.
Jaro-Winkler score:  0.822222222222
Levenshtein distance:  2
Merging to "Fibles" with 5 facets.
--

Comparing "World War 1" with 3 facets, against "World War 2" with 2 facets.
Jaro-Winkler score:  0.939393939394
Levenshtein distance:  1
Keeping "World War 1" with 3 facets, and "World War 2" with 2 facets.
--

Comparing "Images" with 4 facets, against "image" with 3 facets.
Jaro-Winkler score:  0.822222222222
Levenshtein distance:  2
Merging to "Images" with 7 facets.
--

Comparing "Images" with 2 facets, against "movies" with 3 facets.
Jaro-Winkler score:  0.666666666667
Levenshtein distance:  4
Keeping "Images" with 2 facets, and "movies" with 3 facets.
--

Comparing "Asheville, NC" with 3 facets, against "Asheville (N.C.)" with 2 facets.
Jaro-Winkler score:  0.891025641026
Levenshtein distance:  5
Merging to "Asheville, NC" with 5 facets.
--

Comparing "Asheville, (NC)" with 3 facets, against "Asheville (N.C.)" with 2 facets.
Jaro-Winkler score:  0.936111111111
Levenshtein distance:  3
Merging to "Asheville, (NC)" with 5 facets.
--

Comparing "Granville County (N.C.)" with 120 facets, against "Granville County, N.C." with 2 facets.
Jaro-Winkler score:  0.955862977602
Levenshtein distance:  3
Merging to "Granville County (N.C.)" with 122 facets.
--

Comparing "foo & bar" with 3 facets, against "foo and bar" with 2 facets.
Jaro-Winkler score:  0.809553872054
Levenshtein distance:  3
Merging to "foo & bar" with 5 facets.
--

Comparing "United States--History--Civil War, 1861-1865" with 3 facets, against "United States--History--Civil War, 1861-1865--Correspondence" with 2 facets.
Jaro-Winkler score:  0.911111111111
Levenshtein distance:  16
Keeping "United States--History--Civil War, 1861-1865" with 3 facets, and "United States--History--Civil War, 1861-1865--Correspondence" with 2 facets.
--

Comparing "United States--History--World War II" with 3 facets, against "United States--History--World War I" with 2 facets.
Jaro-Winkler score:  0.990740740741
Levenshtein distance:  1
Keeping "United States--History--World War II" with 3 facets, and "United States--History--World War I" with 2 facets.
--

Comparing "United States--History--World War Two" with 3 facets, against "United States--History--World War 2" with 2 facets.
Jaro-Winkler score:  0.963449163449
Levenshtein distance:  3
Merging to "United States--History--World War Two" with 5 facets.
--

Comparing "United States--History--World War Two" with 3 facets, against "United States--History--World War 1" with 2 facets.
Jaro-Winkler score:  0.963449163449
Levenshtein distance:  3
Merging to "United States--History--World War Two" with 5 facets.
--

Comparing "United States--History--World War 1" with 3 facets, against "United States--History--World War 2" with 2 facets.
Jaro-Winkler score:  0.980952380952
Levenshtein distance:  1
Keeping "United States--History--World War 1" with 3 facets, and "United States--History--World War 2" with 2 facets.
--
--------------

Related Content:

Written by nitin

March 15th, 2012 at 11:57 am

pOAIndexter: grabbing and indexing online metadata

leave a comment

As per usual, a good bit of my computer-y stuff at home relates to something that's come up at work. And as usual, I'm pretty ignorant of what I'm getting myself into, but I don't mind.

The other week, my boss and I met with some great people at digitalnc.org and we started talking about the idea of having a super simple, lightweight approach to providing a one-stop-shop search interface for collections across the state – provided those collections expose their metadata somehow. For now, we talked about limiting this to people who do so with an OAI feed and grabbing that metadata. But eventually, this thing should be metadata agnostic – in the sense that it isn't about a metadata format, but just the data itself.

By the way, I guess "grabbing" and "feed" aren't what I typically see with OAI – about which I admittedly don't know much – but I don't care. Same difference.

Of course, there's nothing new to this. I guess one could use Blacklight or VuFind to do this kind of thing, but I'm not sure, though even those are existing open souce projects, that doing so isn't overkill and won't in turn increase dependencies and maintenance overhead.

Actually, that's a topic for another time – I mean the idea that just because part of something is capable of doing what you want doesn't necessarily make it a better option than rolling one's own if using and updating said something entails more cost in the long run. Paved roads often get you there faster, but a willingness to get lost now and then is how you learn where all the really cool local bars are …

;)

Anyway, here's what I'm thinking. A small script would simply look at an XML setup file from which it would know which places to go grab metadata from, the type of feed, the last time the metadata was requested, and stuff like the resumptionToken if applicable. It would also store the appropriate XSL file to process the metadata with so that the metadata could be passed into Solr to be indexed and searchable. Anyone who's site doesn't provide metadata as XML could simply create a web service that does so, e.g. a RESTful MySQL to XML thingamajig. The outputted XML just needs to have an XSL that will facilitate passing it to Solr for that data to be part of the shared metadata store. And since XSL is the universal translator in this context, other metadata types such as RSS/ATOM feeds could be grabbed, too. All one needs to do is add to the XML config file so the script knows to retrieve metadata from that site and make sure there's an XSL file that can be used to facilitate passing the data into Solr. So in the end all this should take in terms of coding is a small script, one XML config file, and as many XSL files as needed.

For fun and to start learning about Solr, I just manually grabbed some OAI metadata from CalTech yesterday – it was for some oral histories. And then I ran them through an XSL file and then posted them to Solr. Within no time I had a searchable, local metadata store to play around with (screenshot below). Since I was using all the defaults from the Solr tutorial I had to map the <dc:creator> field to things like manufacturer, since the default is set up for an electronics store.

Solr screenshot

BTW if we use this, at some point I won't be able to call it "pOAIndexter" but for now I can.

Since I don't know if I'll do this in Python or PHP and since OAI is what we'll work on first, I guess it stands for "Python or PHP OAI Indexer".

Yes, I'm a dork.

--------------

Related Content:

Written by nitin

October 2nd, 2011 at 11:20 am

Switch to our mobile site