blog.humaneguitarist.org

PubMed2XL 0.9.1 available

[Sun, 03 Apr 2011 22:40:17 +0000]
I've uploaded a new version of PubMed2XL, a Windows application that converts article lists from pubmed.gov [http://pubmed.gov/] into Microsoft Excel files. If you'd like to use the software you can download it for free. For those who are interested, here's the changelog: 0.9.1 - worked with Björn Carlsson on a few things: - added length checker for <getElement> so that abstracts greater than 32k characters would get truncated to the first 30k characters. - see: http://blog.humaneguitarist.org/2011/03/16/dealing-with-a-pubmed2xl-bug/ - added <getAttributeByElementPosition> element. - Updated schema. - removed code that displayed the "aboutMessage" variable on the command line if command line options are used. - This is because the diacritic in Mr. Carlsson's name caused encoding errors with the default Windows command prompt. - added <hyperlinkSuffix> element so that alternate views of PubMed data could be passed via the URL. - updated schema. - For example, see this: http://www.ncbi.nlm.nih.gov/pubmed/21069543 then this: http://www.ncbi.nlm.nih.gov/pubmed/21069543?report=medline - The hyperlink suffix of ?report=medline changes the display! - For more information, see: - PubMed Help — PubMed Help — NCBI Bookshelf. Retrieved November 13, 2010, from http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helppubmed&part=pubmedhelp&rendertype=table&id=pubmedhelp.T40 - pm_workbook.pdf. Retrieved November 13, 2010, from http://www.nlm.nih.gov/pubs/manuals/pm_workbook.pdf (see page 135). - updated py2exe "setup.py" to automatically name the command line/console version correctly (i.e. with the "-CL" suffix). - removed "src" folder and placed Python files in same folder as .exe's. _______________________________________________________________________ 0.9.0 - this was the first version - that worked!

COMMENTS

  1. Ilidio [2011-06-04 14:58:03]

    Hi, It works! thanks,

  2. nitin [2011-06-03 22:27:46]

    Here's the download: http://blog.humaneguitarist.org/uploads/PubMed2XL/moreStyles/version_0.9x/ilidio_060311.xml You can download this and place it anywhere on your computer. I would recommend putting it in this folder: PubMed2XL-0.9.2styles Here's what to do: 1) When you start PubMed2XL, go to TOOLS>SELECT STYLESHEET 2) Click on the file you downloaded. 3) Then run PubMed2XL on your PubMed download. Note: in cases where the Abstract is all in one XML element, the results in that cell will be identical to the column called "Abstract - Background". Also, I assumed that the order is always Background, Methods, Results, Conclusion and that there are only 4. 4) Let me know if this does/doesn't work. :-) The new/upcoming version of PubMed2XL should let you do almost anything you want for more complex data retrieval like Mesh terms, etc. as it will allows the user to take advantage of XSLT (http://www.w3schools.com/xsl/). I have this working, but currently the new version isn't backwards compatible with the current version's (0.9.1) stylesheets. So most of my remaining work is carefully copying and pasting some code from the current version into the new one.

  3. Ilidio [2011-06-03 18:06:00]

    Hi, I find your software very useful. I have thoroughly searched and I think this is currently the best tool to import Pubmed results. Thank you! You can find an example of the case discussed in previous post here: http://www.ncbi.nlm.nih.gov/pubmed/21346227 Imported it with Pubmed2XL and can only get the first (background) section. There are other examples – I think that every abstract with several sections will have this issue, but I have not tested for a large number of occurrences. It would be great if you could post a stylesheet that could import all these sections – assuming that the issue is reproducible. The new feature (concatenation of data from several XML subfields) will significantly improve the user experience with Pubmed2XL. Btw, it would be good to have a stylesheet that imports Pubmed codes for each record – for example: publication types and MESH terms. As a user, I could try to create a stylesheet to do this (will try in my free time) but if this is something that several users would like to have maybe can be part of future packages? Thanks again for a great piece of software. Ilidio.

  4. nitin [2011-06-03 17:34:15]

    Thanks for the feedback. It's nice to know people are actually using the software. Can you send me an example of a PubMed record for which this is the case? Unless I've completely overlooked something, there should be a way to get the values you want in separate columns. I can make a PubMed2XL "stylesheet" that you can use to get the data in the columns you want and post it here for you to download. ps: the upcoming version (I'll post it in maybe in a month, hopefully less) will allow you to concatenate values without having to do it in Excel - for example outputting all authors in the same cell, etc. Thanks Ilidio!

  5. Ilidio [2011-06-03 16:53:37]

    Hi, I noticed that recent versions now allow to import multiple authors (up tp 3 by default). This was a great! I wonder if something similar could be done for abstracts. Abstract text in some Pubmed records is now divided 3-4 XML sub-fields (intro/background, methods, results, conclusion). If Pubmed2XL can import all of these fields separately we could easilly merge them in excel. Pubmed2XL is a great tool! Thanks,