pretty printing XML with Python, lxml, and XSLT

Last week or so I was doing some work with Python and lxml. And, it seems like a lot of people, using lxml's pretty printing wasn't really doing anything for me.

I couldn't find any native lxml solutions to make my XML look pretty. All I found were some functions on various code sites written by people to pretty print the XML using a bunch of regular expressions. Yuck.

So I thought, "Why not use XSLT to pretty print my XML?" and I found an XSL written by none other than Michael Kay on this page (see comment #4).

And it seems to work just fine as a function to return pretty XML, not to mention it's super short and sweet.

Anyway, here's an example of using the XSL for pretty printing.

from lxml import etree

def prettify(someXML):
  #for more on lxml/XSLT see: http://lxml.de/xpathxslt.html#xslt-result-objects
  xslt_tree = etree.XML('''\
    <!-- XSLT taken from Comment 4 by Michael Kay found here:
    http://www.dpawson.co.uk/xsl/sect2/pretty.html#d8621e19 -->
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes" encoding="UTF-8"/>
      <xsl:strip-space elements="*"/>
      <xsl:template match="/">
        <xsl:copy-of select="."/>
      </xsl:template>
    </xsl:stylesheet>''')
  transform = etree.XSLT(xslt_tree)
  result = transform(someXML)
  return unicode(result)

myXML = etree.XML('<a><b><c><d/></c></b></a>')
print prettify(myXML)

The example above would output the following:

>>>
<?xml version="1.0"?>
<a>
<b>
<c>
<d/>
</c>
</b>
</a>

By the way I don't even need to see the XML I'm processing most of the time, so why all the pretty printing fuss?

Well, because it bothers me …

And all good XML should look like an X-wing starfighter. If it doesn't your probably doing something wrong or your schema just sucks.

It isn't called an X-wing for no reason.

:P

--------------

Related Content:

Leave a Comment

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>