blog.humaneguitarist.org

discoveries in digital audio, music notation, and information encoding

Archive for the ‘visualization’ tag

North Carolina grants, Google App Engine, and pie … mmm.

leave a comment

I took April off from blogging after realizing I was over blogging, as opposed to over logging.

I'll keep this short. Well, I'll try.

I'm shacked up in the apartment due to some unexpected circumstances and yesterday I decided to try and be a little productive and learn something I could potentially use in the workplace.

I learned a little about Google App Engine. I was drawn to it because of the Python support and because it gives me a free environment where I can deploy Python apps using the ever-elusive lxml library.

While I wrote some silly stuff using lxml and data available from the Business.gov API I ended uploading a simple app – if you can call it that – that parses a CSV file from North Carolina's (USA) NCOpenBook.

I didn't use the csv module because the CSV file I used has like three lines at the top that aren't headers (people: don't do that!). I don't know if there's a way to handle that with the csv module (there probably is) but I wasn't interested in digging around. Instead, I used a modified version of this code I wrote previously.

The CSV file lists grantees who've received funding by North Carolina and the app pulls out the top ten since 2007 based on cumulative grant totals. The app uses Google Chart Tools to make a pie chart of the top ten recipients. I'm not so sure about the colors in the pie chart – it's hard to see the difference between some of the colors associated with each grantee – but it's a simple start.

Here's a screenshot:

Top Ten NC Grants by Grantee

.. and here's the link to the app online: http://top-ten-nc-totals-by-grantee.appspot.com.

I've also pasted the app.yaml file, my Python code, and the Jinja/HTML template below if anyone's interested.

YAML:

application: top-ten-nc-totals-by-grantee
version: 1
runtime: python27
api_version: 1
threadsafe: true

handlers:
- url: /stylesheets
  static_dir: stylesheets
- url: /.*
  script: nctotals.app
 
libraries:
- name: jinja2
  version: latest

Python:

#import modules
import urllib
import webapp2

import jinja2
import os
     
jinja_environment = jinja2.Environment(
  loader=jinja2.FileSystemLoader(os.path.dirname(__file__)))

#####

#see: http://stackoverflow.com/a/2827664
class Object(object):
  pass

#my CSV parser
def csv2dict(fileName, delimiter):
  f = urllib.urlopen(fileName) #open file
  lines = f.read() #read file

  rows = lines.split("\n") #put lines in list

  #cut out non-header rows at top of this particular CSV file
  for i in range(0,3):
    rows.pop(0)

  #shorten the CSV data to 10 rows (there were too many damn rows in the CSV file!)
  for i in range(12,len(rows)+1):
    rows.pop(-1)

  headers = rows[0].split(delimiter) #put header titles in list
  rows.pop(0) #remove header from "rows" list

  i = 0
  worksheet = {}
  for header in headers: #for each header, i.e. each column
    columnCells = []
    #print header #test line
    for row in rows: #for each non-header row in delimited file
      if row != "": #!!!you need to also add a test for lines that don't split on the delimeter (i.e. notes)
        rowCells = row.split(delimiter) #get cells in row
        columnCells.append(rowCells[i].strip()) #put column's cells in list
    worksheet[header] = columnCells #set header as KEY and set "columnCells" list as VALUE
    i = i + 1
 
  return worksheet

#####

class MainPage(webapp2.RequestHandler):
  def get(self):
    parsed = csv2dict("http://data.osbm.state.nc.us/openbook/comma_grant_cumulative_awards_and_annual_disbursements_by_grantee.csv", '","') #pass filename and delimiter
    
    topTen = range(0,len(parsed['"Non-Profit Name (*)'])) #i.e. range is 1 to 10, or 0 to 9 depending on your p.o.v.

    for i in topTen: #add attributes to each of the ten agencies in the CSV file
      topTen[i] = Object()
      topTen[i].name = parsed['"Non-Profit Name (*)'][i].replace('"','')
      topTen[i].total = parsed['Cumulative Total Award'][i]
      raw_total = parsed['Cumulative Total Award'][i]
      raw_total = raw_total.replace('$','')
      raw_total = raw_total.replace(',','')
      topTen[i].raw_total = raw_total
      
    #data for the Jinja template  
    template_values = {
      'topTen': topTen}

    template = jinja_environment.get_template('index.html')
    self.response.out.write(template.render(template_values)) #write data to the index.html template
  
app = webapp2.WSGIApplication([('/', MainPage)],debug=True)

Template:

<!DOCTYPE HTML>
<html>
  <head>
    <title>
      Top Ten NC Grants by Grantee (since 2007)
    </title>
    <link type="text/css" rel="stylesheet" href="/stylesheets/style.css" />
    <script type="text/javascript" src="http://www.google.com/jsapi"></script>
    <script type="text/javascript">
      google.load('visualization', '1', {packages: ['imagepiechart']});
    </script>
    <script type="text/javascript">
      function drawVisualization() {
        // Create and populate the data table.
        var data = new google.visualization.DataTable();
        data.addColumn('string', 'name');
        data.addColumn('number', 'raw_total');
        data.addRows([
          {% for topper in topTen %}
          ["{{ topper.name }} - {{ topper.total }}", {{ topper.raw_total }}],
          {% endfor %}
        ]);
    
        // Create and draw the visualization.
        new google.visualization.ImagePieChart(document.getElementById('visualization')).
          draw(data, null);
      }
      google.setOnLoadCallback(drawVisualization);
    </script>
  </head>
  <body>
    <h3>Top Ten <a href="http://www.ncopenbook.gov/NCOpenBook/GrantsHome.jsp">NC Grants</a> by Grantee (cumulative totals since 2007)</h3>
    <p>see the source CSV file <a href="http://data.osbm.state.nc.us/openbook/comma_grant_cumulative_awards_and_annual_disbursements_by_grantee.csv">here</a></p>
    <div id="visualization"></div>
    <p>Made with:</p>
    <ul>
      <li><a href="https://developers.google.com/appengine/docs/python/gettingstartedpython27/">Google App Engine (Python 2.7)</a></li>
      <li><a href="https://developers.google.com/chart/">Google Chart Tools</a></li>
    </ul>
    <p>More info (blog post):</p>
    <ul>
      <li><a href="http://blog.humaneguitarist.org/2012/05/01/north-carolina-grants-google-app-engine-and-pie-mmm/">North Carolina grants, Google App Engine, and pie ... mmm.</a></li>
    </ul>
  </body>
</html>
--------------

Related Content:

Written by nitin

May 1st, 2012 at 10:42 am

making a DOT graph for PHP include statements

2 comments

A couple of months ago, I posted about my experience with making a Python dependency graph.

Of course, as the post states, I was originally looking for a way to make a graph showing the relationship among PHP files in regard to "include" statements.

Well, I'm home sick and after a few hours of trying to find an easy, out-of-box solution I gave up and rolled my own Python script to make me a DOT graph file.

I didn't have anything better to do.

:(

The results are pretty simplistic, but I'm happy enough with it for now.

The Python script takes three arguments: the directory in which the PHP files exist, whether to search recursively or not (0=no, 1=yes), and the name of the output file as such:

$ python makeDOT.py blog/wordpress 1 wordpressIncludes.dot

#####
#importing modules
import glob, re, sys, os, fnmatch
br = "\n"
tab = "\t"


#####
#exiting if all 3 arguments are not passed via command line
def fail():
    print ("ERROR: " + str(len(sys.argv)-1) + " of 3 required arguments provided.")
    sys.exit()


#####
#getting arguments passed via command line
   
#testing for root DIRECTORY string
try: myDir = sys.argv[1]
except: fail()

#testing for RECURSION boolean
try: myRec = sys.argv[2]
except: fail()

#testing for OUTPUT filename string
try: myFile = sys.argv[3]
except: fail()


#####
#making list of PHP files within DIRECTORY
if myRec == "0": #without recursion
    myDir2 = myDir + "/*.php"
    PHP_list = glob.glob(myDir2)
elif myRec == "1": #with recursion
    PHP_list = []
    for dirname, dirnames, filenames in os.walk(myDir):
        for filename in filenames:
            if fnmatch.fnmatch (filename,("*.php")):
                match = os.path.join(dirname,filename)
                PHP_list.append(match)

#make an empty list;
#tuples will go in the list;
#each tuple will contain a PHP filename and a PHP filename it includes
includeList = []

#iterate through each PHP file and place tuples in the list
for phpFile in PHP_list:
    fileOpen = open(phpFile, "r")
    #for each line in a PHP file
    for line in fileOpen:
            m = re.match(r"(.*)include(.*\()(.*)\)", line) #for include(),include_once()
            if m:
                matchFile = m.group(3)[1:-1]
                if matchFile[-4::] == ".php": #only PHP files
                    phpFile = phpFile.replace("\\","/")
                    matchFile = matchFile.replace("\\","/")
                    matchFile = matchFile.replace("\"","")
                    matchFile = matchFile.replace('\'',"")
                    includeList.append([phpFile[len(myDir)+1:], matchFile])
            else: pass

            m = re.match(r'(.*)require(.*\()(.*)\)', line) #for require(), require_once()
            if m:
                matchFile = m.group(3)[1:-1]
                if matchFile[-4::] == '.php': #only PHP files
                    phpFile = phpFile.replace("\\","/")
                    matchFile = matchFile.replace("\\","/")
                    matchFile = matchFile.replace("\"","")
                    matchFile = matchFile.replace('\'',"")
                    includeList.append([phpFile[len(myDir)+1:], matchFile])
            else: pass


#####
#creating DOT file
dot = open(myFile, "w")

#writing to DOT file
dot.write("digraph {" + br)
for a,b in includeList:
    dot.write(tab)
    dot.write("\"")
    dot.write(a)
    dot.write("\"")
    dot.write(" -> ")
    dot.write("\"")
    dot.write(b)
    dot.write("\"")
    dot.write(";")
    dot.write(br)
dot.write("}")
dot.close()


#####
#exiting
sys.exit()

I ran the Python script on the PHP scripts for MXMLiszt.

Then I used the "circo" layout engine in Graphviz – specifically the Gvedit.exe application – on this resultant DOT file.

Here's the result:


--------------

Related Content:

Written by nitin

July 30th, 2011 at 1:03 pm

making my first dependency graph

leave a comment

I wanted a quick, easy way to generate a dependency graph for PHP include statements, but of course I actually did my Google searches for a Python dependency visualizer. The PHP can wait …

Anyway, I found this cool page that had some scripts that can make dependency graphs using Python import statements.

In a nutshell, here's what I did.

  1. Made a folder on my Desktop (I have Window 7) called "python_visualizer".
  2. Downloaded py2depgraph.py  and depgraph2dot.py to the folder.
  3. Downloaded and installed Graphviz.
    • On my system it installed to:
      • C:\Program Files\Graphviz2.26.3
  4. Put a copy of one of my Python scripts (renamed to "foo.py") in the "python_visualizer" folder.
  5. Opened the command line and did this:

$ python_visualizer>python py2depgraph.py foo.py | python depgraph2dot.py | "C:\Program Files\Graphviz2.26.3\bin\dot" -T png -o depgraph.png

Now, I did get a little error message as below:

(dot.exe:7244): Pango-WARNING **: couldn't load font "Helvetica Not-Rotated 10", falling back to "Sans Not-Rotated 10", expect ugly output.

but it's no big deal, the PNG still got made … and I don't think it's that ugly!

But you can judge for yourself …

--------------

Related Content:

Written by nitin

May 7th, 2011 at 8:53 am

Switch to our mobile site