Pyrex demo for baypiggies
Drew's BayPIGgies Pyrex presentation
At the 2004/11/11 BayPiggies meeting, I did some Pyrex demos.
My first demo was how to build a minimal Pyrex module that has a Python statement and a C call.
Then I talked about using Pyrex to optimize a simple Python program.
Initial Python program
#!/usr/bin/python from glob import glob import itertools class Bigrams: """original python version""" def __init__(self): self.pairs = {} def add(self,word): for i in range(len(word)-1): pair = word[i]+word[i+1] self.pairs[pair] = self.pairs.get(pair,0) + 1 def freqs(self): """get list of (pair,freq)""" return self.pairs.items() files = "/usr/include/*.h" b = Bigrams() for line in itertools.chain(*[file(x) for x in glob(files)]): for word in line.split(): b.add(word) freq_pair = [(v,k) for k,v in b.freqs()] freq_pair.sort() for fp in freq_pair[-8:]: print "%8d %s" % fpOn my box, I get this list of the most popular bigrams:
% cat /proc/cpuinfo | egrep "vendor|MHz" vendor_id : AuthenticAMD cpu MHz : 1400.081 vendor_id : AuthenticAMD cpu MHz : 1400.081 % time ./bigrams 12210 te 12486 fi 13053 nt 13267 ef 13780 er 13940 ne 17057 de 30403 in ./bigrams 4.19s user 0.07s system 92% cpu 4.604 total
Move to Pyrex
I can't wait 4 seconds for this stuff! So, I moved the Bigrams class into Pyrex and built an extension. That version runs in about 4.7sec. Pyrex translates (most) Python to C code that uses the Python API. Run times are usually unaffected. The big speedups come when you let Pyrex write some of your code in faster C by avoiding PyObjects; calling other C libs; etc.Try a Pyrex language extension
Then, I changed "for i in range(len(word)-1):" to "for i from 0 <= i < len(word)-1:" to demo one of Pyrex's bonus features. That loop syntax isn't related to translating normal Python to C; nor is it related to calling C functions. The for/from Pyrex loop uses a plain C integer as the loop index, avoiding a few Python API calls per iteration. This gets me a bit of a speedup. With the Pyrex int loop, the program takes 3.7sec.
Rewrite the algorithm
Now I am ready to tinker with the algorithm. I rewrite the Bigrams class in Pyrex, using a fixed-size C buffer to store the frequency counts. This removes the Python dict access, and I am also now looping over characters in a C string.cdef class Bigrams4: """pack results into a fixed array like a C programmer would""" cdef int pairs[65536] def __init__(self): for x in range(65536): self.pairs[x]=0 def add(self,word): cdef char *cword cdef unsigned short pairidx cword = word for i from 0 <= i < len(word)-1: pairidx = (cword[i]<<8) + cword[i+1] self.pairs[pairidx] = self.pairs[pairidx] + 1 def freqs(self): """get list of (pair,freq)""" letter_pairs = [] for x in range(65536): if self.pairs[x]>0: letter_pairs.append(("%s%s" % (chr(x >> 8), chr(x & 0xff)), self.pairs[x])) return letter_pairsNote 'cdef class', which means this is a new C type. Pyrex writes this C struct for the new type:
struct __pyx_obj_6bigram_Bigrams4 { PyObject_HEAD int (pairs[65536]); };My 'self.pairs' code, formerly an attribute lookup, is now a C struct member access:
Pyrex source: | self.pairs[pairidx] = self.pairs[pairidx] + 1 |
C emitted from the Pyrex compiler: |
/* "/home/drewp/projects/digrams/bigram.pyx":51 */ (((struct __pyx_obj_6bigram_Bigrams4 *)__pyx_v_self)->pairs[__pyx_v_pairidx]) = ((((struct __pyx_obj_6bigram_Bigrams4 *)__pyx_v_self)->pairs[__pyx_v_pairidx]) + 1); |
This runs in 1.2 sec, which I felt was good enough for my demo. Note that my rewritten algorithm could also be ported to plain Python. That version takes 3.7 sec-- the C datatypes made the big difference here.
Swig version
I also did a swig version which runs in .68 sec. It's probably faster because it doesn't use a class at all, because the final packing is faster, or because it doesn't contain the error checks that Pyrex includes. I haven't investigated the exact reason.Source
Get bigrams.tgz and run "python setup.py build_ext --inplace" to build the Pyrex extension (you'll need Pyrex installed). Run "make" to build the swig extension (you'll need swig).Then, try "time bigrams" after setting line 29 to the class version you want to run. Edit line 25 to adjust which files that are scanned. The file "bigrams5" contains the fixed-array version written in Python, which I didn't want to have cluttering the original bigrams program.
These source files also contain a demo of calling uuid_generate from libuuid. Remove all the uuid code if you're on a platform that doesn't have that lib, I guess.