Strony

środa, 17 kwietnia 2013

one does not simply pickle a tree

Ahrrr... It took me 5 hours to figure it out.

Python multiprocessing.Queue is using internally shelve module, which again internally is using cPickle for speed up.  But if something cannot be pickled, when calling Queue.put, you will see some cryptic error message:


File "/opt/dirac/pro/Linux_x86_64_glibc-2.5/lib/python2.6/multiprocessing/queues.py", line 242, in _feed
-    send(obj)
TypeError: expected string or Unicode object, NoneType found

Ugh! Not helping at all, but what to do except start debugging? Taking the advantage of the opportunity I had fixed also some problems here and there (un-shelvable locks or instance methods in several classes), but at the very, very end  I've got this:

pickle.PicklingError: Can't pickle : it's not found as __main__.copyelement

 Huh? What the heck? Where is this one? Find and grep fellows stayed completely  muted, so I've started to dump symbols from *.so under PYTHONPATH, and guess what? I HAVE FOUND IT!

[volhcb13] /opt/dirac/pro > nm Linux_x86_64_glibc-2.5/lib/python2.6/lib-dynload/_elementtree.so | grep copyelement
0000000000209dd8 b elementtree_copyelement_obj




This little bastard was hiding in cElementTree! I've swapped this one with its  python twin (ElementTree). Taa-daam!  Queue.put is working again, not a big deal for me,  XML fragments I'm dealing with are rather small.

5 hours! Five! FIVE HOURS lost...

So, dear children, please remember: when going multiprocessing forget about cElementTree (and perhaps several other modules rewritten in C for speed up) and use pure python implementations.