Python running out of memory parsing XML using cElementTree.iterparse -


a simplified version of xml parsing function here:

import xml.etree.celementtree et  def analyze(xml):     = et.iterparse(file(xml))     count = 0      (ev, el) in it:         count += 1      print('count: {0}'.format(count)) 

this causes python run out of memory, doesn't make whole lot of sense. thing storing count, integer. why doing this:

enter image description here

see sudden drop in memory , cpu usage @ end? that's python crashing spectacularly. @ least gives me memoryerror (depending on else doing in loop, gives me more random errors, indexerror) , stack trace instead of segfault. why crashing?

the documentation tell "parses xml section into element tree [my emphasis] incrementally" doesn't cover how avoid retaining uninteresting elements (which may of them). covered this article effbot.

i recommend using .iterparse() should read this article liza daly. covers both lxml , [c]elementtree.

previous coverage on so:

using python iterparse large xml files
can python xml elementtree parse large xml file?
what fastest way parse large xml docs in python?


Comments

Popular posts from this blog

objective c - Change font of selected text in UITextView -

php - Accessing POST data in Facebook cavas app -

c# - Getting control value when switching a view as part of a multiview -