Python running out of memory parsing XML using cElementTree.iterparse -

- March 15, 2014

a simplified version of xml parsing function here:

import xml.etree.celementtree et  def analyze(xml):     = et.iterparse(file(xml))     count = 0      (ev, el) in it:         count += 1      print('count: {0}'.format(count))

this causes python run out of memory, doesn't make whole lot of sense. thing storing count, integer. why doing this:

enter image description here

see sudden drop in memory , cpu usage @ end? that's python crashing spectacularly. @ least gives me memoryerror (depending on else doing in loop, gives me more random errors, indexerror) , stack trace instead of segfault. why crashing?

the documentation tell "parses xml section into element tree [my emphasis] incrementally" doesn't cover how avoid retaining uninteresting elements (which may of them). covered this article effbot.

i recommend using .iterparse() should read this article liza daly. covers both lxml , [c]elementtree.

previous coverage on so:

using python iterparse large xml files
can python xml elementtree parse large xml file?
what fastest way parse large xml docs in python?

Search This Blog

Support

Python running out of memory parsing XML using cElementTree.iterparse -

Comments

Post a Comment

Popular posts from this blog

objective c - Change font of selected text in UITextView -

php - Accessing POST data in Facebook cavas app -

c# - Getting control value when switching a view as part of a multiview -