Forum


Replies: 3   Views: 3972
Embedhtml using a large amount of memory
Topic closed:
Please note this is an old forum thread. Information in this post may be out-to-date and/or erroneous.
Every phpdocx version includes new features and improvements. Previously unsupported features may have been added to newer releases, or past issues may have been corrected.
We encourage you to download the current phpdocx version and check the Documentation available.

Posted by surveygizmo  · 07-08-2012 - 20:20

Hi,
We use phpdocx to automatically generate word documents for our customers based on their reports.

The problem is embedHTML is using up a large amount of memory in php (~500mb) to generate a word document that only ends up being about 80k.

While we are able to create this word document, for anything larger we hit the server's memory limit. I was wondering if there was someway to stream the document out to a file during embedHTML or if there was some way to limit the memory usage.

Here's our code below along with memory usage dumps:
[code] var_dump(6,memory_get_usage());

$this->endSession();//close the document div
$outputCode = '';
$outputCode .= $this->getBody();//get the html
unset($this->documentBuffer);

var_dump(7,memory_get_usage());

$file = $saveInPath.$fileName;
$docx = new CreateDocx();

var_dump(8,memory_get_usage());

$docx->embedHTML($outputCode, array('parseDivsAsPs' => false, 'downloadImages' => true, 'id' => 'bodyContent'));
unset($outputCode);

var_dump(9,memory_get_usage());

$docx->createDocx($file);

var_dump(10,memory_get_usage());[/code]

And here's the output:
int(6)
int(54476040)
int(7)
int(54476136)
int(8)
int(56217040)
int(9)
int(498491816)
int(10)
int(503589144)

You'll notice how the memory jumps up from 56mb to 498mb between #8 and #9.

Here's a link to the test doc file that was outputted:
[url]http://surveygizmolibrary.s3.amazonaws.com/library/6175/testdoc.docx[/url]

Thanks!
-Chad

Posted by truth@proposaltech.com  · 18-12-2013 - 22:43

We had the same problem and after many hours of trial and error, I've cut the peak memory usage of phpdocx in half while generating the exact same output.  I generated the same 460kb docx file three times with the following peak memory usages:



1355mb:  3.6 as downloaded from phpdocx.com



1072mb:  Replacing accessing the font properties using the magic __get function with a normal function



631mb:  Also clearing the font properties when no longer needed.



I tried other methods of reducing memory usage, but the other methods only saved ~1% of the memory consumption and required far more changes to the code.  (For example, I swapped out some array usage for splfixedarray usage.)  The changes I made to achieve the memory reduction can be easily applied by following the notes below.  I hope the phpdocx team incorporates this in their next release (so everyone can save on memory consumption and so I don't have to patch my upgrade ;) ).



Add the following to lib/dompdfParser/include/frameparser.cls.php (e.g. near "function set_style")



  // This representation of a DOM element will likely be part of a tree

  // and will continue to be stored after processing this subtree has

  // completed.  The style information consumes a large amount of memory

  // and may be cleared when no longer needed.

  function clear_style() {

    $this->_style = null;

  }





In lib/dompdfParser/include/parserhtml.cls.php



Before each "return" in _render(), add:



                                        // Frame processing has completed.  Free memory.

                                        $frame->clear_style();



Replace



                                try{$sTemp = $properties->$style;}

with

                                try{$sTemp = $properties->nonmagic_get($style);}



In lib/dompdfParser/include/styleparser.cls.php



Replace



  function __get($prop) {



with



  function __get($prop) {

    return $this->nonmagic_get($prop);

  }



  // The magic of __get leaks memory. Calling a normal function avoids the leak.

  function nonmagic_get($prop) {



 


Posted by jorgelj  · 19-12-2013 - 09:17

Hello,



Thanks for you code, we're going to test it.



Regards.