Post #8000 - phpdocx

Posted by admin · 26-11-2021 - 08:54

Hello,

The only difference from the logs is that some images can't be downloaded so the Word content of the image is not added to the document (that line 1138 you point out and others).

embedHTML and replaceVariableByHTML methods work in the same way as web browser: if an image can't be readed/downloaded then it's not added (file_get_contents from PHP is used to download images)

If we compare both logs, we can check that the following images are added correctly in both cases:

https://www.energiebericht.net/images/icons-doc/warmth.png
https://www.energiebericht.net/images/icons-doc/electricity.png
https://www.energiebericht.net/images/icons-doc/water.png

but the missing images are:

https://www.energiebericht.net/cache/408/2020/pages/cobjects/8002/warmthConsumption.png
https://www.energiebericht.net/cache/408/2020/pages/cobjects/8002/electricityConsumption.png
https://www.energiebericht.net/cache/408/2020/pages/cobjects/8002/waterConsumption.png
https://www.energiebericht.net/cache/408/2020/pages/cobjects/8002/energy_indicator_27.png
https://www.energiebericht.net/cache/408/2020/pages/cobjects/8002/energy_indicator_1.png
https://www.energiebericht.net/cache/408/2020/pages/cobjects/8002/energy_indicator_2.png
https://www.energiebericht.net/cache/408/2020/pages/cobjects/8002/energy_indicator_3.png

that are the images from https://www.energiebericht.net/cache/ . Maybe the images from this remote folder are auto generated (it seems a cache folder) and not available in specific cases (such as being autogenerated internally)? The problem is that PHP can't read these images for some reason we don't know.

We have tested the HTML from your log using embedHTML and in all cases the images are added perfectly (all images can be readed). The best approach would be checking the web server logs in the remove server (https://www.energiebericht.net/cache/ , where the images should exist); the web server logs must detail why the images can't be readed/downloaded (404 not found, 403 access denied or other) when you run the script.

As embedHTML and replaceVariableByHTML methods silence file_get_contents information when an image is downloaded (to avoid false warnings and work as web browsers), you can edit HTML2WordML.php to debug it deeply. In this file you can find the following line (around line 1400):

$photo = @file_get_contents($this->parseURL($nodo['attributes']['src']));

if you remove @:

$photo = file_get_contents($this->parseURL($nodo['attributes']['src']));

and run it using PHP CLI mode, if the image download fails you should get PHP information about it.

The remote web server logs should explain why they can't be readed (404, 403, 500...).

Regards.

Forum