Cookbook

Tips to convert HTML to Word

The embedHTML method and its counterpart for templates, replaceVariableByHTML, allow to convert HTML with CSS to Word, while respecting to the maximum their contents and styles. To achieve the maximum similarity with the original HTML and avoid any errors, it is necessary to follow some good practices.

Supported tags and styles

phpdocx supports nearly all the HTML tags and CSS styles that have an equivalent in MS Word.

In our web you cand find the complete list of compatible tags and styles.

Beside these HTML tags and CSS styles, when importing HTML you can assign too existing Word styles to classes, ids or specific tags with the option 'wordStyles'.

Tidy, incorrect tagging, accents and other non ASCII characters

For a proper HTML import, it is mandatory that the tags and styles are correctly opened and closed. In other words, that the structure of the code is right. phpdocx uses the PHP extension Tidy (http://php.net/manual/en/book.tidy.php) to correct the HTML and generate a valid tagging. You can install this extension in any operating system with PHP.

Warning

If you haven't installed the Tidy extension, errors may ocurr, like appearing the CSS styles in the document, import with errors the HTML or not displaying accents and other non ASCII characters.

Divide and Optimize

Although the import of HTML and CSS is optimized to the maximum, transforming thousands of lines with different tagging and styles may affect performance.

The solution to achieve the best possible performance is to divide the code you are importing. E.g.: instead of adding with embedHTML an HTML file of 10000 lines, you could divide it in five HTML files and then call embedHTML for each HTML.

With this easy step you can decrease exponentially CPU and memory consumption.