Conversion plugin phpdocx

Preparing the documents for their conversion

Preparing the documents for conversion when using the LibreOffice method

To achieve the highest quality when transforming a document using the LibreOffice method, it is advisable to follow some good practices:

  • Use the latest release available on libreoffice.
  • Set the sizes of the tables manually when using addTable or embedHTML methods. So, instead of keeping the sizes with automatic values according to the content, indicate specific values for each table and cell.
  • To generate active links using addLink, embedHTML or replaceVariableByHTML when trasforming DOCX to PDF, an existing custom charater style (rStyle) in the DOCX must be applied to the link. phpdocx adds a default rStyle (DefaultParagraphFontPHPDOCX) to links, that can be customized with addLink, embedHTML or replaceVariableByHTML to use a custom one in the DOCX, or created dynamically with createCharacterStyle.
  • Backgrounds will not be printed, so they won't appear in the final PDF.
  • Headers and footers are best transformed if their content is in a table. This allows to place correctly each header and footer element.
  • To hide tables borders, erase them cell by cell. If you generate the DOCX from HTML with the embedHtml method, hide the border in each <td>.
  • Choose font types available in the operating system where the conversion plugin is running. Linux, Windows and macOS allow adding new fonts easily. Fonts can also be embedded in the DOCX: working with fonts.
  • LibreOffice has some minor differences in default sizes and distances compared with MS Word. It is recommended to check these values; just open the document with LibreOffice to adjust them.
  • In order to achieve a conversion as much close as possible to the original content, you can create the DOCX template with LibreOffice. LibreOffice 4.3 and higher allows to save documents in DOCX format (File > Save as > Microsoft Word 2007-2013 XML (.docx)). These files are fully compatible with MS Word.
  • Avoid to the greatest extent possible using absolute positions and place the contents in the document with elements like tables.
Supported OOXML tags and attributes when using the native method

phpdocx parses contents, styles, properties and other OOXML contents.

The list of currently parsed contents and styles include :

  • document (w:body)

    • background color (w:background) => w:color
    • border (w:pgBorders) => w:top, w:bottom, w:left, w:right: w:color, w:sz, w:val (nil, none, dashed, dotted), w:space
  • sections (w:sectPr)

    • size (w:pgSz) => w:w (width), w:h (height)
    • margin (w:pgMar) => w:top (margin-top), w:bottom (margin-bottom), w:left (margin-left), w:right (margin-right)
  • title and properties (cp:coreProperties)

    • title (dc:title) => title
    • author (dc:creator) => creator, author
    • subject (dc:subject) => subject
    • keywords (cp:keywords) => keywords
  • text strings (w:t) and text styles (w:rPr)

    • text (w:t)
    • bold (w:b)
    • color (w:color)
    • font family (w:rFonts)
    • font size (w:sz)
    • highlight (w:highlight)
    • italic (w:i)
    • line through (w:strike)
    • text decoration (w:u) => none or underline
    • vertical align (w:vertAlign)
  • paragraphs (w:pPr)

    • background color (w:shd)
    • bold (w:b)
    • color (w:color)
    • font family (w:rFonts)
    • font size (w:sz)
    • italic (w:i)
    • line height (w:spacing)
    • line through (w:strike)
    • page break (w:pageBreakBefore)
    • text align (w:jc) => left, justify, center, right
    • text decoration (w:u) => none or underline
    • text indent (w:firstLine)
  • images (w:drawing): png, jpg and other formats supported by web browsers

    • align (wp:positionH, wp:align) => right, left, center
    • border (a:ln, a:noFill)
    • height (wp:extent) => cy
    • link (a:hlinkClick) => r:id
    • width (wp:extent) => cx
  • lists (w:numPr)

    • type (w:numId) => w:val and w:ilvl (list-style-type: disc, decimal, lower-alpha, lower-roman, upper-alpha, upper-roman)
    • view paragraphs elements for other styles
    • some styles such as color or font sizes can be inherited to the li content from the li symbol. In this case, the content must have its own style
  • links

    • link (w:instrText) => HYPERLINK
  • form elements

    • checkbox (w:instrText)
    • input (w:instrText)
    • select (w:instrText)
  • styles (view elements on this same page for supported styles)

    • character/run (w:rPr)
    • paragraph (w:pPr)
    • list (w:pPr, w:numId, w:ilvl)
    • table (w:style)
    • styles file (w:styles) => character/run (w:rStyle), paragraph and list (w:pStyle), table
    • numbering file => list (w:abstractNum)
  • tables (w:tbl)

    • border (w:tblBorders) => w:top, w:right, w:bottom, w:left (width, style [solid, none], color)
    • layout (w:tblLayout) => w:type fixed
    • width (w:tblW) => w:type pct, dxa w:w
    • rowspan (w:vMerge) => w:val restart, continue (rowspan)
    • cell background color (w:shd) => w:fill
    • cell border (w:tcPr) => w:top, w:right, w:bottom, w:left (width, style [solid, none], color)
    • cell padding (w:tblCellMar) => w:top, w:right, w:bottom, w:left
    • colspan (w:gridSpan) => w:val (colspan)
    • all contents use a left alignment
  • other elements

    • break (w:br) => line and page


  • The fact that a content is not parsed does not mean its content disappears from the DOCX output. It only implies that their associated OOXML properties are not taken directly into account. Their children and text content will be parsed and rendered with their corresponding styles into the PDF output.

phpdocx parses the supported contents and styles from any DOCX created with phpdocx, MS Word, LibreOffice or any other application, then transforms them to PDF including a custom version of TCPDF.

phpdocx also allows to transform DOCX to PDF with DOMPDF instead of the predetermined TCPDF in native conversions. To do that, it is required to download DOMPDF (as phpdocx doesn't include it) and using the class TransformDocAdvDOMPDF:

Exporting the charts

Same as with pictures, it is possible to export existing charts in a DOCX document to PDF.

LibreOffice and MS Word comes with Word graphic charts support. It runs a straight conversion with no extra configurations.

In the case of OpenOffice, it is necessary to install one of the following PHP libraries:

Unzip the chosen library in the lib directory with the name “jpgrah” or “ezcomponents”.

Running with JpGraph, you may encounter this error message:

  • 'JpGraphError::RaiseL(25128);//(‘The function imageantialias() is not available in your PHP installation. Use the GD version that comes with PHP and not the standalone version.’)',

This is due to an incompatibility with the PHP version.

To fix this, modify the file gd_image.php of JpGraph commenting the line JpGraphError::RaiseL(25128); of the SetAntiAliasing method.

Next - Other conversion methods