Forum


Replies: 5   Views: 52
Handle other xml namespaces

Posted by Aldryss  · 18-03-2020 - 09:22

Hi,

Our company uses phpdocx and we encountered a problem while importing a docx file where the document.xml has the following document declaration :

<w:document xmlns:w="http://purl.oclc.org/ooxml/wordprocessingml/main" xmlns:aink="http://schemas.microsoft.com/office/drawing/2016/ink" xmlns:am3d="http://schemas.microsoft.com/office/drawing/2017/model3d" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:cx2="http://schemas.microsoft.com/office/drawing/2015/10/21/chartex" xmlns:cx3="http://schemas.microsoft.com/office/drawing/2016/5/9/chartex" xmlns:cx4="http://schemas.microsoft.com/office/drawing/2016/5/10/chartex" xmlns:cx5="http://schemas.microsoft.com/office/drawing/2016/5/11/chartex" xmlns:cx6="http://schemas.microsoft.com/office/drawing/2016/5/12/chartex" xmlns:cx7="http://schemas.microsoft.com/office/drawing/2016/5/13/chartex" xmlns:cx8="http://schemas.microsoft.com/office/drawing/2016/5/14/chartex" xmlns:m="http://purl.oclc.org/ooxml/officeDocument/math" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://purl.oclc.org/ooxml/officeDocument/relationships" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wp="http://purl.oclc.org/ooxml/drawingml/wordprocessingDrawing" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" mc:Ignorable="w14 w15 w16se w16cid wne wp14" w:conformance="strict">

the error occurs when extracting text from a docx document (using the Docx2Text class), on 

$bodyNode = $this->domDocument->getElementsByTagNameNS('http://schemas.openxmlformats.org/wordprocessingml/2006/main', 'body');

Could you handle other valid xml namespaces, so these documents are also parsed by phpdox?

Thanks a lot

Posted by admin  · 18-03-2020 - 09:32

Hello,

What license and version of phpdocx are using?

Regards.

Posted by Aldryss  · 18-03-2020 - 15:26

Hi, thanks for the quick reply, it's premium 9.5

Posted by admin  · 18-03-2020 - 16:11

Hello,

The problem is that the DOCX uses the strict variant (w:conformance="strict"), that doesn't use the same XML namespaces than the transitional variant.

Strict variant support is a work in progress (its support is beta and it's being tested), and it will be included in the next release of phpdocx. You can use the transitional variant resaving the DOCX.

Regards.

Posted by Aldryss  · 19-03-2020 - 17:23

Hi, do you have an ETA for this new release ?

Posted by admin  · 19-03-2020 - 17:28

Hello,

The new class to work with strict variant DOCX is available to be used by all users with a Premium license and an active LUS.

If you have an active LUS, please send an e-mail to contact[at]phpdocx.com with the username or e-mail of the user that purchased the license and if you are using the classic or namespaces package. The dev team will send you the new class and a sample to illustrate how to use it.

Regards.