DOCX2TXT: how to get the text content of a Word docx file

  • Jun 19, 2012

There is a quite common need to get the contents of a docx Word file converted to plain text so they can be parsed by other tools, inserted on a database, etcetera.

So we decided to create a DOCX2TXT method in PHPDocX so our users would not have the need to use third party scripts or develop such a feature for their platforms.

The DOCX2TXT method is available in the PRO as well as the FREE versions of the library and its use is extremely simple. You may check: DOCX2TXT for a working example.

You have a few extra options that may be useful for some users. For example:

You may extract chart data as tabulated data.
Extract lists and tables with minimum formatting (the indents and cells are substituted by tabs).
Extract or not footnotes and endnotes into the resulting text file.

We hope you find this new method useful.

We finish by commenting there is also a new TXT2DOCX method (this one only available in the PRO version) that does just the opposite: convert a plain text file directly into Word. But that will be the subject of a different blog post.