Cookbook

Corrupted DOCX

Internally, a DOCX file is a ZIP with a set of files that makes the document, including XML, images and other binaries. If when opening a DOCX file MS Word shows an error alert regarding corrupted content, it may be due to two issues:

  • The DOCX file is not valid and it is not possible to open it. This is the most common issue about corrupted documents, and it means that the web server or PHP are adding extra content to the file at the beginning or the end, e.g when it is downloaded. It is possible to see those additional contents by opening the DOCX with a hex-editor. This way you can trace their origin and thus prevent them to be added.
  • The DOCX file shows an error alert and MS Word usually lets you open it after repairing its content. The error is that some of the tags or attributes are wrong, for example, using a negative value for fields that only allow positive ones. In this case it is advisable to comment fragments of the code until you find the line that causes the error when opening it.