Forum


Replies: 1   Views: 216
I need to extract hyperlinks and their text from a .docx
Topic closed:
Please note this is an old forum thread. Information in this post may be out-to-date and/or erroneous.
Every phpdocx version includes new features and improvements. Previously unsupported features may have been added to newer releases, or past issues may have been corrected.
We encourage you to download the current phpdocx version and check the Documentation available.

Posted by admin  · 14-07-2025 - 07:04

Hello,

The current stable version of phpdocx doesn't include a direct method to extract URL links with their related text contents.
Please note that MS Word and other DOCX editors can generate links using two tags: hyperlinks (w:hyperlink) and fields (w:instrText).

If your DOCX only contains w:hyperlinks, you can extract the needed information using a custom code:

$docx = new CreateDocxFromTemplate('document.docx');

// get hyperlinks
$referenceNode = array(
    'customQuery' => '//w:hyperlink',
);
$hyperlinksInfo = $docx->getDocxPathQueryInfo($referenceNode);

// get rels content
$xmlRelsContent = $docx->getWordFiles('word/_rels/document.xml.rels');
$xmlUtilities = new XmlUtilities();
$contentRelsDOM = $xmlUtilities->generateDomDocument($xmlRelsContent);
$contentRelsXpath = new DOMXPath($contentRelsDOM);
$contentRelsXpath->registerNamespace('rel', 'http://schemas.openxmlformats.org/package/2006/relationships');

// get hyperlinks information
$hyperlinks = array();
foreach ($hyperlinksInfo['elements'] as $hyperlinkInfo) {
    $hyperlinkEntries = $contentRelsXpath->query('//rel:Relationship[@Id="'.$hyperlinkInfo->getAttribute('r:id').'"]');
    if ($hyperlinkEntries->length > 0) {
        $hyperlinks[] = array(
            'textContent' => $hyperlinkInfo->textContent,
            'target' => $hyperlinkEntries->item(0)->getAttribute('Target'),
        );
    }
}

var_dump($hyperlinks);

We have opened a task to the dev team, and they have added support in the testing branch to extract this information (from hyperlinks and fields) using Indexer:

  • linksContents option to get URL and text content from hyperlinks.

Your phpdocx 15.5 license doesn't include LUS (https://www.phpdocx.com/support). If you upgrade to phpdocx 16 and include LUS you can access these changes from the testing branch (https://www.phpdocx.com/support).
If you upgrade your license, please send to contact[at]phpdocx.com if you are using the classic or namespaces package. We'll send you the updated Indexer class with a custom sample.

Regards.