简体   繁体   中英

How to edit a Hyperlink in a Word Document using Apache POI?

So I've been browsing around the source code / documentation for POI (specifically XWPF) and I can't seem to find anything that relates to editing a hyperlink in a .docx. I only see functionality to get the information for the currently set hyperlink. My goal is to change the hyperlink in a .docx to link to " http://yahoo.com " from " http://google.com " as an example. Any help would be greatly appreciated. Thanks!

This requirement needs knowledge about how hyperlinks referring to an external reference get stored in Microsoft Word documents and how this gets represented in XWPF of Apache POI.

The XWPFHyperlinkRun is the representation of a linked text run in a IRunBody . This text run, or even multiple text runs, is/are wrapped with a XML object of type CTHyperlink . This contains a relation ID which points to a relation in the package relations part. This package relation contains the URI which is the hyperlink's target.

Currently ( apache poi 5.2.2 ) XWPFHyperlinkRun provides access to a XWPFHyperlink . But this is very rudimentary. It only has getters for the Id and the URI. It neither provides access to it's XWPFHyperlinkRun and it's IRunBody nor it provides a setter for the target URI in the package relations part. It not even has internally access to it's the package relations part.

So only using Apache POI classes the only possibility currently is to delete the old XWPFHyperlinkRun and create a new one pointing to the new URI. But as the text runs also contain the text formatting, deleting them will also delete the text formatting. It would must be copied from the old XWPFHyperlinkRun to the new before deleting the old one. That's uncomfortable.

So the rudimentary XWPFHyperlink should be extended to provide a setter for the target URI in the package relations part. A new class XWPFHyperlinkExtended could look like so:

import org.apache.poi.xwpf.usermodel.*;
import org.apache.poi.openxml4j.opc.PackageRelationship;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;

/**
 * Extended XWPF hyperlink class
 * Provides access to it's Id, URI, XWPFHyperlinkRun, IRunBody.
 * Provides setting target URI in PackageRelationship.
*/
public class XWPFHyperlinkExtended {
    
 private String id;
 private String uri;
 private XWPFHyperlinkRun hyperlinkRun;
 private IRunBody runBody;
 private PackageRelationship rel;

 public XWPFHyperlinkExtended(XWPFHyperlinkRun hyperlinkRun, PackageRelationship rel) {
  this.id = rel.getId();
  this.uri = rel.getTargetURI().toString();
  this.hyperlinkRun = hyperlinkRun;
  this.runBody = hyperlinkRun.getParent();
  this.rel = rel;
 }

 public String getId() {
  return this.id;
 }
 public String getURI() {
  return this.uri;
 }
 public IRunBody getIRunBody() {
  return this.runBody; 
 }
 public XWPFHyperlinkRun getHyperlinkRun() {
  return this.hyperlinkRun; 
 }
 
 /**
  * Provides setting target URI in PackageRelationship.
  * The old PackageRelationship gets removed.
  * A new PackageRelationship gets added using the same Id.
 */
 public void setTargetURI(String uri) {
  this.runBody.getPart().getPackagePart().removeRelationship(this.getId());
  this.uri = uri;
  PackageRelationship rel = this.runBody.getPart().getPackagePart().addExternalRelationship(uri, XWPFRelation.HYPERLINK.getRelation(), this.getId());
  this.rel = rel;
 }
  
}

It does not extend XWPFHyperlink as this is so rudimentary it's not worth it. Furthermore after setTargetURI the String uri needs to be updated. But there is no setter in XWPFHyperlink and the field is only accessible from inside the package.

The new XWPFHyperlinkExtended can be got from XWPFHyperlinkRun like so:

 /**
  * If this HyperlinkRun refers to an external reference hyperlink,
  * return the XWPFHyperlinkExtended object for it.
  * May return null if no PackageRelationship found.
 */ 
 /*modifiers*/ XWPFHyperlinkExtended getHyperlink(XWPFHyperlinkRun hyperlinkRun) {
  try {
   for (org.apache.poi.openxml4j.opc.PackageRelationship rel : hyperlinkRun.getParent().getPart().getPackagePart().getRelationshipsByType(XWPFRelation.HYPERLINK.getRelation())) {   
    if (rel.getId().equals(hyperlinkRun.getHyperlinkId())) {
     return new XWPFHyperlinkExtended(hyperlinkRun, rel);
    }
   }
  } catch (org.apache.poi.openxml4j.exceptions.InvalidFormatException ifex) {
   // do nothing, simply do not return something      
  }
  return null;   
 }

Once we have that XWPFHyperlinkExtended we can set an new target URI using it's method setTargetURI .


A further problem results from the fact, that the XML object of type CTHyperlink can wrap around multiple text runs, not only one. Then multiple XWPFHyperlinkRun are in one CTHyperlink and point to one target URI. For example this could look like:

... [this is a link to example.com ]->https://example.com ...

This results in 6 XWPFHyperlinkRun s in one CTHyperlink linking to https://example.com .

This leads to problems when link text needs to be changed when the link target changes. The text of all the 6 text runs is the link text. So which text run shall be changed?

The best I have found is a method which sets the text of the first text run in the CTHyperlink.

 /**
  * Sets the text of the first text run in the CTHyperlink of this XWPFHyperlinkRun.
  * Tries solving the problem when a CTHyperlink contains multiple text runs.
  * Then the String value is set in first text run only. All other text runs are set empty.
 */ 
 /*modifiers*/ void setTextInFirstRun(XWPFHyperlinkRun hyperlinkRun, String value) {
  org.openxmlformats.schemas.wordprocessingml.x2006.main.CTHyperlink ctHyperlink = hyperlinkRun.getCTHyperlink();
  for (int r = 0; r < ctHyperlink.getRList().size(); r++) {
   org.openxmlformats.schemas.wordprocessingml.x2006.main.CTR ctR = ctHyperlink.getRList().get(r);
   for (int t = 0; t < ctR.getTList().size(); t++) {
    org.openxmlformats.schemas.wordprocessingml.x2006.main.CTText ctText = ctR.getTList().get(t);
    if (r == 0 && t == 0) {
     ctText.setStringValue(value);  
    } else {
     ctText.setStringValue("");  
    }
   }
  }
 }

There the String value is set in first text run only. All other text runs are set empty. The text formatting of the first text run remains.

I found a way to edit the url of the link in a "indirect way" (copy the previous hyperlink, modify the url, delete the previous hyperlink and add the new one in the paragraph).

Code is shown below:

private void editLinksOfParagraph(XWPFParagraph paragraph, XWPFDocument document) {
    for (int rIndex = 0; rIndex < paragraph.getRuns().size(); rIndex++) {
        XWPFRun run = paragraph.getRuns().get(rIndex);

        if (run instanceof XWPFHyperlinkRun) {
            // get the url of the link to edit it
            XWPFHyperlink link = ((XWPFHyperlinkRun) run).getHyperlink(document);
            String linkURL = link.getURL();
            //get the xml representation of the hyperlink that includes all the information
            XmlObject xmlObject = run.getCTR().copy();

            linkURL += "-edited-link"; //edited url of the link, f.e add a '-edited-link' suffix
            //remove the previous link from the paragraph
            paragraph.removeRun(rIndex);
            //add the new hyperlinked with updated url in the paragraph, in place of the previous deleted
            XWPFHyperlinkRun hyperlinkRun = paragraph.insertNewHyperlinkRun(rIndex, linkURL);
            hyperlinkRun.getCTR().set(xmlObject);
        }
    }
}

This works, but need more some steps to get text formatting correctly:

    try (var fis = new FileInputStream(fileName);
            var doc = new XWPFDocument(fis)) {
        var pList = doc.getParagraphs();
        for (var p : pList) {
            var runs = p.getRuns();
            for (int i = 0; i < runs.size(); i++) {
                var r = runs.get(i);
                if (r instanceof XWPFHyperlinkRun) {
                    var run = (XWPFHyperlinkRun) r;
                    var link = run.getHyperlink(doc);
                    // To get text: link for checking
                    System.out.println(run.getText(0) + ": " + link.getURL());
                    // how i replace it
                    var run1 = p.insertNewHyperlinkRun(i, "http://google.com");
                    run1.setText(run.getText(0));
                    // remove the old link
                    p.removeRun(i + 1);
                }
            }
        }
        try (var fos = new FileOutputStream(outFileName)) {
            doc.write(fos);
        }
    }

I'm using these libraries:

implementation 'org.apache.poi:poi:5.2.2'
implementation 'org.apache.poi:poi-ooxml:5.2.2'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM