简体   繁体   English

java-如何通过docx4j在docx文件中获取文本框的数据

[英]java - How to get data of Textbox in docx file by docx4j

I have a project that need to read all content of a docx file, but i dont know how to get it. 我有一个项目,需要读取docx文件的所有内容,但我不知道如何获取它。 All of thing i can get is just list of paragraphs. 我所能得到的只是段落列表。 I wanna get data inside Textbox too Here is my code: 我也想在Textbox中获取数据这是我的代码:

List<Object> texts = getAllElementFromObject(document.getMainDocumentPart(), P.class);

I tried to use method getAllElementFromObject(document.getMainDocumentPart(), CTTextbox.class); 我试图使用方法getAllElementFromObject(document.getMainDocumentPart(), CTTextbox.class);

but still cant get Textbox data. 但仍无法获取文本框数据。

My method getAllElementFromObject() : 我的方法getAllElementFromObject()

    public static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
    List<Object> result = new ArrayList<Object>();
    if (obj instanceof JAXBElement) obj = ((JAXBElement<?>) obj).getValue();

    if (obj.getClass().equals(toSearch))
        result.add(obj);
    else if (obj instanceof ContentAccessor) {
        List<?> children = ((ContentAccessor) obj).getContent();
        for (Object child : children) {
            result.addAll(getAllElementFromObject(child, toSearch));
        }
    }
    return result;
}

A text box create in Word looks something like: 在Word中创建的文本框如下所示:

    <w:p >
        <w:r >
            <w:pict>
                <v:shapetype o:spt="202.0" path="m,l,21600r21600,l21600,xe" coordsize="21600,21600" id="_x0000_t202">
                    <v:stroke joinstyle="miter"/>
                    <v:path gradientshapeok="t" o:connecttype="rect"/>
                </v:shapetype>
                <v:shape o:gfxdata="UEsDB..8EAABkcnMvZG93bnJldzAAAAhwUAAAAA" type="#_x0000_t202" style="position:absolute;margin-left:0;margin-top:0;width:186.95pt;height:110.55pt;z-index:251659264;visibility:visible;mso-wrap-style:square;mso-width-percent:400;mso-height-percent:200;mso-wrap-distance-left:9pt;mso-wrap-distance-top:0;mso-wrap-distance-right:9pt;mso-wrap-distance-bottom:0;mso-position-horizontal:center;mso-position-horizontal-relative:text;mso-position-vertical:absolute;mso-position-vertical-relative:text;mso-width-percent:400;mso-height-percent:200;mso-width-relative:margin;mso-height-relative:margin;v-text-anchor:top" id="Text Box 2" o:spid="_x0000_s1026">
                    <v:textbox style="mso-fit-shape-to-text:t">
                        <w:txbxContent>
<w:p >
    <w:r>
        <w:t>foo</w:t>
    </w:r>
</w:p>
                            </w:txbxContent>
                        </v:textbox>
                    </v:shape>
                </w:pict>
            </w:r>
        </w:p>

Here the relevant objects are: 这里的相关对象是:

  • org.docx4j.vml.CTTextbox org.docx4j.vml.CTTextbox
  • org.docx4j.wml.CTTxbxContent (which might contain a content control) org.docx4j.wml.CTTxbxContent(可能包含内容控件)

Your code isn't going to work since Pict doesn't implement ContentAccessor. 由于Pict没有实现ContentAccessor,因此您的代码无法正常工作。

So instead, please try https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/finders/ClassFinder.java 因此,请尝试https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/finders/ClassFinder.java

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM