簡體   English   中英

在 .odt 文件列表中查找字符串並打印匹配的行

[英]Find a string in a list of .odt files and print the matching lines

我正在嘗試找到一種在 odt 文件列表中查找單詞的方法。 我的意思是 odt 文件中的一個詞。 然后,我想查看哪些文件包含這個詞以及匹配這個詞的行(或者至少是它之前的一些詞和它之后的一些詞)。


for file in *.odt; do unzip -c "$file" | grep -iq "searched_word" && echo "$file"; done




the is the first line with searched_word blabla : /path/filename1.odt
the is the second line with searched_word blabla : /path/filename2.odt


將 grep 輸出讀入一個變量並使用相同的語句回顯它

grep -i "searched_word" | read x && echo "$x:$file"

$ cat filename1.odt
the is the first line with searched_word blabla
$ cat filename2.odt
gfdgj gdflgjdfl
the is the second line with searched_word blabla
fdg gdfgdf
$ for file in *.odt; do ; cat $file  | grep -i "searched_word" | read x && echo "$x:$file" ; done
the is the first line with searched_word blabla:filename1.odt
the is the second line with searched_word blabla:filename2.odt


$ for file in *.odt; do ; cat $file  | grep -i "QQQQQ" | read x && echo "$x:$file" ; done

一種方法是讓 grep 打印您的文件名,即使您使用的是標准輸入。 有以下選項:

   -H, --with-filename
          Print the file name for each match.  This is the default when there is more than one file to search.

          Display input actually coming from standard input as input coming from file LABEL.  This is especially useful when implementing tools like zgrep, e.g., gzip -cd foo.gz |  grep  --label=foo  -H
          something.  See also the -H option.

   -n, --line-number
          Prefix each line of output with the 1-based line number within its input file.

   -a, --text
          Process a binary file as if it were text; this is equivalent to the --binary-files=text option.

因此,您只需設置--label=$file -Ha -n獲得輸出,就像直接運行了 grep 一樣。

您需要 -H ... 容易犯的錯誤,但沒有它,“只有 1 個輸入文件”,因此不會打印任何標簽。

如果 grep 的啟發式判斷輸入看起來像二進制,您可能需要 -a。

其實,為什么不能直接運行grep呢? 某些 grep 安裝將自動解壓縮 .gz 文件。

Java 1.8 中的基本實現:

package app;

import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.JDOMException;
import org.jdom2.Namespace;
import org.jdom2.filter.Filters;
import org.jdom2.input.SAXBuilder;

import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.HashSet;
import java.util.Set;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;

 *  OfficeSearch
public class OfficeSearch
    private final Set<String> searchSet = new HashSet<>();
    private static OfficeSearch INSTANCE = new OfficeSearch();

    // main
    public static void main(String[] args) {

    //  execute
    private void execute(String[] args) {
        if (args.length > 1) {
            for (int i=1; i<args.length; i++) {
            try {
            catch (IOException e) {
        else {
            System.out.println("Usage: OfficeSearch <directory> <search_term> [...]");


    //  is_odt
    private boolean is_odt(File file) {
        if (file.isFile()) {
            final String name = file.getName();
            final int dotidx = name.lastIndexOf('.');
            if ((0 <= dotidx) && (dotidx < name.length() - 1)) {
                return name.substring(dotidx + 1).equalsIgnoreCase("odt");
        return false;

    // search
    private void search(File odt) {
        try (ZipFile zip = new ZipFile(odt)) {
            final ZipEntry content = zip.getEntry("content.xml");
            if (content != null) {
                final SAXBuilder builder = new SAXBuilder();
                final Document doc = builder.build(zip.getInputStream(content));
                final Element root = doc.getRootElement();
                final Namespace office_ns = root.getNamespace("office");
                final Namespace text_ns = root.getNamespace("text");
                final Element body = root.getChild("body", office_ns);
                if (body != null) {
                    boolean found = false;
                    for (Element e : body.getDescendants(Filters.element(text_ns))) {
                        if ("p".equals(e.getName()) ||
                            "h".equals(e.getName())) {
                            final String s = e.getValue().toLowerCase();
                            for (String p : searchSet) {
                                if (s.contains(p)) {
                                    if (!found) {
                                        found = true;
                                        System.out.println("\n" + odt.toString());
        catch (IOException | JDOMException e) {



聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

粵ICP備18138465號  © 2020-2024 STACKOOM.COM