简体   繁体   English

Java如何从文本文件中提取单词?

[英]How Java extracts words from a text file?

I have a text file which contains data in one line, and I want to extract words from the text file. 我有一个文本文件,其中包含一行数据,我想从文本文件中提取单词。

The words I want to extract are: "id" and "token" 我要提取的词是:“ id”和“ token”

With Java I can read the file: 使用Java,我可以读取文件:

import java.io.File;
import java.io.IOException;

import org.apache.commons.io.FileUtils;

public class ReadStringFromFile
{
    public static void main(String[] args) throws IOException
    {
        File file = new File("test.txt");
        String string = FileUtils.readFileToString(file);
        System.out.println("Read in: " + string);
    }
}

As the text file is in one line, I do not know how I can extract a value from the String. 由于文本文件在一行中,因此我不知道如何从字符串中提取值。

You need to split the string. 您需要分割字符串。

In your case I assume the words are separated by a whitespace so string.split("\\\\s+"); 在您的情况下,我假设单词之间用空格隔开,所以string.split("\\\\s+"); should to the trick. 应该把戏。

It looks like you're trying to parse some json code. 似乎您正在尝试解析一些json代码。 You could use a json parser (check out: http://www.json.org/java/ ) or if your needs are simple use a regex to extract the bits you want. 您可以使用json解析器(签出: http : //www.json.org/java/ ),或者如果您的需求很简单,则使用正则表达式提取所需的位。 Maybe something like: 也许像这样:

    File file = new File("test.txt");
    String string = FileUtils.readFileToString(file);
    Pattern re = Pattern.compile("(?:,|\\{)?\"([^:]*)\":(\"[^\"]*\"|\\{[^}]*\\}|[^},]*}?)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
    Matcher m = re.matcher(string);

    // Create a map of all values
    Map<String, String> map = new HashMap<String, String>();
    String id = "NOT_FOUND";
    String token = "NOT_FOUND";
    while (m.find()) {
        map.put(m.group(1), m.group(2).replace("\"", ""));
        if (m.group(1).trim().equals("id")) {
            id = m.group(2).replace("\"", "");
        }
        if (m.group(1).equals("token")) {
            token = m.group(2).replace("\"", "");
        }
    }

    System.out.println("id = " + id + " : token = " + token);

    // or 
    System.out.println(map);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM