使用BufferedReader（Java）翻译字符串中的单词

Question

我已经在这方面工作了几天，我无法取得任何进展。 我尝试过使用Scanner和BufferedReader，但没有运气。

基本上，我有一个工作方法（shortenWord），它接受一个String并根据格式如下的文本文件缩短它：

hello,lo
any,ne
anyone,ne1
thanks,thx

这也是标点符号所以'你好？' 成为'lo？' 等等

我需要能够读取字符串并单独翻译每个单词，所以“你好？任何人都感谢！” 将变成“lo ne ne1 thx！”，基本上使用我已经对String中每个单词的方法。 我所拥有的代码将翻译第一个单词，但其余部分则不做任何操作。 我认为这与我的BufferedReader如何工作有关。

import java.io.*;

public class Shortener {
    private FileReader in ;
    /*
     * Default constructor that will load a default abbreviations text file.
     */
    public Shortener() {
        try {
            in = new FileReader( "abbreviations.txt" );
        }       

        catch ( Exception e ) {
            System.out.println( e );
        }
    }

    public String shortenWord( String inWord ) {
        String punc = new String(",?.!;") ;
        char finalchar = inWord.charAt(inWord.length()-1) ;
        String outWord = new String() ;
        BufferedReader abrv = new BufferedReader(in) ;

            // ends in punctuation
            if (punc.indexOf(finalchar) != -1 ) {
                String sub = inWord.substring(0, inWord.length()-1) ;
                outWord = sub + finalchar ;


            try {
                String line;
                while ( (line = abrv.readLine()) != null ) {
                    String[] lineArray = line.split(",") ;
                        if ( line.contains(sub) ) {
                            outWord = lineArray[1] + finalchar ;
                            }
                        }
                    }

            catch (IOException e) {
                System.out.println(e) ;
                }
            }

            // no punctuation
            else {
                outWord = inWord ;

                try {
                String line;

                    while( (line = abrv.readLine()) != null) {
                        String[] lineArray = line.split(",") ;
                            if ( line.contains(inWord) ) {
                                outWord = lineArray[1] ;
                            }
                        }
                    }

                catch (IOException ioe) {
                   System.out.println(ioe) ; 
                }
            }

        return outWord;
    }

    public void shortenMessage( String inMessage ) {
         String[] messageArray = inMessage.split("\\s+") ;
         for (String word : messageArray) {
            System.out.println(shortenWord(word));
        }
    }
}

任何帮助，甚至是正确方向的推动都会受到如此多的赞赏。

编辑：我已经尝试在shortenWord方法结束时关闭BufferedReader，它只会导致我在第一个说BufferedReader关闭后的字符串中的每个字都出错。

Answer 1

所以我看了看这个。 首先，如果您可以选择更改文本文件的格式，我会将其更改为类似这样的内容（或XML）：

 key1=value1
 key2=value2

通过这样做，您可以在以后使用java的Properties.load(Reader) 。 这将消除对文件的任何手动解析的需要。

如果通过任何更改您无法更改格式，则必须自行解析。 类似下面的代码会做到这一点，并将结果放入名为shortningRules的Map ，然后可以在以后使用。

private void parseInput(FileReader reader) {
    try (BufferedReader br = new BufferedReader(reader)) {
        String line;
        while ((line = br.readLine()) != null) {
            String[] lineComponents = line.split(",");
            this.shortningRules.put(lineComponents[0], lineComponents[1]);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}

当涉及到实际缩短消息时，我可能会选择正则表达式方法，例如\\\\bKEY\\\\b其中key是要缩短的单词。 \\\\b是正则表达式中的锚点，表示单词boundery ，这意味着它不匹配空格或标点符号。 完成缩短的整个代码将变成这样的：

public void shortenMessage(String message) {
    for (Entry<String, String> entry : shortningRules.entrySet()) {
        message = message.replaceAll("\\b" + entry.getKey() + "\\b", entry.getValue());
    }
    System.out.println(message); //This should probably be a return statement instead of a sysout.
}

全部放在一起会给你一些这个，在这里我添加了一个main用于测试目的。

Answer 2

我认为您可以使用HashMap获得更简单的解决方案。 创建Shortener对象时，将所有缩写读入地图，只要有单词就引用它。 单词将是key和缩写value 。 像这样：

public class Shortener {

    private FileReader in;
    //the map
    private HashMap<String, String> abbreviations;

    /*
     * Default constructor that will load a default abbreviations text file.
     */
    public Shortener() {
        //initialize the map
        this.abbreviations = new HashMap<>();
        try {
            in = new FileReader("abbreviations.txt" );
            BufferedReader abrv = new BufferedReader(in) ;
            String line;
            while ((line = abrv.readLine()) != null) {
                String [] abv = line.split(",");
                //If there is not two items in the file, the file is malformed
                if (abv.length != 2) {
                    throw new IllegalArgumentException("Malformed abbreviation file");
                }
                //populate the map with the word as key and abbreviation as value
                abbreviations.put(abv[0], abv[1]);
            }
        }       

        catch ( Exception e ) {
            System.out.println( e );
        }
    }

    public String shortenWord( String inWord ) {
        String punc = new String(",?.!;") ;
        char finalchar = inWord.charAt(inWord.length()-1) ;

        // ends in punctuation
        if (punc.indexOf(finalchar) != -1) {
            String sub = inWord.substring(0, inWord.length() - 1);

            //Reference map
            String abv = abbreviations.get(sub);
            if (abv == null)
                return inWord;
            return new StringBuilder(abv).append(finalchar).toString();
        }

        // no punctuation
        else {
            //Reference map
            String abv = abbreviations.get(inWord);
            if (abv == null)
                return inWord;
            return abv;
        }
    }

    public void shortenMessage( String inMessage ) {
         String[] messageArray = inMessage.split("\\s+") ;
         for (String word : messageArray) {
            System.out.println(shortenWord(word));
        }
    }

    public static void main (String [] args) {
        Shortener s = new Shortener();
        s.shortenMessage("hello? any anyone thanks!");
    }
}

输出：

lo?
ne
ne1
thx!

编辑：

从atommans回答，你基本上可以删除shortenWord方法，通过修改shortenMessage方法，如下所示：

public void shortenMessage(String inMessage) {
     for (Entry<String, String> entry:this.abbreviations.entrySet()) 
         inMessage = inMessage.replaceAll(entry.getKey(), entry.getValue());

     System.out.println(inMessage);
}

使用BufferedReader（Java）翻译字符串中的单词

问题描述

2 个解决方案

解决方案1
3 2015-04-02 11:25:14

解决方案2
2 已采纳 2015-04-02 10:59:47

使用BufferedReader（Java）翻译字符串中的单词

问题描述

2 个解决方案

解决方案1 3 2015-04-02 11:25:14

解决方案2 2 已采纳 2015-04-02 10:59:47

解决方案1
3 2015-04-02 11:25:14

解决方案2
2 已采纳 2015-04-02 10:59:47