简体   繁体   English

使用BufferedReader(Java)翻译字符串中的单词

[英]Translate words in a string using BufferedReader (Java)

I've been working on this for a few days now and I just can't make any headway. 我已经在这方面工作了几天,我无法取得任何进展。 I've tried using Scanner and BufferedReader and had no luck. 我尝试过使用Scanner和BufferedReader,但没有运气。

Basically, I have a working method (shortenWord) that takes a String and shortens it according to a text file formatted like this: 基本上,我有一个工作方法(shortenWord),它接受一个String并根据格式如下的文本文件缩短它:

hello,lo
any,ne
anyone,ne1
thanks,thx

It also accounts for punctuation so 'hello?' 这也是标点符号所以'你好?' becomes 'lo?' 成为'lo?' etc. 等等

I need to be able to read in a String and translate each word individually, so "hello? any anyone thanks!" 我需要能够读取字符串并单独翻译每个单词,所以“你好?任何人都感谢!” will become "lo? ne ne1 thx!", basically using the method I already have on each word in the String. 将变成“lo ne ne1 thx!”,基本上使用我已经对String中每个单词的方法。 The code I have will translate the first word but then does nothing to the rest. 我所拥有的代码将翻译第一个单词,但其余部分则不做任何操作。 I think it's something to do with how my BufferedReader is working. 我认为这与我的BufferedReader如何工作有关。

import java.io.*;

public class Shortener {
    private FileReader in ;
    /*
     * Default constructor that will load a default abbreviations text file.
     */
    public Shortener() {
        try {
            in = new FileReader( "abbreviations.txt" );
        }       

        catch ( Exception e ) {
            System.out.println( e );
        }
    }

    public String shortenWord( String inWord ) {
        String punc = new String(",?.!;") ;
        char finalchar = inWord.charAt(inWord.length()-1) ;
        String outWord = new String() ;
        BufferedReader abrv = new BufferedReader(in) ;

            // ends in punctuation
            if (punc.indexOf(finalchar) != -1 ) {
                String sub = inWord.substring(0, inWord.length()-1) ;
                outWord = sub + finalchar ;


            try {
                String line;
                while ( (line = abrv.readLine()) != null ) {
                    String[] lineArray = line.split(",") ;
                        if ( line.contains(sub) ) {
                            outWord = lineArray[1] + finalchar ;
                            }
                        }
                    }

            catch (IOException e) {
                System.out.println(e) ;
                }
            }

            // no punctuation
            else {
                outWord = inWord ;

                try {
                String line;

                    while( (line = abrv.readLine()) != null) {
                        String[] lineArray = line.split(",") ;
                            if ( line.contains(inWord) ) {
                                outWord = lineArray[1] ;
                            }
                        }
                    }

                catch (IOException ioe) {
                   System.out.println(ioe) ; 
                }
            }

        return outWord;
    }

    public void shortenMessage( String inMessage ) {
         String[] messageArray = inMessage.split("\\s+") ;
         for (String word : messageArray) {
            System.out.println(shortenWord(word));
        }
    }
}

Any help, or even a nudge in the right direction would be so much appreciated. 任何帮助,甚至是正确方向的推动都会受到如此多的赞赏。

Edit: I've tried closing the BufferedReader at the end of the shortenWord method and it just results in me getting an error on every word in the String after the first one saying that the BufferedReader is closed. 编辑:我已经尝试在shortenWord方法结束时关闭BufferedReader,它只会导致我在第一个说BufferedReader关闭后的字符串中的每个字都出错。

So I took at look at this. 所以我看了看这个。 First of all, if you have the option to change the format of your textfile I would change it to something like this (or XML): 首先,如果您可以选择更改文本文件的格式,我会将其更改为类似这样的内容(或XML):

 key1=value1
 key2=value2

By doing this you could later use java's Properties.load(Reader) . 通过这样做,您可以在以后使用java的Properties.load(Reader) This would remove the need for any manual parsing of the file.' 这将消除对文件的任何手动解析的需要。

If by any change you don't have the option to change the format then you'll have to parse it yourself. 如果通过任何更改您无法更改格式,则必须自行解析。 Something like the code below would do that, and put the results into a Map called shortningRules which could then be used later. 类似下面的代码会做到这一点,并将结果放入名为shortningRulesMap ,然后可以在以后使用。

private void parseInput(FileReader reader) {
    try (BufferedReader br = new BufferedReader(reader)) {
        String line;
        while ((line = br.readLine()) != null) {
            String[] lineComponents = line.split(",");
            this.shortningRules.put(lineComponents[0], lineComponents[1]);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}

When it comes to actually shortening a message I would probably opt for a regex approach, eg \\\\bKEY\\\\b where key is word you want shortened. 当涉及到实际缩短消息时,我可能会选择正则表达式方法,例如\\\\bKEY\\\\b其中key是要缩短的单词。 \\\\b is a anchor in regex and symbolizes a word boundery which means it will not match spaces or punctuation. \\\\b是正则表达式中的锚点,表示单词boundery ,这意味着它不匹配空格或标点符号。 The whole code for doing the shortening would then become something like this: 完成缩短的整个代码将变成这样的:

public void shortenMessage(String message) {
    for (Entry<String, String> entry : shortningRules.entrySet()) {
        message = message.replaceAll("\\b" + entry.getKey() + "\\b", entry.getValue());
    }
    System.out.println(message); //This should probably be a return statement instead of a sysout.
}

Putting it all together will give you something this , here I've added a main for testing purposes. 全部放在一起会给你一些这个 ,在这里我添加了一个main用于测试目的。

I think you can have a simpler solution using a HashMap . 我认为您可以使用HashMap获得更简单的解决方案。 Read all the abbreviations into the map when the Shortener object is created, and just reference it once you have a word. 创建Shortener对象时,将所有缩写读入地图,只要有单词就引用它。 The word will be the key and the abbreviation the value . 单词将是key和缩写value Like this: 像这样:

public class Shortener {

    private FileReader in;
    //the map
    private HashMap<String, String> abbreviations;

    /*
     * Default constructor that will load a default abbreviations text file.
     */
    public Shortener() {
        //initialize the map
        this.abbreviations = new HashMap<>();
        try {
            in = new FileReader("abbreviations.txt" );
            BufferedReader abrv = new BufferedReader(in) ;
            String line;
            while ((line = abrv.readLine()) != null) {
                String [] abv = line.split(",");
                //If there is not two items in the file, the file is malformed
                if (abv.length != 2) {
                    throw new IllegalArgumentException("Malformed abbreviation file");
                }
                //populate the map with the word as key and abbreviation as value
                abbreviations.put(abv[0], abv[1]);
            }
        }       

        catch ( Exception e ) {
            System.out.println( e );
        }
    }

    public String shortenWord( String inWord ) {
        String punc = new String(",?.!;") ;
        char finalchar = inWord.charAt(inWord.length()-1) ;

        // ends in punctuation
        if (punc.indexOf(finalchar) != -1) {
            String sub = inWord.substring(0, inWord.length() - 1);

            //Reference map
            String abv = abbreviations.get(sub);
            if (abv == null)
                return inWord;
            return new StringBuilder(abv).append(finalchar).toString();
        }

        // no punctuation
        else {
            //Reference map
            String abv = abbreviations.get(inWord);
            if (abv == null)
                return inWord;
            return abv;
        }
    }

    public void shortenMessage( String inMessage ) {
         String[] messageArray = inMessage.split("\\s+") ;
         for (String word : messageArray) {
            System.out.println(shortenWord(word));
        }
    }

    public static void main (String [] args) {
        Shortener s = new Shortener();
        s.shortenMessage("hello? any anyone thanks!");
    }
}

Output: 输出:

lo?
ne
ne1
thx!

Edit: 编辑:

From atommans answer, you can basically remove the shortenWord method, by modifying the shortenMessage method like this: 从atommans回答,你基本上可以删除shortenWord方法,通过修改shortenMessage方法,如下所示:

public void shortenMessage(String inMessage) {
     for (Entry<String, String> entry:this.abbreviations.entrySet()) 
         inMessage = inMessage.replaceAll(entry.getKey(), entry.getValue());

     System.out.println(inMessage);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM