简体   繁体   中英

Translate words in a string using BufferedReader (Java)

I've been working on this for a few days now and I just can't make any headway. I've tried using Scanner and BufferedReader and had no luck.

Basically, I have a working method (shortenWord) that takes a String and shortens it according to a text file formatted like this:

hello,lo
any,ne
anyone,ne1
thanks,thx

It also accounts for punctuation so 'hello?' becomes 'lo?' etc.

I need to be able to read in a String and translate each word individually, so "hello? any anyone thanks!" will become "lo? ne ne1 thx!", basically using the method I already have on each word in the String. The code I have will translate the first word but then does nothing to the rest. I think it's something to do with how my BufferedReader is working.

import java.io.*;

public class Shortener {
    private FileReader in ;
    /*
     * Default constructor that will load a default abbreviations text file.
     */
    public Shortener() {
        try {
            in = new FileReader( "abbreviations.txt" );
        }       

        catch ( Exception e ) {
            System.out.println( e );
        }
    }

    public String shortenWord( String inWord ) {
        String punc = new String(",?.!;") ;
        char finalchar = inWord.charAt(inWord.length()-1) ;
        String outWord = new String() ;
        BufferedReader abrv = new BufferedReader(in) ;

            // ends in punctuation
            if (punc.indexOf(finalchar) != -1 ) {
                String sub = inWord.substring(0, inWord.length()-1) ;
                outWord = sub + finalchar ;


            try {
                String line;
                while ( (line = abrv.readLine()) != null ) {
                    String[] lineArray = line.split(",") ;
                        if ( line.contains(sub) ) {
                            outWord = lineArray[1] + finalchar ;
                            }
                        }
                    }

            catch (IOException e) {
                System.out.println(e) ;
                }
            }

            // no punctuation
            else {
                outWord = inWord ;

                try {
                String line;

                    while( (line = abrv.readLine()) != null) {
                        String[] lineArray = line.split(",") ;
                            if ( line.contains(inWord) ) {
                                outWord = lineArray[1] ;
                            }
                        }
                    }

                catch (IOException ioe) {
                   System.out.println(ioe) ; 
                }
            }

        return outWord;
    }

    public void shortenMessage( String inMessage ) {
         String[] messageArray = inMessage.split("\\s+") ;
         for (String word : messageArray) {
            System.out.println(shortenWord(word));
        }
    }
}

Any help, or even a nudge in the right direction would be so much appreciated.

Edit: I've tried closing the BufferedReader at the end of the shortenWord method and it just results in me getting an error on every word in the String after the first one saying that the BufferedReader is closed.

So I took at look at this. First of all, if you have the option to change the format of your textfile I would change it to something like this (or XML):

 key1=value1
 key2=value2

By doing this you could later use java's Properties.load(Reader) . This would remove the need for any manual parsing of the file.'

If by any change you don't have the option to change the format then you'll have to parse it yourself. Something like the code below would do that, and put the results into a Map called shortningRules which could then be used later.

private void parseInput(FileReader reader) {
    try (BufferedReader br = new BufferedReader(reader)) {
        String line;
        while ((line = br.readLine()) != null) {
            String[] lineComponents = line.split(",");
            this.shortningRules.put(lineComponents[0], lineComponents[1]);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}

When it comes to actually shortening a message I would probably opt for a regex approach, eg \\\\bKEY\\\\b where key is word you want shortened. \\\\b is a anchor in regex and symbolizes a word boundery which means it will not match spaces or punctuation. The whole code for doing the shortening would then become something like this:

public void shortenMessage(String message) {
    for (Entry<String, String> entry : shortningRules.entrySet()) {
        message = message.replaceAll("\\b" + entry.getKey() + "\\b", entry.getValue());
    }
    System.out.println(message); //This should probably be a return statement instead of a sysout.
}

Putting it all together will give you something this , here I've added a main for testing purposes.

I think you can have a simpler solution using a HashMap . Read all the abbreviations into the map when the Shortener object is created, and just reference it once you have a word. The word will be the key and the abbreviation the value . Like this:

public class Shortener {

    private FileReader in;
    //the map
    private HashMap<String, String> abbreviations;

    /*
     * Default constructor that will load a default abbreviations text file.
     */
    public Shortener() {
        //initialize the map
        this.abbreviations = new HashMap<>();
        try {
            in = new FileReader("abbreviations.txt" );
            BufferedReader abrv = new BufferedReader(in) ;
            String line;
            while ((line = abrv.readLine()) != null) {
                String [] abv = line.split(",");
                //If there is not two items in the file, the file is malformed
                if (abv.length != 2) {
                    throw new IllegalArgumentException("Malformed abbreviation file");
                }
                //populate the map with the word as key and abbreviation as value
                abbreviations.put(abv[0], abv[1]);
            }
        }       

        catch ( Exception e ) {
            System.out.println( e );
        }
    }

    public String shortenWord( String inWord ) {
        String punc = new String(",?.!;") ;
        char finalchar = inWord.charAt(inWord.length()-1) ;

        // ends in punctuation
        if (punc.indexOf(finalchar) != -1) {
            String sub = inWord.substring(0, inWord.length() - 1);

            //Reference map
            String abv = abbreviations.get(sub);
            if (abv == null)
                return inWord;
            return new StringBuilder(abv).append(finalchar).toString();
        }

        // no punctuation
        else {
            //Reference map
            String abv = abbreviations.get(inWord);
            if (abv == null)
                return inWord;
            return abv;
        }
    }

    public void shortenMessage( String inMessage ) {
         String[] messageArray = inMessage.split("\\s+") ;
         for (String word : messageArray) {
            System.out.println(shortenWord(word));
        }
    }

    public static void main (String [] args) {
        Shortener s = new Shortener();
        s.shortenMessage("hello? any anyone thanks!");
    }
}

Output:

lo?
ne
ne1
thx!

Edit:

From atommans answer, you can basically remove the shortenWord method, by modifying the shortenMessage method like this:

public void shortenMessage(String inMessage) {
     for (Entry<String, String> entry:this.abbreviations.entrySet()) 
         inMessage = inMessage.replaceAll(entry.getKey(), entry.getValue());

     System.out.println(inMessage);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM