[英]Translate words in a string using BufferedReader (Java)
I've been working on this for a few days now and I just can't make any headway. 我已经在这方面工作了几天,我无法取得任何进展。 I've tried using Scanner and BufferedReader and had no luck.
我尝试过使用Scanner和BufferedReader,但没有运气。
Basically, I have a working method (shortenWord) that takes a String and shortens it according to a text file formatted like this: 基本上,我有一个工作方法(shortenWord),它接受一个String并根据格式如下的文本文件缩短它:
hello,lo
any,ne
anyone,ne1
thanks,thx
It also accounts for punctuation so 'hello?' 这也是标点符号所以'你好?' becomes 'lo?'
成为'lo?' etc.
等等
I need to be able to read in a String and translate each word individually, so "hello? any anyone thanks!" 我需要能够读取字符串并单独翻译每个单词,所以“你好?任何人都感谢!” will become "lo? ne ne1 thx!", basically using the method I already have on each word in the String.
将变成“lo ne ne1 thx!”,基本上使用我已经对String中每个单词的方法。 The code I have will translate the first word but then does nothing to the rest.
我所拥有的代码将翻译第一个单词,但其余部分则不做任何操作。 I think it's something to do with how my BufferedReader is working.
我认为这与我的BufferedReader如何工作有关。
import java.io.*;
public class Shortener {
private FileReader in ;
/*
* Default constructor that will load a default abbreviations text file.
*/
public Shortener() {
try {
in = new FileReader( "abbreviations.txt" );
}
catch ( Exception e ) {
System.out.println( e );
}
}
public String shortenWord( String inWord ) {
String punc = new String(",?.!;") ;
char finalchar = inWord.charAt(inWord.length()-1) ;
String outWord = new String() ;
BufferedReader abrv = new BufferedReader(in) ;
// ends in punctuation
if (punc.indexOf(finalchar) != -1 ) {
String sub = inWord.substring(0, inWord.length()-1) ;
outWord = sub + finalchar ;
try {
String line;
while ( (line = abrv.readLine()) != null ) {
String[] lineArray = line.split(",") ;
if ( line.contains(sub) ) {
outWord = lineArray[1] + finalchar ;
}
}
}
catch (IOException e) {
System.out.println(e) ;
}
}
// no punctuation
else {
outWord = inWord ;
try {
String line;
while( (line = abrv.readLine()) != null) {
String[] lineArray = line.split(",") ;
if ( line.contains(inWord) ) {
outWord = lineArray[1] ;
}
}
}
catch (IOException ioe) {
System.out.println(ioe) ;
}
}
return outWord;
}
public void shortenMessage( String inMessage ) {
String[] messageArray = inMessage.split("\\s+") ;
for (String word : messageArray) {
System.out.println(shortenWord(word));
}
}
}
Any help, or even a nudge in the right direction would be so much appreciated. 任何帮助,甚至是正确方向的推动都会受到如此多的赞赏。
Edit: I've tried closing the BufferedReader at the end of the shortenWord method and it just results in me getting an error on every word in the String after the first one saying that the BufferedReader is closed. 编辑:我已经尝试在shortenWord方法结束时关闭BufferedReader,它只会导致我在第一个说BufferedReader关闭后的字符串中的每个字都出错。
So I took at look at this. 所以我看了看这个。 First of all, if you have the option to change the format of your textfile I would change it to something like this (or XML):
首先,如果您可以选择更改文本文件的格式,我会将其更改为类似这样的内容(或XML):
key1=value1
key2=value2
By doing this you could later use java's Properties.load(Reader)
. 通过这样做,您可以在以后使用java的
Properties.load(Reader)
。 This would remove the need for any manual parsing of the file.' 这将消除对文件的任何手动解析的需要。
If by any change you don't have the option to change the format then you'll have to parse it yourself. 如果通过任何更改您无法更改格式,则必须自行解析。 Something like the code below would do that, and put the results into a
Map
called shortningRules
which could then be used later. 类似下面的代码会做到这一点,并将结果放入名为
shortningRules
的Map
,然后可以在以后使用。
private void parseInput(FileReader reader) {
try (BufferedReader br = new BufferedReader(reader)) {
String line;
while ((line = br.readLine()) != null) {
String[] lineComponents = line.split(",");
this.shortningRules.put(lineComponents[0], lineComponents[1]);
}
} catch (IOException e) {
e.printStackTrace();
}
}
When it comes to actually shortening a message I would probably opt for a regex approach, eg \\\\bKEY\\\\b
where key is word you want shortened. 当涉及到实际缩短消息时,我可能会选择正则表达式方法,例如
\\\\bKEY\\\\b
其中key是要缩短的单词。 \\\\b
is a anchor in regex and symbolizes a word boundery which means it will not match spaces or punctuation. \\\\b
是正则表达式中的锚点,表示单词boundery ,这意味着它不匹配空格或标点符号。 The whole code for doing the shortening would then become something like this: 完成缩短的整个代码将变成这样的:
public void shortenMessage(String message) {
for (Entry<String, String> entry : shortningRules.entrySet()) {
message = message.replaceAll("\\b" + entry.getKey() + "\\b", entry.getValue());
}
System.out.println(message); //This should probably be a return statement instead of a sysout.
}
Putting it all together will give you something this , here I've added a main
for testing purposes. 全部放在一起会给你一些这个 ,在这里我添加了一个
main
用于测试目的。
I think you can have a simpler solution using a HashMap
. 我认为您可以使用
HashMap
获得更简单的解决方案。 Read all the abbreviations into the map when the Shortener
object is created, and just reference it once you have a word. 创建
Shortener
对象时,将所有缩写读入地图,只要有单词就引用它。 The word will be the key
and the abbreviation the value
. 单词将是
key
和缩写value
。 Like this: 像这样:
public class Shortener {
private FileReader in;
//the map
private HashMap<String, String> abbreviations;
/*
* Default constructor that will load a default abbreviations text file.
*/
public Shortener() {
//initialize the map
this.abbreviations = new HashMap<>();
try {
in = new FileReader("abbreviations.txt" );
BufferedReader abrv = new BufferedReader(in) ;
String line;
while ((line = abrv.readLine()) != null) {
String [] abv = line.split(",");
//If there is not two items in the file, the file is malformed
if (abv.length != 2) {
throw new IllegalArgumentException("Malformed abbreviation file");
}
//populate the map with the word as key and abbreviation as value
abbreviations.put(abv[0], abv[1]);
}
}
catch ( Exception e ) {
System.out.println( e );
}
}
public String shortenWord( String inWord ) {
String punc = new String(",?.!;") ;
char finalchar = inWord.charAt(inWord.length()-1) ;
// ends in punctuation
if (punc.indexOf(finalchar) != -1) {
String sub = inWord.substring(0, inWord.length() - 1);
//Reference map
String abv = abbreviations.get(sub);
if (abv == null)
return inWord;
return new StringBuilder(abv).append(finalchar).toString();
}
// no punctuation
else {
//Reference map
String abv = abbreviations.get(inWord);
if (abv == null)
return inWord;
return abv;
}
}
public void shortenMessage( String inMessage ) {
String[] messageArray = inMessage.split("\\s+") ;
for (String word : messageArray) {
System.out.println(shortenWord(word));
}
}
public static void main (String [] args) {
Shortener s = new Shortener();
s.shortenMessage("hello? any anyone thanks!");
}
}
Output: 输出:
lo?
ne
ne1
thx!
Edit: 编辑:
From atommans answer, you can basically remove the shortenWord
method, by modifying the shortenMessage
method like this: 从atommans回答,你基本上可以删除
shortenWord
方法,通过修改shortenMessage
方法,如下所示:
public void shortenMessage(String inMessage) {
for (Entry<String, String> entry:this.abbreviations.entrySet())
inMessage = inMessage.replaceAll(entry.getKey(), entry.getValue());
System.out.println(inMessage);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.