[英]search a unicode string in a file using java
How to search a unicode string in a file using java? 如何使用Java搜索文件中的Unicode字符串? Below is the code that I have tried.It works strings other than unicode.
下面是我尝试过的代码,它可以处理unicode以外的字符串。
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.io.*;
import java.util.*;
class file1
{
public static void main(String arg[])throws Exception
{
BufferedReader bfr1 = new BufferedReader(new InputStreamReader(
System.in));
System.out.println("Enter File name:");
String str = bfr1.readLine();
BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
String s;
int count=0;
int flag=0;
System.out.println("Enter the string to be found");
s=br.readLine();
BufferedReader bfr = new BufferedReader(new FileReader(str));
String bfr2=bfr.readLine();
Pattern p = Pattern.compile(s);
Matcher matcher = p.matcher(bfr2);
while (matcher.find()) {
count++;
}System.out.println(count);
}}
Well, there are three potential sources of problems I can see: 好吧,我可以看到三种潜在的问题来源:
FileReader
which always uses the platform default encoding. FileReader
,该文件始终使用平台默认编码。 What's the encoding of the file you're trying to read? FileInputStream
wrapped in an InputStreamReader
using an explicit encoding (eg UTF-8) which matches the file. InputStreamReader
中的FileInputStream
。 To debug the real values in strings, I would usually use something like this: 为了调试字符串中的实际值,我通常会使用以下内容:
private static void dumpString(String text) {
for (int i = 0; i < text.length(); i++) {
char c = text.charAt(i);
System.out.printf("%d: %4h (%c)", i, c, c);
System.out.println();
}
}
That way you can see the exact UTF-16 code point in each char
in the string. 这样,您可以在字符串的每个
char
中看到确切的UTF-16代码点。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.