简体   繁体   English

如何计算.txt文件中波兰语字符的出现

[英]How to count occurrence of Polish characters in .txt file

I have to prepare a .txt file and count how many times each character of alphabet occurs in the file. 我必须准备一个.txt文件,并计算该文件中每个字母字符出现多少次。 I've found a very nice piece of code, but unfortunately, it doesn't work with Polish characters like ą,ę,ć,ó,ż,ź. 我找到了一段非常不错的代码,但不幸的是,它不适用于+/-,ę,ć,ó,ż,ź等波兰语字符。 Even though I put them in the array, for some reason they are not found in the .txt file so the output is 0. 即使我将它们放在数组中,由于某种原因,在.txt文件中也找不到它们,因此输出为0。

Does anyone know why? 有人知道为什么吗? Maybe I should count them differently, with "Switch" or something similar. 也许我应该用“ Switch”或类似的东西对它们进行不同的计算。 Before anyone asks - yes, the .txt file is saved with UTF-8 :) 在任何人问之前-是的,.txt文件使用UTF-8保存:)

public static void main(String[] args) throws FileNotFoundException {
        int ch;
        BufferedReader reader;
        try {
            int counter = 0;

            for (char a : "AĄĆĘÓBCDEFGHIJKLMNOPQRSTUVWXYZ".toCharArray()) {
                reader = new BufferedReader(new FileReader("C:\\Users\\User\\Desktop\\pan.txt"));
                char toSearch = a;
                counter = 0;

                try {
                    while ((ch = reader.read()) != -1) {
                        if (a == Character.toUpperCase((char) ch)) {
                            counter++;
                            }
                    }

                } catch (IOException e) {
                    System.out.println("Error");
                    e.printStackTrace();
                }
                System.out.println(toSearch + " occurs " + counter);

            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }

Looks like your problem related to encoding and default system charset 看起来您的问题与编码和默认系统字符集有关

try to change reader variable to this 尝试将读者变量更改为此

InputStreamReader reader = new InputStreamReader(new FileInputStream("C:\\Users\\User\\Desktop\\pan.txt"), "UTF-8");

try this: I suggest that you use NIO and this code I have written for you using NIO, RandomAccessFile and MappedByteBuffer that is faster: 试试这个:我建议您使用NIO,而我使用NIO,RandomAccessFile和MappedByteBuffer为您编写的代码则更快:

import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.util.HashMap;
import java.util.Map;

public class FileReadNio
{
public static void main(String[] args) throws IOException
{
    Map<Character, Integer> charCountMap = new HashMap<>();

    RandomAccessFile rndFile = new RandomAccessFile
            ("c:\\test123.txt", "r");
    FileChannel inChannel = rndFile.getChannel();
    MappedByteBuffer buffer = inChannel.map(FileChannel.MapMode.READ_ONLY, 0, inChannel.size());
    buffer.load();
    for (int i = 0; i < buffer.limit(); i++)
    {

        char c = (char) buffer.get();

        if (charCountMap.get(c) != null) {
        int cnt = charCountMap.get(c);
            charCountMap.put(c, ++cnt);

        }
        else
        {
            charCountMap.put(c, 1);
        }
    }

    for (Map.Entry<Character,Integer> characterIntegerEntry : charCountMap.entrySet()) {

        System.out.printf("char: %s :: count=%d", characterIntegerEntry.getKey(), characterIntegerEntry.getValue());
        System.out.println();
    }

    buffer.clear();
    inChannel.close();
    rndFile.close();
}
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM