import java.io.*;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
public class TextReader {
public static void main(String[] args) throws FileNotFoundException{
// HashMap<String, Integer> hashmap = new HashMap <String, Integer>();
TreeMap<String, Integer> hashmap = new TreeMap<String, Integer>();
//get the file and put it into the file variable
File file = new File ("/Desktop/TextSampleWordCount.txt");
//Scan the file in
Scanner pinyintextfile = new Scanner (file,"UTF-8");
while(pinyintextfile.hasNext()){
String word = pinyintextfile.next();
if( hashmap.containsKey(word)){
//if the word is found we put the word into the map and update its count.
int count = hashmap.get(word) + 1;
hashmap.put(word, count);
}
else{
//if the word in not in the map we want to create a new entry for it
hashmap.put(word, 1);
}
}
pinyintextfile.close();
for(Map.Entry<String, Integer> entry : hashmap.entrySet()){
System.out.println(entry);
}
}
}
The program counts the number of repetitions of pinyin words. The problem is that when it outputs the text it outputs it as
Ch?ng =1
Zh?= 3
etc.... I tried looking up the problem but nothing helped. I also referred to this https://docs.oracle.com/javase/7/docs/api/java/nio/charset/Charset.html and changed the charsetName but still outputs questions marks. I am not sure what im doing wrong. Could it be my IDE ?
The file looks like this
Yù kè sù dǎo zhú zhǐ bì shè mù qiú zhēng zì běn yáng qī yán biǎo. Dú zhǒng lǎn yǐ wén yǒu zhèng cái shū cān sè luò shè láo zì yū xuě qián mù wàn. Yū bàn shè shí lǐ wài gèng ér jiāo xī qì shàn xiāng xiào. Wén sēn dé yì fā hù luòzhuǎn quán dào nián měi jì shì chūguò gé shū. Tài jué zhī néngshǒu sòng xiě qiú xù tū tóu jī shòu wèi zhì diào tú yù ān néng. Zhì fù qǐ jiè xíngshì jué zhǐ dǒng zhǔ sè shí yì jì. Dú shè hǎo rì jì zhì qì shǒu xué jí jūn yè zhì shè chēzuò xī zhōngyán míng. Tè shēng yì zhōng shè tóu néng gōng chūshān zuò shēn yàn. Lì fàn duō quán mǎ huà zhèng jì zhì kāngdìng wèn yǒng zǒng.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.