简体   繁体   中英

Add an element into a treeSet : infromation retrieval

i want to create a inverse index that mean

if i have a terms in a multi-document the result will be like this

term 1 =[doc1], term2 =[doc2 , doc3 , doc4 ] ....

this is my code:

public class TP3 {    
    private static String DIRNAME = "/home/amal/Téléchargements/lemonde";    
    private static String STOPWORDS_FILENAME = "/home/amal/Téléchargements/lemonde/frenchST.txt";    

     public static TreeMap<String, TreeSet<String>> getInvertedFile(File dir, Normalizer normalizer) throws IOException {

        TreeMap<String, TreeSet<String>> st = new TreeMap<String, TreeSet<String>>();

        ArrayList<String> wordsInFile;
        ArrayList<String> words;
        String wordLC;

        if (dir.isDirectory()) {
            String[] fileNames = dir.list();    
            Integer number;         
            for (String fileName : fileNames) {   
                System.err.println("Analyse du fichier " + fileName);

                wordsInFile = new ArrayList<String>();
                words = normalizer.normalize(new File(dir, fileName));

                    for (String word : words) {
                        wordLC = word.toLowerCase();

                        if (!wordsInFile.contains(word)) {
                            TreeSet<String> set = st.get(word);
                            set.add(fileName);
                        }
                    }
                }
            }

            for (Map.Entry<String, TreeSet<String>> hit : st.entrySet()) {
                System.out.println(hit.getKey() + "\t" + hit.getValue());
            }
            return st;
        }
    }

i have an erreor in

set.add(fileName);

i don't know what is the problem please help me

Your main issue is that these two lines are not going to be good:

if (!wordsInFile.contains(word)) {
    TreeSet<String> set = st.get(word);

You never put a set into st so set will be null. After this line you should probably have something like:

if(set == null)
{
    set = new TreeSet<String>();
    st.put(word, set);
}

That should fix your current problem.

Hint for next time, this will be re-read by future users with the same problem and also represents YOU (Someone in the future will read this question when interviewing you for a job!)

Spend some time formatting it thinking of your readers. Prune out comments and correct indentation, don't just paste and run. Also post a little bit of the error stack trace--they are amazingly helpful! Had you posted it, it would have been a "NullPointerException", on that line there is really only one way to get an NPE and it would have saved us having to analyze your code.

PS: I Edited your question so you could see the difference (and to keep it from being closed on you). The main problem with your formatting was the use of tabs.. for programmers tabs are--well let's just say they only work in very controlled conditions. In this case it really helps to watch the preview pane (below your editing box) while you edit--scroll down before you submit to see what we will actually see.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM