简体   繁体   中英

Problems with Indexing by Latent Semantic Analysis

Whenever I try to run this python script, in Windows 7 Enterprise (64 bit) with python 2.6.6 installed, I keep getting this error

Problem signature: Problem Event Name: APPCRASH
Application Name: python.exe
Application Version: 0.0.0.0
Application Timestamp: 4c73f7b6
Fault Module Name: _csr.pyd
Fault Module Version: 0.0.0.0
Fault Module Timestamp: 4d6a645b
Exception Code: c0000005
Exception Offset: 000c05d4

I've tried re-installing python and all the modules that my program runs on (ie gensim, nlptk, scipy and numpy)

I don't know if this is enough data for you guys, but please let me know!!

lsi = models.LsiModel(corpus, num_topics = num_Topics)
index_lsi = similarities.MatrixSimilarity(lsi[corpus])

for k, v in dict_Queries.items():
        File.write("Check Key: " +k+ "\n")
        print "Running.... \n" 
        vec_bow = dict.doc2bow(v.split(), allow_update=True)

#In the last iteration, the code below the line doesn't run and i think the vec_lsi  
#is the source of the problem but I don't know why?
        vec_lsi = lsi[vec_bow]

        #indexing the LSI
        sims = index_lsi[vec_lsi]
        sims = sorted(enumerate(sims), key = lambda item: -item[1])

        if not cut_Off == 0:
            sims = sims[0:cut_Off]
        else:
            pass

        for t in sims:

            dup_info = dict_tcs.get(t[0])

            if t[1] > 0.75:
                #print "Key: " + k + " Link: " + dup_info + "\n"
                File.write("Adding: "+str(t)+ " To LSI actual \n")
                if dict_Actual_LSI.has_key(k):
                    links = dict_Actual_LSI.get(k)
                    links.append(dup_info)
                else:
                    links = []
                    links.append(dup_info)
                    dict_Actual_LSI[k] = links
        print "Added\n"

In the last iteration, the code below the line doesn't run and i think the vec_lsi is the source of the problem but I don't know why?

Thanks

Exception code c0000005 means "access violation". This generally means that some piece of code tried to read from or write to a memory address that it didn't have permission to access. This might be due to a corrupted pointer, uninitialized memory or native code indexing out of the bounds of an array.

The module that the fault is in is _csr.pyd. This is a part of SciPy that sounds like it's for manipulating sparse arrays. This would suggest that the error is happening because somehow SciPy has been pointed towards invalid memory. Without seeing your program it's hard to guess how this might have happened.

As a next step, you could try to pin down what's happening immediately before the crash by adding some print statements to your program - by printing out its progress you can narrow down where the crash is occurring. If you're lucky it might then become clear why SciPy is trying to access invalid memory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM