简体   繁体   English

通过潜在语义分析建立索引的问题

[英]Problems with Indexing by Latent Semantic Analysis

Whenever I try to run this python script, in Windows 7 Enterprise (64 bit) with python 2.6.6 installed, I keep getting this error 每当我尝试在安装了python 2.6.6的Windows 7 Enterprise(64位)中运行此python脚本时,都会不断出现此错误

Problem signature: Problem Event Name: APPCRASH 问题签名:问题事件名称:APPCRASH
Application Name: python.exe 应用名称:python.exe
Application Version: 0.0.0.0 应用版本:0.0.0.0
Application Timestamp: 4c73f7b6 应用时间戳:4c73f7b6
Fault Module Name: _csr.pyd 故障模块名称:_csr.pyd
Fault Module Version: 0.0.0.0 故障模块版本:0.0.0.0
Fault Module Timestamp: 4d6a645b 故障模块时间戳:4d6a645b
Exception Code: c0000005 异常代码:c0000005
Exception Offset: 000c05d4 异常偏移量:000c05d4

I've tried re-installing python and all the modules that my program runs on (ie gensim, nlptk, scipy and numpy) 我尝试重新安装python及其程序运行的所有模块(即gensim,nlptk,scipy和numpy)

I don't know if this is enough data for you guys, but please let me know!! 我不知道这对于你们来说是否足够的数据,但是请让我知道!

lsi = models.LsiModel(corpus, num_topics = num_Topics)
index_lsi = similarities.MatrixSimilarity(lsi[corpus])

for k, v in dict_Queries.items():
        File.write("Check Key: " +k+ "\n")
        print "Running.... \n" 
        vec_bow = dict.doc2bow(v.split(), allow_update=True)

#In the last iteration, the code below the line doesn't run and i think the vec_lsi  
#is the source of the problem but I don't know why?
        vec_lsi = lsi[vec_bow]

        #indexing the LSI
        sims = index_lsi[vec_lsi]
        sims = sorted(enumerate(sims), key = lambda item: -item[1])

        if not cut_Off == 0:
            sims = sims[0:cut_Off]
        else:
            pass

        for t in sims:

            dup_info = dict_tcs.get(t[0])

            if t[1] > 0.75:
                #print "Key: " + k + " Link: " + dup_info + "\n"
                File.write("Adding: "+str(t)+ " To LSI actual \n")
                if dict_Actual_LSI.has_key(k):
                    links = dict_Actual_LSI.get(k)
                    links.append(dup_info)
                else:
                    links = []
                    links.append(dup_info)
                    dict_Actual_LSI[k] = links
        print "Added\n"

In the last iteration, the code below the line doesn't run and i think the vec_lsi is the source of the problem but I don't know why? 在最后一次迭代中,该行下面的代码未运行,我认为vec_lsi是问题的根源,但我不知道为什么?

Thanks 谢谢

Exception code c0000005 means "access violation". 异常代码c0000005表示“访问冲突”。 This generally means that some piece of code tried to read from or write to a memory address that it didn't have permission to access. 通常,这意味着某些代码试图读取或写入没有访问权限的内存地址。 This might be due to a corrupted pointer, uninitialized memory or native code indexing out of the bounds of an array. 这可能是由于指针损坏,未初始化的内存或本机代码索引超出了数组的范围。

The module that the fault is in is _csr.pyd. 故障所在的模块是_csr.pyd。 This is a part of SciPy that sounds like it's for manipulating sparse arrays. 这是SciPy的一部分,听起来像是用于处理稀疏数组。 This would suggest that the error is happening because somehow SciPy has been pointed towards invalid memory. 这可能表明该错误正在发生,因为SciPy已经以某种方式指向了无效的内存。 Without seeing your program it's hard to guess how this might have happened. 如果不看程序,很难猜测这是怎么发生的。

As a next step, you could try to pin down what's happening immediately before the crash by adding some print statements to your program - by printing out its progress you can narrow down where the crash is occurring. 下一步,您可以尝试通过在程序中添加一些打印语句来确定崩溃之前发生的事情-通过打印其进度,您可以缩小崩溃发生的位置。 If you're lucky it might then become clear why SciPy is trying to access invalid memory. 如果幸运的话,这可能会很清楚为什么SciPy试图访问无效的内存。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM