简体   繁体   English

无效数组枚举上标量变量的索引

[英]invalid index to scalar variable on spacy array enumerate

I'm trying to run the following sentence compression locally: https://github.com/zhaohengyang/Generate-Parallel-Data-for-Sentence-Compression 我正在尝试在本地运行以下句子压缩: https : //github.com/zhaohengyang/Generate-Parallel-Data-for-Sentence-Compression

So I copied the files and installed all dependencies with conda. 因此,我复制了文件并使用conda安装了所有依赖项。 I have made minor modifications such as reading the data from url instead of local disk and placed his parallel_data_gen.py bundled in my single py file. 我做了一些小的修改,例如从url而不是本地磁盘读取数据,并将他的parallel_data_gen.py放在我的py文件中。

However when I run it I get: 但是,当我运行它时,我得到:

Spacy library couldn't parse sentence into a tree. Spacy库无法将句子解析为树。 Please ignore this sentence pair 请忽略这对句子

----------------59-------------------
reducing sentence: This year the Venezuelan government plans to continue its pace of land expropriations in order to move towards what it terms ``agrarian socialism''.
reducing headline: Venezuelan government to continue pace of land expropriations for ``agrarian socialism''
Traceback (most recent call last):
  File "/home/user/dev/projects/python-snippets/zhaohengyang/sentence-compression.py", line 701, in <module>
    reduce_sentence(sample)
  File "/home/user/dev/projects/python-snippets/zhaohengyang/sentence-compression.py", line 641, in reduce_sentence
    sentence_info = parse_info(sentence)
  File "/home/user/dev/projects/python-snippets/zhaohengyang/sentence-compression.py", line 616, in parse_info
    heads = [index + item[0] for index, item in enumerate(doc.to_array([HEAD]))]
IndexError: invalid index to scalar variable.

I'm not sure how to fix that as i'm a rather newbie python user. 我不确定如何解决此问题,因为我是一个相当新手的python用户。

Here is the full code i'm running to reproduce the problem: https://gist.github.com/avidanyum/3edfbc96ea22807445ab5307830d41db 这是我正在运行的完整代码以重现该问题: https : //gist.github.com/avidanyum/3edfbc96ea22807445ab5307830d41db

The internal snippet that fails: 失败的内部代码段:

def parse_info(sentence):
    doc = nlp(sentence)

    heads = [index + item[0] for index, item in enumerate(doc.to_array([HEAD]))]

and here is now I loaded the nlp : 现在我加载了nlp

import spacy
print('if you didnt run: python -m spacy download en')
import spacy.lang.en
nlp = spacy.load('en')

More info about my env: 有关我的环境的更多信息:

/home/user/home/user/dev/anaconda3/envs/pymachine/bin/python --version
Python 2.7.15 :: Anaconda, Inc.

Quick note that I am running spaCy 2.0 on python 3.6, but just running a quick test on a sample sentence: 快速说明,我在python 3.6上运行spaCy 2.0,但仅对示例语句运行了快速测试:

nlp = spacy.load('en_core_web_lg')

doc = nlp("Here is a test sentence for me to use.")

I get a few errors running your code, both of which are in the line you specify: 我在运行代码时遇到一些错误,这两个错误都在您指定的行中:

heads = [(index, item) for index, item in enumerate(doc.to_array([HEAD]))]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'HEAD' is not defined

This is because the to_array call takes a list of string objects. 这是因为to_array调用采用了一个string对象list Fixing it to this: 解决这个问题:

# Note that HEAD is now a string, rather than a variable
heads = [(index, item) for index, item in enumerate(doc.to_array(['HEAD']))]
heads
[(0, 3), (1, 1), (2, 1), (3, 0), (4, 18446744073709551615), (5, 1), (6, 18446744073709551614), (7, 18446744073709551612)]

solves the problem. 解决了问题。 You'll also notice that the item returned by enumerate is an int or scalar type, therefore it has no index attribute. 您还将注意到enumerate返回的itemintscalar类型,因此它没有索引属性。 Get rid of your index[0] and that should fix your problem. 摆脱index[0] ,那应该可以解决您的问题。

Your method with no errors: 您的方法没有错误:

def parse_info(sentence):
    doc = nlp(sentence)

    heads = [index + item for index, item in enumerate(doc.to_array(['HEAD']))]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM