I'm new with Python and NLTK When I test the following lines in the Python console
import nltk.data
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
text ="toto. titi. tutu"
tokens = tokenizer.tokenize(text)
print(tokens)
I get what I expect. But when I execute these lines from a file, for example with the command line > python tokenize.py
, I get errors:
C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\python.exe C:/Documents/Dvpt/SemanticAndOpenData/scholar/scholar.py/tokenize.py
Traceback (most recent call last):
File "C:/Documents/Dvpt/SemanticAndOpenData/scholar/scholar.py/tokenize.py", line 1, in <module>
import nltk.data
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\__init__.py", line 89, in <module>
from nltk.internals import config_java
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\internals.py", line 11, in <module>
import subprocess
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\subprocess.py", line 395, in <module>
import threading
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\threading.py", line 10, in <module>
from traceback import format_exc as _format_exc
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\traceback.py", line 3, in <module>
import linecache
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\linecache.py", line 10, in <module>
import tokenize
File "C:\Documents\Dvpt\SemanticAndOpenData\scholar\scholar.py\tokenize.py", line 2, in <module>
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\data.py", line 786, in load
resource_val = pickle.load(opened_resource)
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\__init__.py", line 63, in <module>
from nltk.tokenize.simple import (SpaceTokenizer, TabTokenizer, LineTokenizer,
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\simple.py", line 38, in <module>
from nltk.tokenize.api import TokenizerI, StringTokenizer
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\api.py", line 13, in <module>
from nltk.internals import overridden
ImportError: cannot import name 'overridden'
Process finished with exit code 1
And I'm stuck on the problem and I can't find a way to solve it. Thanks in advance for any useful proposal.
您需要将脚本命名为tokenize.py
以外的名称
The problem here is that you have named your script as tokenize.py . Try renaming the file to something like my_tokenizer.py . Actually what is happening is that when you are using
import tokenize
What it is doing is trying to import the current file itself and thus you are getting the errors.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.