ImportError in a simple NLTK example

Question

I'm new with Python and NLTK When I test the following lines in the Python console

import nltk.data
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
text ="toto. titi. tutu"
tokens = tokenizer.tokenize(text)
print(tokens)

I get what I expect. But when I execute these lines from a file, for example with the command line > python tokenize.py , I get errors:

C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\python.exe C:/Documents/Dvpt/SemanticAndOpenData/scholar/scholar.py/tokenize.py
Traceback (most recent call last):
  File "C:/Documents/Dvpt/SemanticAndOpenData/scholar/scholar.py/tokenize.py", line 1, in <module>
    import nltk.data
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\__init__.py", line 89, in <module>
    from nltk.internals import config_java
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\internals.py", line 11, in <module>
    import subprocess
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\subprocess.py", line 395, in <module>
    import threading
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\threading.py", line 10, in <module>
    from traceback import format_exc as _format_exc
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\traceback.py", line 3, in <module>
    import linecache
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\linecache.py", line 10, in <module>
    import tokenize
  File "C:\Documents\Dvpt\SemanticAndOpenData\scholar\scholar.py\tokenize.py", line 2, in <module>
    tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\data.py", line 786, in load
    resource_val = pickle.load(opened_resource)
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\__init__.py", line 63, in <module>
    from nltk.tokenize.simple   import (SpaceTokenizer, TabTokenizer, LineTokenizer,
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\simple.py", line 38, in <module>
    from nltk.tokenize.api import TokenizerI, StringTokenizer
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\api.py", line 13, in <module>
    from nltk.internals import overridden
ImportError: cannot import name 'overridden'
Process finished with exit code 1

And I'm stuck on the problem and I can't find a way to solve it. Thanks in advance for any useful proposal.

Answer 1

您需要将脚本命名为tokenize.py以外的名称

Answer 2

The problem here is that you have named your script as tokenize.py . Try renaming the file to something like my_tokenizer.py . Actually what is happening is that when you are using

import tokenize

What it is doing is trying to import the current file itself and thus you are getting the errors.

ImportError in a simple NLTK example

Question

2 answers

solution1
4 2017-03-09 22:39:07

solution2
1 2017-06-30 09:43:45

ImportError in a simple NLTK example

Question

2 answers

solution1 4 2017-03-09 22:39:07

solution2 1 2017-06-30 09:43:45

solution1
4 2017-03-09 22:39:07

solution2
1 2017-06-30 09:43:45