简体   繁体   中英

TypeError: coercing to Unicode: need string or buffer, file found in python

I want to stem the words, for which i import the porterstemmer pkg from nltk but an error occurred at run time.

The error is :

TypeError: coercing to Unicode: need string or buffer, file found

My Python code is

  import nltk;     
  from nltk.stem import PorterStemmer  
  stemmer=PorterStemmer()  
  file = open('C:/Python26/test.txt','r')  
  f=open("root.txt",'w')  
  with open(file,'r',-1) as rf:  
    lines = rf.readlines()  
    for word in lines:  
        root = stemmer.stem(word)  
        f.write(root+"\n")  
    f.close()  

yes i tried it and got an error which i couldn't understand ad the error was 1.6.2 Traceback (most recent call last): File "C:\\Python26\\check.py", line 10, in with open(file,'r',-1) as rf: UnicodeDecodeError: 'ascii' codec can't decode byte 0xf8 in position 6: ordinal not in range(128)

  My code after ur recommended change is import nltk; import numpy; import numpy as np from StringIO import StringIO print numpy.__version__ from nltk.stem import PorterStemmer stemmer=PorterStemmer() file = np.genfromtxt('C:/Python26/test.txt', delimiter=" ") f=open("root.txt",'w') with open(file,'r',-1) as rf: lines = rf.readlines() for word in lines: root = stemmer.stem(word) f.write(root+"\\n") f.close() and my dummy file is like this 

walking
talked
oranges
books
Src
Src
mAB

You have already opened the file. You're trying to pass a file object to with open... . Remove file = open('C:/... line.

PS You will be iterating over lines, not words.

You are opening file in line 4 and then use that as the filename for another open() in line 6. Just do:

import nltk;     
from nltk.stem import PorterStemmer  
stemmer=PorterStemmer()  
with open("root.txt",'w') as f:
    with open('C:/Python26/test.txt','r',-1) as rf:  
      lines = rf.readlines()  
      for word in lines:  
          root = stemmer.stem(word)  
          f.write(root+"\n")  

It seems that the problem is with the parameters passed to a function, and i'm guessing its in the line root = stemmer.stem(word)

try using the function genfromtxt instead of open():

>>> import numpy as np
>>> from StringIO import StringIO
>>> np.genfromtxt('C:/Python26/test.txt', delimiter=",") #Whatever delimiter your file has.

That should fix the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM