I am working with Vader from the nltk package. I've imported my dataset following the vader tutorial:
list = []
for line in open("C:\Users\Luca\Desktop\Uni\Tesi\PythonTest\paolo.txt","r").readlines():
for value in line.split(","):
list.append(value)
Then I've created the function to remove punctuation:
def _words_only(self):
text_mod = REGEX_REMOVE_PUNCTUATION.sub('', self.text)
words_only = text_mod.split()
words_only = [word for word in words_only if len(word) > 1]
return words_only
But when I try to use the "words only" function I get this error
AttributeError Traceback (most recent call last)
<ipython-input-14-cbc12179c890> in <module>()
----> 1 _words_only(list)
<ipython-input-13-68a545bbbaa4> in _words_only(self)
1 def _words_only(self):
----> 2 text_mod = REGEX_REMOVE_PUNCTUATION.sub('', self.text)
3 words_only = text_mod.split()
AttributeError: 'list' object has no attribute 'text'
I am really new to Python. Is it a problem in the importing process or is it something else? Thanks for your help.
You don't show where/how you created the function _words_only()
, but the self
argument indicates that you patterned it on a class method. You're evidently using it as a stand-alone function, like this:
_words_only(list)
I would advise you not to tackle classes yet if you can avoid it. Write your function like this:
def words_only(text):
text_mod = REGEX_REMOVE_PUNCTUATION.sub('', text)
words_only = text_mod.split()
words_only = [word for word in words_only if len(word) > 1]
return words_only
You should also know that your function is designed to process one string, not a list of them. In addition, don't use builtin names like list
as variable names-- you're asking for a very confusing error in a day or two. Use a more informative name, or an abbreviation like lst
:
lines = []
...
some_words = words_only(lines[0])
Since you actually want to work with the list of lines, apply the revised function to each one like this:
filtered_lines = [ words_only(line) for line in lines ]
If you had wanted to work with the entire contents of the file, you would read in your text like this:
myfile = open(r"C:\Users\Luca\Desktop\Uni\Tesi\PythonTest\paolo.txt","r")
text = myfile.read()
myfile.close()
some_words = words_only(text)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.