I wanted to used the below string functions text.lower for a Pandas series instead of from a text file. Tried different methods to convert the series to list and then string,, but no luck. Still I am not able to use the below function directly. Help is much appreciated.
def words(text):
return re.findall(r'\w+', text.lower())
WORDS = Counter(words(open('some.txt').read()))
I think need apply
by your function:
s = pd.Series(['Aasa dsad d','GTH rr','SSD'])
print (s)
0 Aasa dsad d
1 GTH rr
2 SSD
dtype: object
def words(text):
return re.findall(r'\w+', text.lower())
print (s.apply(words))
0 [aasa, dsad, d]
1 [gth, rr]
2 [ssd]
dtype: object
But in pandas is better use str.lower
and str.findall
, because also working with NaN
s:
print (s.str.lower().str.findall(r'\w+'))
0 [aasa, dsad, d]
1 [gth, rr]
2 [ssd]
dtype: object
Something like this?
from collections import Counter
import pandas as pd
series = pd.Series(['word', 'Word', 'WORD', 'other_word'])
counter = Counter(series.apply(lambda x: x.lower()))
print(counter)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.