AttributeError: 'int' object has no attribute 'split'
Data is:
print(df)
Content Page no
0 My name is mark 3
1 My name is jeff 3
2 My name is bill 3
The code is:
df['doc_len'] = df['Content'].apply(lambda words: len(words.split()))
The error it's returning is:
AttributeError Traceback (most recent call last)
<ipython-input-26-7d2f1de16b3d> in <module>
----> 1 df['doc_len'] = df['Content'].apply(lambda words: len(words.split()))
~\t5\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
4136 else:
4137 values = self.astype(object)._values
-> 4138 mapped = lib.map_infer(values, f, convert=convert_dtype)
4139
4140 if len(mapped) and isinstance(mapped[0], Series):
pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-26-7d2f1de16b3d> in <lambda>(words)
----> 1 df['doc_len'] = df['Content'].apply(lambda words: len(words.split()))
AttributeError: 'int' object has no attribute 'split'
The Content column is object type and it's returning int object has no attribute split
You can try:
via str.split()
and str.len()
:
df['doc_len']=df['Content'].str.split().str.len()
OR
via str.count()
and then add 1:
df['doc_len']=df['Content'].str.count(' ')+1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.