I would like to build a classifier of tweets using Python 3. I got the following error:
AttributeError: 'DataFrame' object has no attribute 'id'
when I ran the following code:
train_df['id'] = train_df.id.apply(lambda x: int(x))
train_df['friends_count'] = train_df.friends_count.apply(lambda x: int(x))
train_df['followers_count'] = train_df.followers_count.apply(lambda x: 0 if x=='None' else int(x))
train_df['friends_count'] = train_df.friends_count.apply(lambda x: 0 if x=='None' else int(x))
The dataset looks like
and it is a csv file. I would like to have a list of all the columns in the dataset rather than scrolling manually. I checked the version of panda and it seems to be already updated. I am pretty new in using Python, so I hope you can help me to figure out what I am doing wrong. Thanks
EDIT:
AttributeError Traceback (most recent call last)
<ipython-input-32-13c1484f8b58> in <module>
2 train_df = df.copy()
3 #train_df['id'] = train_df.id.apply(lambda x: int(x))
----> 4 train_df['friends_count'] = train_df.friends_count.apply(lambda x: int(x))
5 train_df['followers_count'] = train_df.followers_count.apply(lambda x: 0 if x=='None' else int(x))
6 train_df['friends_count'] = train_df.friends_count.apply(lambda x: 0 if x=='None' else int(x))
/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
4374 IE10 404 0.08
4375 Chrome 200 0.02
-> 4376
4377 >>> df.reindex(new_index, fill_value='missing')
4378 http_status response_time
AttributeError: 'DataFrame' object has no attribute 'friends_count'
https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html
Read this article.
You need to learn a bit more about pandas and how it works before the answer to this question would even be helpful.
Currently, your columns are simply shown as 0,1,2,...
. You are probably interested to use the first row as column names. You need to first convert the first data row to columns in the following way:
train_df.columns = train_df.iloc[0]
or
train_df.rename(columns=train_df.iloc[0])
Then you will be able to do the current operations you are doing. You can also remove the current header row in the following way:
train_df.drop(train_df.index[0])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.