[英]Issue altering values in a pandas DataFrame in Python
I have a data set that has been loaded into a pandas DataFrame
.我有一个已加载到 pandas
DataFrame
的数据集。 When I print data.head()
, it looks like this.当我打印
data.head()
,它看起来像这样。
G1 G2 G3 absences failures studytime romantic internet
0 5 6 6 6 0 2 no no
1 5 5 6 4 0 2 no yes
2 7 8 10 10 3 2 no yes
3 15 14 15 2 0 3 yes yes
4 6 10 10 4 0 2 no no
I am attempting to create a linear regression model and want to convert the yes' and no's to 1s and 0s in the romantic
and internet
columns.我正在尝试创建一个线性回归模型,并希望将“是”和“否”转换为“
romantic
和“ internet
列中的 1 和 0。
The code I used:我使用的代码:
df['romantic'].replace('yes', 0)
df['romantic'].replace('no', 1)
df['internet'].replace('yes', 0)
df['internet'].replace('no', 1)
Did not work :( It also did not display an error of any sort.没有工作:(它也没有显示任何类型的错误。
I tried to make a linear model with data = df[["G1", "G2", "G3", "absences", "failures", "studytime", "romantic", "internet"]]
and it showed:我试图用
data = df[["G1", "G2", "G3", "absences", "failures", "studytime", "romantic", "internet"]]
创建一个线性模型,结果显示:
ValueError: could not convert string to float: 'yes'
Even though I thought I converted them.即使我认为我转换了它们。 Please help, thanks...
请帮忙,谢谢...
To convert both your columns of interest, run:要转换您感兴趣的两个列,请运行:
df.romantic = (df.romantic == 'yes').astype(int)
df.internet = (df.internet == 'yes').astype(int)
Note also that you wrote convert the yes' and no's to 1s and 0s , so in your code sample you attempt to assing the values just the opposite way.另请注意,您编写了将 yes' 和 no 转换为 1s 和 0s的代码,因此在您的代码示例中,您尝试以相反的方式分配值。
如果要将所有“是”替换为 0,将所有“否”替换为 1,请使用:
df.replace({'yes': 0, 'no': 1})
df.replace({'yes': 0, 'no': 1}, regex=True)
Try this.尝试这个。 Replaces all occurences of 'yes' with 0 and all occurences of 'no' with 1.
将所有出现的“是”替换为 0,将所有出现的“否”替换为 1。
mapper = {'yes':0,'no':1}
df.loc[:,'romantic'] = df['romantic'].map(mapper)
df.loc[:,'internet'] = df['internet'].map(mapper)
Use map function for this job为这项工作使用地图功能
You need to assign it while replacing:您需要在替换时分配它:
df = pd.DataFrame({'romantic':['no','no','no','yes','no'], 'internet':['no','yes','yes','yes','no']})
df
df['romantic'] = df['romantic'].replace('yes', 0)
df['romantic'] = df['romantic'].replace('no', 1)
df['internet'] = df['internet'].replace('yes', 0)
df['internet'] = df['internet'].replace('no', 1)
print(df)
romantic internet
0 1 1
1 1 0
2 1 0
3 0 0
4 1 1
There are more ways to do this in Python:在 Python 中有更多方法可以做到这一点:
df['romantic'] = df['romantic'].apply(lambda x: 0 if x == 'yes' else 1)
df['internet'] = df['internet'].apply(lambda x: 0 if x == 'yes' else 1)
df['romantic'] = np.where(df['romantic'] == 'yes',0,1)
df['internet'] = np.where(df['internet'] == 'yes',0,1)
df['romantic'] = df['romantic'].map(dict(yes = 0, no = 1))
df['internet'] = df['internet'].map(dict(yes = 0, no = 1))
All yield the same result.都产生相同的结果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.