如何保存 .csv 文件，其中我使用 pandas 将字符串数据转换为 .csv 文件特定列的数字数据？

Question

I wrote this program where i converted the string data into numerical data of the given rows.我编写了这个程序，将字符串数据转换为给定行的数字数据。 The actual csv file is in here.实际的 csv 文件在这里。

> df.Sex[df.Sex == 'M'] = 1 df.Sex[df.Sex == 'F'] = 0
> #changing ChestPainType of TA ,ATA,NAP and ASY into 1,2,3 and 4 df.ChestPainType[df.ChestPainType == 'TA'] = 1
> df.ChestPainType[df.ChestPainType == 'ATA'] = 2
> df.ChestPainType[df.ChestPainType == 'NAP'] = 3
> df.ChestPainType[df.ChestPainType == 'ASY'] = 4
> # changing ExerciseAngina of N = 0 and Y = 1 df.ExerciseAngina[df.ExerciseAngina == 'N'] = 0
> df.ExerciseAngina[df.ExerciseAngina == 'Y'] = 1
> # changing RestingECG of Normal,ST and LVH into 1,2 and 3 df.RestingECG[df.RestingECG == 'Normal'] = 1
> df.RestingECG[df.RestingECG == 'ST'] = 2 df.RestingECG[df.RestingECG
> == 'LVH'] = 3
> 
> df.ST_Slope[df.ST_Slope == 'Up'] = 1 df.ST_Slope[df.ST_Slope ==
> 'Flat'] = 2 df.ST_Slope[df.ST_Slope == 'Down'] = 3 df.head()

and it is showing the output of the first 5 rows of the file.它显示了文件前 5 行的输出。

but afterwards when i try to print the correlations using this program:但之后当我尝试使用这个程序打印相关性时：

pearsoncorr = df.corr(method = 'pearson')   #df = the .csv file i am working with. 

pearsoncorr

the output it is showing me is this.它显示给我的输出是这样的。

Here, I want to see the correlations of the new csv file i made changes earlier which should be this file and the expected correlation output should be almost like this showing all the columns .在这里，我想查看我之前更改的新 csv 文件的相关性，该文件应该是这个文件，并且预期的相关性输出应该几乎像这样显示所有列。 But this correlative table is showing me the correlations of this csv file.但是这个相关表向我展示了这个csv 文件的相关性。

The question is, How can i save the modified .csv file?问题是，如何保存修改后的 .csv 文件？

PS I am new in this site so if there are any errors I made, i apologize and i will be glad if you let me know how can i change it. PS我是这个网站的新手，所以如果我犯了任何错误，我深表歉意，如果你让我知道如何更改它，我会很高兴。

Answer 1

You're most probably trying to set the values on a copy of a slice from the DataFrame您很可能正在尝试在 DataFrame 中的切片副本上设置值

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame SettingWithCopyWarning：试图在 DataFrame 中的切片副本上设置值

You can see on view-versus-a-copy and SettingWithCopyWarning why this may be a problem.您可以在view-versus-a-copy和SettingWithCopyWarning上看到为什么这可能是一个问题。

When using df.Sex[df.Sex == 'M'] = 1 the original DataFrame may not have been altered.使用df.Sex[df.Sex == 'M'] = 1时，原始 DataFrame 可能没有被更改。 You can check this using df.info() and inspect the Dtype column.您可以使用df.info()检查并检查Dtype列。

 #   Column          Non-Null Count  Dtype
---  ------          --------------  -----
 0   Age             918 non-null    int64
 1   Sex             918 non-null    object
 2   ChestPainType   918 non-null    object
 3   RestingBP       918 non-null    int64
 ...

In the previously mentioned link from Pandas documentation the recommended access method is using .loc for multiple items (using mask, df.Sex == 'M' in your case) and a single item using a fixed index:在前面提到的 Pandas 文档链接中，推荐的访问方法是对多个项目使用.loc （在您的情况下使用掩码， df.Sex == 'M' ）和使用固定索引的单个项目：

df.loc[df.Sex == 'M', 'Sex'] = 1
df.loc[df.Sex == 'F', 'Sex'] = 0

Another option would be to use Pandas map , that, in my opinion better expresses the code intent.另一种选择是使用 Pandas map ，在我看来，它更好地表达了代码意图。

df.Sex = df.Sex.map({'M':1, 'F':0})
df.ChestPainType = df.ChestPainType.map({'TA': 1, 'ATA': 2, 'NAP': 3, 'ASY': 4})
df.ExerciseAngina = df.ExerciseAngina.map({'N': 0, 'Y': 1})
...

Cheking again for the Dtype column we can ensure that the values are in fact of type integer, allowing you to use df.corr (or save the Dataframe with the new values to a csv with to_csv ).再次检查Dtype列，我们可以确保这些值实际上是整数类型，允许您使用df.corr （或使用 to_csv 将具有新值的to_csv保存到 csv）。

Data columns (total 12 columns):
#   Column          Non-Null Count  Dtype
---  ------          --------------  -----
0   Age             918 non-null    int64
1   Sex             918 non-null    int64
2   ChestPainType   918 non-null    int64
3   RestingBP       918 non-null    int64
...

如何保存 .csv 文件，其中我使用 pandas 将字符串数据转换为 .csv 文件特定列的数字数据？

问题描述

1 个解决方案

解决方案1
0 2022-06-05 14:41:33

如何保存 .csv 文件，其中我使用 pandas 将字符串数据转换为 .csv 文件特定列的数字数据？

问题描述

1 个解决方案

解决方案1 0 2022-06-05 14:41:33

解决方案1
0 2022-06-05 14:41:33