简体   繁体   English

如何保存 .csv 文件,其中我使用 pandas 将字符串数据转换为 .csv 文件特定列的数字数据?

[英]How do I save the .csv file where i converted the string data into numerical data of a specific column of a .csv file using pandas?

I wrote this program where i converted the string data into numerical data of the given rows.我编写了这个程序,将字符串数据转换为给定行的数字数据。 The actual csv file is in here.实际的 csv 文件在这里。

> df.Sex[df.Sex == 'M'] = 1 df.Sex[df.Sex == 'F'] = 0
> #changing ChestPainType of TA ,ATA,NAP and ASY into 1,2,3 and 4 df.ChestPainType[df.ChestPainType == 'TA'] = 1
> df.ChestPainType[df.ChestPainType == 'ATA'] = 2
> df.ChestPainType[df.ChestPainType == 'NAP'] = 3
> df.ChestPainType[df.ChestPainType == 'ASY'] = 4
> # changing ExerciseAngina of N = 0 and Y = 1 df.ExerciseAngina[df.ExerciseAngina == 'N'] = 0
> df.ExerciseAngina[df.ExerciseAngina == 'Y'] = 1
> # changing RestingECG of Normal,ST and LVH into 1,2 and 3 df.RestingECG[df.RestingECG == 'Normal'] = 1
> df.RestingECG[df.RestingECG == 'ST'] = 2 df.RestingECG[df.RestingECG
> == 'LVH'] = 3
> 
> df.ST_Slope[df.ST_Slope == 'Up'] = 1 df.ST_Slope[df.ST_Slope ==
> 'Flat'] = 2 df.ST_Slope[df.ST_Slope == 'Down'] = 3 df.head()

and it is showing the output of the first 5 rows of the file.它显示了文件前 5 行的输出。

but afterwards when i try to print the correlations using this program:但之后当我尝试使用这个程序打印相关性时:

pearsoncorr = df.corr(method = 'pearson')   #df = the .csv file i am working with. 

pearsoncorr

the output it is showing me is this.它显示给我的输出是这样的。

Here, I want to see the correlations of the new csv file i made changes earlier which should be this file and the expected correlation output should be almost like this showing all the columns .在这里,我想查看我之前更改的新 csv 文件的相关性,该文件应该是这个文件,并且预期的相关性输出应该几乎像这样显示所有列 But this correlative table is showing me the correlations of this csv file.但是这个相关表向我展示了这个csv 文件的相关性。

The question is, How can i save the modified .csv file?问题是,如何保存修改后的 .csv 文件?

PS I am new in this site so if there are any errors I made, i apologize and i will be glad if you let me know how can i change it. PS我是这个网站的新手,所以如果我犯了任何错误,我深表歉意,如果你让我知道如何更改它,我会很高兴。

You're most probably trying to set the values on a copy of a slice from the DataFrame您很可能正在尝试在 DataFrame 中的切片副本上设置值

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame SettingWithCopyWarning:试图在 DataFrame 中的切片副本上设置值

You can see on view-versus-a-copy and SettingWithCopyWarning why this may be a problem.您可以在view-versus-a-copySettingWithCopyWarning上看到为什么这可能是一个问题。

When using df.Sex[df.Sex == 'M'] = 1 the original DataFrame may not have been altered.使用df.Sex[df.Sex == 'M'] = 1时,原始 DataFrame 可能没有被更改。 You can check this using df.info() and inspect the Dtype column.您可以使用df.info()检查并检查Dtype列。

 #   Column          Non-Null Count  Dtype
---  ------          --------------  -----
 0   Age             918 non-null    int64
 1   Sex             918 non-null    object
 2   ChestPainType   918 non-null    object
 3   RestingBP       918 non-null    int64
 ...

In the previously mentioned link from Pandas documentation the recommended access method is using .loc for multiple items (using mask, df.Sex == 'M' in your case) and a single item using a fixed index:在前面提到的 Pandas 文档链接中,推荐的访问方法是对多个项目使用.loc (在您的情况下使用掩码, df.Sex == 'M' )和使用固定索引的单个项目:

df.loc[df.Sex == 'M', 'Sex'] = 1
df.loc[df.Sex == 'F', 'Sex'] = 0

Another option would be to use Pandas map , that, in my opinion better expresses the code intent.另一种选择是使用 Pandas map ,在我看来,它更好地表达了代码意图。

df.Sex = df.Sex.map({'M':1, 'F':0})
df.ChestPainType = df.ChestPainType.map({'TA': 1, 'ATA': 2, 'NAP': 3, 'ASY': 4})
df.ExerciseAngina = df.ExerciseAngina.map({'N': 0, 'Y': 1})
...

Cheking again for the Dtype column we can ensure that the values are in fact of type integer, allowing you to use df.corr (or save the Dataframe with the new values to a csv with to_csv ).再次检查Dtype列,我们可以确保这些值实际上是整数类型,允许您使用df.corr (或使用 to_csv 将具有新值的to_csv保存到 csv)。

Data columns (total 12 columns):
#   Column          Non-Null Count  Dtype
---  ------          --------------  -----
0   Age             918 non-null    int64
1   Sex             918 non-null    int64
2   ChestPainType   918 non-null    int64
3   RestingBP       918 non-null    int64
...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用Pandas在CSV文件中创建新列,并根据这些列中的值添加数据 - How do I create a new column in a csv file using Pandas, and add data depending on the values in those columns 如果我想通过同一 csv 文件中的另一列将 pandas 中 csv 文件的一部分中的数据拆分,我该怎么做? - If I want to split data in one part of a csv file in pandas by another column in the same csv file how do I do that? 我想在 csv 文件中搜索数据,其中数据取自 file1.csv 并使用 pandas 在 file2.csv 中搜索 - I want to search data in csv file where data is taken from file1.csv and searching in file2.csv in using pandas 如何将这些数据保存到csv文件中? - How do I save this piece of data to a csv file? 我如何将数据保存到 .csv 文件 - How would I save the data to a .csv file 如何每天更新和保存数据到 CSV 文件? - How do I update and save data to a CSV file each day? 如何从 Python 中的 CSV 文件中抓取特定数据? - How do I scrape specific data from a CSV file in Python? 如何在与另一列对应的csv文件中打印数据? - How do I print data in a csv file that corresponds to another column? 如何将新数据列添加到csv文件中 - How do I add a new column of data to a csv file 如何将数据作为字符串(不是文件)写入 CSV 格式? - How do I write data into CSV format as string (not file)?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM