如何保存 .csv 文件，其中我使用 pandas 將字符串數據轉換為 .csv 文件特定列的數字數據？

Question

我編寫了這個程序，將字符串數據轉換為給定行的數字數據。 實際的 csv 文件在這里。

> df.Sex[df.Sex == 'M'] = 1 df.Sex[df.Sex == 'F'] = 0
> #changing ChestPainType of TA ,ATA,NAP and ASY into 1,2,3 and 4 df.ChestPainType[df.ChestPainType == 'TA'] = 1
> df.ChestPainType[df.ChestPainType == 'ATA'] = 2
> df.ChestPainType[df.ChestPainType == 'NAP'] = 3
> df.ChestPainType[df.ChestPainType == 'ASY'] = 4
> # changing ExerciseAngina of N = 0 and Y = 1 df.ExerciseAngina[df.ExerciseAngina == 'N'] = 0
> df.ExerciseAngina[df.ExerciseAngina == 'Y'] = 1
> # changing RestingECG of Normal,ST and LVH into 1,2 and 3 df.RestingECG[df.RestingECG == 'Normal'] = 1
> df.RestingECG[df.RestingECG == 'ST'] = 2 df.RestingECG[df.RestingECG
> == 'LVH'] = 3
> 
> df.ST_Slope[df.ST_Slope == 'Up'] = 1 df.ST_Slope[df.ST_Slope ==
> 'Flat'] = 2 df.ST_Slope[df.ST_Slope == 'Down'] = 3 df.head()

它顯示了文件前 5 行的輸出。

但之后當我嘗試使用這個程序打印相關性時：

pearsoncorr = df.corr(method = 'pearson')   #df = the .csv file i am working with. 

pearsoncorr

它顯示給我的輸出是這樣的。

在這里，我想查看我之前更改的新 csv 文件的相關性，該文件應該是這個文件，並且預期的相關性輸出應該幾乎像這樣顯示所有列。 但是這個相關表向我展示了這個csv 文件的相關性。

問題是，如何保存修改后的 .csv 文件？

PS我是這個網站的新手，所以如果我犯了任何錯誤，我深表歉意，如果你讓我知道如何更改它，我會很高興。

Answer 1

您很可能正在嘗試在 DataFrame 中的切片副本上設置值

SettingWithCopyWarning：試圖在 DataFrame 中的切片副本上設置值

您可以在view-versus-a-copy和SettingWithCopyWarning上看到為什么這可能是一個問題。

使用df.Sex[df.Sex == 'M'] = 1時，原始 DataFrame 可能沒有被更改。 您可以使用df.info()檢查並檢查Dtype列。

 #   Column          Non-Null Count  Dtype
---  ------          --------------  -----
 0   Age             918 non-null    int64
 1   Sex             918 non-null    object
 2   ChestPainType   918 non-null    object
 3   RestingBP       918 non-null    int64
 ...

在前面提到的 Pandas 文檔鏈接中，推薦的訪問方法是對多個項目使用.loc （在您的情況下使用掩碼， df.Sex == 'M' ）和使用固定索引的單個項目：

df.loc[df.Sex == 'M', 'Sex'] = 1
df.loc[df.Sex == 'F', 'Sex'] = 0

另一種選擇是使用 Pandas map ，在我看來，它更好地表達了代碼意圖。

df.Sex = df.Sex.map({'M':1, 'F':0})
df.ChestPainType = df.ChestPainType.map({'TA': 1, 'ATA': 2, 'NAP': 3, 'ASY': 4})
df.ExerciseAngina = df.ExerciseAngina.map({'N': 0, 'Y': 1})
...

再次檢查Dtype列，我們可以確保這些值實際上是整數類型，允許您使用df.corr （或使用 to_csv 將具有新值的to_csv保存到 csv）。

Data columns (total 12 columns):
#   Column          Non-Null Count  Dtype
---  ------          --------------  -----
0   Age             918 non-null    int64
1   Sex             918 non-null    int64
2   ChestPainType   918 non-null    int64
3   RestingBP       918 non-null    int64
...

如何保存 .csv 文件，其中我使用 pandas 將字符串數據轉換為 .csv 文件特定列的數字數據？

問題描述

1 個解決方案

解決方案1
0 2022-06-05 14:41:33

如何保存 .csv 文件，其中我使用 pandas 將字符串數據轉換為 .csv 文件特定列的數字數據？

問題描述

1 個解決方案

解決方案1 0 2022-06-05 14:41:33

解決方案1
0 2022-06-05 14:41:33