Pandas：將轉換應用於所有字符列

Question

我正在使用 Python 嘗試對 pandas 數據框中的所有字符/字符串列進行一些轉換。 變換是：

將所有內容設為大寫
修剪空白區域

我來自 R 背景，這可以通過類似的方式來實現


mydf <- mydf %>% 
  dplyr::mutate_if(is.character, toupper)
  dplyr::mutate_if(is.character, trimws)

對於 Python，我不知所措。 我已經嘗試過以下方法，它首先標識所有字符列，然后嘗試修剪空格並使所有字符列大寫（在這種情況下為物種）

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris

# Create a sample dataset
iris = load_iris()

df= pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                 columns= iris['feature_names'] + ['target'])

df['species'] = pd.Categorical.from_codes(iris.target, iris.target_names)

# Make character columns upper case and then trim the white space
string_dtypes = df.convert_dtypes().select_dtypes("string")
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.upper())
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.strip())

df

我很欣賞這可能是一個非常基本的問題，並提前感謝任何花時間提供幫助的人

Answer 1

您應該能夠通過方法鏈接在一行中做到這一點：

df.astype(str).apply(lambda x: x.str.upper().str.strip())

輸出：

    sepal length (cm)   sepal width (cm)    petal length (cm)   petal width (cm)    target  species
0   5.1 3.5 1.4 0.2 0.0 SETOSA
1   4.9 3.0 1.4 0.2 0.0 SETOSA
2   4.7 3.2 1.3 0.2 0.0 SETOSA
3   4.6 3.1 1.5 0.2 0.0 SETOSA
4   5.0 3.6 1.4 0.2 0.0 SETOSA
... ... ... ... ... ... ...
145 6.7 3.0 5.2 2.3 2.0 VIRGINICA
146 6.3 2.5 5.0 1.9 2.0 VIRGINICA
147 6.5 3.0 5.2 2.0 2.0 VIRGINICA
148 6.2 3.4 5.4 2.3 2.0 VIRGINICA
149 5.9 3.0 5.1 1.8 2.0 VIRGINICA

Pandas：將轉換應用於所有字符列

問題描述

1 個解決方案

解決方案1
4 已采納 2022-07-20 12:05:25

Pandas：將轉換應用於所有字符列

問題描述

1 個解決方案

解決方案1 4 已采納 2022-07-20 12:05:25

解決方案1
4 已采納 2022-07-20 12:05:25