简体   繁体   English

Python Pandas:数据框未使用字符串方法更新

[英]Python Pandas: Dataframe is not updating using string methods

I'm trying to update the strings in a .csv file that I am reading using Pandas. 我正在尝试更新正在使用Pandas读取的.csv文件中的字符串。 The .csv contains the column name 'about' which contains the rows of data I want to manipulate. .csv包含列名“ about”,其中包含我要操作的数据行。

I've already used str. 我已经用过str。 to update but it is not reflecting in the exported .csv file. 更新,但未反映在导出的.csv文件中。 Some of my code can be seen below. 我的一些代码可以在下面看到。

import pandas as pd

df = pd.read_csv('data.csv')
df.About.str.lower() #About is the column I am trying to update
df.About.str.replace('[^a-zA-Z ]', '')
df.to_csv('newdata.csv')

You need assign output to column, also is possible chain both operation together, because working with same column About and because values are converted to lowercase, is possible change regex to replace not uppercase: 您需要将输出分配给列,也可以将两个操作链接在一起,因为使用同一列About和因为值转换为小写,可以将regex替换为大写:

df = pd.read_csv('data.csv')
df.About = df.About.str.lower().str.replace('[^a-z ]', '')
df.to_csv('newdata.csv', index=False)

Sample : 样品

df = pd.DataFrame({'About':['AaSD14%', 'SDD Aa']})

df.About = df.About.str.lower().str.replace('[^a-z ]', '')
print (df)
    About
0    aasd
1  sdd aa
import pandas as pd
import numpy as np

columns = ['About']
data = ["ALPHA","OMEGA","ALpHOmGA"]
df = pd.DataFrame(data, columns=columns)
df.About = df.About.str.lower().str.replace('[^a-zA-Z ]', '')
print(df)

OUTPUT: OUTPUT:

出

Example Dataframe : 示例数据框

>>> df
        About
0      JOHN23
1     PINKO22
2   MERRY jen
3  Soojan San
4      Remo55

Solution: ,another way Using a compiled regex with flags 解决方案:另一种方式,使用带有标志的已编译正则表达式

>>> df.About.str.lower().str.replace(regex_pat,  '')
0          john
1         pinko
2     merry jen
3    soojan san
4          remo
Name: About, dtype: object

Explanation: 说明:

Match a single character not present in the list below [^az]+ 匹配[^az]+以下列表中不存在的单个字符

+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy) az a single character in the range between a (index 97) and z (index 122) (case sensitive) +量词-尽可能地匹配一次和无限次,并根据需要(贪婪)将a返还a(索引97)和z(索引122)之间的单个字符(区分大小写)

$ asserts position at the end of a line $在行尾声明位置

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Python Pandas的数据框上使用字符串方法吗? - Using string methods on dataframes in Python Pandas? python pandas dataframe变量未更新 - python pandas dataframe variable not updating Python:使用 loc[]+iloc[]+loc[] 更新 Pandas Dataframe 中的单个值 - Python: Updating a single value in a Pandas Dataframe using loc[]+iloc[]+loc[] Python(String): 使用保存在 DataFrame 单元格中的字符串作为 pandas 公式 - Python(String): Using a string saved in a DataFrame cell as a pandas Formula Python Pandas DataFrame: KeyError 0 while iterate through the ZBA834BA059A9A379459C1112E4Z1 using after ZBA834BA059A9A379459C1112E71 - Python Pandas DataFrame: KeyError 0 while iterating through the DataFrame after using the set_index and resample methods Python pandas dataframe:使用数据帧数据进行插值而不更新它。 只需获取插值即可。 - Python pandas dataframe: interpolation using dataframe data without updating it. Just get the interpolated value. Python Pandas DataFrame单元更新错误 - python pandas dataframe cell updating error 在Pandas / Python中使用可变大小的行更新数据框 - Updating dataframe with rows of variable size in Pandas/Python Python / Pandas-使用条件从JSON更新数据框 - Python/Pandas - Updating dataframe from JSON with conditions 通过python pandas排序字符串数据框 - sorting string dataframe by python pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM