[英]check each row and column in dataframe and replace value with user define function
df=pd.DataFrame({'0':["qwa-abc","abd-xyz","abt-Rac","xyz-0vc"],'1':['axc-0aa',"abd-xyz","abt-Rac","xyz-1avc"],
'3':['abc-aaa',"NaN","abt-9ac","xyz-9vc"]})
I have this DataFrame, I want to check each row and each column for a specific value.我有这个 DataFrame,我想检查每一行和每一列的特定值。 for example index 0 there are 4 values "qwa-abc","abd-xyz","abt-Rac","xyz-0avc".
例如索引 0 有 4 个值“qwa-abc”、“abd-xyz”、“abt-Rac”、“xyz-0avc”。
for every value I want to check if xxx- any number xx.对于我要检查的每个值是否为 xxx-任何数字xx。
example:例子:
qua-abc has a at the position 4, so do nothing. qua-abc 在 position 4 上有一个,所以什么也不做。 when it reach to xyz- 0 ac there is number 0 at position 4. hence it should run user define function to replace whole value(xyz-0vc) to whatever the user define function get.
当它到达 xyz- 0 ac 时,在 position 4 处有数字 0。因此它应该运行用户定义 function 以将整个值(xyz-0vc)替换为用户定义的任何值 ZC1C425268E683895D1AB45
NOTE: I tried running str.replace but it only supports specific user define string.注意:我尝试运行 str.replace 但它只支持特定的用户定义字符串。 here user function will connect to different system and get a string.
这里用户 function 将连接到不同的系统并获得一个字符串。 hence it's not predefine .
因此它不是预定义的。
If you want to change all the cells in your Dataframe you need to use pd.apply
over the row axis, so your custom function needs to take a pd.Series
as one of the parameters.如果要更改 Dataframe 中的所有单元格,则需要在行轴上使用
pd.apply
,因此您的自定义 function 需要将pd.Series
作为参数之一。 In this example row is the series.在此示例中,行是系列。
This generator function iterates over each cell in the row, checks if the character at index 4 is numeric.此生成器 function 迭代行中的每个单元格,检查索引 4 处的字符是否为数字。 If true returns the value to replace string with, otherwise it will return the value of the cell itself.
如果 true 返回替换字符串的值,否则将返回单元格本身的值。
def replace_value(row, value):
for cell in row:
if pd.notna(cell) and cell[4].isnumeric():
yield value
else:
yield cell
df.apply(lambda x: pd.Series(replace_value(x, 'myvalue')), axis=1)
You then apply your custom function row wise, ( axis=1
) and wrap it in a lambda so you can pass additional arguments ( value
in this case) and then call pd.Series
on the iterator
returned by the function. You then apply your custom function row wise, (
axis=1
) and wrap it in a lambda so you can pass additional arguments ( value
in this case) and then call pd.Series
on the iterator
returned by the function.
Hope it makes sense.希望这是有道理的。
You don't need a separate method, try this:您不需要单独的方法,试试这个:
In [1200]: df.loc[df['0'].str[4].str.isdigit(), '0'] = 'myvalue'
In [1201]: df
Out[1201]:
0 1 3
0 qwa-abc axc-0aa abc-aaa
1 abd-xyz abd-xyz NaN
2 abt-Rac abt-Rac abt-9ac
3 myvalue xyz-1avc xyz-9vc
In [1242]: def check_digit(cols,new_val):
...: for i in cols:
...: df.loc[(df[i].str[4].str.isdigit()) & (df[i].notna()), i] = new_val
...:
In [1243]: df.apply(lambda x: check_digit(df.columns, 'myval'), 1)
In [1244]: df
Out[1244]:
0 1 3
0 qwa-abc myval abc-aaa
1 abd-xyz abd-xyz NaN
2 abt-Rac abt-Rac myval
3 myval myval myval
This answer is based on @NomadMonad这个答案基于@NomadMonad
string_replacer() is a function that will change value based on input value that satisfies condition string_replacer() 是一个 function 将根据满足条件的输入值更改值
def replace_value(row, value): for cell in row: try: if pd.notna(cell) and cell[4].isnumeric(): value=string_replacer(cell) yield value else: yield cell except: print(row,value) if_df.apply(lambda x: pd.Series(replace_value(x,value)), axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.