简体   繁体   English

检查 dataframe 中的每一行和每一列,并用用户定义的 function 替换值

[英]check each row and column in dataframe and replace value with user define function

df=pd.DataFrame({'0':["qwa-abc","abd-xyz","abt-Rac","xyz-0vc"],'1':['axc-0aa',"abd-xyz","abt-Rac","xyz-1avc"],
                '3':['abc-aaa',"NaN","abt-9ac","xyz-9vc"]})

I have this DataFrame, I want to check each row and each column for a specific value.我有这个 DataFrame,我想检查每一行和每一列的特定值。 for example index 0 there are 4 values "qwa-abc","abd-xyz","abt-Rac","xyz-0avc".例如索引 0 有 4 个值“qwa-abc”、“abd-xyz”、“abt-Rac”、“xyz-0avc”。

for every value I want to check if xxx- any number xx.对于我要检查的每个值是否为 xxx-任何数字xx。

example:例子:

qua-abc has a at the position 4, so do nothing. qua-abc 在 position 4 上有一个,所以什么也不做。 when it reach to xyz- 0 ac there is number 0 at position 4. hence it should run user define function to replace whole value(xyz-0vc) to whatever the user define function get.当它到达 xyz- 0 ac 时,在 position 4 处有数字 0。因此它应该运行用户定义 function 以将整个值(xyz-0vc)替换为用户定义的任何值 ZC1C425268E683895D1AB45

NOTE: I tried running str.replace but it only supports specific user define string.注意:我尝试运行 str.replace 但它只支持特定的用户定义字符串。 here user function will connect to different system and get a string.这里用户 function 将连接到不同的系统并获得一个字符串。 hence it's not predefine .因此它不是预定义的。

If you want to change all the cells in your Dataframe you need to use pd.apply over the row axis, so your custom function needs to take a pd.Series as one of the parameters.如果要更改 Dataframe 中的所有单元格,则需要在行轴上使用pd.apply ,因此您的自定义 function 需要将pd.Series作为参数之一。 In this example row is the series.在此示例中,行是系列。

This generator function iterates over each cell in the row, checks if the character at index 4 is numeric.此生成器 function 迭代行中的每个单元格,检查索引 4 处的字符是否为数字。 If true returns the value to replace string with, otherwise it will return the value of the cell itself.如果 true 返回替换字符串的值,否则将返回单元格本身的值。

def replace_value(row, value):
    for cell in row:
        if pd.notna(cell) and cell[4].isnumeric():
            yield value
        else:
            yield cell

df.apply(lambda x: pd.Series(replace_value(x, 'myvalue')), axis=1)

You then apply your custom function row wise, ( axis=1 ) and wrap it in a lambda so you can pass additional arguments ( value in this case) and then call pd.Series on the iterator returned by the function. You then apply your custom function row wise, ( axis=1 ) and wrap it in a lambda so you can pass additional arguments ( value in this case) and then call pd.Series on the iterator returned by the function.

Hope it makes sense.希望这是有道理的。

You don't need a separate method, try this:您不需要单独的方法,试试这个:

In [1200]: df.loc[df['0'].str[4].str.isdigit(), '0'] = 'myvalue'                                                                                                                                            

In [1201]: df                                                                                                                                                                                               
Out[1201]: 
         0         1        3
0  qwa-abc   axc-0aa  abc-aaa
1  abd-xyz   abd-xyz      NaN
2  abt-Rac   abt-Rac  abt-9ac
3  myvalue  xyz-1avc  xyz-9vc

For doing this in all columns, do this:要在所有列中执行此操作,请执行以下操作:

In [1242]: def check_digit(cols,new_val): 
      ...:     for i in cols: 
      ...:         df.loc[(df[i].str[4].str.isdigit()) & (df[i].notna()), i] = new_val 
      ...:  

In [1243]: df.apply(lambda x: check_digit(df.columns, 'myval'), 1)

In [1244]: df                                                                                                                                                                                               
Out[1244]: 
         0        1        3
0  qwa-abc    myval  abc-aaa
1  abd-xyz  abd-xyz      NaN
2  abt-Rac  abt-Rac    myval
3    myval    myval    myval

This answer is based on @NomadMonad这个答案基于@NomadMonad

string_replacer() is a function that will change value based on input value that satisfies condition string_replacer() 是一个 function 将根据满足条件的输入值更改值

def replace_value(row, value): for cell in row: try: if pd.notna(cell) and cell[4].isnumeric(): value=string_replacer(cell) yield value else: yield cell except: print(row,value) if_df.apply(lambda x: pd.Series(replace_value(x,value)), axis=1)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 访问每一行并检查数据框中的每一列值 - Acces each row and check each column value in dataframe 检查数据框中的值是否存在于每一行的另一列中 - Check if value in dataframe exists in another column for each row 有没有办法用列名、第一列中的行值和值本身替换数据框中的每个单元格值? - Is there a way to replace each cell value in a dataframe with the column name, row value in the first column and the value itself? 为 pandas dataframe 的每一行替换列中的字符串 - Replace a string in a column for each row of a pandas dataframe A pandas dataframe 列作为行级 function 的参数传递,以将列的每个值应用于其各自的行 - A pandas dataframe column to pass as an argument of row level function to apply each value of the column to its respective row 过滤熊猫数据框行并替换列中的值 - Filter Pandas dataframe row and replace value in column 将自定义 function 应用于 dataframe 中的列中的每一行 - Appling a custom function to each row in a column in a dataframe 检查熊猫数据帧的一列是否包含不同列的每一行的子字符串? - Check if a column of a pandas dataframe contains a substring for each row of a different column? 用数组的每个值替换pandas数据框的每一列 - replace each column of pandas dataframe with each value of array 检查数据框列值中是否存在用户输入 - Check if user input exist in the dataframe column value
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM