简体   繁体   English

如果另一个具有字符串 Pandas,则将值加到列中

[英]Sum values to column if another have a string Pandas

I have a dataframe with values of different attributes, and i have a json file which have a list of attributes to sum only if a column in the dataframe contains a string.我有一个具有不同属性值的 dataframe,我有一个 json 文件,它有一个属性列表,仅当 dataframe 中的列包含字符串时才求和。

| Product | store_location | attr1 | attr2 | attr3 | attr4 | global |
| ------- | -------------- | ----- | ----- | ----- | ----- | ------ |
| First   | NY-store1      | 3     | 5     |  2    | 2     |        |
| Second  | NY-store2      | 1     | 3     |  5    | 1     |        |
| Third   | NJ-store1      | 3     | 5     |  2    | 2     |        |
| Fourth  | PA-store1      | 1     | 3     |  5    | 1     |        |

The json file has this structure: json 文件具有以下结构:

{
"positionEvaluation": [
    {
      "position": "Global",
      "sumElements": ["attr1", "attr2"],
      "gralSum": ["attr2", "attr3", "attr4"],
      "elementsProm": ["attr1", "attr2", "attr3", "attr4"]
    }
]
}

Obviously the real file has more attributes, only for demo.显然真实文件的属性比较多,仅供demo使用。 So, I want when the product has in the store location the string 'NY' take respective attributes of "sumElements" and divide by the length of "gralSum", and if the product has another string like 'NJ' or 'PA' just sum all elements of "elementsProm" and then divide by the length of it.所以,我想当产品在商店位置时,字符串“NY”采用“sumElements”的相应属性并除以“gralSum”的长度,如果产品有另一个字符串,如“NJ”或“PA”对“elementsProm”的所有元素求和,然后除以它的长度。

Here my code:这是我的代码:

for p in range(len(js_positions["positionEvaluation"])):
    aux1_string = js_positions["positionEvaluation"][p]["position"]
    df[aux1_string] = 0
    if df['store_location'].str.contains('NY').any():
        for k in range(len(js_positions["positionEvaluation"][p]["sumElements"])):
            tmp = js_positions["positionEvaluation"][p]["sumElements"][k]
            df[aux1_string] = df[aux1_string] + df[tmp_for_gk]

        df[aux1_string] = df[aux1_string] / len(js_positions["positionEvaluation"][p]["gralSum"])

    else:
        for k in range(len(js_positions["positionEvaluation"][p]["elementsProm"])):
            tmp = js_positions["positionEvaluation"][p]["elementsProm"][k]
            df[aux1_string] = df[aux1_string] + df[tmp]
        df[aux1_string] = df[aux1_string] / len(js_positions["positionEvaluation"][p]["elementsProm"])

Explicit list:显式列表:

sumElements = ["attr1", "attr2"]
gralSum = ["attr2", "attr3", "attr4"]
elementsProm = ["attr1", "attr2", "attr3", "attr4"]

Expected output:预计 output:

| Product | store_location | attr1 | attr2 | attr3 | attr4 | global |
| ------- | -------------- | ----- | ----- | ----- | ----- | ------ |
| First   | NY-store1      | 3     | 5     |  2    | 2     |  2,66  |
| Second  | NY-store2      | 1     | 3     |  5    | 1     |  1,33  |
| Third   | NJ-store1      | 3     | 5     |  2    | 2     |    3   |
| Fourth  | PA-store1      | 1     | 3     |  5    | 1     |   2,5  |

IIUC, you want to sum different attribute whether or not the string NY is in the store name? IIUC,您想对商店名称中是否包含字符串 NY 的不同属性求和?

For this you can use boolean indexing and mean or sum :为此,您可以使用 boolean 索引和meansum

sumElements = ["attr1", "attr2"]
gralSum = ["attr2", "attr3", "attr4"]
elementsProm = ["attr1", "attr2", "attr3", "attr4"]

df['global'] = np.where(df['store_location'].str.contains('NY'),
                        df[sumElements].sum(1).div(len(gralSum)),
                        df[elementsProm].mean(1))

output: output:

  Product store_location  attr1  attr2  attr3  attr4    global
0   First      NY-store1      3      5      2      2  2.666667
1  Second      NY-store2      1      3      5      1  1.333333
2   Third      NJ-store1      3      5      2      2  3.000000
3  Fourth      PA-store1      1      3      5      1  2.500000

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 由另一列 pandas 分组的列中的总和值 - sum values in column grouped by another column pandas Pandas DataFrame:在字符串列中查找唯一单词,根据条件计算它们在另一列中的出现和总和值 - Pandas DataFrame: Find unique words in string column, count their occurrence and sum values in another column on condition 在pandas中添加新列,这是另一列的值的总和 - Adding a new column in pandas which is the total sum of the values of another column 根据另一列中的项目对pandas列中的值求和 - Sum the values in a pandas column based on the items in another column 通过另一列的分组值的总和对pandas数据框中的列进行归一化 - Normalize column in pandas dataframe by sum of grouped values of another column 熊猫数据框-将一列wrt与另一列中的值求和 - Pandas dataframe - Sum a column wrt to values in another column Python - Pandas DF - 对与另一列中的条件匹配的列中的值求和 - Python - Pandas DF - sum values in a column that match a condition in another column Pandas 根据另一列中的值对相应值求和 - Pandas sum corresponding values based on values in another column Pandas - 将值中的值汇总到另一列中的值之间 - Pandas - Sum values in one column in between values in another 列(收入)的总和值基于:pandas 中另一列(日期)的值和另一列(用户 ID)的值 - Sum values of a column (Revenue) based on: the values of another column (Date) AND the value of another column (UserId) in pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM