简体   繁体   English

同时迭代和更新熊猫数据帧

[英]Iterate and update pandas dataframe simultaneously

I have a dataframe containing Level, Product ID and Cost .我有一个包含Level, Product ID and Cost的数据框。 Here Level 1 indicates it is a main product and Level 2 indicates it is a sub-product and further increase in Level indicates multiple sub-products of a sub-product.这里Level 1表示是主产品, Level 2表示是副产品, Level 2进一步增加表示一个副产品有多个副产品。

    Level    Product ID    Cost
0   1         111           12
1   1         112           15
.
.
.
25  1         294           32

I need to iterate on the above dataframe and search in database if any product with specific Product ID has a sub-product.如果具有特定产品 ID 的任何产品具有子产品,我需要迭代上述数据框并在数据库中搜索。 For example, product with Product ID 112 can have 2 sub-products with Product ID 1121 and 1122. Then I need to add these 2 sub-products in my dataframe.例如,产品 ID 为 112 的产品可以有 2 个产品 ID 为 1121 和 1122 的子产品。然后我需要在我的数据框中添加这 2 个子产品。

Note: Product ID can be any number or string.注意:产品 ID 可以是任何数字或字符串。 It need not be a multiple of its base product id.它不必是其基本产品 ID 的倍数。

Another condition here is that a sub-product can have further sub-products.这里的另一个条件是一个子产品可以有更多的子产品。 For example, a sub-product 1122 can have 3 sub-products 11221, 11222, 11223.例如,一个子产品 1122 可以有 3 个子产品 11221、11222、11223。

Also, if a product has sub-products then the cost of the product should be equal to sum of the cost of all it's sub-products.此外,如果一个产品有子产品,那么该产品的成本应该等于它所有子产品的成本之和。

The final dataframe must look like this.最终的数据框必须如下所示。

    Level    Product ID    Cost
0   1        111           12
1   1        112           15
2   2        1121          8
3   2        1122          7
4   3        11221         2
5   3        11222         3
6   3        11223         2
.
.
.
27  1        294           32

Can someone please help me in achieving this solution.有人可以帮我实现这个解决方案。 Below is the code that I tried.下面是我试过的代码。

for i, _ in multi_bom_df.iterrows():
        if i == 0:
            multi_bom_df.at[i, 'Level'] = '1'
        else:
            multi_bom_df.at[i, 'Level'] = str(current_level)
            base_part_number = str(multi_bom_df.loc[i]['Name'])
            sub_assemblies = models.MultiLevel.objects.filter(base_part=base_part)
            if sub_assemblies.exists():
                current_level += 1
                for index, record in enumerate(sub_assemblies):
                    sub_index = i + (index + 1) / 10
                    multi_bom_df.at[sub_index, 'Level'] = current_level
                    multi_bom_df.at[sub_index, 'Product ID'] = record.sub_assembly_product_id
                    multi_bom_df.at[sub_index, 'Cost'] = record.cost
                multi_bom_df.index = multi_bom_df.index.astype(float)
                multi_bom_df = multi_bom_df.sort_index()```

Here are the directions for something like this.以下是此类内容的说明。 Take the product ID and turn it into a string.获取产品 ID 并将其转换为字符串。 Put character 1 of the product ID into it's own column "pid1" do the same for the 2nd character "pid2" and "pid3" for the 3rd, as well as "pid4" (N-1 columns)将产品 ID 的第 1 个字符放入它自己的列“pid1”中,对第 2 个字符“pid2”和“pid3”以及“pid4”(N-1 列)执行相同的操作

df.groupby(['pid1','pid2','pid3','pid4']).agg({'cost':'sum'})

this will get you all the level 4 stuff (the sums of level 5 things)这会让你得到所有 4 级的东西(5 级东西的总和)

df.groupby(['pid1','pid2','pid3']).agg({'cost':'sum'})

this will get you all the level 3 stuff.这将为您提供所有 3 级的东西。

Note: This is a terrible format for things it assumes you only have <1000 products with subassemblies or that a subassembly can't be used in other products.注意:对于假设您只有 <1000 个带有子组件的产品或子组件不能用于其他产品的情况,这是一种糟糕的格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM