繁体   English   中英

总和值取决于 python 中没有 pandas 的其他列

[英]sum values depending on other column without pandas in python

我试图总结这一点,但我收到一个关于unsupported operand type(s) for +: 'int' and 'str'的错误。 在我写了Change_Point = int(float(row[6]))之后,我得到了'int' object is not iterable 我只想总结没有 pandas 的变化点。

import csv
import json

with open('sample.csv','r') as file:
    rows = csv.reader(file,delimiter='|')
    next(rows, None)
    y = []
    orders = {}
    for row in rows:
        PointRec_ID = row[0]        
        Opeartion = row[1]        
        Member_ID = row[2]                
        Order_ID = row[3]        
        Point_Valid_Date = row[4]        
        Point_Invalid_Date = row[5]        
        Change_Point = row[6]  
        Accumulative_Point = row[7]       
        if not Order_ID in orders :
            orders[Order_ID] = {
                'PointRec_ID': PointRec_ID,
                'Opeartion': Opeartion,
                'Member_ID': Member_ID,
                'Order_ID': Order_ID,
                'Point_Valid_Date': Point_Valid_Date,
                'Point_Invalid_Date': Point_Invalid_Date,
                'Change_Point': sum(Change_Point),
                'Accumulative_Point': Accumulative_Point,                
            }
        order = orders[Order_ID]
    for Order_ID in orders:        
        y.append(orders[Order_ID])
print(json.dumps(y))

例如 csv:

PointRec_ID|Opeartion|Member_ID|Order_ID|Point_Valid_Date|Point_Invalid_Date|Change_Point|Accumulative_Point
20200819000001760|Point gain|00100224165|AD031SA12016866|2020-08-23 16:00:00|2021-08-23 16:00:00|639|934
20200819000001761|Point gain|00100224165|AD031SA12016866|2020-08-23 16:00:00|2021-08-23 16:00:00|0|934
20200819000001762|Point gain|00100224165|AD031SA12016866|2020-08-23 16:00:00|2021-08-23 16:00:00|1|935
20200819000001763|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|89|90
20200819000001764|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|699|789
20200819000001765|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|0|789
20200819000001766|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|1|790
20200819000001767|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|0|790
20200819000001768|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|1169|1959

期望结果(如果我 output csv):

20200819000001762|Point gain|00100224165|AD031SA12016866|2020-08-23 16:00:00|2021-08-23 16:00:00|640|935
20200819000001768|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|1958|1959

任何帮助将不胜感激。

您需要对现有项目的值求和。

import csv
import json

with open('sample.csv','r') as file:
    rows = csv.reader(file,delimiter='|')
    y = [next(rows, None)]
    
    orders = {}
    for row in rows:
        PointRec_ID = row[0]        
        Opeartion = row[1]        
        Member_ID = row[2]                
        Order_ID = row[3]        
        Point_Valid_Date = row[4]        
        Point_Invalid_Date = row[5]        
        Change_Point = row[6]  
        Accumulative_Point = row[7] 
            
        if not Order_ID in orders :
            orders[Order_ID] = {
                'PointRec_ID': PointRec_ID,
                'Opeartion': Opeartion,
                'Member_ID': Member_ID,
                'Order_ID': Order_ID,
                'Point_Valid_Date': Point_Valid_Date,
                'Point_Invalid_Date': Point_Invalid_Date,
                'Change_Point': int(Change_Point),
                'Accumulative_Point': Accumulative_Point,                
            }
        else:
            orders[Order_ID]["Change_Point"] +=  int(Change_Point)
            
    for Order_ID in orders:        
        y.append(list(orders[Order_ID].values()))
print(y)

将 for 循环内的 Change_Point 更改为Change_Point = [int(row[6])] + [int(i[6]) for i in rows]

import csv
import json

with open('sample.csv','r') as file:
    rows = csv.reader(file,delimiter='|')
    next(rows, None)
    y = []
    orders = {}
    for row in rows:
        PointRec_ID = row[0]        
        Opeartion = row[1]        
        Member_ID = row[2]                
        Order_ID = row[3]        
        Point_Valid_Date = row[4]        
        Point_Invalid_Date = row[5]        
        Change_Point = [int(row[6])] + [int(i[6]) for i in rows] # list of change
        #print(Change_Point)
        Accumulative_Point = row[7]       
        if not Order_ID in orders :
            orders[Order_ID] = {
                'PointRec_ID': PointRec_ID,
                'Opeartion': Opeartion,
                'Member_ID': Member_ID,
                'Order_ID': Order_ID,
                'Point_Valid_Date': Point_Valid_Date,
                'Point_Invalid_Date': Point_Invalid_Date,
                'Change_Point': sum(Change_Point),
                'Accumulative_Point': Accumulative_Point,                
           }
        order = orders[Order_ID]
    for Order_ID in orders:        
        y.append(orders[Order_ID])
print(json.dumps(y))

Output:

[{"PointRec_ID": "20200819000001760", "Opeartion": "Point gain", "Member_ID": "00100224165", "Order_ID": "AD031SA12016866", "Point_Valid_Date": "2020-08-23 16:00:00", "Point_Invalid_Date": "2021-08-23 16:00:00", "Change_Point": 2598, "Accumulative_Point": "934"}]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM