简体   繁体   中英

df=df.groupby('Dates')['OrderQuantity'].sum() error

In my data set, I have a list of dates in one column and quantities in another. Some of the dates appear more than once representing different orders made on the same day. I want to find the sum of the quantities ordered on each day, so that each date shows up in the dates column once, with the total number of items purchased that day in the quantity column. I am currently using the df=df.groupby('Dates')['OrderQuantity'].sum() function, but it is copying the first sum it finds into any of the following rows with quantities >0. Here is my code:

import pandas as pd

import numpy as np

df=pd.read_excel('stackoverflowexample.xlsx')

df=df.groupby('Dates')['OrderQuantity'].sum()

df.to_csv("materialrows.csv")

df=pd.read_csv("materialrows.csv")

array = np.zeros((11,2))

j=0
for i in df['Dates']:
     array[i][0] = i
    array[i][1] = df['OrderQuantity'][j]
    j+1

for i in range(1,15):
    if array[i][0] == 0:
        array[i][0] = array[i-1][0] + 1
    
x=pd.DataFrame(data = array, columns = ["Dates","OrderQuantity"])   

x=x.iloc[1:, :]
x=x['OrderQuantity']
print(x)

df=df.groupby('Dates')['OrderQuantity'].sum()
df.to_csv("materialrows.csv")

df=pd.read_csv("materialrows.csv")

array = np.zeros((11,2))

j=0
for i in df['Dates']:
    array[i][0] = i
    array[i][1] = df['OrderQuantity'][j]
    j+1

    for i in range(1,15):
    if array[i][0] == 0:
        array[i][0] = array[i-1][0] + 1
        
y=pd.DataFrame(data = array, columns = ["Dates","OrderQuantity"])   
  
y=y.iloc[1:, :]

y=y['OrderQuantity']

print(y)

Here is what the 'stackoverflowexample' excel file looks like.

Dates OrderQuantity
1     3
1     4
2     3 
3     8
4     1
5     2
6     6 
7     1
7     2
7     5
8     1
9     2
10    2

Here is the current result of my code:

1    7
2    7
3    7
4    7
5    7
6    7
7    7
8    7
9    7
10   7

Here is the result I want:

    1    7
    2    3
    3    8
    4    1
    5    2
    6    6
    7    8
    8    1
    9    2
    10   2

Any help would be greatly appreciated!

This df=df.groupby('Dates')['OrderQuantity'].sum() returns a series. Adding the as_index=False field will return a DF -->

df = df.groupby('Dates',as_index=False)['OrderQuantity'].sum().

Try re-running the cells in your notebook.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM