I have a dataframe that looks like this (my input database on COVID cases)
data:
date state cases
0 20200625 NY 300
1 20200625 CA 250
2 20200625 TX 200
3 20200625 FL 100
5 20200624 NY 290
6 20200624 CA 240
7 20200624 TX 100
8 20200624 FL 80
...
worth noting that the "date" column in the above data is a number (not datetime)
I want to make it a timeseries like this (desired output), with dates as index and each state's COVID cases as columns
NY CA TX FL
20200625 300 250 200 100
20200626 290 240 100 80
...
As of now I managed to create only the scheleton of the output with the following code
states = ['NY', 'CA', 'TX', 'FL']
days = [20200625, 20200626]
columns = states
positives = pd.DataFrame(columns = columns)
i = 0
for day in days:
positives.loc[i, "date"] = day
i = i +1
positives.set_index('date', inplace=True)
positives= positives.rename_axis(None)
print(positives)
which returns:
NY CA TX FL
20200625.0 NaN NaN NaN NaN
20200626.0 NaN NaN NaN NaN
how can I get from the "data" dataframe the value of column "cases" when:
(i) value in data["state"] = column header of "positives",
(ii) value in data["date"] = row index of "positives"
You can do:
df = df.set_index(['date', 'state']).unstack().reset_index()
# fix column names
df.columns = df.columns.get_level_values(1)
state CA FL NY TX
0 20200624 240.0 NaN 290.0 NaN
1 20200625 250.0 100.0 300.0 200.0
Later, to set index again we need to set the name explicitly, do:
df = df.set_index("")
df.index.name = "date"
The transformation you are interested in is called a pivot. You can achieve this in Pandas as follows:
# Reproduce part of the data
data = pd.DataFrame({'date': [20200625, 20200625, 20200624, 20200624],
'state': ['NY', 'CA', 'NY', 'CA'],
'cases': [300, 250, 290, 240]})
data
# date state cases
# 0 20200625 NY 300
# 1 20200625 CA 250
# 2 20200624 NY 290
# 3 20200624 CA 240
# Pivot
data.pivot(index='date', columns='state', values='cases')
# state CA NY
# date
# 20200624 240 290
# 20200625 250 300
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.