繁体   English   中英

有条件地加一列 pandas dataframe

[英]Adding a column to pandas dataframe conditionally

我正在开展一个个人项目,收集有关 Covid-19 病例的数据。 该数据集仅显示每 state 累计的 Covid-19 病例总数。 我想添加一个包含当天添加的新案例的列。 这是我到目前为止所拥有的:

import pandas as pd
from datetime import date
from datetime import timedelta
import numpy as np

#read the CSV from github
hist_US_State = pd.read_csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv")

#some code to get yesterday's date and the day before which is needed later.
today = date.today()
yesterday = today - timedelta(days = 1)
yesterday = str(yesterday)
day_before_yesterday = today - timedelta(days = 2)
day_before_yesterday = str(day_before_yesterday)

#Extracting yesterday's and the day before cases and combine them in one dataframe
yesterday_cases = hist_US_State[hist_US_State["date"] == yesterday]
day_before_yesterday_cases = hist_US_State[hist_US_State["date"] == day_before_yesterday]

total_cases = pd.DataFrame()
total_cases = day_before_yesterday_cases.append(yesterday_cases)

#Adding a new column called "new_cases" and this is where I get into trouble.
total_cases["new_cases"] = yesterday_cases["cases"] - day_before_yesterday_cases["cases"]

你能指出我做错了什么吗?

因为您将total_cases定义为yesterday_casesday_before_yesterday_cases的串联(通过追加),所以它的行数等于其他两个数据帧的总和。 看起来yesterday_casesday_before_yesterday_cases都有 55 行,因此total_cases有 110 行。 因此,您的最后一行试图将 55 个值分配给一系列 110 个值。

您可能想要重塑数据以便每个日期都是它自己的列,或者在 arrays 数据帧中工作。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM