如何从 Pandas 中的列表向 Dataframe 添加行？

Question

I have a yearly information (COUNT) of countries stored in DataFrame.我有一个存储在 DataFrame 中的国家/地区的年度信息 (COUNT)。 However, some countries are missing in certain years.然而，某些国家在某些年份失踪了。

If I have a complete list of countries, what is an optimal way to add them under corresponding years and fill the missing value for COUNT with 0?如果我有完整的国家/地区列表，将它们添加到相应年份并用 0 填充 COUNT 的缺失值的最佳方法是什么？

            DATE    COUNTRY     COUNTRY_ID  COUNT
       0    1980    United States   840     42
      42    1980    Czech Republic  203     2
      95    1980    Hungary         348     1
      96    1980    Great Britain   826     1
      97    1980    South Africa    710     1
      98    1982    United States   840     42
     140    1982    Paraguay        600     2
       .
       .

Answer 1

One way to do this is to make a combination of all the DATE, COUNTRY combinations and then reindex the DataFrame and finally fill in the missing values.一种方法是组合所有 DATE、COUNTRY 组合，然后reindex DataFrame，最后填充缺失值。

# Assume that we want all years not just the ones seen
years = range(df['DATE'].min(), df['DATE'].max()+1)

# get all combinations
idx = pd.MultiIndex.from_product([years, df['COUNTRY'].unique()], names=['DATE', 'COUNTRY'])

# reindex by first putting DATE and COUNTRY into the index
df1 = df.set_index(['DATE', 'COUNTRY']).reindex(idx).reset_index()

# Fill back in missing IDs
country_map = df.set_index('COUNTRY')['COUNTRY_ID'].drop_duplicates()
df1['COUNTRY_ID'] = df1.COUNTRY.map(country_map)

# fill in 0 for COUNT and convert back to int
df1['COUNT'] = df1['COUNT'].fillna(0).astype(int)

    DATE         COUNTRY  COUNTRY_ID  COUNT
0   1980   United States         840     42
1   1980  Czech Republic         203      2
2   1980         Hungary         348      1
3   1980   Great Britain         826      1
4   1980    South Africa         710      1
5   1980        Paraguay         600      0
6   1981   United States         840      0
7   1981  Czech Republic         203      0
8   1981         Hungary         348      0
9   1981   Great Britain         826      0
10  1981    South Africa         710      0
11  1981        Paraguay         600      0
12  1982   United States         840     42
13  1982  Czech Republic         203      0
14  1982         Hungary         348      0
15  1982   Great Britain         826      0
16  1982    South Africa         710      0
17  1982        Paraguay         600      2

Answer 2

Consider also a cross join merge route (for those of us with the SQL mindset)还考虑一个交叉连接merge路线（对于我们这些有 SQL 思维的人）

# ASSIGN KEY COLUMN
df['KEY'] = 1

# CREATE DF OF DATES RANGE
dates = pd.DataFrame({'DATE':list(range(df['DATE'].min(),df['DATE'].max() + 1)),
                      'COUNT':0, 'KEY':1})    
# CROSS JOIN MERGE
mdf = df.merge(dates, on=['KEY'])

# REASSIGN COUNT
mdf.loc[mdf['DATE_x'] != mdf['DATE_y'], 'COUNT_x'] = 0

# CLEAN UP DF (COLS AND ROWS)
mdf = mdf[['DATE_y', 'COUNTRY', 'COUNTRY_ID', 'COUNT_x']]\
           .rename(columns={'DATE_y':'DATE', 'COUNT_x':'COUNT'})\
           .drop_duplicates(['DATE', 'COUNTRY', 'COUNTRY_ID'])\
           .sort_values('DATE')\
           .reset_index(drop=True)

#     DATE         COUNTRY  COUNTRY_ID  COUNT
# 0   1980   United States         840     42
# 1   1980        Paraguay         600      0
# 2   1980  Czech Republic         203      2
# 3   1980         Hungary         348      1
# 4   1980   Great Britain         826      1
# 5   1980    South Africa         710      1
# 6   1981   United States         840      0
# 7   1981  Czech Republic         203      0
# 8   1981         Hungary         348      0
# 9   1981        Paraguay         600      0
# 10  1981   Great Britain         826      0
# 11  1981    South Africa         710      0
# 12  1982    South Africa         710      0
# 13  1982         Hungary         348      0
# 14  1982  Czech Republic         203      0
# 15  1982   United States         840      0
# 16  1982   Great Britain         826      0
# 17  1982        Paraguay         600      2

如何从 Pandas 中的列表向 Dataframe 添加行？

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-02-07 23:39:07

解决方案2
0 2017-02-08 14:53:03

如何从 Pandas 中的列表向 Dataframe 添加行？

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-02-07 23:39:07

解决方案2 0 2017-02-08 14:53:03

解决方案1
1 已采纳 2017-02-07 23:39:07

解决方案2
0 2017-02-08 14:53:03