简体   繁体   English

将行插入特定索引到 Dataframe

[英]Inserting rows to specific indices to Dataframe

I want to insert specific rows to a dataframe. The dataframe contains id, hourname and count columns.我想将特定行插入到 dataframe。dataframe 包含 id、hourname 和 count 列。 I want to insert rows to the hours (0-23) which has no data.我想将行插入到没有数据的小时数 (0-23)。 This is my dataframe,这是我的dataframe,

      index  id                          hourname  count
           0  a                               0         1
           1  a                               4         1
           2  a                               14        1
           3  a                               15        3
           4  a                               17        1
           5  a                               20        1

and this is what I want to achieve这就是我想要实现的

      index  id                          hourname  count
           0  a                               0         1
           1  a                               1         0
           2  a                               2         0
           3  a                               3         0
           4  a                               4         1
           5  a                               5         0
           6  a                               6         0 
           7  a                               7         0
           8  a                               8         0
           9  a                               9         0
           10 a                               10        0
           11 a                               11        0
           12 a                               12        0
           13 a                               13        0
           14 a                               14        1
           15 a                               15        3
           16 a                               16        0
           17 a                               17        1
           18 a                               18        0
           19 a                               19        0
           20 a                               20        1
           21 a                               21        0
           22 a                               22        0
           23 a                               23        0

I grab data from a csv file, here is the file content (name of file is a.csv in source code)我从一个 csv 文件中抓取数据,这是文件内容(源代码中文件名是 a.csv)

,id,hourname,count
0,a,0,1
1,a,4,1
2,a,14,1
3,a,15,3
4,a,17,1
5,a,20,1

and here is my source code这是我的源代码

import csv
import pandas as pd
import numpy as np

result4 = pd.read_csv("a.csv")
print(result4)
for i in range(0,23):
    if result4.loc[i, 'hourname'] != i:
        line = pd.DataFrame({"id": "a", "hourname": i, "count":0}, index=[i])
        result4 = result4.append(line, ignore_index=False)
    result4 = result4.sort_index().reset_index(drop=True)
print(result4)

I hope this answers your question?我希望这回答了你的问题?

import pandas as pd
x = [0, 4, 14, 15, 17, 20]
y = [1, 1, 1, 3, 1, 1]
k = list(range(0,24))
i = list(set(k) - set(x))
for itm in i:
    x.insert(itm,itm)
    y.insert(itm, 0)
data = {'id': len(x) * ['a'], 'hourname': x, 'count': y}
df = pd.DataFrame(data) 
print(df)

I simply created a list k that contains integers from 0 to 23, then I got the difference between list x and list k as list i .我简单地创建了一个列表k ,其中包含从 0 到 23 的整数,然后我得到了list xlist k之间的差异作为list i After these, I iterated through list i adding it's items to list x and 0s at adjacent indicies in list y在这些之后,我遍历list i将它的项目添加到列表list y中相邻索引处的list x和 0

Try,尝试,

hours_df = pd.DataFrame({'hourname': range(0,23)})
df = your_df.merge(hours_df, how='right', on='hourname')

This will give you all the hours.这会给你所有的时间。 Then to fill the missing id and count然后填写缺少的ID并计数

df['id'] = df['id'].ffill()
df['count'] = df['count'].fillna(0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM