[英]Inserting rows to specific indices to Dataframe
I want to insert specific rows to a dataframe. The dataframe contains id, hourname and count columns.我想将特定行插入到 dataframe。dataframe 包含 id、hourname 和 count 列。 I want to insert rows to the hours (0-23) which has no data.
我想将行插入到没有数据的小时数 (0-23)。 This is my dataframe,
这是我的dataframe,
index id hourname count
0 a 0 1
1 a 4 1
2 a 14 1
3 a 15 3
4 a 17 1
5 a 20 1
and this is what I want to achieve这就是我想要实现的
index id hourname count
0 a 0 1
1 a 1 0
2 a 2 0
3 a 3 0
4 a 4 1
5 a 5 0
6 a 6 0
7 a 7 0
8 a 8 0
9 a 9 0
10 a 10 0
11 a 11 0
12 a 12 0
13 a 13 0
14 a 14 1
15 a 15 3
16 a 16 0
17 a 17 1
18 a 18 0
19 a 19 0
20 a 20 1
21 a 21 0
22 a 22 0
23 a 23 0
I grab data from a csv file, here is the file content (name of file is a.csv in source code)我从一个 csv 文件中抓取数据,这是文件内容(源代码中文件名是 a.csv)
,id,hourname,count
0,a,0,1
1,a,4,1
2,a,14,1
3,a,15,3
4,a,17,1
5,a,20,1
and here is my source code这是我的源代码
import csv
import pandas as pd
import numpy as np
result4 = pd.read_csv("a.csv")
print(result4)
for i in range(0,23):
if result4.loc[i, 'hourname'] != i:
line = pd.DataFrame({"id": "a", "hourname": i, "count":0}, index=[i])
result4 = result4.append(line, ignore_index=False)
result4 = result4.sort_index().reset_index(drop=True)
print(result4)
I hope this answers your question?我希望这回答了你的问题?
import pandas as pd
x = [0, 4, 14, 15, 17, 20]
y = [1, 1, 1, 3, 1, 1]
k = list(range(0,24))
i = list(set(k) - set(x))
for itm in i:
x.insert(itm,itm)
y.insert(itm, 0)
data = {'id': len(x) * ['a'], 'hourname': x, 'count': y}
df = pd.DataFrame(data)
print(df)
I simply created a list k
that contains integers from 0 to 23, then I got the difference between list x
and list k
as list i
.我简单地创建了一个列表
k
,其中包含从 0 到 23 的整数,然后我得到了list x
和list k
之间的差异作为list i
。 After these, I iterated through list i
adding it's items to list x
and 0s at adjacent indicies in list y
在这些之后,我遍历
list i
将它的项目添加到列表list y
中相邻索引处的list x
和 0
Try,尝试,
hours_df = pd.DataFrame({'hourname': range(0,23)})
df = your_df.merge(hours_df, how='right', on='hourname')
This will give you all the hours.这会给你所有的时间。 Then to fill the missing id and count
然后填写缺少的ID并计数
df['id'] = df['id'].ffill()
df['count'] = df['count'].fillna(0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.