[英]How to properly append a list to a DataFrame in a for loop
我正在尝试将字符串列表(文件中的行)中的每个项目添加到我的 DataFrame。 该行填充了转储到列表中并转换为 json 的键和值。 问题是我无法让 Pandas 从循环中的列表中正确制作 DataFrame(代码卡在 for 循环中)。
df = pd.DataFrame()
df2 = pd.DataFrame()
with open(log_file_path, "r") as file:
for line in file:
line = json.loads(line[1:])
items = line.items()
all_list.append(list)
df = df.append(pd.DataFrame.from_records([line]))
continue
print("work")
print(df)
print(df.head())
这是每行的样子。
line = {'protocol': 'https', 'instanceid': 'beacond-lga13-1349-12003', 'raw_data': 'i|200|122!i|200|114!i|200|117', 'source_ip': '90.227.61.0', 'ts': 1549434199, 'jobid': '1uxw9ir', 'geocode': 'SE', 'referer': 'https://sv.cam4.com/female', 'user_agent': 'Mozilla/5.0 (Linux; Android 8.0.0; SAMSUNG SM-G935F/G935FXXS3ERL4 Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/8.2 Chrome/63.0.3239.111 Mobile Safari/537.36', 'appid': '157pr4o', 'app_version': 1536174158, 'asn': 3301}
我会列出一个列表,然后构建您的数据框。 例如:
# After collecting each list
lists = [['a', 'b'],
['c', 'd']]
# Pass your list of lists (and you can name the columns too if you like!)
pd.DataFrame(lists, columns=['col1', 'col2'])
输出:
col1 col2
0 a b
1 c d
如果我这样做,我可以阅读您的列表:
line = {'protocol': 'https', 'instanceid': 'beacond-lga13-1349-12003', 'raw_data': 'i|200|122!i|200|114!i|200|117', 'source_ip': '90.227.61.0', 'ts': 1549434199, 'jobid': '1uxw9ir', 'geocode': 'SE', 'referer': 'https://sv.cam4.com/female', 'user_agent': 'Mozilla/5.0 (Linux; Android 8.0.0; SAMSUNG SM-G935F/G935FXXS3ERL4 Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/8.2 Chrome/63.0.3239.111 Mobile Safari/537.36', 'appid': '157pr4o', 'app_version': 1536174158, 'asn': 3301}
pd.DataFrame(line, index=[0])
您也可以在索引列 index=range(0,len(items)) 中使用范围,
lines = [{'protocol': 'https',
'instanceid': 'beacond-lga13-1349-12003',
'raw_data': 'i|200|122!i|200|114!i|200|117',
'source_ip': '90.227.61.0',
'ts': 1549434199,
'jobid': '1uxw9ir',
'geocode': 'SE',
'referer': 'https://sv.cam4.com/female',
'user_agent': 'Mozilla/5.0 (Linux; Android 8.0.0; SAMSUNG SM-G935F/G935FXXS3ERL4 Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/8.2 Chrome/63.0.3239.111 Mobile Safari/537.36',
'appid': '157pr4o',
'app_version': 1536174158,
'asn': 3301},
{'protocol': 'https',
'instanceid': 'beacond-lga14-1349-12003',
'raw_data': 'i|200|122!i|200|114!i|200|117',
'source_ip': '90.227.61.1',
'ts': 1549434199,
'jobid': '1uxw9ir',
'geocode': 'SE',
'referer': 'https://sv.cam4.com/female',
'user_agent': 'Mozilla/5.0 (Linux; Android 8.0.0; SAMSUNG SM-G935F/G935FXXS3ERL4 Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/8.2 Chrome/63.0.3239.111 Mobile Safari/537.36',
'appid': '157pr4o',
'app_version': 1536174158,
'asn': 3301}]
pd.DataFrame(lines, index=list(range(0, len(lines))))
输出:
Out[899]:
protocol instanceid raw_data source_ip ts ... referer user_agent appid app_version asn
0 https beacond-lga13-1349-12003 i|200|122!i|200|114!i|200|117 90.227.61.0 1549434199 ... https://sv.cam4.com/female Mozilla/5.0 (Linux; Android 8.0.0; SAMSUNG SM-... 157pr4o 1536174158 3301
1 https beacond-lga14-1349-12003 i|200|122!i|200|114!i|200|117 90.227.61.1 1549434199 ... https://sv.cam4.com/female Mozilla/5.0 (Linux; Android 8.0.0; SAMSUNG SM-... 157pr4o 1536174158 3301
[2 rows x 12 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.