[英]import nested dict containing a list into a csv
我正在嘗試將以下數據導入 CSV:
{'test.foo.com': {'domain': 'foo.com','FQDN': 'test.foo.com', 'AS': 'AS1111', 'ressource_type': 'A', \
'nb_ip': '1', 'IP': '1.1.1.1', 'service': ['UNKNOWN'], 'port': '[443, 8443]'}}
我用這段代碼幾乎成功了:
#!/bin/python3
## Import ##
# Offical
import csv
### Main ###
if __name__ == '__main__':
## Variables
csv_headers = ['domain', 'FQDN', 'AS', 'ressource_type', 'nb_ip', 'IP', 'service', 'port']
final_data = {'test.foo.com': {'domain': 'foo.com','FQDN': 'test.foo.com', 'AS': 'AS1111', 'ressource_type': 'A', \
'nb_ip': '1', 'IP': '1.1.1.1', 'service': ['UNKNOWN'], 'port': '[443, 8443]'}}
# Open the csv file in "write mode"
with open(file_name, mode='w') as file:
# Prepare the writer to add a dict into the csv file
csv_writer = csv.DictWriter(file, fieldnames=headers)
# Write the columns header into the csv file
csv_writer.writeheader()
# Write the dict into the file
for key, val in nest_dict.items():
row = {'FQDN': key}
row.update(val)
csv_writer.writerow(row)
結果是:
domain,FQDN,AS,ressource_type,nb_ip,IP,service,port
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,['UNKNOWN'],"[443, 8443]"
但我想:
domain,FQDN,AS,ressource_type,nb_ip,IP,service,port
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,'UNKNOWN','443'
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,'UNKNOWN','8443'
看到不同 ? 我有一個“服務”列表(這里不需要處理)和一個“端口”列表。 如果“端口”列中有 1 個以上的端口,我需要為列表中的每個端口打印一個新行。
我正在努力做到這一點,因為我沒有完全理解這段代碼:
# Write the dict into the file
for key, val in nest_dict.items():
row = {'FQDN': key}
row.update(val)
csv_writer.writerow(row)
你能幫我處理一下嗎?
這將使用給定的數據給出所需的結果:
### Main ###
if __name__ == '__main__':
## Variables
csv_headers = ['domain', 'FQDN', 'AS', 'ressource_type', 'nb_ip', 'IP', 'service', 'port']
final_data = {'test.foo.com': {'domain': 'foo.com','FQDN': 'test.foo.com', 'AS': 'AS1111', 'ressource_type': 'A', \
'nb_ip': '1', 'IP': '1.1.1.1', 'service': ['UNKNOWN'], 'port': '[443, 8443]'}}
# Open the csv file in "write mode"
with open('out.csv', mode='w') as file:
# Prepare the writer to add a dict into the csv file
csv_writer = csv.DictWriter(file, fieldnames=csv_headers)
# Write the columns header into the csv file
csv_writer.writeheader()
# Write the dict into the file
for key, val in final_data.items():
row = {'FQDN': key}
# Assume that service is always a list of one value and replace it with the one value
# it contains.
val['service'] = val.pop('service')[0]
row.update(val)
# Since the value of port is quoted it will be a string, but we wat a list. Remove the
# value of 'port' from the dict and put it in 'port_string' (= '[443, 8443'')
port_string = val.pop('port')
# Remove the opening and closing brackets from the port_string (= '443, 8443').
port_string = port_string.replace('[', '')
port_string = port_string.replace(']', '')
# Now we can split the string into a python list (= ['443', ' 8443'])
port_list = port_string.split(',')
# Write a csv row for each value in the port list
for port in port_list:
row['port'] = port.strip()
csv_writer.writerow(row)
(順便說一句,原始帖子中的代碼沒有運行。此代碼包括使其運行的編輯。)
請注意,由於 'port' 的值被引用(與 'service' 的值不同)它將作為字符串讀入,因此必須首先將其轉換為列表。 如果刪除 [443, 8443] 周圍的單引號,則代碼的端口部分簡化為:
port_list = val.pop('port')
# Write a csv row for each value in the port list
for port in port_list:
row['port'] = port
csv_writer.writerow(row)
另一個潛在的問題是“服務”。 它是一個列表,所以它可以有多個值嗎? 如果是這樣,則需要修改代碼以解決該問題。
最后,我在此展示的代碼可能更加 Python 化,但希望確保它對於初學者來說盡可能具有可讀性。 一旦它完全按需要工作,它就可以變得更加 Pythonic。
所以這是最終的代碼。
不確定這是否是最好的方法,但對我來說看起來已經足夠了。
#!/bin/python3
## Import ##
# Offical
import csv
### Main ###
if __name__ == '__main__':
## Variables
csv_headers = ['domain', 'FQDN', 'AS', 'ressource_type', 'nb_ip', 'IP', 'service', 'port']
final_data = {'test.foo.com': {'domain': 'foo.com','FQDN': 'test.foo.com', 'AS': 'AS1111', 'ressource_type': 'A', \
'nb_ip': '1', 'IP': '1.1.1.1', 'service': ['UNKNOWN'], 'port': '[443, 8443]'}}
# Open the csv file in "write mode"
with open(file_name, mode='w') as file:
# Prepare the writer to add a dict into the csv file
csv_writer = csv.DictWriter(file, fieldnames=csv_headers)
# Write the columns header into the csv file
csv_writer.writeheader()
for key, val in final_data.items():
# ?
row = {'FQDN': key}
# Update the row with all columns values
row.update(val)
# If service contains multiple elements it will transform the list into a string with each string separate by a space
# If service contains just one element, it will transform the list into a string (no space) added
row['service'] = ' '.join(val['service'])
# Write a row for each value in the port list
for port in val['port']:
row['port'] = port
csv_writer.writerow(row)
結果輸出:
domain,FQDN,AS,ressource_type,nb_ip,IP,service,port
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,'UNKNOWN','443'
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,'UNKNOWN','8443'
不要投票我的答案,我把它用於知識目的。 所有獎勵都應該發給@bartonstanley
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.