簡體   English   中英

列出熊貓數據框

[英]List to pandas dataframe

我有一個如下列表:

['[[["3200","house_number"],["northline ave","road"],["ste 360","unit"],["greensboro","city"],["27408","postcode"],["7611","house_number"],["ncus","road"]]]\n',
 '[[["1530","house_number"],["jamacha rd","road"],["ste pel","unit"],["ca","road"],["jon","city"],["ca","state"],["92019","postcode"],["us","country"]]]\n',
 '[[["625","house_number"],["westport pkwy","road"],["grapevine","city"],["76051","postcode"],["txus","city"]]]\n',
 '[[["609 principale stpaul de illeauxnoix quebec ca nadaus","house"]]]\n',
 '[[["734","house_number"],["warmiinsterunited states of ameri caus","house"]]]\n',
 '[[["595","house_number"],["market street","road"],["suite 2500","unit"],["san francisco","city"],["ca","state"],["94105us","postcode"]]]\n',
 '[[["40","house_number"],["first plaza 4 th flooralbuquerque","road"],["87102","house_number"],["nmus","road"]]]\n',
 '[[["519","house_number"],["regents gate","road"],["drhenderson","city"],["nv","state"],["89012","postcode"],["us","country"]]]\n',
 '[[["400","house_number"],["garden city plz","road"],["ste 510","unit"],["garden city","suburb"],["nyus","city"]]]\n']

我有一個空的數據框(df2):

df2=pd.DataFrame(columns=['house','category','near','house_number','road','unit','level','staircase','entrance','po_box','postcode','suburb','city_district', 'city','island', 'state_district', 'state', 'country_region', 'country', 'world_region'])

我想根據列表中的鍵將列表索引到數據框中,如果它沒有標簽,那么它可以沒有標簽。 我做了這個使用

df = df.reindex(df2.columns, fill_value="")

但是,我收到錯誤消息說它應該有唯一的標簽。 現在從列表中您可以看到道路標簽重復了兩次。 它應該只有一次。 所以我會用相似的鍵連接所有的值,然后重新索引。

請幫助我根據鍵連接值並將其放入預定義的 dataframe-df2 中。

提前致謝。

嘗試這個:

lst = ['[[["3200","house_number"],["northline ave","road"],["ste 360","unit"],["greensboro","city"],["27408","postcode"],["7611","house_number"],["ncus","road"]]]\n',
 '[[["1530","house_number"],["jamacha rd","road"],["ste pel","unit"],["ca","road"],["jon","city"],["ca","state"],["92019","postcode"],["us","country"]]]\n',
 '[[["625","house_number"],["westport pkwy","road"],["grapevine","city"],["76051","postcode"],["txus","city"]]]\n',
 '[[["609 principale stpaul de illeauxnoix quebec ca nadaus","house"]]]\n',
 '[[["734","house_number"],["warmiinsterunited states of ameri caus","house"]]]\n',
 '[[["595","house_number"],["market street","road"],["suite 2500","unit"],["san francisco","city"],["ca","state"],["94105us","postcode"]]]\n',
 '[[["40","house_number"],["first plaza 4 th flooralbuquerque","road"],["87102","house_number"],["nmus","road"]]]\n',
 '[[["519","house_number"],["regents gate","road"],["drhenderson","city"],["nv","state"],["89012","postcode"],["us","country"]]]\n',
 '[[["400","house_number"],["garden city plz","road"],["ste 510","unit"],["garden city","suburb"],["nyus","city"]]]\n']

cols = ['house','category','near','house_number','road','unit','level','staircase','entrance','po_box','postcode','suburb','city_district', 'city','island', 'state_district', 'state', 'country_region', 'country', 'world_region']

lst = [item for x in lst for item in ast.literal_eval(x)]
df_dict = {idx: {v[1]: v[0] for v in ls} for idx, ls in enumerate(lst)}
df = pd.DataFrame.from_dict(df_dict, orient="index", columns=cols)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM