簡體   English   中英

python - 如何轉換鍵值對 Pandas dataframe

[英]python - How to transform key-value pairs in Pandas dataframe

我收到了這個數據集,其中包含 a.csv 格式的鍵值對中的房地產數據。

如果我刪除第一行,我可以用 Pandas 加載它並得到一個 dataframe,如下所示:

編號 1 [{'鍵'": '"地板'" '"value'": '"2. Floor'"} {'“關鍵'”:'“可用_日期'” "價值'": '"nach Vereinbarung'"}
編號 2 [{'鍵'": '"地板'" '"value'": '"1. Floor'"} {'“鑰匙'”:'“生活空間'” “價值”:81.0}
編號 3 [{'key'": '"生活空間'" '“價值”:240.0} {'“key'”:'“construction_year'” '“價值”:2012}
編號 4 [{'key'": '"生活空間'" '“價值”:280.0} {'“key'”:'“construction_year'” '“價值”:1851}

但是,我不知道如何使用 Python 中的鍵值對,所以我想將此數據轉換為 Pandas dataframe,其中“鍵”作為標題,每行中它們各自的值,如下所示:

ID 地面 可用_日期 居住空間 施工年
編號 1 2.地板 交際會
編號 2 1.地板 81
編號 3 240.0 2012
編號 4 280.0 1851年

我找到了很多關於如何將 Pandas dataframe 轉換為鍵值對的說明,但反之則不然......

先感謝您。

更新

我的數據內容如下所示:

print(df.head(10))
           [{'key'": '"floor'"   '"value'": '"3. Stock'"}        {'"key'": '"living_space'"    '"value'": 50.0}      {'"key'": '"available_date'"  ... Unnamed: 49 Unnamed: 50 Unnamed: 51 Unnamed: 52 Unnamed: 53
0          [{'key'": '"floor'"   '"value'": '"2. Stock'"}        {'"key'": '"living_space'"   '"value'": 113.0}   {'"key'": '"construction_year'"  ...         NaN         NaN         NaN         NaN         NaN
1          [{'key'": '"floor'"   '"value'": '"1. Stock'"}        {'"key'": '"living_space'"    '"value'": 52.0}   {'"key'": '"construction_year'"  ...         NaN         NaN         NaN         NaN         NaN
..                         ...                        ...                               ...                 ...                               ...  ...         ...         ...         ...         ...         ...
8   [{'key'": '"living_space'"          '"value'": 240.0}   {'"key'": '"construction_year'"    '"value'": 2012}      {'"key'": '"available_date'"  ...         NaN         NaN         NaN         NaN         NaN
9   [{'key'": '"living_space'"          '"value'": 280.0}   {'"key'": '"construction_year'"    '"value'": 1851}      {'"key'": '"available_date'"  ...         NaN         NaN         NaN         NaN         NaN

[10 rows x 54 columns]

更新

.csv 的內容如下所示(對於第 2 次觀察):

1,[{'key'"": '""floor'"""," '""value'"": '""3. Stock'""}","{'""key'"" : '""living_space'"""," '""value'"": 50.0}"," {'""key'"": '""available_date'"""," '""value'"" : '""01.04.2022'""}"," {'""鍵'"": '""有用區域'""", """值'"": 60.0}","{'"" key'"": '""pets_allowed'"""," '""value'"": true}"," {'""key'"": '""child_friendly'"""," '"" value'"": true}"," {'""key'"": '""balcony'"""," '""value'"": true}"," {'""key'"" : '""parking_outdoor'"""," '""value'"": true}"," {'""key'"": '""lift'"""," '""value'"" : true}"," {'""key'"": '""cable_tv'"""," '""value'"": true}]""","[{'date'"": ' ""2022-02-25'""","'""price_amount'"": 1550}]"""

2,[{'key'"": '""floor'"""," '""value'"": '""2. Stock'""}","{'""key'"" : '""living_space'"""," '""value'"": 113.0}"," {'""key'"": '""construction_year'"""," '""value'"" : 2022}"," {'""key'"": '""available_date'"""," '""value'"": '""01.04.2022'""}","{'"" key'"": '""wheelchair_accessible'"""," '""value'"": true}"," {'""key'"": '""child_friendly'"""," '"" value'"": true}"," {'""key'"": '""balcony'"""," '""value'"": true}"," {'""key'"" : '""parking_indoor'"""," '""value'"": true}"," {'""key'"": '""lift'"""," '""value'"" : true}]""","[{'日期'"": '""2022-02-27'"""," '""price_amount'"": 2990}]"""

這些數據似乎是從房地產在線市場中刪除的。 我認為也與 state 有關,即每次觀察都有不同數量的特征。

可能的解決方案如下:

文件“data.csv”內容

1,"[{'key'"": '""floor'"""," '""value'"": '""3. Stock'""}"," {'""key'"": '""living_space'"""," '""value'"": 50.0}"," {'""key'"": '""available_date'"""," '""value'"": '""01.04.2022'""}"," {'""key'"": '""useful_area'"""," '""value'"": 60.0}"," {'""key'"": '""pets_allowed'"""," '""value'"": true}"," {'""key'"": '""child_friendly'"""," '""value'"": true}"," {'""key'"": '""balcony'"""," '""value'"": true}"," {'""key'"": '""parking_outdoor'"""," '""value'"": true}"," {'""key'"": '""lift'"""," '""value'"": true}"," {'""key'"": '""cable_tv'"""," '""value'"": true}]""","[{'date'"": '""2022-02-25'"""," '""price_amount'"": 1550}]"""
2,"[{'key'"": '""floor'"""," '""value'"": '""2. Stock'""}"," {'""key'"": '""living_space'"""," '""value'"": 113.0}"," {'""key'"": '""construction_year'"""," '""value'"": 2022}"," {'""key'"": '""available_date'"""," '""value'"": '""01.04.2022'""}"," {'""key'"": '""wheelchair_accessible'"""," '""value'"": true}"," {'""key'"": '""child_friendly'"""," '""value'"": true}"," {'""key'"": '""balcony'"""," '""value'"": true}"," {'""key'"": '""parking_indoor'"""," '""value'"": true}"," {'""key'"": '""lift'"""," '""value'"": true}]""","[{'date'"": '""2022-02-27'"""," '""price_amount'"": 2990}]"""

import pandas as pd
import json

# read data from csv file
with open("data.csv", "r", encoding="utf-8") as file:
    data = file.read().replace('"', '').replace("'", '"').replace("[", '').replace("]", '').splitlines()

# convert string to list
data_dict = [json.loads("[" + d + "]") for d in data]

data_all = []

for list_item in data_dict:
    data_prepared = {}
    for idx, item in enumerate(list_item):
        if idx == 0:
            data_prepared["id"] = item
        else:
            if 'key' in item:
                data_prepared[item['key']] = item['value']
            else:
                data_prepared.update(item)
    data_all.append(data_prepared)

# create dataframe
df = pd.DataFrame(data_all)
df = df.fillna("-")
df = df.replace(True, 'Yes')
df = df.replace(False, 'No')
df

退貨

在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM