简体   繁体   中英

How can I convert a list of dictionaries and nested lists into a pandas Dataframe for use in a markdown table generator?

I have the following list of dictionaries (with nested lists within):

compatibility = [
  {'region': 'NTSC-U',
   'stats': [
     {'os': 'Windows', 'state': 'Playable'}, 
     {'os': 'Linux', 'state': 'Playable'},
     {'os': 'Mac', 'state': 'N/A'}
   ]}, 
   {'region': 'PAL',
    'stats': [
      {'os': 'Windows', 'state': 'Playable'},
      {'os': 'Linux', 'state': 'N/A'},
       {'os': 'Mac', 'state': 'N/A'}
    ]}, 
    {'region': 'NTSC-J',
     'stats': [
       {'os': 'Windows', 'state': 'N/A'}, 
       {'os': 'Linux', 'state': 'N/A'},
       {'os': 'Mac', 'state': 'N/A'}
   ]}
]

I'm trying to use pytablewriter and pandas to generate a markdown table for this data but I can't seem to wrap my head around dealing with the unique values and column ordering for the dataframe.

Here's my function with an example of how this output would look:

def generate_table():
    compatibility = find_compatibility()
    writer = MarkdownTableWriter()
    writer.from_dataframe(
        pd.DataFrame({'NTSC-U': ['Playable', 'Playable', 'N/A'], 'PAL': ['Playable', 'N/A', 'N/A'], 
                      'NTSC-J': ['N/A', 'N/A', 'N/A']}, index=['Windows', 'Linux', 'Mac']),
        add_index_column=True,
    )
    writer.write_table()

How would I go about generating this? I've gotten as far as thinking about sets to pull the unique OS names but that doesn't seem to work with a list of dicts with nested lists. (And since the data structure sometimes only contains one region, it would need to be dynamic).

Any help is much appreciated!

You can start by normalizing your json , selecting 'stats' as the record path.

df = pd.json_normalize(compatibility, "stats", ["region"])
print(df)
        os     state  region
0  Windows  Playable  NTSC-U
1    Linux  Playable  NTSC-U
2      Mac       N/A  NTSC-U
3  Windows  Playable     PAL
4    Linux       N/A     PAL
5      Mac       N/A     PAL
6  Windows       N/A  NTSC-J
7    Linux       N/A  NTSC-J
8      Mac       N/A  NTSC-J

Then you just use pivot to get the desired output.

df = df.pivot("os", "region", values="state")
print(df)
region  NTSC-J    NTSC-U       PAL
os                                
Linux      N/A  Playable       N/A
Mac        N/A       N/A       N/A
Windows    N/A  Playable  Playable

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM