简体   繁体   中英

combine two lists to one dict with keys and values and transform into dataframe

I have two lists

keys = ['a', 'b', 'c', 'd']
values = ["1", "2", "3", "4", "5", "6", "7", "8"]

the values in my keys-list should be the keys later in the final dict and the values in my values-lists should be te values for them.

The output should look like that:

result : [{
    "a": "1",
    "b": "2",
    "c": "3",
    "d": "4"
    },
    {
    "a": "5",
    "b": "6",
    "c": "7",
    "d": "8"
    }]

later I want to do a json_normalize to have a table.

+---+---+---+---+
| a | b | c | d |
+---+---+---+---+
| 1 | 2 | 3 | 4 |
| 5 | 5 | 7 | 8 |
+---+---+---+---+

I need a solution where the length of the values-list is dynamic. So it should work with 8 values, as well as for example with 64 values.

I tried to zip() the lists in combination with intertools.cycle

values = ["1","2","3","4","5","6","7","8"]
keys = ["A","B","C"]

from itertools import cycle
zip_list = zip(cycle(keys), values)

result = set(zip_list)
result

here the result

{('A', '1'),
 ('A', '4'),
 ('A', '7'),
 ('B', '2'),
 ('B', '5'),
 ('B', '8'),
 ('C', '3'),
 ('C', '6')}

This solution I can´t use because first I can´t do json_normalize() to transform it fast and easy into a dataframe and second the set is orderned, I don´t want to get it ordered. How to achieve my goal in another way?

I'd suggest you approach this in a simpler way, by splitting the list values into chunks according to the length of keys , and use the result to build the dataframe:

n = len(keys)
l = [values[i:i + n] for i in range(0, len(values), n)]
# [['1', '2', '3', '4'], ['5', '6', '7', '8']]
pd.DataFrame(l, columns=keys)

   a  b  c  d
0  1  2  3  4
1  5  6  7  8

Just for fun here's an itertools based one, following the idea in your approach (though I strongly suggest the first method for obvious reasons):

from itertools import repeat, islice
keys = ['a', 'b', 'c', 'd']
values = ["1", "2", "3", "4", "5", "6", "7", "8"]
i_values = iter(values)
r = len(values)//len(keys)

d = [dict(zip(k, islice(i_values, len(k)))) for k in repeat(keys, r)]
# [{'a': '1', 'b': '2', 'c': '3', 'd': '4'}, {'a': '5',...

print(pd.DataFrame(d))
   a  b  c  d
0  1  2  3  4
1  5  6  7  8

It's pretty straightforward with pandas.DataFrame constructor:

import pandas as pd

keys = ['a', 'b', 'c', 'd']
values = ["1", "2", "3", "4", "5", "6", "7", "8"]

step = len(keys)
df = pd.DataFrame([dict(zip(keys, values[i:i+step])) for i in range(0, len(values), step)])
print(df)

The output:

   a  b  c  d
0  1  2  3  4
1  5  6  7  8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM