简体   繁体   English

重组数组的数组并合并相同的项

[英]Restructure an array of arrays and combine same terms

I'm trying to write a function that takes an array of arrays, and restructures it into a different form, with certain conditions. 我正在尝试编写一个函数,该函数需要一个数组数组,并在一定条件下将其重组为其他形式。 For example let's say: 例如,假设:

array = [
    ["City1","Spanish", "163"],
    ["City1", "French", "194"],
    ["City2","English", "1239"],
    ["City2","Spanish", "1389"],
    ["City2", "French", "456"]
]

So I want to create a new array which is sorted by cities alphabetically, and columns by languages(sorting on columns optional), any nulls will get replaced by 0. For example, and output to the above array should be: 因此,我想创建一个新的数组,该数组按城市按字母顺序排序,按语言按列排序(对列进行排序),所有null都将替换为0。例如,输出到上述数组的应该是:

[
[0, 163, 194],
[1239, 1389, 456]
]

I wrote this method, but II'm not sure if it makes sense logically. 我写了这种方法,但是我不确定逻辑上是否有意义。 It is definitely hard coded and I am trying to make it so that it can be used for any input in the above format. 它绝对是硬编码的,我正在尝试使其能够以上述格式用于任何输入。

import numpy as np

new_array = [[]]
x = 'City1'
y = 'City2'

def solution(arr):
    for row in arr:
        if row[0]==x:
            new_array[-1].append(row[2])
        else:
            x = x + 1
            c.append([row[2]])
solution(array)

I know I need to fix the syntax, and also write a loop for sorting things alphabetically. 我知道我需要修复语法,还需要编写一个循环以按字母顺序对事物进行排序。 Any help on this would be appreciated, I would like to understand how to iterate through an array like this and perform different functions and restructure the array to the new format. 我们将不胜感激,我想了解如何迭代这样的数组并执行不同的功能并将数组重组为新格式。

If performance is not your overriding concern, you can use Pandas with Categorical Data and groupby . 如果性能不是您最关心的问题,则可以将Pandas与Categorical Datagroupby This works because, by default, groupby with categoricals uses the Cartesian product of categorical series: 之所以groupby ,是因为默认情况下,带有类别的groupby使用类别系列的笛卡尔积:

import pandas as pd, numpy as np

# construct dataframe
df = pd.DataFrame(array, columns=['city', 'language', 'value'])

# convert to categories
for col in ['city', 'language']:
    df[col] = df[col].astype('category')

# groupby.first or groupby.sum works if you have unique combinations
res = df.sort_values(['city', 'language'])\
        .groupby(['city', 'language']).first().fillna(0).reset_index()

print(res)

    city language value
0  City1  English     0
1  City1   French   194
2  City1  Spanish   163
3  City2  English  1239
4  City2   French   456
5  City2  Spanish  1389

Then, for your desired list of lists output: 然后,对于所需的列表输出列表:

res_lst = res.groupby('city')['value'].apply(list).tolist()
res_lst = [list(map(int, x)) for x in res_lst]

print(res_lst)

[[0, 194, 163], [1239, 456, 1389]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM