如何在python中展平数据numpy.ndarray

Question

I have a numpy.ndarray data that looks like below and I want to flatten it out so that i can manipulate it.我有一个 numpy.ndarray 数据，如下所示，我想将其展平，以便我可以对其进行操作。 Please find my sample data below:请在下面找到我的示例数据：

sample_data=[list([{'region': 'urn:li:region:9194', 'followerCounts': {'organicFollowerCount': 157, 'paidFollowerCount': 0}}, {'region': 'urn:li:region:7127', 'followerCounts': {'organicFollowerCount': 17, 'paidFollowerCount': 0}}])]

I have tried to use the following code but no luck yet:我曾尝试使用以下代码，但还没有运气：

sample.flatter()

The desired output is as follows:所需的输出如下：

region                 organicFollowerCount   paidFollowerCount

urn:li:region:9194    157                          0
urn:li:region:7127    17                           0

Can anyone help me achieving this please?任何人都可以帮我实现这一目标吗？

Answer 1

Here is an approach that uses pd.json_normalize :这是一种使用pd.json_normalize的方法：

import pandas as pd

# note that `sample data` has been modified into a list of dictionaries
sample_data = [
    {'region': 'urn:li:region:9194', 
     'followerCounts': {'organicFollowerCount': 157, 'paidFollowerCount': 0}}, 
    {'region': 'urn:li:region:7127', 
     'followerCounts': {'organicFollowerCount': 17, 'paidFollowerCount': 0}}
]

Now, convert each item in the list to a data frame:现在，将列表中的每个项目转换为数据框：

dfs = list()

# convert one dict at a time into a data frame, using json_normalize()
for sd in sample_data:
    t = pd.json_normalize(sd)
    dfs.append(t)

# convert list of dataframes into a single data frame, 
#   and change column labels
t = pd.concat(dfs).rename(columns={
    'followerCounts.organicFollowerCount': 'organicFollowerCount',
    'followerCounts.paidFollowerCount': 'paidFollowerCount'
}).set_index('region')

print(t)


                    organicFollowerCount  paidFollowerCount
region                                                     
urn:li:region:9194                   157                  0
urn:li:region:7127                    17                  0

As @thehumaneraser noted, this format is not ideal, but we can't always influence the format of the data we receive.正如@thehumaneraser 指出的那样，这种格式并不理想，但我们不能总是影响收到的数据的格式。

Answer 2

You are not going to be able to flatten this data the way you want with Numpy's flatten method.您将无法使用 Numpy 的 flatten 方法以您想要的方式扁平化这些数据。 That method simply takes a multi-dimensional ndarray and flattens it to one dimension.该方法仅采用多维 ndarray 并将其展平为一维。 You can read the docs here .你可以在这里阅读文档。

A couple other things.其他一些事情。 First of all, your sample data above is not an ndarray, it is just a python list.首先，您上面的示例数据不是 ndarray，它只是一个 python 列表。 And actually since you call list() inside square brackets it is a nested list of dictionaries.实际上，由于您在方括号内调用list()因此它是一个嵌套的字典列表。 This is really not a good way to store this information and based on this convoluted format you leave yourself very few options for nicely "flattening" it into the table you desire.这确实不是存储这些信息的好方法，并且基于这种复杂的格式，您几乎没有选择可以很好地将其“展平”到您想要的表格中。

If you have many rows like this I would do the following:如果您有很多这样的行，我会执行以下操作：

headers = ["region", "organicFollowerCount", "paidFollowerCount"]
data = [headers]
for row in sample_data[0]: # Subindexing here because it is unwisely a nested list
    formatted_row = []
    formatted_row.append(row["region"])
    formatted_row.append(row["followerCounts"]["organicFollowerCount"])
    formatted_row.append(row["followerCounts"]["paidFollowerCount"]
    data.append(formatted_row)
data = np.array(data)

This will give you an ndarray of the data as you have it here, but this is still an ugly solution.这将为您提供数据的 ndarray，因为您在这里拥有它，但这仍然是一个丑陋的解决方案。 Really this is a highly impractical presentation of data and you should ditch it for a better one.实际上，这是一种非常不切实际的数据呈现方式，您应该放弃它以获得更好的方式。

One last thing: don't use camel case.最后一件事：不要使用骆驼壳。 That is standard practice for some languages like Java but nor for Python.这是某些语言（如 Java）的标准做法，但对于 Python 则不是。 Instead of organicFollowerCount use organic_follower_count and so on.而不是organicFollowerCount使用organic_follower_count等等。

如何在python中展平数据numpy.ndarray

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-08-25 02:08:25

解决方案2
0 2020-08-25 01:07:30

如何在python中展平数据numpy.ndarray

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-08-25 02:08:25

解决方案2 0 2020-08-25 01:07:30

解决方案1
1 已采纳 2020-08-25 02:08:25

解决方案2
0 2020-08-25 01:07:30