简体   繁体   English

将用大括号中的值填充的 pandas 数据框转换为 numpy 数组

[英]Convert a pandas data frame filled with values in curly brackets to numpy array

I have a Pandas dataframe with values in curly brackets, and I want to convert it to a Pandas dataframe with the same values but instead of curly brackets, they have to be converted to NumPy arrays. I have a Pandas dataframe with values in curly brackets, and I want to convert it to a Pandas dataframe with the same values but instead of curly brackets, they have to be converted to NumPy arrays. This is an example of an instance of my dataframe: An instance of the dataframe这是我的 dataframe 实例的示例: dataframe的实例

0, 5, '{{{1., 0.}, {0., 0.}}, {{0., 0.}, {0., 0.}}}',
   '{{{0., 0.}, {1., 0.}}, {{0.3333333333333333, 0.}, {0., 1.}}}',
   '{{{0., 0.}, {0., 0.}}, {{0., 0.}, {0., 0.}}}',
   '{0., 0.041666666666666664, 0., 0., 0.}', '{0., 0., 2., 1.}'

I want this instance of the dataframe to be like this:我希望 dataframe 的这个实例是这样的:

0, 5, array([[[1., 0.], [0., 0.]], [[0., 0.], [0., 0.]]]),
   array([[[0., 0.], [1., 0.]], [[0.3333333333333333, 0.], [0., 1.]]]),
   array([[[0., 0.], [0., 0.]], [[0., 0.], [0., 0.]]]),
   array([0., 0.041666666666666664, 0., 0., 0.]), array([0., 0., 2., 1.])

Okay, I took the liberty of assuming those curly brackets in your original DataFrame are strings.好的,我冒昧地假设原始 DataFrame 中的那些花括号是字符串。

You can use a combination of a lambda expression and ast.literal_eval(x) .您可以使用 lambda 表达式和ast.literal_eval(x)的组合。

import ast
import numpy as np
import pandas as pd

df = df.applymap(lambda x: np.array(ast.literal_eval(str(x).replace('{', '[').replace('}', ']')), 
                                    dtype=object))

This expression applies a function which first converts a value to string.此表达式应用 function,它首先将值转换为字符串。 It then replaces '{' with '[' and '}' with ']' and after that it uses ast.literal_eval to convert a string to a list .然后它将'{'替换为'[''}'替换为']' ,然后使用ast.literal_eval将 string 转换为list np.array is there if you really want it to be a numpy array but it isn't necessary.如果您真的希望它是numpy数组, np.array就在那里,但这不是必需的。

From another answer :从另一个答案

With ast.literal_eval you can safely evaluate an expression node or a string containing a Python literal or container display.使用ast.literal_eval您可以安全地评估包含 Python 文字或容器显示的表达式节点或字符串。 The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, booleans, and None.提供的字符串或节点只能由以下 Python 文字结构组成:字符串、字节、数字、元组、列表、字典、布尔值和无。

You can simply:您可以简单地:

  1. replace all { with [ and replace all } with ] and use python eval function to convert it into a python list将所有{替换为[并将所有}替换为]并使用 python eval function 将其转换为 python list
  2. create a np.array() from python list从 python 列表创建一个np.array()
import numpy as np
import pandas as pd

data = pd.Series(['0', '5', '{{1.,0.},{0.,0.},{0.,0.}}', '2', '{{4.5, 5}, {0.3, 0.6}}', '200'])
data = data.apply(lambda x: np.array(eval(str(x).replace('{', '[').replace('}', ']'))) if '{' in str(x) else float(x))
print(data)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM