简体   繁体   English

转换数据框中具有值的列作为数字列表的列表

[英]Convert a column in a data frame that has values as a list of lists as numeric

I have a data frame that looks like this: 我有一个看起来像这样的数据框:

    date         B  C   D   E
0   04/06/2019  258 994 761 [1, 46, 36, 7457, 456]
1   05/06/2019  748 181 565 [3453, 45]
2   07/06/2019  185 876 107 [4976, 46, 57, 7, 3]
3   08/06/2019  241 386 728 [4, 6457, 4]
4   09/06/2019  516 579 596 [65]

I would like to convert df['E'] as a numeric data type. 我想将df ['E']转换为数字数据类型。 The reason for that is that my goal is to plot the maximum value of E and the average value over time. 这样做的原因是我的目标是绘制E的最大值和一段时间内的平均值。

I already tried to convert using: 我已经尝试使用以下方法进行转换:

df['E'].infer_objects()
df['E'].astype(np.int16)

But it did not work... 但这没有用...

Try this. 尝试这个。

df['E'].apply(lambda x:np.array(x,dtype=np.int32))

For max and mean 对于最大和均值

df['E_max'].apply(lambda x:x.max())
df['E_mean'].apply(lambda x:x.mean())

infer_objects isn't really for what you think it's for. infer_objects并不是您真正想的那样。 From the docs : 文档

Attempts soft conversion of object-dtyped columns, leaving non-object and unconvertible columns unchanged. 尝试对对象类型化的列进行软转换,而使非对象和不可转换的列保持不变。 The inference rules are the same as during normal Series/DataFrame construction. 推理规则与常规Series / DataFrame构造过程中的规则相同。

This just checks if an object column can be converted to numeric or some other format recognized by pandas . 这只是检查对象列是否可以转换为数字或pandas识别的其他格式。


If you want E to be truly numeric in the eyes of pandas , you need to expand each entry of each list to its own column, so that you are storing actual numeric types and not Python objects. 如果希望Epandas眼中是真正的数字,则需要将每个列表的每个条目扩展到其自己的列,以便存储实际的数字类型,而不是Python对象。

E = pd.DataFrame(df.E.to_numpy().tolist())

      0       1     2       3      4
0     1    46.0  36.0  7457.0  456.0
1  3453    45.0   NaN     NaN    NaN
2  4976    46.0  57.0     7.0    3.0
3     4  6457.0   4.0     NaN    NaN
4    65     NaN   NaN     NaN    NaN

Now that you have this reference frame, you can use min and max directly on this frame. 现在有了该参考框架,您可以直接在此框架上使用minmax Using vectorized methods will be much faster than an approach using apply 使用量化方法会比使用方法快得多 apply

df.assign(**E.agg(['mean', 'max'], 1))

         date    B    C    D                       E    mean     max
0  04/06/2019  258  994  761  [1, 46, 36, 7457, 456]  1599.2  7457.0
1  05/06/2019  748  181  565              [3453, 45]  1749.0  3453.0
2  07/06/2019  185  876  107    [4976, 46, 57, 7, 3]  1017.8  4976.0
3  08/06/2019  241  386  728            [4, 6457, 4]  2155.0  6457.0
4  09/06/2019  516  579  596                    [65]    65.0    65.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将具有列名和值的列表转换为数据框并保留值 - Convert list which has column names and values to data frame with preserving the values 将一组数据框行的列值转换为列中的列表 - Convert column values for a group of data frame rows into a list in the column 将具有向量值的熊猫数据框列转换为张量 - Convert pandas data frame column, which has values of vectors, into tensors 基于现有数字列、字符串列表作为列名和元组列表作为值在数据框中创建新列 - Create new columns in a data frame based on an existing numeric column, a list of strings as column names and a list of tuples as values Python:如何将列表列表转换为数据框? - Python: How to convert list of lists into data frame? 将数据框转换为python中的列表列表 - convert data-frame into lists of list in python 如何将 map 字典值转换为具有列表值的数据框列 - How to map dictionary values to data frame column which has values as lists 如何将熊猫数据框字符串值转换为数值 - How to convert pandas data frame string values to numeric values 将数字格式应用于包含空格作为值的熊猫数据框中的列 - Applying numeric format to a column in a pandas data frame that contains spaces as values 仅从数据帧中的列中识别数值 - Python - Identifying only numeric values from a column in a Data Frame- Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM