简体   繁体   English

如何将列表转换为numpy数组以过滤元素?

[英]How can I convert a list to a numpy array for filtering elements?

I have a list of float numbers and I would like to convert it to numpy array so I can use numpy.where() to get indices of elements that are bigger than 0.0 (not zero) 我有一个float列表,我想将其转换为numpy array因此我可以使用numpy.where()获得大于0.0(不为零)的元素的索引

I tried this, but with no luck: 我尝试了这个,但是没有运气:

import numpy as np

arr = np.asarray(enumerate(grade_list))
g_indices = np.where(arr[1] > 0)[0]

Edit: 编辑:

is dtype=float needed? 是否需要dtype=float

You don't need the enumerate() : 您不需要enumerate()

arr = np.asarray(grade_list)
g_indices = np.where(arr > 0)[0]

You are over-complicating it: 您过于复杂了:

import  numpy as np

grade_list_as_array = np.array(grade_list)

You don't need numpy arrays to filter lists. 您不需要numpy数组来过滤列表。

List comprehensions 清单理解

List comprehensions are a really powerful tool to write readable and short code: 列表推导是编写可读和简短代码的强大工具:

grade_list = [1, 2, 3, 4, 4, 5, 4, 3, 1, 6, 0, -1, 6, 3]
indices = [index for index, grade in enumerate(grade_list) if grade > 0.0]
print(indices)

gives [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13] . 给出[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13] This is a standard Python list. 这是标准的Python列表。 This list can be converted to a numpy array afterwards, if necessary. 如有必要,此列表可以在以后转换为numpy数组。

Numpy 脾气暴躁的

If you really want to use numpy.where , you should skip the enumerate : 如果您确实要使用numpy.where ,则应跳过enumerate

import numpy
grade_list = [1, 2, 3, 4, 4, 5, 4, 3, 1, 6, 0, -1, 6, 3]
grade_list_np = numpy.array(grade_list)
indices = numpy.where(grade_list_np > 0.0)[0]
print(indices)

gives [ 0 1 2 3 4 5 6 7 8 9 12 13] . 给出[ 0 1 2 3 4 5 6 7 8 9 12 13]

Performance comparision 性能比较

If you only need this for a small list (eg < 100), the list comprehension is the fastest way to do it. 如果只需要一个小的列表(例如<100),则列表理解是最快的方法。 Using numpys where is significantly faster than using a list comprehension first and then converting it to a numpy array (for list length of 1000): 使用numpys where的速度明显比先使用列表理解然后将其转换为numpy数组(列表长度为1000)快得多:

numpy.where (|L| = 1000): 13.5045940876
list_comprehension_np (|L| = 1000): 27.2982738018
list_comprehension (|L| = 1000): 15.2280910015

These results were created with the following script: 这些结果是使用以下脚本创建的:

#! /usr/bin/env python
# -*- coding: utf-8 -*-

import random
import timeit
import numpy


def filtered_list_comprehension(grade_list):
    return [index for index, grade in enumerate(grade_list) if grade > 0.3]


def filtered_list_comprehension_np(grade_list):
    return numpy.array([index for index, grade in enumerate(grade_list)
                        if grade > 0.3])


def filtered_numpy(grade_list):
    grade_list_np = numpy.array(grade_list)
    return numpy.where(grade_list_np > 0.3)[0]

list_elements = 10000
grade_list = [random.random() for i in range(list_elements)]

res = timeit.timeit('filtered_numpy(grade_list)',
                    number=100000,
                    setup="from __main__ import grade_list, filtered_numpy")
print("numpy.where (|L| = %i): %s" % (list_elements, str(res)))
res = timeit.timeit('filtered_list_comprehension_np(grade_list)',
                    number=100000,
                    setup="from __main__ import grade_list, filtered_list_comprehension_np")
print("list_comprehension_np (|L| = %i): %s" % (list_elements, str(res)))
res = timeit.timeit('filtered_list_comprehension(grade_list)',
                    number=100000,
                    setup="from __main__ import grade_list, filtered_list_comprehension")
print("list_comprehension (|L| = %i): %s" % (list_elements, str(res)))

The enumerate is superfluous. enumerate是多余的。 If you truly have a list of floats, this will work: 如果您确实有一个浮动列表,则可以使用:

import numpy as np 
arr = np.array(grade_list)
g_indices = np.where(arr > 0)[0]

Since in boolean comparisons of numbers, 0.0 evaluates to False , technically you can leave off the >0 too. 由于在数字的布尔比较中, 0.0计算结果为False ,从技术上讲,您也可以忽略>0

But if you have a nested list, or a list of tuples, it won't. 但是,如果您有嵌套列表或元组列表,则不会。 We may need to know more about your list. 我们可能需要更多地了解您的清单。

try first converting the enumerate to a list first 首先尝试将枚举转换为列表

I did: 我做了:

np.asarray(list(enumerate([1, 2, 3])))

You want to use np.array , not np.asarray , and you don't need enumerate : 您想使用np.array而不是np.asarray ,并且不需要enumerate

import numpy as np

grade_list=[0,1,2,3,2,1,2,3,1,0,2,4]
arr=np.array(grade_list)

g_indices = np.where(arr > 0)[0]

print g_indices

>>> [ 1  2  3  4  5  6  7  8 10 11]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM