简体   繁体   English

根据每个子数组中的第一项分离二维数组

[英]Separating a 2D array based on the first item in each subarray

I have a long 2D numpy array, like so:我有一个长的 2D numpy 数组,如下所示:

[['State1' 15]
 ['State1' 19]
 ['State1' 26]
 ['State2' 3]
 ['State1' 9]
 ...
 ['State2' 3]]

where only 2 states are possible as the first element.其中只有 2 个状态可能作为第一个元素。 I want to separate this 2D array into two different arrays, one for each state, with only the numeric information in each (I need this for a boxplot), but I am not too sure how to separate.我想将这个二维数组分成两个不同的数组,每个州一个,每个州只有数字信息(我需要这个用于箱线图),但我不太确定如何分开。 I've tried list comprehension but it returns a long array of True and Falses rather than the values themselves我试过列表理解,但它返回一长串 True 和 Falses 而不是值本身

st1 = [state[0] == "State1" for state in joined] #joined is the array shown above

How could i do this, potentially with a more concise way?我怎么能用更简洁的方式做到这一点?

Edit:编辑:

My problem with filter() is that it returns the arrays and I dont know how to specify to only return the second entry:我的filter()问题是它返回数组,我不知道如何指定只返回第二个条目:

normal = list(filter(lambda x: x[0] == "State1", joined))

[array(['State1', '14.4659'], dtype='<U9'), array(['State1', '20.8356'], dtype='<U9'), array(['State1', '5.3358'], dtype='<U9'), array(['State1', '1.9017'],...]

Here's one way using a defaultdict :这是使用defaultdict的一种方法:

from collections import defaultdict

my_list = [['State1', 15], ['State1', 19], ['State1', 26], ['State1', 3], 
           ['State2', 9], ['State2', 3]]

d = defaultdict(list)
for l in my_list:
    d[l[0]].append(l)

print(list(d.values()))

[[['State1', 15], ['State1', 19], ['State1', 26], ['State1', 3]],
 [['State2', 9], ['State2', 3]]]

I actually found another way to do it using numpy , as follows:我实际上找到了另一种使用numpy的方法,如下所示:

index = np.where(joined[0:,] == "State1")[0] #get indices where this is true, returns the array
normals = joined[index][:,1] #to get an array of second elements

I am not clear if you are working with arrays or lists but, assuming that you want to split values from lists inside a list you can make it simple as Python: you can read values iterating over the position of each list.我不清楚你是在使用数组还是列表,但是假设你想从列表中的列表中拆分值,你可以像 Python 一样简单:你可以读取遍历每个列表位置的值。 In this solution, there is no need to import anything to solve your challenge:在此解决方案中,无需导入任何内容即可解决您的挑战:

Assuming that your data does not have any values more than "State1" and "State2"假设您的数据没有任何值超过“State1”和“State2”

#Creating your list
l = [['State1', 11], ['State2', 13], ['State1', 2], ['State2', 5], ['State1', 7], ['State1', 3]]

Solution with Python用 Python 解决

#Creating a empty lists for saving separated values
states=[]
values=[]
#Reading states & values from your data
for x in range(len(l)):
    stat=l[x][0]
    val=l[x][1]
    states.append(stat)
    values.append(val)

#Showing the data contained in each list: states & values
print(states)
print(values)

Output输出

#First list
['State1', 'State2', 'State1', 'State2', 'State1', 'State1']
#Second list
[11, 13, 2, 5, 7, 3]

If you want to filter a list from one of the States using True/False values as you describe above, try this:如果您想使用上面描述的 True/False 值从其中一个州过滤列表,请尝试以下操作:

from itertools import compress
bool_list=[l[state][0]=='State1' for state in range(len(l))]
st1=list(compress(l, bool_list))
print(st1)

Output输出

#Filtered data by State1
[['State1', 11], ['State1', 2], ['State1', 7], ['State1', 3]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM