简体   繁体   中英

Separating a 2D array based on the first item in each subarray

I have a long 2D numpy array, like so:

[['State1' 15]
 ['State1' 19]
 ['State1' 26]
 ['State2' 3]
 ['State1' 9]
 ...
 ['State2' 3]]

where only 2 states are possible as the first element. I want to separate this 2D array into two different arrays, one for each state, with only the numeric information in each (I need this for a boxplot), but I am not too sure how to separate. I've tried list comprehension but it returns a long array of True and Falses rather than the values themselves

st1 = [state[0] == "State1" for state in joined] #joined is the array shown above

How could i do this, potentially with a more concise way?

Edit:

My problem with filter() is that it returns the arrays and I dont know how to specify to only return the second entry:

normal = list(filter(lambda x: x[0] == "State1", joined))

[array(['State1', '14.4659'], dtype='<U9'), array(['State1', '20.8356'], dtype='<U9'), array(['State1', '5.3358'], dtype='<U9'), array(['State1', '1.9017'],...]

Here's one way using a defaultdict :

from collections import defaultdict

my_list = [['State1', 15], ['State1', 19], ['State1', 26], ['State1', 3], 
           ['State2', 9], ['State2', 3]]

d = defaultdict(list)
for l in my_list:
    d[l[0]].append(l)

print(list(d.values()))

[[['State1', 15], ['State1', 19], ['State1', 26], ['State1', 3]],
 [['State2', 9], ['State2', 3]]]

I actually found another way to do it using numpy , as follows:

index = np.where(joined[0:,] == "State1")[0] #get indices where this is true, returns the array
normals = joined[index][:,1] #to get an array of second elements

I am not clear if you are working with arrays or lists but, assuming that you want to split values from lists inside a list you can make it simple as Python: you can read values iterating over the position of each list. In this solution, there is no need to import anything to solve your challenge:

Assuming that your data does not have any values more than "State1" and "State2"

#Creating your list
l = [['State1', 11], ['State2', 13], ['State1', 2], ['State2', 5], ['State1', 7], ['State1', 3]]

Solution with Python

#Creating a empty lists for saving separated values
states=[]
values=[]
#Reading states & values from your data
for x in range(len(l)):
    stat=l[x][0]
    val=l[x][1]
    states.append(stat)
    values.append(val)

#Showing the data contained in each list: states & values
print(states)
print(values)

Output

#First list
['State1', 'State2', 'State1', 'State2', 'State1', 'State1']
#Second list
[11, 13, 2, 5, 7, 3]

If you want to filter a list from one of the States using True/False values as you describe above, try this:

from itertools import compress
bool_list=[l[state][0]=='State1' for state in range(len(l))]
st1=list(compress(l, bool_list))
print(st1)

Output

#Filtered data by State1
[['State1', 11], ['State1', 2], ['State1', 7], ['State1', 3]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM