简体   繁体   中英

Concatenate two lists with respect to pandas dataframe index

I have two lists containing predictions made on two clusters of the same file. The clusters don't occur sequentially so I had to take the index values of both clusters and create two separate lists. I use the cluster-specific trained model for each of the lists. But after prediction, I am unable to merge the lists in the original order.

df_A = df_A.loc[running_index_A.values]
df_B = df_B.loc[runnung_index_B.values]
pred_cluster_A = modelA.predict(df_A)
pred_cluster_B = modelB.predict(df_B)

Now both the predictions should be arranged with respect to the running indexes A and B.

You may want to use the zip() function to realize that :

gatherList = list(zip(pred_cluster_A,pred_cluster_B))
## returns someting like : [(clustA_val1,clustB_val1),(clustA_val2,clust_val2)]

then you can apply pandas on it

df = pd.DataFrame(gatherList)

You could join to the lists of indexes and the lists of predictions

index_sum = running_index_A.values + running_index_B.values
pred_sum = pred_cluster_A + pred_cluster_B

and then link them in a dict

index_to_pred = dict(zip(index_sum, pred_sum))

and then sort the dict by key (ie by index)

import operator

index_to_pred_sorted = sorted(index_to_pred.items(), , key=lambda kv: kv[0]) # returns list

If you use Series this way they get sorted in a sequence:

import pandas as pd

s1 = pd.Series(['a', 'b'])
s2 = pd.Series(['c', 'd'])
result = pd.concat([s1, s2], ignore_index=True)

print(result)
# 0    a
  1    b
  2    c
  3    d

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM