简体   繁体   English

在没有 (...,) pandas python 的情况下获取每一行中具有最大值的列的索引

[英]Get index of the column with max value in everyrow without (...,) pandas python

I worked this on my Jupyter.我在我的 Jupyter 上做了这个。
I'd like to know if there's a way to find the location (column index) of the max value in everyrow in the table.我想知道是否有办法找到表中每一最大值的位置(列索引)
For example, it looks like this:例如,它看起来像这样:

yo1 = [1,3,7]
yo2 = [2,4,5,6,8]
yo3 = [0.1,0.3,0.7]
yo4 = [0.2,0.4,0.5,0.6,0.8]

yoo = []
for x in yo3:
    vvv = []
    for y in yo4:
        dot = x*y
        na = x+x
        nb = y+y
        prod = dot/(na+nb)
        vvv.append(prod)
    yoo.append(vvv)
yooo = pd.DataFrame(yoo, columns=(yo2), index=[yo1])
print(yooo)

(yes, it is cosine similarity) (是的,这是余弦相似度)

output:
      2         4         5         6         8
1  0.033333  0.040000  0.041667  0.042857  0.044444
3  0.060000  0.085714  0.093750  0.100000  0.109091
7  0.077778  0.127273  0.145833  0.161538  0.186667

Then, I want to get index of the column with max value in everyrow.然后,我想在每一行中获取具有最大值的列的索引 I used this :我用过这个:

go = yooo.idxmax().reset_index()
go.columns=['column', 'get']
go

output:
    column  get
0   2       (7,)
1   4       (7,)
2   5       (7,)
3   6       (7,)
4   8       (7,)

but my desired output is :但我想要的输出是:

output:
    column  get
0   2       7
1   4       7
2   5       7
3   6       7
4   8       7

I've tried replace the '(' with ' '我试过用 ' ' 替换 '('

go['get']=go['get'].str.replace('(','')

and used lstrip-rstrip并使用 lstrip-rstrip

go['get']=go['get'].map(lambda x: x.lstrip('(').rstrip(',)'))

also this one还有这个

top_n=1
get = pd.DataFrame({n: yooo[col].nlargest(top_n).index.tolist() for n, col in enumerate(yooo)}).T

They all did'nt work well :(他们都没有很好地工作:(

Help me.. How to solve this and would you explain it to me???帮帮我.. 如何解决这个问题,你能向我解释一下吗??? Thankyou!!!谢谢!!!

Your real problem is in your dataframe constructor for 'yooo', you are wrapping a list with [] creating a 2d list and thus creating a pd.MultiIndex, hence the tuples (7,).你真正的问题是在你的'yooo'的数据框构造函数中,你用[]包装一个列表,创建一个二维列表,从而创建一个pd.MultiIndex,因此是元组(7,)。 Use this instead:改用这个:

 yooo = pd.DataFrame(yoo, columns=(yo2), index=yo1)

 yooo.idxmax()

Output:输出:

2    7
4    7
5    7
6    7
8    7
dtype: int64

And further to get dataframe with column names:并进一步获取带有列名的数据框:

yooo.idxmax().rename_axis('column').rename('get').reset_index()

Output:输出:

   column  get
0       2    7
1       4    7
2       5    7
3       6    7
4       8    7

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM