根据多种条件分析数据框

Question

names   Class   Category    label
ram     A        Red        one
ravi    A        Red        two
gopal   B        Green      three
Sri     C        Red        four      

my_list1=["Category"]
my_list2=["Class"]

I need to get the combination counts between these two columns.

I am trying to get the combination of some selected columns. 我正在尝试获取某些选定列的组合。 my_list2 even have more than one. my_list2甚至不止一个。

 I tried, 
 df[mylist1].value_counts()

It is working fine for a sinigle column. 对于single列，它工作正常。 But I want to do for multiple column in my_list2 based on my_list1 但是我想基于my_list1对my_list2中的多列做

My desired output should be, 我想要的输出应该是

output_df,
 Value     Counts
 Red.A      2
 Red.C      1
 Green.B    1

Answer 1

I think you need join both lists first, then create Series and last value_counts : 我认为您需要先加入两个列表，然后创建Series和最后一个value_counts ：

print (df)
   names Class Category  label Class1
0    ram     A      Red    one      E
1   ravi     A      Red    two      G
2  gopal     B    Green  three      B

my_list1=["Category"]
my_list2=["Class", "Class1"]


df = df[my_list1 + my_list2].apply('.'.join, axis=1).value_counts()
print (df)
Red.A.E      1
Red.A.G      1
Green.B.B    1
dtype: int64

Detail: 详情：

print (df[my_list1 + my_list2])
  Category Class Class1
0      Red     A      E
1      Red     A      G
2    Green     B      B

print (df[my_list1 + my_list2].apply('.'.join, axis=1))
0      Red.A.E
1      Red.A.G
2    Green.B.B
dtype: object

Answer 2

You can use str.cat like 您可以像这样使用str.cat

In [5410]: my_list1 = ["Category"]
      ...: my_list2 = ["Class", "Class1"]

In [5411]: df[my_list1+my_list2].apply(lambda x: x.str.cat(sep='.'), axis=1).value_counts()
Out[5411]:
Green.B.B    1
Red.A.E      1
Red.A.G      1
dtype: int64

Also 也

In [5516]: pd.Series('.'.join(x) for x in df[my_list1 + my_list2].values).value_counts()
Out[5516]:
Green.B.B    1
Red.A.E      1
Red.A.G      1
dtype: int64

Or

In [5517]: pd.Series(map('.'.join, df[my_list1 + my_list2].values)).value_counts()
Out[5517]:
Green.B.B    1
Red.A.E      1
Red.A.G      1
dtype: int64

根据多种条件分析数据框

问题描述

2 个解决方案

解决方案1
1 2017-10-24 10:22:11

解决方案2
1 2017-10-24 10:32:44

根据多种条件分析数据框

问题描述

2 个解决方案

解决方案1 1 2017-10-24 10:22:11

解决方案2 1 2017-10-24 10:32:44

解决方案1
1 2017-10-24 10:22:11

解决方案2
1 2017-10-24 10:32:44