简体   繁体   English

如何生成不重复的排列

[英]How to generate permutations without repetition

I have a table that looks like below我有一张如下所示的表格

Loc Loc ID ID filter筛选 P1 P1
A一个 ABC1 ABC1 GHY GHY 55.6 55.6
A一个 DFT1 DFT1 FGH生长激素 67.8 67.8
B HJH5 HJH5 GHY GHY 67 67
C C HKL HKL BHY BHY 78 78
B GTY GTY FGH生长激素 60 60

I want the output as below.我想要 output 如下。 Basically, I want the records with the same Filter to be one row基本上,我希望具有相同过滤器的记录为一行

Filter筛选 ID ID Loc Loc P1 P1 m_ID m_Loc m_Loc m_p1 m_p1 total全部的
GHY GHY ABC1 ABC1 A一个 55.6 55.6 HJH5 HJH5 B 67 67 122.6 122.6
FGH生长激素 DFT1 DFT1 A一个 67.8 67.8 GTY GTY B 60 60 127.8 127.8

Is this achievable using itertools i python.这是否可以使用 itertools i python 来实现。 If yes can someone please suggest how can we do it?如果是的话,有人可以建议我们怎么做吗?

Here's a solution using lead and row_number that I think is a little nicer.这是一个使用leadrow_number的解决方案,我认为它更好一些。

select filter
      ,id
      ,loc 
      ,p1
      ,m_id
      ,m_loc
      ,m_p1

from 
      (with t2 as 
      (select row_number () over( partition by filter order by filter desc) as rn
      ,*
       from t)
              select rn,filter, id, loc, p1
             ,lead(id) over( partition by filter order by filter)  as m_id
             ,lead(loc) over( partition by filter order by filter) as m_loc
             ,lead(p1) over( partition by filter order by filter)  as m_p1
              from t2) t
where rn=1  
filter筛选 id ID loc位置 p1 p1 m_id m_loc m_loc m_p1 m_p1
BHY BHY HKL HKL C C 78 78 null null null null null null
FGH生长激素 DFT1 DFT1 A一个 67.8 67.8 GTY GTY B 60 60
GHY GHY ABC1 ABC1 A一个 55.6 55.6 HJH5 HJH5 B 67 67

Fiddle 小提琴

There should be a better solution to this question, but here is a solution that's based on what you did.这个问题应该有更好的解决方案,但这里有一个基于您所做的解决方案。 I used left join to not lose filters that only appear once and then I used group by to consolidate the results.我使用left join来不丢失只出现一次的过滤器,然后我使用group by来合并结果。

select t1.filter
      ,max(t1.id)  as id
      ,max(t1.loc) as loc
      ,max(t1.p1)  as p1
      ,min(t2.id)  as m_id
      ,min(t2.loc) as m_loc
      ,min(t2.p1)  as m_p1
      
from t as t1 left join t as t2 on t2.filter = t1.filter and t2.id <> (t1.id)
group by t1.filter
filter筛选 id ID loc位置 p1 p1 m_id m_loc m_loc m_p1 m_p1
BHY BHY HKL HKL C C 78 78 null null null null null null
FGH生长激素 GTY GTY B 67.8 67.8 DFT1 DFT1 A一个 60 60
GHY GHY HJH5 HJH5 B 67 67 ABC1 ABC1 A一个 55.6 55.6

Fiddle 小提琴

If the usage of pandas is possible, you can achive a flexible solutiion with the following:如果可以使用 pandas,您可以通过以下方式实现灵活的解决方案:

Definition of the data:数据定义:

df=pd.DataFrame({'Loc': {0: 'A', 1: 'A', 2: 'B ', 3: 'C', 4: 'B'},
 'ID': {0: 'ABC1', 1: 'DFT1', 2: 'HJH5', 3: 'HKL', 4: 'GTY'},
 'filter': {0: 'GHY', 1: 'FGH', 2: 'GHY', 3: 'BHY', 4: 'FGH'},
 'P1': {0: 55.6, 1: 67.8, 2: 67.0, 3: 78.0, 4: 60.0}}) 

Creation of the repetive columns:重复列的创建:

cols=["{}_{}".format(N, c) for N in range(0,df.groupby('filter').count()['ID'].max()) for c in df.columns]

Here, I first find the maximum required repitions by looking for the max occurences of each filter df.groupby('filter').count()['ID'].max() .在这里,我首先通过查找每个过滤器df.groupby('filter').count()['ID'].max()的最大出现次数来找到所需的最大重复次数。 The remaining code is just formating by adding a leading number.剩下的代码只是通过添加一个前导数字来格式化。

Creation of new dataframe with filter as index and the generated columns cols as columns创建新的 dataframe, filter作为索引,生成的列cols作为列

df_new=pd.DataFrame(index=set(df['filter']), columns=cols)

Now we have to fill in the data:现在我们必须填写数据:

for fil in df_new.index:
    values=[val for row in df[df['filter']==fil].values for val in row]
    df_new.loc[fil,df_new.columns[:len(values)]]=values

Here two things are done: First, the selected values based on the filter name fil are flattend by [val for row in df[df['filter']==fil].values for val in row] .这里做了两件事:首先,基于过滤器名称fil选择的值被[val for row in df[df['filter']==fil].values for val in row] Then, these values are filled into the dataframe starting at the left.然后,这些值从左侧开始填充到 dataframe 中。

The result is as expected:结果如预期:

    0_Loc   0_ID    0_filter    0_P1    1_Loc   1_ID    1_filter    1_P1
GHY     A   ABC1    GHY     55.6    B   HJH5    GHY     67.0
BHY     C   HKL     BHY     78.0    NaN     NaN     NaN     NaN
FGH     A   DFT1    FGH     67.8    B   GTY     FGH     60.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM