从项列表中创建特定Shape的pandas数据框

Question

I have a pandas dataframe fa specific size say: 我有一个pandas dataframe fa特定大小说：

   ID  FACTOR    LEVEL
  160  SPM       P                       
  159  SPM2      S                         
  851  ABS       K                        
  415  ABS       P                       
  202  ABS       P 
  205  ABS2      Q
  207  AQE       T

What I want to do is that I have a list say of two items: X = ['GAB', 'YER'] 我想要做的是我有两个项目的列表：X = ['GAB'，'YER']

So what I want to do is that I want to distribute this items in the list as a new column say NewCol with a specific frequency as the same size of the rows in the dataframe. 所以我想要做的是，我想在列表中分发这些项目，因为新列说NewCol的特定频率与数据帧中行的大小相同。 So in this case say I have 7 rows and 2 items and 7/2 integer quotient is 3, so I want to put the first item in 3 rows and the next one in the remaining 4 rows. 所以在这种情况下说我有7行和2项，7/2整数商是3，所以我想把第一项放在3行，下一个放在剩下的4行。 So the output should like : 所以输出应该像：

         ID  FACTOR    LEVEL  NewCol
        160  SPM       P        GAB               
        159  SPM2      S        GAB                 
        851  ABS       K        GAB               
        415  ABS       P        YER               
        202  ABS       P        YER
        205  ABS2      Q        YER
        207  AQE       T        YER

So what I was able to do so far is to create aa series of two elements using 所以我到目前为止所能做的就是使用一系列的两个元素

 df_s = pd.DataFrame(X)

which is giving me 这给了我

     0
    GAB
    YER

However, I am not able to create a series out of this of the shape of the dataframe with the values in X and neither I am able to find a way to distribute it. 但是，我无法使用X中的值创建数据帧形状的系列，而且我无法找到分发它的方法。 I am still working on it, but some help/hints from the experts will be appreciated. 我仍在努力，但专家的一些帮助/提示将不胜感激。

Answer 1

Use np.repeat and assign the result to a new column: 使用np.repeat并将结果分配给新列：

arr = np.repeat(X, len(df) // len(X))
df['NewCol'] = pd.Series(arr, index=df.index[:len(arr)])
df

    ID FACTOR LEVEL NewCol
0  160    SPM     P    GAB
1  159   SPM2     S    GAB
2  851    ABS     K    GAB
3  415    ABS     P    YER
4  202    ABS     P    YER
5  205   ABS2     Q    YER
6  207    AQE     T    NaN

If you want to fill the last NaN, use, 如果你想填写最后一个NaN，请使用，

df['NewCol'] = df['NewCol'].ffill()

Answer 2

Some different idea using clip 一些不同的想法使用clip

import numpy as np 
n=len(X)
m=len(df)//n 
s=pd.Series(np.arange(len(df))//m).clip_upper(n-1)
df['New']=s.map(dict(zip(s.unique(),X)))
df
Out[278]: 
    ID FACTOR LEVEL  New
0  160    SPM     P  GAB
1  159   SPM2     S  GAB
2  851    ABS     K  GAB
3  415    ABS     P  YER
4  202    ABS     P  YER
5  205   ABS2     Q  YER
6  207    AQE     T  YER

从项列表中创建特定Shape的pandas数据框

问题描述

2 个解决方案

解决方案1
3 已采纳 2019-05-21 23:52:48

解决方案2
2 2019-05-22 00:37:01

从项列表中创建特定Shape的pandas数据框

问题描述

2 个解决方案

解决方案1 3 已采纳 2019-05-21 23:52:48

解决方案2 2 2019-05-22 00:37:01

解决方案1
3 已采纳 2019-05-21 23:52:48

解决方案2
2 2019-05-22 00:37:01