简体   繁体   English

熊猫-python-使用列将值添加到新列

[英]pandas - python - using columns to add value to new column

I haven't been able to figure this out. 我还没弄清楚。 Let's say I have a pandas dataframe ( port_info ) that looks like this: 假设我有一个熊猫数据port_infoport_info ),看起来像这样:

         chass  olt  port   BW
0        1      1     1      80000
1        1      1     2     212000
2        1      1     3     926600
3        1      1     4      50000
4        1      1     5     170000
5        1      1     6     840000
6        1      1     7     320000
7        1      1     8     500000
8        1      1     9     270000
9        1      1    10     100000
10       1      2     1     420000
11       1      2     2      60000
12       1      2     3     480000
13       1      2     4      90000
14       1      2     5          0
15       1      2     6     520000
16       1      2     7     840000
17       1      2     8     900000
18       1      2     9     110000
19       1      2    10          0

I want to add a column depending on how many ports per olt per chassis. 我想添加一列,具体取决于每个机箱每个olt有多少个端口。 If there are more than 8 ports per olt per chass, then add a value of 1 to every row for that olt for that chass. 如果每个机架每个olt有8个以上的端口,则为该机架的每个olt每行添加一个值1。 Otherwise, add a value of 10 to every row for that olt for that chass. 否则,为该跟踪记录的每行添加10值。

In the end, I need a new column ( port_info.BW_cap ) that has a value for each port dependent on how many ports there are in that olt in that chass. 最后,我需要一个新列( port_info.BW_cap ),该列的每个端口都有一个值,具体取决于该机箱中该olt中有多少个端口。

So far I have this to check the max port per olt: 到目前为止,我已经检查每个olt的最大端口:

test = pd.DataFrame(table.groupby(['chass','olt'])['port'].max()).reset_index()

That gets me a minimalist dataframe that looks like this: 这使我得到了一个如下所示的极简数据框:

chass  olt
1      1      10
       2      10
       3      10
       4      10
       5      10
       6      10
       7      10
       8      10
       11     10
       12     10
       13     10
       14     10
       15     10
       16     10
       17     10
       18     10

What's the best way to take all of the above and basically have pandas iterate through every row in the initial dataframe and compare to the appropriate row in the minimalist dataframe to check what the max port is for that olt for that chassis, and add a value to the row in the initial dataframe under a new column named 'BW_cap' dependant on the value in the minimalist dataframe for that same chass/olt combo? 采取以上所有措施并让熊猫基本上遍历初始数据帧中的每一行,然后与极简数据帧中的相应行进行比较,以检查该机箱的olt的最大端口是什么,并添加一个值,这是最好的方法到初始数据帧中名为“ BW_cap”的新列下的行,取决于同一个chas / olt组合的极简数据帧中的值?

So in the end, something that looks like this: 所以最后,看起来像这样:

       chass  olt  port       BW    BW_cap
0        1    1     1    80000        1
1        1    1     2   212000        1
2        1    1     3   926600        1
3        1    1     4    50000        1
4        1    1     5   170000        1

I think I get what you want. 我想我得到你想要的。 You just need the bottom 3 lines in this code. 您只需要此代码的底部3行。 You were close, you can just join your groupby max result to the original dataframe. 距离您很近,您可以将groupby max结果加入原始数据框。

One thing to note, saying "if there are more than 8 ports per chass/olt combination" is different than saying "the max port is > 8". 需要注意的一件事是,“每个通道/ olt组合中是否有8个以上的端口”不同于“最大端口> 8”。 If your ports aren't always number ascending 1 to 10. if there are chass/olt combinations that have 3, 6, 9 as the 3 ports, thats only 3 ports but the max is 9. 如果您的端口并不总是以1到10的顺序递增,则如果有Chas / olt组合将3、6、9作为3个端口,则多数民众赞成只有3个端口,但最大为9。

import random
random.seed(123)

df = pd.DataFrame({
        'chass':[random.randint(1, 10) for x in range(200)],
        'olt':[random.randint(1, 10) for x in range(200)],
        'port':[random.randint(1, 10) for x in range(200)],
        'BW':[random.randint(0, 1000000) for x in range(200)]})

g = df.groupby(['chass', 'olt']).apply(lambda x: 1 if x.port.max() > 8 else 10).reset_index()
g.columns = ['chass', 'olt', 'BW_cap']
df = pd.merge(df, g, on=['chass', 'olt'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用python pandas在特定列中使用最大值在数据框中添加新的“x”列数 - Add new 'x' number of columns in data frame using max value in specific column using python pandas Python Pandas:根据已存在的列值添加新列,并将新列的值设置为1或0 - Python pandas: add new columns based on the existed a column value, and set the value of new columns as 1 or 0 Python:使用其他列将Pandas中的新列的值分配为列表 - Python: Assign value to a new column in Pandas as list using other columns 将列值拆分为 2 个新列 - Python Pandas - Splitting column value into 2 new columns - Python Pandas python pandas for循环向新列添加值 - python pandas for loop add value to new column Pandas(python):列中的max定义新列中的新值 - Pandas (python): max in columns define new value in new column Python Pandas-在多列中添加基于名字和姓氏的新列 - Python Pandas - Add a new column with value based on first and last name in multiple columns 大熊猫:使用其他两列中的任何一个添加新列 - pandas: add new column with value from either of two other columns Python Pandas:使用数组为新列的每个值选择不同的列 - Python Pandas: Using an array to choose different columns for each value of a new column 按列连接数据框并按值创建新列 Pandas Python - Join Dataframes by column and create new columns by value Pandas Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM