简体   繁体   English

遍历列表并添加以将 1 或 0 添加到数据框中的相应列

[英]iterate through a list and add to add an 1 or 0 to corresponding columns in a data frame

I am creating a user interface with the code below:我正在使用以下代码创建一个用户界面:

data_pool = {'corpus': ['aa', 'bb', 'cc','dd', 'ee'], 'zero_level_name': ['a', 'b', 
'c','d', 'e'], 'time': ['', '', '', '', ''], 'labels': ['', '', '', '', '']}

data_pool = pd.DataFrame(data_pool)

print(data_pool)

data_pool[['label1', 'label2', 'label3', 'label4']] = ''

index = 0
listA = ['label1', 'label2', 'label3', 'label4']
number_of_instances = int(input('Please, enter the number of texts you want to 
annotate today: '))

while index < number_of_instances:
    row = data_pool.loc[index]
    print("Enter labels for the text or enter 2 to go back to previous:")
    print()
    print('index:', index)
    print()
    print('text :\n\n\n', row['corpus'],'\n\n\n')
    start = time.time()
    label = input(": ")
    end = time.time()
    label.lower()
    if label == '2':
        index -= 1
        if index < 0:
            print('There is no previous row')
    else:
        label = label.split(',')
        label = [i.strip().lower() for i in label]
        for i in label:
            if i not in listA:
                print('Invalid input, try again')
                index -= 1
            else:
                if i == 'label1':
                    data_pool.loc[index, 'label1'] = 1
                elif i == 'label2':
                    data_pool.loc[index, 'label2'] = 1
                elif i == 'label3':
                    data_pool.loc[index, 'label3'] = 1
                elif i == 'label4':
                   data_pool.loc[index, 'label4'] = 1
                data_pool.loc[index, 'zero_level_name'] = label
                data_pool.loc[index, 'time'] = end-start
                break
        index += 1


print(data_pool)

with this code I achieve this dataframe:使用此代码,我实现了 dataframe:

  corpus zero_level_name      time labels label1 label2 label3 label4
0     aa        [label1]  5.372776             1                     
1     bb        [label2]  3.291902                    1              
2     cc               c                                             
3     dd               d                                             
4     ee               e   

with this code I am able to assign 1 to column 'label1' every time I input the string 'label1', assign 1 to column 'label2' every time I input the string 'label2' and so on.使用此代码,我可以在每次输入字符串“label1”时将 1 分配给“label1”列,在每次输入字符串“label2”时将 1 分配给“label2”列,依此类推。 However, with my code I cannot assign 1 to different columns at the same time if I input two labels at the same time.但是,使用我的代码,如果我同时输入两个标签,我不能同时将 1 分配给不同的列。 I want to be able to do this.我希望能够做到这一点。 For instance, if I input 'label1, label2' my output should be something like that:例如,如果我输入“label1, label2”,我的 output 应该是这样的:

    corpus zero_level_name      time labels label1 label2 label3 label4
0     aa  [label1, label2]  5.372776      1       1                     
1     bb  [label2, label3] 3.291902                    1     1         
2     cc  [label2, label3, label4] 3.548               1     1        1              
3     dd               d                                             
4     ee               e   

how can I achieve this goal?我怎样才能实现这个目标?

Note that your break is exiting the for loop, try to remove it and see if it solves your problem:) So even if you have two labels that are valid, you exit after the first one请注意,您的break正在退出 for 循环,请尝试将其删除并查看它是否解决了您的问题:) 因此,即使您有两个有效的标签,您也会在第一个标签之后退出

(or maybe you wanted the break, to exit the while loop, but I'm not sure I understand the logic there..) (或者也许你想要休息,退出 while 循环,但我不确定我是否理解那里的逻辑......)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM