简体   繁体   English

循环遍历 dataframe,每个选择标准都有用户输入

[英]loop through dataframe with user input for each selection criteria

The two data structures I am working with are a dataframe and a list.我正在使用的两个数据结构是 dataframe 和一个列表。 I need to print rows from the cluster column that match my list at index 0, 1, n... I then need to take user input and feed it to new column.我需要在索引 0、1、n 处打印与我的列表匹配的簇列中的行...然后我需要获取用户输入并将其提供给新列。

sample_data = {'Cluster': ['A', 'B', 'A', 'B'],
                'column2' : [21, 32, 14, 45]}
sample_dataframe = pd.DataFrame(data=sample_data)
lst = ['A', 'B']

Desired Dataframe after taking user input(the y and n is what the user will input):接受用户输入后所需的 Dataframe(y 和 n 是用户将输入的内容):

expected_output = {'cluster': ['A', 'B', 'A', 'B'],
                'column2': ['Top', 'Bottom', 'Top', 'Bottom'],
                'NEW_COLUMN' : ['y', 'n', 'y', 'n']}
expected_output = pd.DataFrame(data=expected_output)

I was thinking some sort of loop like:我在想某种循环,例如:

for i in lst:
    if i == column value:
      print all rows that match 
      and take user input to form a new column

I haven't been able to put together the logic for this yet.我还不能把这个逻辑放在一起。
Any help would be greatly appreciated.任何帮助将不胜感激。 The user input should for the new column should be put into every row that matches the list, For example.例如,新列的用户输入应放入与列表匹配的每一行。 the new dataframe has user input 'y' where the Cluster is 'A' and user input 'n' where the cluster is 'B'.新的 dataframe 具有用户输入“y”,其中集群是“A”,用户输入“n”,其中集群是“B”。

I think that should work.我认为这应该有效。

sample_data = {'Cluster': ['A', 'B', 'A', 'B'],
                   'column2': [21, 32, 14, 45]}
sample_dataframe = pd.DataFrame(data=sample_data)
lst = ['A', 'B']

#Create a dictionary to save the user input
input_dict = {}

for i in lst:

    #Genrate Data with right Cluster
    series = sample_dataframe[sample_dataframe['Cluster'] == i]

    #Check if Cluster from list is in Datafrmae
    if series.empty:
        continue
    else:
        print(series)
        #Get user input and store it in dictionary
        input_dict[i] = input()

#Create new Column
sample_dataframe['input'] = sample_dataframe['Cluster'].map(input_dict)
print(sample_dataframe)

For better unserstanding a dictionary is a list of key/value pair.为了更好地理解字典是键/值对的列表。 In this case the Cluster is the key and the user input is the value.在这种情况下,集群是键,用户输入是值。 To generate the new Column i just map the Cluster column with the dictionary.要生成新的列,我只需 map 带有字典的 Cluster 列。 Maybe a closer look to the map method could help.也许仔细查看map方法可能会有所帮助。 I hope by and large the code can be understood well.我希望代码大体上能被很好地理解。 If not please ask for clarification.如果不是,请要求澄清。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM