简体   繁体   English

dataframe 中的 For 循环,并将每次迭代保留在 Python 中

[英]For loop in a dataframe and keep each iteration in Python

I'm creating a template to process SurveyMonkey surveys into a Tableau ready format.我正在创建一个模板来将 SurveyMonkey 调查处理为 Tableau 就绪格式。 I'm breaking down the surveys into their question types.我将调查分解为他们的问题类型。 I want to automate the script as much as possible so I'm trying to use a for loop for each question type.我想尽可能地自动化脚本,所以我尝试为每种问题类型使用一个 for 循环。

For our purposes let's stick to the Ranking type question.出于我们的目的,让我们坚持排名类型的问题。

Let's say I have a dataframe like this:假设我有一个像这样的 dataframe:

d = {'Respondent ID': [123, 234, 345], 'rank question 1': [3, 5, 4], 'rank question 2': [1, 6, 7]}
df = pd.DataFrame(data=d)
df

I want the final dataframe to look like this:我希望最终的 dataframe 看起来像这样:

rankfinal = {'Respondent ID': [123, 234, 345, 123, 234, 345], 'answer': [3, 5, 4, 1, 6, 7], 'question': ['rank question 1', 'rank question 1', 'rank question 1', 'rank question 2', 'rank question 2', 'rank question 2']}
rank1 = pd.DataFrame(data=rankfinal)
rank1

I've tried several attempts, but here is my best:我已经尝试了几次,但这是我最好的:

ranking = [1,2] # These are the column positions in the original survey dataframe

hold = [] 
for i in range(len(ranking)):
    hold.append(i)

respondent_id = []
questions = []
answers = []

for i in hold:
    if len(hold) < 1:
        print('No Ranking Questions! Moving on...')
    else:
        respondent_id.append(Respondent_ID)
        questions.append(df.columns[ranking[i]])
        answers.append(df.iloc[1:, ranking[i]])

While the code works, I don't think I can end up doing anything with the outputs to get them into a single dataframe.虽然代码有效,但我认为我最终无法对输出做任何事情以将它们放入单个 dataframe。 I've always struggled with loops so hopefully you might be able to help me get this project done.我一直在与循环作斗争,所以希望你能帮助我完成这个项目。

Thanks in advance.提前致谢。

I would approach this problem by consolidating rank questions.我会通过合并排名问题来解决这个问题。

  1. Loop through all "rank question" columns, and consolidate their values.遍历所有“排名问题”列,并巩固它们的值。

    • you will end up with a list [3, 5, 4, 1, 6, 7]你最终会得到一个列表 [3, 5, 4, 1, 6, 7]
  2. Duplicate Respondent ID field n times where n == num of "rank question" columns.重复受访者 ID 字段 n 次,其中 n == “排名问题”列的数量。

    • you will obtain a list [123, 234, 345, 123, 234, 345]您将获得一个列表 [123, 234, 345, 123, 234, 345]
  3. Create a list where you repeat "rank question" field names by the number of rows.创建一个列表,在其中按行数重复“排名问题”字段名称。

    • you will obtain a list [rank question 1, ..., rank question2, ...]您将获得一个列表 [rank question 1, ..., rank question2, ...]
  4. Finally assign these lists as json and pass to pandas Dataframe.最后将这些列表分配为 json 并传递给 pandas Dataframe。

I worked out a solution I'm mostly happy with:我制定了一个我最满意的解决方案:

rank_fun = {}

if len(ranking) < 1:
    print('No Ranking Questions! Moving on...')
    
else:
    for i in ranking:
        rank_fun[i] = pd.concat([Respondent_ID, df.iloc[1:,i]], axis=1)
        rank_fun[i]['question'] = rank_fun[i].columns[1]

rank1 = pd.DataFrame()

for i in ranking:
    rank1 = rank_fun[i].append(rank_fun[i])
    
rank1.rename(columns={rank1.columns[1]: "answer" }, inplace = True)
rank1['answer option'] = "Rank"
rank1 = rank1[rank1['answer'].str.contains("nan")==False]

rank1

My only annoyance now is when there are no ranking questions I wish it wouldn't throw an error.我现在唯一的烦恼是当没有排名问题时,我希望它不会引发错误。 Any ideas?有任何想法吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将 DataFrame 的每次迭代“for 循环”结果保存为 Python 中的新数组 - Saving Each Iteration of DataFrame "for loop" results as a new array in Python 在Pandas DataFrame Python中的循环中为每次迭代使用不同的正则表达式 - Use different regex for each iteration in a loop in pandas dataframe python Python Pandas Dataframe循环迭代 - Python Pandas Dataframe loop iteration Pandas Dataframe / Python:如何在每次Python迭代中使用for循环更新dataframe单元值? - Pandas Dataframe/Python : How to update dataframe cell value using for loop at each iteration in python? 如何将 for 循环的每次迭代的结果存储在 dataframe 中 - How to store the results of each iteration of for loop in a dataframe python迭代数据帧中的每个bin - python iteration over each bins in a dataframe 如何从我的 python 代码中创建单个 Dataframe 代码,为 for 循环的每次迭代生成字典? - How can I create a single Dataframe out of my python code that generates a dictionary for each iteration of a for loop? 在循环迭代结束时删除python pandas dataframe - delete python pandas dataframe in the end of a loop iteration 对于python中的循环,在每次迭代时初始化一个外部变量 - For loop in python initializes an outer variable on each iteration 通过 Python 在别处使用循环的每次迭代作为变量 - Using each iteration of a loop as a variable elsewhere with Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM