如何将一系列Pandas数据框行变成具有多个值的一列？

Question

Right now I have an Excel sheet in the following format which I have converted into a Pandas data frame in Python: 现在，我有以下格式的Excel工作表，已将其转换为Python中的Pandas数据框：

      Name            Column2      Unnamed: 2   Datatype   Definition
0   Entity   Accounts Receivable                                    
1    term1                                      char       term1
2    term2                                      numeric    term2
3    term3                                      char       term3
4   Entity      Accounts Payable                                    
5    term4                                      char       term4
6    term5                                      char       term5
7    term6                                      varchar    term6
8    term7                                      numeric    term7

I'm attempting to write a code that will automatically populate the empty cells in Column2 with the corresponding value for 'Entity' next to each term name. 我正在尝试编写一个代码，该代码将自动在Column2中的空单元格中填充每个术语名称旁边的“ Entity”对应值。 So term1, term2, and term3 would be 'Accounts Receivable' and term4, term5, term6, and term7 would be 'Accounts Payable'. 因此term1，term2和term3将是“应收帐款”，term4，term5，term6和term7将是“应付帐款”。

This is the code I've written so far: 这是我到目前为止编写的代码：

   df = pd.read_excel('test.xlsx')

   df = df.replace(np.nan,'')

   values = df.values.tolist()

   ent_list = []

   for values[0] in values:
       if values[0][0] == 'Entity':
           ent_list.append(values[0][1])

   for j in range(len(values)):
       for e in range(len(ent_list)):
           while values[j][1] != ent_list[e]:
               values[j][1] = ent_list[e]
               break
           e += 1

When I print out 'values' though, I get the following: 当我打印出“值”时，得到以下信息：

[['Entity', 'Accounts Payable', '', '', ''], 
 ['term1', 'Accounts Payable', '', 'char', 'term1'], 
 ['term2', 'Accounts Payable', '', 'numeric', 'term2'], 
 ['term3', 'Accounts Payable', '', 'char', 'term3'], 
 ['Entity', 'Accounts Payable', '', '', ''], 
 ['term4', 'Accounts Payable', '', 'char', 'term4'], 
 ['term5', 'Accounts Payable', '', 'char', 'term5'], 
 ['term6', 'Accounts Payable', '', 'varchar', 'term6'], 
 ['term7', 'Accounts Payable', '', 'numeric', 'term7']]

Ideally it should look like this: 理想情况下，它应如下所示：

[['Entity', 'Accounts Receivable', '', '', ''], 
 ['term1', 'Accounts Receivable', '', 'char', 'term1'], 
 ['term2', 'Accounts Receivable', '', 'numeric', 'term2'], 
 ['term3', 'Accounts Receivable', '', 'char', 'term3'], 
 ['Entity', 'Accounts Payable', '', '', ''], 
 ['term4', 'Accounts Payable', '', 'char', 'term4'], 
 ['term5', 'Accounts Payable', '', 'char', 'term5'], 
 ['term6', 'Accounts Payable', '', 'varchar', 'term6'], 
 ['term7', 'Accounts Payable', '', 'numeric', 'term7']]

Is there a way to achieve this using the method I am currently using? 有没有一种方法可以使用我目前使用的方法来实现？ I have to imagine this is possible with VBA but I'm honestly more comfortable using Python. 我必须想象使用VBA可以做到这一点，但是老实说，使用Python会让我更自在。 I'm going to keep revising this code but am genuinely stumped as I am not too experienced. 我将继续修改此代码，但是由于我不太有经验，所以我真的很沮丧。

I know I could do it manually but that will take too long as these reports need to generated every so often and usually include between 40,000 and 70,000 rows, and I would much prefer to automate this. 我知道我可以手动执行此操作，但这会花费很长时间，因为这些报告需要经常生成，并且通常包含40,000至70,000行，因此，我更希望将其自动化。

Answer 1

df = df.fillna(method = 'ffill')

如何将一系列Pandas数据框行变成具有多个值的一列？

问题描述

1 个解决方案

解决方案1
0 2018-09-08 18:39:07

如何将一系列Pandas数据框行变成具有多个值的一列？

问题描述

1 个解决方案

解决方案1 0 2018-09-08 18:39:07

解决方案1
0 2018-09-08 18:39:07