简体   繁体   English

如何将一系列Pandas数据框行变成具有多个值的一列?

[英]How to turn a series of Pandas dataframe rows into one column with multiple values?

Right now I have an Excel sheet in the following format which I have converted into a Pandas data frame in Python: 现在,我有以下格式的Excel工作表,已将其转换为Python中的Pandas数据框:

      Name            Column2      Unnamed: 2   Datatype   Definition
0   Entity   Accounts Receivable                                    
1    term1                                      char       term1
2    term2                                      numeric    term2
3    term3                                      char       term3
4   Entity      Accounts Payable                                    
5    term4                                      char       term4
6    term5                                      char       term5
7    term6                                      varchar    term6
8    term7                                      numeric    term7

I'm attempting to write a code that will automatically populate the empty cells in Column2 with the corresponding value for 'Entity' next to each term name. 我正在尝试编写一个代码,该代码将自动在Column2中的空单元格中填充每个术语名称旁边的“ Entity”对应值。 So term1, term2, and term3 would be 'Accounts Receivable' and term4, term5, term6, and term7 would be 'Accounts Payable'. 因此term1,term2和term3将是“应收帐款”,term4,term5,term6和term7将是“应付帐款”。

This is the code I've written so far: 这是我到目前为止编写的代码:

   df = pd.read_excel('test.xlsx')

   df = df.replace(np.nan,'')

   values = df.values.tolist()

   ent_list = []

   for values[0] in values:
       if values[0][0] == 'Entity':
           ent_list.append(values[0][1])

   for j in range(len(values)):
       for e in range(len(ent_list)):
           while values[j][1] != ent_list[e]:
               values[j][1] = ent_list[e]
               break
           e += 1

When I print out 'values' though, I get the following: 当我打印出“值”时,得到以下信息:

[['Entity', 'Accounts Payable', '', '', ''], 
 ['term1', 'Accounts Payable', '', 'char', 'term1'], 
 ['term2', 'Accounts Payable', '', 'numeric', 'term2'], 
 ['term3', 'Accounts Payable', '', 'char', 'term3'], 
 ['Entity', 'Accounts Payable', '', '', ''], 
 ['term4', 'Accounts Payable', '', 'char', 'term4'], 
 ['term5', 'Accounts Payable', '', 'char', 'term5'], 
 ['term6', 'Accounts Payable', '', 'varchar', 'term6'], 
 ['term7', 'Accounts Payable', '', 'numeric', 'term7']]

Ideally it should look like this: 理想情况下,它应如下所示:

[['Entity', 'Accounts Receivable', '', '', ''], 
 ['term1', 'Accounts Receivable', '', 'char', 'term1'], 
 ['term2', 'Accounts Receivable', '', 'numeric', 'term2'], 
 ['term3', 'Accounts Receivable', '', 'char', 'term3'], 
 ['Entity', 'Accounts Payable', '', '', ''], 
 ['term4', 'Accounts Payable', '', 'char', 'term4'], 
 ['term5', 'Accounts Payable', '', 'char', 'term5'], 
 ['term6', 'Accounts Payable', '', 'varchar', 'term6'], 
 ['term7', 'Accounts Payable', '', 'numeric', 'term7']]

Is there a way to achieve this using the method I am currently using? 有没有一种方法可以使用我目前使用的方法来实现? I have to imagine this is possible with VBA but I'm honestly more comfortable using Python. 我必须想象使用VBA可以做到这一点,但是老实说,使用Python会让我更自在。 I'm going to keep revising this code but am genuinely stumped as I am not too experienced. 我将继续修改此代码,但是由于我不太有经验,所以我真的很沮丧。

I know I could do it manually but that will take too long as these reports need to generated every so often and usually include between 40,000 and 70,000 rows, and I would much prefer to automate this. 我知道我可以手动执行此操作,但这会花费很长时间,因为这些报告需要经常生成,并且通常包含40,000至70,000行,因此,我更希望将其自动化。

df = df.fillna(method = 'ffill')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将行附加到 Pandas 数据帧,并将多个重叠单元格(具有相同索引)转换为单个值,而不是一个系列? - How to append rows to a Pandas dataframe, and have it turn multiple overlapping cells (with the same index) into a single value, instead of a series? 如何 Select Pandas Dataframe 的行具有一个具有多个值的列值? - How to Select Rows of Pandas Dataframe with one Column Value which has multiple values? 如何将一列中的值传播到其他列中的行(熊猫数据框) - How to propagate values in one column to rows in other columns (pandas dataframe) 有没有一种优化方法可以将一个大的 dataframe 列表列转换为多行(Pandas)? - Is there an optimize way to turn a large dataframe column of lists into multiple rows (Pandas)? 按多个列值过滤pandas数据帧行 - Filter pandas dataframe rows by multiple column values Pandas DataFrame - 按多列值对行求和 - Pandas DataFrame - summing rows by multiple column values 如何为列表中的一个键创建具有多个值的 Python 字典,然后创建具有一列和多行的 pandas 数据框 - How can I create a Python dictionary with multiple values for one key from a list, to then create a pandas dataframe with one column and multiple rows How do I turn a Pandas DataFrame object with 1 main column into a Pandas Series with the index column from the original DataFrame - How do I turn a Pandas DataFrame object with 1 main column into a Pandas Series with the index column from the original DataFrame Pandas数据帧:如何根据标记的列值将一行转换为单独的行 - Pandas dataframe: how to turn one row into separate rows based on labelled column value pandas:将DataFrame列(一个系列)中的分隔值拆分为多个列。 优雅的解决方 - pandas: Split separated values in a DataFrame column (one Series) into multiple Columns. Elegant solutions?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM