如果列是可变长度的字符串，如何将 Pandas DataFrame 列拆分为多列？

Question

I have a Pandas DataFrame that was created by reading a table from a PDF with tabula.我有一个 Pandas DataFrame，它是通过从带有表格的 PDF 读取表格创建的。 The PDF isn't parsed perfectly, so I end up with a few table columns smushed into one column in the resulting DataFrame. PDF 没有被完美解析，所以我最终在生成的 DataFrame 中将一些表列弄乱成一列。 The issue is that one of the table columns in the PDF is text, so there are sometimes one word and sometimes two words that compose the column.问题是 PDF 中的表格列之一是文本，因此有时一个单词有时两个单词组成该列。 Example:例子：

            Col_1  Col_2
0       Hello X Y      A
1 Hello world Q R      B
2          Hi S T      C

I would like to split Col_1 into 3 columns.我想将Col_1分成 3 列。 I'm not sure how to do this, given that the first new column would sometimes consist of one word, as in the case of Rows 0 & 2, and sometimes consist of two words, as in the case of Row 1.我不确定如何执行此操作，因为第一个新列有时会包含一个单词，例如第 0 行和第 2 行，有时包含两个单词，例如第 1 行。

I have tried splitting the strings of Col_1 with df['Col_1'].str.split(' ', 4, expand=True) , but this starts the splitting from the beginning of the string (from the left), whereas I would like the splitting to be done from the right, I suppose.我尝试使用df['Col_1'].str.split(' ', 4, expand=True)拆分Col_1的字符串，但这会从字符串的开头（从左侧）开始拆分，而我会我想，就像从右边进行拆分一样。

Answer 1

You can try using str.rsplit :您可以尝试使用str.rsplit ：

Splits string around given separator/delimiter, starting from the right.从右侧开始，围绕给定的分隔符/定界符拆分字符串。

df['Col_1'].str.rsplit(' ', 2, expand=True)

Output: Output：

             0  1  2
0        Hello  X  Y
1  Hello world  Q  R
2           Hi  S  T

As a full dataframe:作为一个完整的 dataframe：

df['Col_1'].str.rsplit(' ', 2, expand=True).add_prefix('nCol_').join(df)

Output: Output：

        nCol_0 nCol_1 nCol_2            Col_1 Col_2
0        Hello      X      Y        Hello X Y     A
1  Hello world      Q      R  Hello world Q R     B
2           Hi      S      T           Hi S T     C

如果列是可变长度的字符串，如何将 Pandas DataFrame 列拆分为多列？

问题描述

1 个解决方案

解决方案1
1 2021-12-03 02:54:33

如果列是可变长度的字符串，如何将 Pandas DataFrame 列拆分为多列？

问题描述

1 个解决方案

解决方案1 1 2021-12-03 02:54:33

解决方案1
1 2021-12-03 02:54:33