如何使用 str.split 或正则表达式在 python 中将一列拆分为两列或多列？

Question

How to split this column into 2 or more columns.如何将此列拆分为 2 列或更多列。 I've used str.split('/',2) to split but it just removed the '/' and did not split into 2 columns.我使用str.split('/',2)进行拆分，但它只是删除了 '/' 并没有拆分为 2 列。

X X
East Bound: 6900 / West Bound: 7700东行：6900 / 西行：7700
East Bound: 7800 / West Bound: 8700东行：7800 / 西行：8700
North Bound: 5000 / South Bound: 4900北界：5000 / 南界：4900
North Bound: 7000 / South Bound: 9000北界：7000 / 南界：9000
East Bound: 4900 / West Bound: 9700东行：4900 / 西行：9700

What I want is:我想要的是：

First Direction第一方向	Second direction第二方向
East Bound: 6900东界：6900	West Bound: 7700西行：7700
East Bound: 7800东行：7800	West Bound: 8700西行：8700
North Bound: 5000北界：5000	South Bound: 4900南行：4900
North Bound: 7000北界：7000	South Bound: 9000南行：9000
East Bound: 4900东界：4900	West Bound: 9700西行：9700

Even better is if I can have four column headers for the four cardinal directions and filling it with the values from the first table such as:更好的是，如果我可以为四个基本方向设置四个列标题并用第一个表中的值填充它，例如：

North北	South南	East东方	West西方
0 0	0 0	6900 6900	7700 7700
0 0	0 0	7800 7800	8700 8700
5000 5000	4900 4900	0 0	0 0
7000 7000	4900 4900	0 0	0 0
0 0	0 0	4900 4900	9700 9700

If I have read on the documentation correctly, I believe this can be done with regex patterns but is there an efficient way to do this concisely?如果我正确阅读了文档，我相信这可以通过正则表达式模式来完成，但是有没有一种有效的方法来简洁地做到这一点？

Here is the original df for use: df = ['East Bound: 6900 / West Bound: 7700', 'East Bound: 7800 / West Bound: 8700', 'North Bound: 5000 / South Bound: 4900', 'North Bound: 7000 / South Bound: 9000', 'East Bound: 4900 / West Bound: 9700']这是使用的原始df： df = ['East Bound: 6900 / West Bound: 7700', 'East Bound: 7800 / West Bound: 8700', 'North Bound: 5000 / South Bound: 4900', 'North Bound: 7000 / South Bound: 9000', 'East Bound: 4900 / West Bound: 9700']

Answer 1

For Q1, you can try .str.split对于 Q1，您可以尝试.str.split

df[['First Direction', 'Second direction']] = df['X'].str.split(' / ', expand=True)

print(df)

                                       X     First Direction    Second direction
0    East Bound: 6900 / West Bound: 7700   East Bound: 6900     West Bound: 7700
1    East Bound: 7800 / West Bound: 8700   East Bound: 7800     West Bound: 8700
2  North Bound: 5000 / South Bound: 4900  North Bound: 5000    South Bound: 4900
3  North Bound: 7000 / South Bound: 9000  North Bound: 7000    South Bound: 9000
4    East Bound: 4900 / West Bound: 9700   East Bound: 4900     West Bound: 9700

For Q2, you can try to convert X column to dictionary then explode the column into separate columns对于 Q2，您可以尝试将X列转换为字典，然后将该列分解为单独的列

out = df['X'].apply(lambda x: dict([direction.split(':') for direction in x.split(' / ')])).apply(pd.Series)

print(out)

  East Bound West Bound North Bound South Bound
0       6900       7700         NaN         NaN
1       7800       8700         NaN         NaN
2        NaN        NaN        5000        4900
3        NaN        NaN        7000        9000
4       4900       9700         NaN         NaN

Answer 2

My approach would be to use Series.str.extractall with a specific pattern to get the direction and the amount, convert the amount to a suitable type (I've just gone for integer here), then pivot_table filling in with zeros where appropriate, eg:我的方法是使用具有特定模式的Series.str.extractall来获取方向和数量，将数量转换为合适的类型（我刚刚在这里使用整数），然后在适当的地方用零填充 pivot_table，例如：

out = (
    df['X'].str.extractall(r'(?P<bound>North|South|West|East) (?:Bound): (?P<n>\d+)')
    .astype({'n': int})
    .pivot_table(index=pd.Grouper(level=0), columns='bound', values='n', fill_value=0)
)

This'll give you:这会给你：

bound  East  North  South  West
0      6900      0      0  7700
1      7800      0      0  8700
2         0   5000   4900     0
3         0   7000   9000     0
4      4900      0      0  9700

This retains your original DF ID's... so you can merge/join back to your original DF at some point.这会保留您的原始 DF ID...，因此您可以在某个时候合并/加入原始 DF。

如何使用 str.split 或正则表达式在 python 中将一列拆分为两列或多列？

问题描述

2 个解决方案

解决方案1
1 2022-06-29 19:00:22

解决方案2
1 2022-06-29 19:41:43

如何使用 str.split 或正则表达式在 python 中将一列拆分为两列或多列？

问题描述

2 个解决方案

解决方案1 1 2022-06-29 19:00:22

解决方案2 1 2022-06-29 19:41:43

解决方案1
1 2022-06-29 19:00:22

解决方案2
1 2022-06-29 19:41:43