[英]How to split the contents of a column into multiple columns inside a polars dataframe
If I have string column namely, 'Cabin' in my dataframe, containing values as shown below:如果我的 dataframe 中有字符串列,即“Cabin”,包含如下所示的值:
Series: 'Cabin' [str]
[
"B/0/P"
"F/0/S"
"A/0/S"
"A/0/S"
"F/1/S"
]
I want to know the process of splitting the 'Cabin' column into multiple columns as shown below:我想知道将'Cabin'列拆分为多个列的过程,如下所示:
A一种 | B乙 | C C |
---|---|---|
str海峡 | i8 i8 | str海峡 |
"B" “乙” | 0 0 | "P" “P” |
"F" “F” | 0 0 | "S" “S” |
"A" “一种” | 1 1个 | "S" “S” |
"C" “C” | 1 1个 | "S" “S” |
I did the initial splitting operation on the column by train.select(pl.col("Cabin").str.split(by="/")).to_series()
to get我通过train.select(pl.col("Cabin").str.split(by="/")).to_series()
对列进行了初始拆分操作以获得
Series: 'Cabin' [list]
[
["B", "0", "P"]
["F", "0", "S"]
["A", "0", "S"]
["A", "0", "S"]
["F", "1", "S"]
]
So I want to know the next steps to get my desired output as shown above.所以我想知道下一步如何获得我想要的 output,如上所示。
You are getting close.你越来越近了。 Either you could could index into this list to create new columns, or use split_exact
to create a struct
instead.您可以索引到此列表以创建新列,或者使用split_exact
来创建struct
。
>>> s = pl.Series("Cabin", ["B/0/P", "F/0/S", "A/0/S"])
>>> train = s.to_frame()
>>> train
shape: (3, 1)
┌───────┐
│ Cabin │
│ --- │
│ str │
╞═══════╡
│ B/0/P │
├╌╌╌╌╌╌╌┤
│ F/0/S │
├╌╌╌╌╌╌╌┤
│ A/0/S │
└───────┘
Indexing into the list (add more expressions get(1)
and get(2)
correspondingly):索引到列表中(相应地添加更多表达式get(1)
和get(2)
):
>>> train.with_column(pl.col("Cabin").str.split("/").arr.get(0))
shape: (3, 1)
┌───────┐
│ Cabin │
│ --- │
│ str │
╞═══════╡
│ B │
├╌╌╌╌╌╌╌┤
│ F │
├╌╌╌╌╌╌╌┤
│ A │
└───────┘
Split-exact solution:分裂精确解:
>>> train.select(pl.col("Cabin").str.split_exact("/", 2)).unnest("Cabin")
shape: (3, 3)
┌─────────┬─────────┬─────────┐
│ field_0 ┆ field_1 ┆ field_2 │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪═════════╪═════════╡
│ B ┆ 0 ┆ P │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
│ F ┆ 0 ┆ S │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
│ A ┆ 0 ┆ S │
└─────────┴─────────┴─────────┘
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.