基于列创建行

Question

I want to create a row based on a column.我想基于一列创建一行。

For example - I have the following data frame.例如 - 我有以下数据框。

| lookup_name | alt_name | inventory | location |
|-------------|----------|-----------|----------|
| Honda       | Car      | 1         | au       |
| Apple       | Fruit    | 1         | us       |

I want to convert it to the following我想将其转换为以下

| lookup_name | inventory | location |
|-------------|-----------|----------|
| Honda       | 1         | au       |
| Car         | 1         | au       |
| Apple       | 1         | us       |
| Fruit       | 1         | us       |

Where the alternative name column is removed and the locations and inventory are copied against the new lookup_name entry.删除替代名称列并根据新的lookup_name 条目复制位置和库存的位置。

Answer 1

data= [
    ('Honda', 'Car', 1, 'au'),
    ('Apple', 'Fruit', 1, 'us'),
]

df = spark.createDataFrame(data, ['lookup_name','alt_name', 'inventory', 'location'])

(
    df
        .withColumn('lookup_name', explode(array(col('lookup_name'), col('alt_name'))))
        .drop('alt_name')
        .show(10, False)
)
# +-----------+---------+--------+
# |lookup_name|inventory|location|
# +-----------+---------+--------+
# |Honda      |1        |au      |
# |Car        |1        |au      |
# |Apple      |1        |us      |
# |Fruit      |1        |us      |
# +-----------+---------+--------+

array(col('lookup_name'), col('alt_name')) => ['Honda', 'Car'] array(col('lookup_name'), col('alt_name')) => ['本田'，'汽车']

df.withColumn('lookup_name', array(col('lookup_name'), col('alt_name'))).show(10, False)
# +--------------+--------+---------+--------+
# |lookup_name   |alt_name|inventory|location|
# +--------------+--------+---------+--------+
# |[Honda, Car]  |Car     |1        |au      |
# |[Apple, Fruit]|Fruit   |1        |us      |
# +--------------+--------+---------+--------+

pyspark.sql.functions.explode(col) Returns a new row for each element in the given array or map. pyspark.sql.functions.explode(col)为给定数组或 map 中的每个元素返回一个新行。

基于列创建行

问题描述

1 个解决方案

解决方案1
0 2022-08-11 00:47:06

基于列创建行

问题描述

1 个解决方案

解决方案1 0 2022-08-11 00:47:06

解决方案1
0 2022-08-11 00:47:06