简体   繁体   English

基于列创建行

[英]Create Rows based on Column

I want to create a row based on a column.我想基于一列创建一行。

For example - I have the following data frame.例如 - 我有以下数据框。

| lookup_name | alt_name | inventory | location |
|-------------|----------|-----------|----------|
| Honda       | Car      | 1         | au       |
| Apple       | Fruit    | 1         | us       |


I want to convert it to the following我想将其转换为以下

| lookup_name | inventory | location |
|-------------|-----------|----------|
| Honda       | 1         | au       |
| Car         | 1         | au       |
| Apple       | 1         | us       |
| Fruit       | 1         | us       |

Where the alternative name column is removed and the locations and inventory are copied against the new lookup_name entry.删除替代名称列并根据新的lookup_name 条目复制位置和库存的位置。

data= [
    ('Honda', 'Car', 1, 'au'),
    ('Apple', 'Fruit', 1, 'us'),
]

df = spark.createDataFrame(data, ['lookup_name','alt_name', 'inventory', 'location'])

(
    df
        .withColumn('lookup_name', explode(array(col('lookup_name'), col('alt_name'))))
        .drop('alt_name')
        .show(10, False)
)
# +-----------+---------+--------+
# |lookup_name|inventory|location|
# +-----------+---------+--------+
# |Honda      |1        |au      |
# |Car        |1        |au      |
# |Apple      |1        |us      |
# |Fruit      |1        |us      |
# +-----------+---------+--------+

array(col('lookup_name'), col('alt_name')) => ['Honda', 'Car'] array(col('lookup_name'), col('alt_name')) => ['本田','汽车']

df.withColumn('lookup_name', array(col('lookup_name'), col('alt_name'))).show(10, False)
# +--------------+--------+---------+--------+
# |lookup_name   |alt_name|inventory|location|
# +--------------+--------+---------+--------+
# |[Honda, Car]  |Car     |1        |au      |
# |[Apple, Fruit]|Fruit   |1        |us      |
# +--------------+--------+---------+--------+

pyspark.sql.functions.explode(col) Returns a new row for each element in the given array or map. pyspark.sql.functions.explode(col)为给定数组或 map 中的每个元素返回一个新行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM