![](/img/trans.png)
[英]IF-statement applied to substring of a string in a specific position
[英]Get position of substring after a specific position in Pyspark
我有一張這樣的桌子:
+-----+-----------------------+
| id | word |
+---+-------------------------+
| 1 | today is a nice day |
| 2 | hello world |
| 3 | he is good |
| 4 | is it raining? |
+-----+-----------------------+
我想獲得 position 的word
( is
) 僅當它出現在第三個 position 之后
+-----+-----------------------+-----------------+
| id | word | substr_position|
+---+-------------------------+-----------------+
| 1 | today is a nice day | 7 |
| 2 | hello world | 0 |
| 3 | he is good | 4 |
| 4 | is it raining? | 0 |
+-----+-----------------------+-----------------+
有什么幫助嗎?
您可以在 spark 中使用定位function。
它在特定 position 之后返回字符串列中第一次出現的 substring。
from pyspark.sql.functions import locate, col
df.withColumn("substr_position", locate("is", col("word"), pos=3)).show()
+---+-------------------+---------------+
| id| word|substr_position|
+---+-------------------+---------------+
| 1|today is a nice day| 7|
| 2| hello world| 0|
| 3| he is good| 4|
| 4| is it raining?| 0|
+---+-------------------+---------------+
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.