简体   繁体   中英

How to enumerate the rows of a dataframe? Spark Scala

I have a dataframe (renderDF) like this:

+------+---+-------+
|   uid|sid|renders|
+------+---+-------+
| david|  0|      0|
|rachel|  1|      0|
|rachel|  3|      0|
|rachel|  2|      0|
|   pep|  2|      0|
|   pep|  0|      1|
|   pep|  1|      1|
|rachel|  0|      1|
|  rick|  1|      1|
|  ross|  0|      3|
|  rick|  0|      3|
+------+---+-------+

I want to use a window function to achieve this result

+------+---+-------+-----------+
|   uid|sid|renders|row_number |    
+------+---+-------+-----------+
| david|  0|      0|        1  |
|rachel|  1|      0|        2  |
|rachel|  3|      0|        3  |
|rachel|  2|      0|        4  |
|   pep|  2|      0|        5  |
|   pep|  0|      1|        6  |
|   pep|  1|      1|        7  |
|rachel|  0|      1|        8  |
|  rick|  1|      1|        9  |
|  ross|  0|      3|       10  |
|  rick|  0|      3|       11  |
+------+---+-------+-----------+

I try:

val windowRender = Window.partitionBy('sid).orderBy('Renders)
renderDF.withColumn("row_number", row_number() over windowRender)

But it doesn't do what I need. Is the partition my problem?

尝试这个:

val dfWithRownumber = renderDF.withColumn("row_number", row_number.over(Window.partitionBy(lit(1)).orderBy("renders")))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM