简体   繁体   中英

How to build multi row formula using Cloud Data Fusion

I am trying to build a new column that has the running total of a specific column. Are there directives available to do this? Any suggestions on how to accomplish this?

Multi-row directives are not available in CDF(Wrangler) for this particular scenario. Running total of a specific column can be achieved through window aggregation plugin, https://cdap.atlassian.net/wiki/spaces/DOCS/pages/760381517/Window+Aggregation+Analytics+Spark

Eg. new_col:Accumulate(specific_col, 1, false) as aggregate function.

Make sure you define the right partition column if you need rolling sum for a group for rows, if not create a dummy col for considering whole table as one partition. Also make sure table is not huge in the latter case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM