How to run jobs in parallel manner in pyspark?

Question

I am trying to run jobs in a parallel manner. Can you please help me how to do this?

Example:

Job       Job_Type
A         independent
B         independent
C         A
D         B

You can see here Job A, B are independent so they will run in a same time. C and D dependent on A and B. So they will run after completion of respective Jobs. Suppose A is taking 10 min. and B is taking 15 min. So After completion of A immediately C should start.

Can we create logic for this scenario? Please let me know if you need more information.

Answer 1

I am not sure what orchestration tool you are using, but you can create a job something like below..Or this is what I follow..

Create a Rule based job as such: C will update when A will have new data

How to run jobs in parallel manner in pyspark?

Question

1 answers

solution1
0 2020-07-03 11:10:17

How to run jobs in parallel manner in pyspark?

Question

1 answers

solution1 0 2020-07-03 11:10:17

solution1
0 2020-07-03 11:10:17