[英]Assert a value of a specific cell in spark df in python
What is the easiest way of asserting specific cell values in pyspark dataframes?在 pyspark 数据帧中断言特定单元格值的最简单方法是什么?
+---------+--------+
|firstname|lastname|
+---------+--------+
|James |Smith |
|Anna | null |
|Julia |Williams|
|Maria |Jones |
|Jen |Brown |
|Mike |Williams|
+---------+--------+
I want to assert the existence of values null and "Jen" in their respective rows/columns in this data frame.我想在此数据帧的各自行/列中断言值 null 和“Jen”的存在。
So I can use something like:所以我可以使用类似的东西:
assert df['firstname'][4] == "Jen"
assert df['lastname'][1] == None
From what I found, using collect()
is the way (which is equivalent of iloc() in Pandas df):根据我的发现,使用
collect()
是一种方式(相当于 Pandas df 中的 iloc() ):
assert df.collect()[4]['firstname'] == 'Jen'
assert df.collect()[1]['lastname'] is None
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.