如何根据 pyspark/python 中的值过滤 dataframe？

Question

I have a dataframe like below.我有一个如下所示的 dataframe。 I want to read the dataframe and filter the records based on start time and store in different dataframes.我想读取 dataframe 并根据开始时间过滤记录并存储在不同的数据框中。

INPUT DF输入方向

name      start_time
AA        2022-11-16
AAA       2022-11-15
BBB       2022-11-14

For eg: I need to store each record based on start time, which means all, 16 th date start time records should go to one dataframe and so on.例如：我需要根据开始时间存储每条记录，这意味着所有，第 16 个日期开始时间记录应该 go 到一个 dataframe 等等。

OUTPUT DF OUTPUT 东风

df1 = ["Store 2022-11-16 record"]
df2 = ["Store 2022-11-15 record"]
df3 = ["Store 2022-11-14 record"]

Answer 1

Well, technially a duplicate but idk how to report that but I think this works:好吧，技术上是重复的，但我不知道如何报告，但我认为这可行：

df = pd.DataFrame({"name" : ["AA", "AAA", "BBB"], 
"start_time" : ["2022-11-16"," 2022-11-15", "2022-11-14"]})

dfs = dict(tuple(df.groupby('start_time')))

dfs

you can select each DataFrame by the start time:你可以 select 每个 DataFrame 由开始时间：

print (dfs['2022-11-14''])

    name    start_time
2   BBB 2022-11-14

如何根据 pyspark/python 中的值过滤 dataframe？

问题描述

1 个解决方案

解决方案1
0 2022-11-17 13:58:18

如何根据 pyspark/python 中的值过滤 dataframe？

问题描述

1 个解决方案

解决方案1 0 2022-11-17 13:58:18

解决方案1
0 2022-11-17 13:58:18