[英]Python Map Reduce to find daily max, min, mean and variance in temperature for each weather station from hourly data
I have a data file for hourly weather data.我有一个每小时天气数据的数据文件。 A screenshot of the data is attached below.
下面附上数据的截图。
The relevant columns are相关列是
The data is hourly so there are 24 rows for each day and this is for every station.数据是每小时的,因此每天有 24 行,这适用于每个站点。 I need to find the daily max, min, mean and variance of "Dry Bulb Temp" from all the weather stations.
我需要从所有气象站找到“干球温度”的每日最大值、最小值、平均值和方差。 I can solve this using numpy or other libraries but I am not allowed to use any package that gives statistics.
我可以使用 numpy 或其他库来解决这个问题,但我不允许使用任何提供统计信息的包。 I MUST use MapReduce framework in order to complete this task.
我必须使用 MapReduce 框架才能完成这项任务。 I am not familiar with this and can't find help from other questions.
我对此不熟悉,无法从其他问题中找到帮助。 How do I approach this?
我该如何处理? Thanks
谢谢
Use pandas
使用
pandas
import pandas as pd
out = (df.groupby(['WbanNumber', 'YearMonthDay'])['Dry Bulb Temp']
.agg(['max', 'min', 'mean', 'var']).reset_index())
print(out)
# Output
WbanNumber YearMonthDay max min mean var
0 3011 20070101 49 30 38.583333 29.644928
1 3011 20070102 49 30 39.583333 28.514493
My input dataframe:我的输入数据框:
WbanNumber YearMonthDay Dry Bulb Temp
0 3011 20070101 46
1 3011 20070101 34
2 3011 20070101 49
3 3011 20070101 45
4 3011 20070101 30
5 3011 20070101 30
6 3011 20070101 43
7 3011 20070101 40
8 3011 20070101 37
9 3011 20070101 40
10 3011 20070101 44
11 3011 20070101 36
12 3011 20070101 45
13 3011 20070101 40
14 3011 20070101 30
15 3011 20070101 34
16 3011 20070101 41
17 3011 20070101 33
18 3011 20070101 43
19 3011 20070101 40
20 3011 20070101 39
21 3011 20070101 32
22 3011 20070101 36
23 3011 20070101 39
24 3011 20070102 32
25 3011 20070102 39
26 3011 20070102 47
27 3011 20070102 34
28 3011 20070102 30
29 3011 20070102 40
30 3011 20070102 48
31 3011 20070102 35
32 3011 20070102 34
33 3011 20070102 40
34 3011 20070102 42
35 3011 20070102 38
36 3011 20070102 36
37 3011 20070102 32
38 3011 20070102 40
39 3011 20070102 41
40 3011 20070102 38
41 3011 20070102 38
42 3011 20070102 45
43 3011 20070102 39
44 3011 20070102 48
45 3011 20070102 44
46 3011 20070102 41
47 3011 20070102 49
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.