熊猫：在DataFrame问题中选择列-例如row [1] ['Column']

Question

I don't understand this line of code 我不明白这行代码

minimum.append(min(j[1]['Data_Value']))

...specifically ...特别

j[1]['Data_Value']

I know the full code returns the minimum value and stores it in a list called minimum, but what does the j[1] do there? 我知道完整的代码返回最小值，并将其存储在称为“最小值”的列表中，但是j [1]在那里做什么？ I've tried using other numbers to figure it out but get an error. 我尝试使用其他数字来弄清楚，但出现错误。 Is it selecting the index or something? 是选择索引还是其他？

Full code below. 完整代码如下。 Thanks! 谢谢！

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook

df1 = pd.read_csv('./data/C2A2_data/BinnedCsvs_d400/ed157460d30113a689e487b88dcbef1f5d64cbd8bb7825f5f485013d.csv')

minimum = []
maximum = []
month = []
df1 = df1[~(df1['Date'].str.endswith(r'02-29'))]
times1 = pd.DatetimeIndex(df1['Date'])


df = df1[times1.year != 2015]
times = pd.DatetimeIndex(df['Date'])
for j in df.groupby([times.month, times.day]):
    minimum.append(min(j[1]['Data_Value']))
    maximum.append(max(j[1]['Data_Value']))

Answer 1

Explanation 说明

pandas.groupby returns a list of tuples, (key, dataframe). pandas.groupby返回一个元组列表（键，数据框）。 Key is the groupby key; 密钥是分组密钥； the key value of that group. 该组的关键值。 See below for example. 参见以下示例。

Looping over these j 's, means looping over these tuples. 遍历这些j意味着遍历这些元组。

j[0] refers to the group "key" j [0]指代组“键”
j[1] means taking the dataframe component of that tuple. j [1]表示采用该元组的数据帧成分。 ['Data_Value'] takes a column of that dataframe. ['Data_Value']占据该数据['Data_Value']一列。

Example 例

df = pd.DataFrame({'a': [1, 1, 2], 'b': [2, 4, 6]})
df_grouped = df.groupby('a')

for j in df_grouped:
     print(f"Groupby key (col a): {j[0]}")
     print("dataframe:")
     print(j[1])

Yields: 产量：

Groupby key (col a): 1
dataframe:
   a  b
0  1  2
1  1  4
Groupby key (col a): 2
dataframe:
   a  b
2  2  6

More readable solution 更具可读性的解决方案

Another, more comfortable, way to get the min/max of Data_Value for every month-day combination is this: 另一种更舒适的方式来获取每个月日组合的Data_Value的最小值/最大值是这样的：

data_value_summary = df \
    .groupby([times.month, times.day]) \
    .agg({'Data_Value': [min, max]}) \
    ['Data_Value']  # < this removed the 2nd header from the newly created dataframe

minimum = data_value_summary['min']
maximum = data_value_summary['max']

熊猫：在DataFrame问题中选择列-例如row [1] ['Column']

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-08-14 10:34:07

熊猫：在DataFrame问题中选择列-例如row [1] [&#39;Column&#39;]

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-08-14 10:34:07

熊猫：在DataFrame问题中选择列-例如row [1] ['Column']

解决方案1
2 已采纳 2019-08-14 10:34:07