简体   繁体   中英

Pandas: selecting columns in a DataFrame question - e.g. row[1]['Column']

I don't understand this line of code

minimum.append(min(j[1]['Data_Value']))

...specifically

j[1]['Data_Value']

I know the full code returns the minimum value and stores it in a list called minimum, but what does the j[1] do there? I've tried using other numbers to figure it out but get an error. Is it selecting the index or something?

Full code below. Thanks!

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook

df1 = pd.read_csv('./data/C2A2_data/BinnedCsvs_d400/ed157460d30113a689e487b88dcbef1f5d64cbd8bb7825f5f485013d.csv')

minimum = []
maximum = []
month = []
df1 = df1[~(df1['Date'].str.endswith(r'02-29'))]
times1 = pd.DatetimeIndex(df1['Date'])


df = df1[times1.year != 2015]
times = pd.DatetimeIndex(df['Date'])
for j in df.groupby([times.month, times.day]):
    minimum.append(min(j[1]['Data_Value']))
    maximum.append(max(j[1]['Data_Value']))

Explanation

pandas.groupby returns a list of tuples, (key, dataframe). Key is the groupby key; the key value of that group. See below for example.

Looping over these j 's, means looping over these tuples.

  • j[0] refers to the group "key"
  • j[1] means taking the dataframe component of that tuple. ['Data_Value'] takes a column of that dataframe.

Example

df = pd.DataFrame({'a': [1, 1, 2], 'b': [2, 4, 6]})
df_grouped = df.groupby('a')

for j in df_grouped:
     print(f"Groupby key (col a): {j[0]}")
     print("dataframe:")
     print(j[1])

Yields:

Groupby key (col a): 1
dataframe:
   a  b
0  1  2
1  1  4
Groupby key (col a): 2
dataframe:
   a  b
2  2  6

More readable solution

Another, more comfortable, way to get the min/max of Data_Value for every month-day combination is this:

data_value_summary = df \
    .groupby([times.month, times.day]) \
    .agg({'Data_Value': [min, max]}) \
    ['Data_Value']  # < this removed the 2nd header from the newly created dataframe

minimum = data_value_summary['min']
maximum = data_value_summary['max']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM