[英]Slicing a dataframe that matches multiple conditions in Pandas
我正在尝试分割一个数据框,该数据框在 yearID 为 2010 且 G 大于 100 的列上按升序返回列 TrueHit。我的代码除了不匹配创建 2010 和 130 的子集外,其他所有代码都在执行对整个 dataframe 的操作。
import pandas as pd
import numpy as np
url = 'https://raw.githubusercontent.com/maniac73/Baseball/master/Truehit.csv'
df = pd.read_csv(url)df.query('yearID == 2010 and G > 100')
df.replace([np.inf, -np.inf], np.nan, inplace=True)
df.dropna(inplace=True)
df.sort_values('TrueHit', ascending=False)
样品 output 下面
playerID yearID stint teamID lgID G AB R H 2B 3B HR RBI SB CS BB SO IBB HBP SH SF GIDP 1B TrueHit
66039 66039 perrypa02 1988 2 CHN NL 35 1 1 1 0 0 1 2.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0 5.0
87039 87039 rogered01 2005 1 BAL AL 8 1 4 1 0 0 1 2.0 0.0 2.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0 5.0
78779 78779 motagu01 1999 1 MON NL 51 1 1 1 0 0 1 3.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0 5.0
90648 90648 hernafe02 2008 1 SEA AL 31 1 1 1 0 0 1 4.0 0.0 0.0 0 0.0 0.0 0.0 1.0 0.0 0.0 0 5.0
61956 61956 quirkja01 1984 2 CLE AL 1 1 1 1 0 0 1 1.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0 5.0
也许像这样?
import pandas as pd
import numpy as np
url = 'https://raw.githubusercontent.com/maniac73/Baseball/master/Truehit.csv'
df = pd.read_csv(url)
df.replace([np.inf, -np.inf], np.nan, inplace=True)
df.dropna(inplace=True)
df.sort_values('TrueHit', ascending=False, inplace=True)
df[(df["yearID"] == 2010) & (df["G"] > 100)]
output:
Unnamed: 0 playerID yearID stint teamID lgID G AB R H 2B 3B HR RBI SB CS BB SO IBB HBP SH SF GIDP 1B TrueHit
94089 94089 thomeji01 2010 1 MIN AL 108 276 48 78 16 2 25 59.0 0.0 0.0 60 82.0 4.0 2.0 0.0 2.0 8.0 35 1.358696
94148 94148 vottojo01 2010 1 CIN NL 150 547 106 177 36 2 37 113.0 16.0 5.0 91 125.0 8.0 7.0 0.0 3.0 11.0 102 1.248629
93202 93202 dunnad01 2010 1 WAS NL 158 558 85 145 36 2 38 103.0 0.0 1.0 77 199.0 10.0 9.0 0.0 4.0 10.0 69 1.222222
93145 93145 custja01 2010 1 OAK AL 112 349 50 95 19 0 13 52.0 2.0 2.0 68 127.0 0.0 5.0 0.0 3.0 6.0 63 1.214900
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.