[英]shifting down rows of specific columns from a specific index in python
I am scraping multiple tables from multiple pages of a website. 我正在从网站的多个页面抓取多个表。 The issue is there is a row missing from the initial table.
问题是初始表中缺少一行。 Basically, this is how the dataframe looks.
基本上,这就是数据框的外观。
mar2018 feb2018 jan2018 dec2017 nov2017
oct2017 sep2017 aug2017
balls faced 345 561 295 0 645 balls faced 200 58 0
runs scored 156 281 183 0 389 runs scored 50 20 0
strike rate 52.3 42.6 61.1 0 52.2 strike rate 25 34 0
dot balls 223 387 173 0 476 dot balls 125 34 0
fours 8 12 19 0 22 sixes 2 0 0
doubles 20 38 16 0 36 fours 4 2 0
notout 2 0 0 0 4 doubles 2 0 0
notout 4 2 0
the column 'sixes' is missing in the first page and present in the subsequent pages. 第一页中缺少“六”列,随后的页面中也存在。 So, I am trying to move the rows starting from 'fours' to 'not out' to a position down and leave nan's in row 4 for first 5 columns starting from mar2018 to nov2017.
因此,我正在尝试将行从'fours'移至'not out'向下移动一个位置,并将nan's留在第4行中,以便从2018年3月开始到nov2017开始的前5列。
I tried the following code but it isn't working. 我尝试了以下代码,但无法正常工作。 This is moving the values horizontally but not vertically downward.
这将水平移动值,但不垂直向下移动。
df.iloc[4][0:6] = df.iloc[4][0:6].shift(1)
and also 并且
df2 = pd.DataFrame(index = 4)
df = pd.concat([df.iloc[:], df2, df.iloc[4:]]).reset_index(drop=True)
did not work. 不工作。
df['mar2018'] = df['mar2018'].shift(1)
But this moves all the values of that column down by 1 row. 但这会将该列的所有值向下移动1行。
So, I was wondering if it is possible to shift down rows of specific columns from a specific index? 因此,我想知道是否可以从特定索引下移特定列的行?
I think need reindex
by union by numpy.union1d
of all index values: 我认为,需要
reindex
由工会numpy.union1d
所有指数值:
idx = np.union1d(df1.index, df2.index)
df1 = df1.reindex(idx)
df2 = df2.reindex(idx)
print (df1)
mar2018 feb2018 jan2018 dec2017 nov2017
balls faced 345.0 561.0 295.0 0.0 645.0
dot balls 223.0 387.0 173.0 0.0 476.0
doubles 20.0 38.0 16.0 0.0 36.0
fours 8.0 12.0 19.0 0.0 22.0
notout 2.0 0.0 0.0 0.0 4.0
runs scored 156.0 281.0 183.0 0.0 389.0
sixes NaN NaN NaN NaN NaN
strike rate 52.3 42.6 61.1 0.0 52.2
print (df2)
oct2017 sep2017 aug2017
balls faced 200 58 0
dot balls 125 34 0
doubles 2 0 0
fours 4 2 0
notout 4 2 0
runs scored 50 20 0
sixes 2 0 0
strike rate 25 34 0
If multiple DataFrame
s in list is possible use list comprehension
: 如果列表中可能有多个
DataFrame
,请使用list comprehension
:
from functools import reduce
dfs = [df1, df2]
idx = reduce(np.union1d, [x.index for x in dfs])
dfs1 = [df.reindex(idx) for df in dfs]
print (dfs1)
[ mar2018 feb2018 jan2018 dec2017 nov2017
balls faced 345.0 561.0 295.0 0.0 645.0
dot balls 223.0 387.0 173.0 0.0 476.0
doubles 20.0 38.0 16.0 0.0 36.0
fours 8.0 12.0 19.0 0.0 22.0
notout 2.0 0.0 0.0 0.0 4.0
runs scored 156.0 281.0 183.0 0.0 389.0
sixes NaN NaN NaN NaN NaN
strike rate 52.3 42.6 61.1 0.0 52.2, oct2017 sep2017 aug2017
balls faced 200 58 0
dot balls 125 34 0
doubles 2 0 0
fours 4 2 0
notout 4 2 0
runs scored 50 20 0
sixes 2 0 0
strike rate 25 34 0]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.