简体   繁体   中英

Data cleaning using pandas from excel file in Python

I am trying to modify or clean the DataFrame to make it look like the picture added here: 1 . I figured out how to remove the upper values (because the excel file is empty in the two first rows) but can not figure out how to remove the values from 5 and downwards as well. I have tried using the skiprows= range function, but I have already used this to skip the first rows in the excel file I have. In total I have 298 rows in the excel file, so I want to eliminate all the rows from 5 to 298 in the DataFrame.

The code I have used til know:

from pathlib import Path 
src_file = Path.cwd() /  'a1_data1.xlsx'

df = pd.read_excel(src_file, header=1, usecols='B', skiprows =
range(0,1)) 
df = np.round(df, 1)
df

Does anybody know how to do this?

Thanks Adrian

df = pd.read_excel(src_file, header=1, usecols='B', skiprows = range(0,1))
df = np.round(df, 1) 
df = df.head(5)
df

will do you, head(5) gives you the first 5 rows

df.head(5) # for First Five Rows
df.tail(5) # for Last Five Rows
df = pd.concat([df.iloc[:4,:], df.iloc[298:,:]], ignore_index=True, sort=False)

now you eliminate all the rows between from 5 to 298. I am not sure if you want to keep 5th one or not. so please tune it yourself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM