简体   繁体   English

使用 Python 中 excel 文件中的 pandas 进行数据清理

[英]Data cleaning using pandas from excel file in Python

I am trying to modify or clean the DataFrame to make it look like the picture added here: 1 .我正在尝试修改或清理 DataFrame 使其看起来像此处添加的图片: 1 I figured out how to remove the upper values (because the excel file is empty in the two first rows) but can not figure out how to remove the values from 5 and downwards as well.我想出了如何删除上限值(因为 excel 文件在前两行中是空的),但不知道如何从 5 和向下删除值。 I have tried using the skiprows= range function, but I have already used this to skip the first rows in the excel file I have.我曾尝试使用skiprows= range function,但我已经使用它来跳过我拥有的 excel 文件中的第一行。 In total I have 298 rows in the excel file, so I want to eliminate all the rows from 5 to 298 in the DataFrame. excel 文件中总共有 298 行,所以我想消除 DataFrame 中从 5 到 298 的所有行。

The code I have used til know:我用过的代码知道:

from pathlib import Path 
src_file = Path.cwd() /  'a1_data1.xlsx'

df = pd.read_excel(src_file, header=1, usecols='B', skiprows =
range(0,1)) 
df = np.round(df, 1)
df

Does anybody know how to do this?有人知道怎么做这个吗?

Thanks Adrian谢谢阿德里安

df = pd.read_excel(src_file, header=1, usecols='B', skiprows = range(0,1))
df = np.round(df, 1) 
df = df.head(5)
df

will do you, head(5) gives you the first 5 rows会的,head(5) 给你前 5 行

df.head(5) # for First Five Rows
df.tail(5) # for Last Five Rows
df = pd.concat([df.iloc[:4,:], df.iloc[298:,:]], ignore_index=True, sort=False)

now you eliminate all the rows between from 5 to 298. I am not sure if you want to keep 5th one or not.现在您消除了从 5 到 298 之间的所有行。我不确定您是否要保留第 5 行。 so please tune it yourself.所以请自己调整。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM