[英]Is there a way to remove a row in excel if a certain cell contains a “0” with python?
I am looking for a quick way to edit an Excel equipment list with Python. 我正在寻找使用Python编辑Excel设备列表的快速方法。 Currently I am looking at an equipment list with many line items containing a "0" on the quantity column.
目前,我正在查看一个设备列表,其中有许多行项目在“数量”列上包含“ 0”。
I would like for these rows that have a qty of "0" to be deleted. 我希望将这些数量为“ 0”的行删除。
Example: 例:
from this:
1 | Pots | 3 | 10.99
2 | Pans | 0 | 16.99
3 | Spoons | 1 | 11.99
4 | Forks | 7 | 0.99
5 | Knives | 0 | 20.99
6 | Lids | 0 | 12.99
7 | Spatulas| 2 | 5.99
8 | Tongs | 8 | 6.99
9 | Grill | 1 | 12.99
to this:
1 | Pots | 3 | 10.99
3 | Spoons | 1 | 11.99
4 | Forks | 7 | 0.99
7 | Spatulas| 2 | 5.99
8 | Tongs | 8 | 6.99
9 | Grill | 1 | 12.99
(No need to renumber the "Item No." Column) (无需重新编号“项目编号”列)
I am still learning Python and I know how to create a dataframe with pandas, and remove rows given certain conditions, but I am not sure how to import an existing excel file and remove certain rows given a certain cell condition. 我仍在学习Python,我知道如何用熊猫创建数据框,并在特定条件下删除行,但是我不确定如何导入现有的excel文件并在特定单元格条件下删除某些行。
# Here is what I have done so far
import numpy as np
d = {
'Equipment':['Pots','Pans','Spoons','Forks','Knives','Lids',
'Spatulas','Tongs','Grill','Skewers'],
'QTY':[3,0,1,7,0,0,2,8,1,0]}
df = pd.DataFrame(d,columns=['Equipment','QTY'])
df[df.QTY != 0]
Essentially, I am looking to develop a script where I can remove line items that have a qty of 0. 本质上,我希望开发一个脚本,在其中可以删除数量为0的订单项。
You almost had it: 您几乎拥有它:
import pandas as pd
df = pd.read_excel("file.xlsx")
df = df[df.QTY != 0]
df.to_excel("file.xlsx", index=False)
There are few ways to do it: 有几种方法可以做到:
import pandas as pd
df = {
'Equipment':['Pots','Pans','Spoons','Forks','Knives','Lids',
'Spatulas','Tongs','Grill','Skewers'],
'QTY':[3,0,1,7,0,0,2,8,1,0]}
df = pd.DataFrame(df, columns=['Equipment','QTY'])
# CPU times: user 2 µs, sys: 1 µs, total: 3 µs
# Wall time: 5.48 µs
df = df[df.QTY != 0]
# CPU times: user 2 µs, sys: 1 µs, total: 3 µs
# Wall time: 5.25 µs
df = df.loc[df['QTY'] != 0]
The difference becomes much more significant when the number of rows increases: 当行数增加时,差异变得更加明显:
times = 100000
df = {
'Equipment':['Pots','Pans','Spoons','Forks','Knives','Lids',
'Spatulas','Tongs','Grill','Skewers']*times,
'QTY':[3,0,1,7,0,0,2,8,1,0]*times}
df = pd.DataFrame(df, columns=['Equipment','QTY'])
# CPU times: user 4 µs, sys: 1 µs, total: 5 µs
# Wall time: 7.63 µs
df = df[df.QTY != 0]
# CPU times: user 1e+03 ns, sys: 0 ns, total: 1e+03 ns
# Wall time: 4.77 µs
df2 = df.loc[df['QTY'] != 0]
To export the file to excel you can do: 要将文件导出到Excel ,可以执行以下操作:
df.to_excel("output.xlsx", index=False)
I had run these tests on google colab . 我已经在google colab上运行了这些测试。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.