简体   繁体   中英

Is there a way to remove a row in excel if a certain cell contains a “0” with python?

I am looking for a quick way to edit an Excel equipment list with Python. Currently I am looking at an equipment list with many line items containing a "0" on the quantity column.

I would like for these rows that have a qty of "0" to be deleted.

Example:

  from this:

Item No. | Equipment | QTY | Price

1    |   Pots    |  3  | 10.99 
2    |   Pans    |  0  | 16.99 
3    |   Spoons  |  1  | 11.99 
4    |   Forks   |  7  |  0.99 
5    |   Knives  |  0  | 20.99 
6    |   Lids    |  0  | 12.99 
7    |   Spatulas|  2  |  5.99 
8    |   Tongs   |  8  |  6.99 
9    |   Grill   |  1  | 12.99




  to this:

Item No. | Equipment | QTY | Price

1    |   Pots    |  3  | 10.99 
3    |   Spoons  |  1  | 11.99 
4    |   Forks   |  7  |  0.99 
7    |   Spatulas|  2  |  5.99 
8    |   Tongs   |  8  |  6.99 
9    |   Grill   |  1  | 12.99 

(No need to renumber the "Item No." Column)

I am still learning Python and I know how to create a dataframe with pandas, and remove rows given certain conditions, but I am not sure how to import an existing excel file and remove certain rows given a certain cell condition.

# Here is what I have done so far

import numpy as np

d = {
    'Equipment':['Pots','Pans','Spoons','Forks','Knives','Lids',
            'Spatulas','Tongs','Grill','Skewers'],
    'QTY':[3,0,1,7,0,0,2,8,1,0]}

df = pd.DataFrame(d,columns=['Equipment','QTY'])

df[df.QTY != 0]

Essentially, I am looking to develop a script where I can remove line items that have a qty of 0.

You almost had it:

import pandas as pd

df = pd.read_excel("file.xlsx")

df = df[df.QTY != 0]

df.to_excel("file.xlsx", index=False)

There are few ways to do it:

import pandas as pd
df = {
    'Equipment':['Pots','Pans','Spoons','Forks','Knives','Lids',
            'Spatulas','Tongs','Grill','Skewers'],
    'QTY':[3,0,1,7,0,0,2,8,1,0]}

df = pd.DataFrame(df, columns=['Equipment','QTY'])

Method 1

# CPU times: user 2 µs, sys: 1 µs, total: 3 µs
# Wall time: 5.48 µs
df = df[df.QTY != 0]

Method 2

# CPU times: user 2 µs, sys: 1 µs, total: 3 µs
# Wall time: 5.25 µs
df = df.loc[df['QTY'] != 0]

The difference becomes much more significant when the number of rows increases:

times = 100000
df = {
    'Equipment':['Pots','Pans','Spoons','Forks','Knives','Lids',
            'Spatulas','Tongs','Grill','Skewers']*times,
    'QTY':[3,0,1,7,0,0,2,8,1,0]*times}

df = pd.DataFrame(df, columns=['Equipment','QTY'])

Method 1

# CPU times: user 4 µs, sys: 1 µs, total: 5 µs
# Wall time: 7.63 µs
df = df[df.QTY != 0]

Method 2

# CPU times: user 1e+03 ns, sys: 0 ns, total: 1e+03 ns
# Wall time: 4.77 µs
df2 = df.loc[df['QTY'] != 0]

To export the file to excel you can do:

df.to_excel("output.xlsx", index=False)

I had run these tests on google colab .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM