简体   繁体   中英

Importing CSV to SQL with Pandas, how do I ignore empty rows?

Still learning Python, so bear with me. I use the following script to import a csv file into a local SQL database. My problem is that the csv file usually has a bunch of empty rows at the end of it and I get primary key errors upon import. What's the best way to handle this? If I manually edit the csv in a text editor I can delete all the rows of,,,,,,,,,,,,,,,,,,,,,,,,,,, and it works perfectly.

Bonus question , is there an easy way to iterate through all.csv files in a directory, and then delete or move them after they've been processed?

import pandas as pd

data = pd.read_csv (r'C:\Bookings.csv')
df = pd.DataFrame(data, columns= ['BookingKey','BusinessUnit','BusinessUnitKey','DateTime','Number','Reference','ExternalId','AmountTax','AmountTotal','AmountPaid','AmountOpen','AmountTotalExcludingTax','BookingFee','MerchantFee','ProcessorFee','NumberOfPersons','Status','StatusDateTime','StartTime','EndTime','PlannedCheckinTime','ActualCheckinTime','Attendance','AttendanceDatetime','OnlineBookingCheckedDatetime','Origin','CustomerKey'])
df = df.fillna(value=0)
print(df)

import pyodbc

conn = pyodbc.connect('Driver={SQL Server};'
                      'Server=D3VBUP\SQLEXPRESS;'
                      'Database=BRIQBI;'
                      'Trusted_Connection=yes;')
cursor = conn.cursor()

for row in df.itertuples():
    cursor.execute('''
                INSERT INTO BRIQBI.dbo.Bookings (BookingKey,BusinessUnit,BusinessUnitKey,DateTime,Number,Reference,ExternalId,AmountTax,AmountTotal,AmountPaid,AmountOpen,AmountTotalExcludingTax,BookingFee,MerchantFee,ProcessorFee,NumberOfPersons,Status,StatusDateTime,StartTime,EndTime,PlannedCheckinTime,ActualCheckinTime,Attendance,AttendanceDatetime,OnlineBookingCheckedDatetime,Origin,CustomerKey)
                VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
                ''',
                row.BookingKey,
                row.BusinessUnit,
                row.BusinessUnitKey,
                row.DateTime,
                row.Number,
                row.Reference,
                row.ExternalId,
                row.AmountTax,
                row.AmountTotal,
                row.AmountPaid,
                row.AmountOpen,
                row.AmountTotalExcludingTax,
                row.BookingFee,
                row.MerchantFee,
                row.ProcessorFee,
                row.NumberOfPersons,
                row.Status,
                row.StatusDateTime,
                row.StartTime,
                row.EndTime,
                row.PlannedCheckinTime,
                row.ActualCheckinTime,
                row.Attendance,
                row.AttendanceDatetime,
                row.OnlineBookingCheckedDatetime,
                row.Origin,
                row.CustomerKey
                )
conn.commit()

Ended up being really easy. I added the dropna function so all the rows of data that had no data in them would be dropped.

df = df.dropna(how = 'all')

Now off to find out how to iterate through multiple files in a directory and move them to another location.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM