简体   繁体   中英

Pandas dataframe only reading first value, NaN for everything else

I am attempting to read a csv with pandas and then insert into a SQL table. I am reading the data from the csv correctly when I print(data), but once I add it into the dataframe it is only reading the very first column, and is inserting NaN for every other value in the csv. Code and output below;

data = pd.read_csv (localFilePath)
print(data)
df = pd.DataFrame(data, columns= ['Date','EECode','LastName','FirstName', \
           'HomeDepartmentCode','HomeDepartmentDesc','PayClass','InPunchTime', \
           'OutPunchTime','DepartmentCode','DepartmentDesc','JobCodesCode', \
           'JobCodesDesc','TeamCode','TeamDesc','EarnCode'])
print(df)

for row in df.itertuples():
    SQLInsert = ('''
                INSERT INTO [Reporting].[dbo].[Paycom_Missing_Punch] 
                (Date, EECode, LastName, FirstName, HomeDepartmentCode, 
                HomeDepartmentDesc, PayClass, InPunchTime, OutPunchTime, 
                DepartmentCode, DepartmentDesc, JobCodesCode, JobCodesDesc, 
                TeamCode, TeamDesc, EarnCode)
                VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
                '''
                )
     args = row.Date, row.EECode, row.LastName, row.FirstName, \
                row.HomeDepartmentCode, row.HomeDepartmentDesc, row.PayClass, row.InPunchTime, \
                row.OutPunchTime, row.DepartmentCode, row.DepartmentDesc, row.JobCodesCode, \
                row.JobCodesDesc, row.TeamCode, row.TeamDesc, row.EarnCode
                          
    #print(SQLInsert) 
    #print(args)
    cursor.execute(SQLInsert, args)     
conn.commit()

output when I print(data);

         Date  EE Code  ...               Team Desc Earn Code
0  01/21/2021     1435  ...             Indiana DWD       NaN
1  01/21/2021     1435  ...             Indiana DWD       NaN
2  01/22/2021     1180  ...             Supervisors       NaN
3  01/21/2021     1664  ...  Technical Support Desk       NaN
4  01/21/2021     1078  ...             Supervisors       NaN

output once I add it to the dataframe;

         Date  EECode  LastName  ...  TeamCode  TeamDesc  EarnCode
0  01/21/2021     NaN       NaN  ...       NaN       NaN       NaN
1  01/21/2021     NaN       NaN  ...       NaN       NaN       NaN
2  01/22/2021     NaN       NaN  ...       NaN       NaN       NaN
3  01/21/2021     NaN       NaN  ...       NaN       NaN       NaN
4  01/21/2021     NaN       NaN  ...       NaN       NaN       NaN

I assume the problem is how I am passing the values to the dataframe, but from everything I have read or seen, the way I am doing it looks correct.

The problem is the way you're doing the df . You're creating the dataframe first with your data . Then you're trying to create another dataframe of it, using names that don't exist. To fix your problem simply do this:

>>> col_names = ['Date','EECode','LastName','FirstName', \
           'HomeDepartmentCode','HomeDepartmentDesc','PayClass','InPunchTime', \
           'OutPunchTime','DepartmentCode','DepartmentDesc','JobCodesCode', \
           'JobCodesDesc','TeamCode','TeamDesc','EarnCode']

>>> df = pd.read_csv(localFilePath)
>>> df.columns = col_names

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM