简体   繁体   中英

When inserting CSV data into AWS RDS MySQL DB getting errorMessage 1054 "Unknown column in 'field list'. Please let me know how to fix this error

This is the AWS Lambda function lambda_function.py code in which I am working on to read the CSV data from AWS S3 bucket and to insert it into AWS RDS MySQL database.

import json
import boto3
import sys
import logging
import rds_config
import pymysql
import csv

s3_client = boto3.client("s3")

#rds settings
rds_host  = "myrdsdatabase-xxxxx-amazonaws.com"
name = rds_config.db_username
password = rds_config.db_password
db_name = rds_config.db_name

logger = logging.getLogger()
logger.setLevel(logging.INFO)

try:
    conn = pymysql.connect(rds_host, user=name, passwd=password, db=db_name, connect_timeout=5)
except pymysql.MySQLError as e:
    logger.error("ERROR: Unexpected error: Could not connect to MySQL instance.")
    logger.error(e)
    sys.exit()

logger.info("SUCCESS: Connection to RDS MySQL instance succeeded")

def lambda_handler(event, context):
    
    bucket_name = event['Records'][0]['s3']['bucket']['name']
    s3_file_name = event['Records'][0]['s3']['object']['key']
    csv_file = s3_client.get_object(Bucket=bucket_name,Key=s3_file_name)
    record_list = csv_file['Body'].read().decode("utf-8").split('\n')
    print('RAW CSV Data:')
    print(record_list)
    csv_reader = csv.reader(record_list, delimiter=',', quotechar='"')
    colheader, *rows = csv_reader
    print('Data to be inserted into RDS MySQL:')
    for r in rows:
        empID = r[0]
        empName = r[1]
        print(empID+' '+empName)    
    
    """
    This function fetches content from MySQL RDS instance
    """
    result = []
    
    with conn.cursor() as cur:
        #cur.execute("create table Employee ( EmpID int NOT NULL, EmpName varchar(255) NOT NULL, PRIMARY KEY (EmpID))")
        #cur.execute("Drop table Employee")
        #cur.execute("Delete from Employee")
        #cur.execute('insert into Employee (EmpID, EmpName) values(0,"Test")')
        #cur.execute("select * from Employee")
        #conn.commit()
        #cur.close()
        
        for r in rows:
            empID = r[0]
            empName = r[1]
            cur.execute('insert into Employee (EmpID, EmpName) values('+empID+','+empName+')')
            cur.execute("select * from Employee")
            conn.commit()
            cur.close()
            
        for row in cur:
            result.append(list(row))
        print ("Data from AWS RDS MySQL...")
        print (result)

Output:-

Response:
{
  "errorMessage": "(1054, \"Unknown column 'Tom' in 'field list'\")",
  "errorType": "InternalError",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 62, in lambda_handler\n    cur.execute('insert into Employee (EmpID, EmpName) values('+empID+','+empName+')')\n",
    "  File \"/var/task/pymysql/cursors.py\", line 165, in execute\n    result = self._query(query)\n",
    "  File \"/var/task/pymysql/cursors.py\", line 321, in _query\n    conn.query(q)\n",
    "  File \"/var/task/pymysql/connections.py\", line 860, in query\n    self._affected_rows = self._read_query_result(unbuffered=unbuffered)\n",
    "  File \"/var/task/pymysql/connections.py\", line 1061, in _read_query_result\n    result.read()\n",
    "  File \"/var/task/pymysql/connections.py\", line 1349, in read\n    first_packet = self.connection._read_packet()\n",
    "  File \"/var/task/pymysql/connections.py\", line 1018, in _read_packet\n    packet.check_error()\n",
    "  File \"/var/task/pymysql/connections.py\", line 384, in check_error\n    err.raise_mysql_exception(self._data)\n",
    "  File \"/var/task/pymysql/err.py\", line 107, in raise_mysql_exception\n    raise errorclass(errno, errval)\n"
  ]
}

Request ID:
"6b8f70cd-dc9a-458c-9024-c4ecdb9a913e"

Function logs:
START RequestId: 6b8f70cd-dc9a-458c-9024-c4ecdb9a913e Version: $LATEST
RAW CSV Data:
['EmpID,EmpName\r', '1,Tom\r', '2,Jack\r', '3,Ross\r', '4,William']
Data to be inserted into RDS MySQL:
1 Tom
2 Jack
3 Ross
4 William
[ERROR] InternalError: (1054, "Unknown column 'Tom' in 'field list'")
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 62, in lambda_handler
    cur.execute('insert into Employee (EmpID, EmpName) values('+empID+','+empName+')')
  File "/var/task/pymysql/cursors.py", line 165, in execute
    result = self._query(query)
  File "/var/task/pymysql/cursors.py", line 321, in _query
    conn.query(q)
  File "/var/task/pymysql/connections.py", line 860, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/var/task/pymysql/connections.py", line 1061, in _read_query_result
    result.read()
  File "/var/task/pymysql/connections.py", line 1349, in read
    first_packet = self.connection._read_packet()
  File "/var/task/pymysql/connections.py", line 1018, in _read_packet
    packet.check_error()
  File "/var/task/pymysql/connections.py", line 384, in check_error
    err.raise_mysql_exception(self._data)
  File "/var/task/pymysql/err.py", line 107, in raise_mysql_exception
    raise errorclass(errno, errval)
END RequestId: 6b8f70cd-dc9a-458c-9024-c4ecdb9a913e
REPORT RequestId: 6b8f70cd-dc9a-458c-9024-c4ecdb9a913e  Duration: 223.12 ms Billed Duration: 300 ms Memory Size: 128 MB Max Memory Used: 80 MB  Init Duration: 488.20 ms    

Note:- I am able to insert hard code values into the RDS MySQL database but when trying to insert CSV file values then only getting this error.

Please let me know in case any further information required which can be helpful to debug this error and also to find the root cause.

Thanks in Advance!!!

There was a syntax problem in cur.execute call. Now after changing the syntax I am successfully able to insert the CSV data into RDS MySQL database.

Correction:

cur.execute("insert into Employee (EmpID, EmpName) values(%s,%s)", (empID,empName))

I believe the issue stems from this line:

cur.execute('insert into Employee (EmpID, EmpName) values('+empID+','+empName+')')

The employee name is a string but it is not being quoted in the resulting SQL statement. The way it currently is will lead to SQL statements like:

INSERT INTO Employee (EmpID, EmpName) VALUES (1, Tom);

Where what you want is:

INSERT INTO Employee (EmpID, EmpName) VALUES (1, 'Tom'); (note the quotes around Tom)

You can either add these quotes to the cur.execute call:

cur.execute('insert into Employee (EmpID, EmpName) values (' + empID + ', "' + empName + '")')

Or use the syntax described here: How to use variables in SQL statement in Python?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM