简体   繁体   中英

Error when storing data to sqlite3 using scrapy

I m getting the following error on storing data in sqlite3

  File "/Users/qasimbutt/PycharmProjects/IntegratedProject/spider_backend/spider_backend/pipelines.py", line 50, in process_item
    self.cursor.execute("insert into raw_tbl values(?,?,?) ",
sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.

I was able to store data perfectly nice and easy in mongodb.But when i shifted to sqlite3 it gives me error. I tried playing around but doesnt works easily.

I am trying to fetch data such as below, which i doubt can be persisted in the sqlite3

 ERROR: Error processing {'desc': (('View the profiles of professionals named "Royce" on ',),),
 'title': (('10,800+ "Royce" profiles | LinkedIn',),),
 'url': ('nz.linkedin.com › pub › dir › Royce › '
         'nz-9194-Auckland,-New-Zealand',)}
Traceback (most recent call last):

Following is my pipeline.

import sqlite3


class SpiderBackendPipeline(object):
   # def process_item(self, item, spider):
   #     return item

   #def process_item(self, item, spider):
   #    self.collection.insert(dict(item))
       # print ("Pipeline",+ item['title'][0])
   #    return item

#Adding new sqlite3 connection
    def __init__(self):
        self.db_connection()
        self.create_tbl()
        pass

    def db_connection(self):
        self.connection = sqlite3.connect("rawdata.db")
        self.cursor = self.connection.cursor()
        print("DB STATUS:DB Connection established")

    def create_tbl(self):
        self.cursor.execute("""
                            drop table if exists raw_tbl
                            """)
        self.cursor.execute("""create table raw_tbl (
                            title TEXT,
                            desc TEXT,
                            url TEXT
                            )""")
        print("DB STATUS:Table created")

    def process_item(self, item, spider):
      #  self.storedb(item)
      #  return item
        #print("Pipeline" +item['title'][0])
        #print("DB STATUS: Successfully processed data")
        self.cursor.execute("insert into raw_tbl values(?,?,?) ",
                            (item['title'][0],
                             item['desc'][0],
                             item['url'][0],
                             )
                            )
        print("DB STATUS:Data stored in db")
        self.connection.commit()
        return item

#  def storedb(self):
 #       self.cursor.execute("""insert into raw_tbl values(?,?,?) """,
 #                           (item['title'][0],
 #                            item['desc'][0],
 #                            item['url'][0],
 #                           )
 #                         )
 #       print("DB STATUS:Data stored in db")
 #       self.connection.commit()
        #self.connection.close()
 #       print("DB STATUS:Closing db connection")

    def closedb(self):
        self.connection.close()

Appreciate if you could assist me here and help me understand what am i misisng here? thanks alot

Insert statement in SQLite is a bit different.
You must specify the column names in which you want to add the values. Like,

insert into mytable(col1,col2) values(val1,val2); 

Hope that helps. Rest seems fine with the code.

The error message you get is quite meaningful:

sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.

This means that you try to bind a parameter with an invalid type, eg if you defined title as TEXT but you try to bind an int parameter. As the others already pointed out, explicitly define the targeted columns when using INSERT statements, that helps. Next, try to debug on which crawled item it fails exactly, probably you will see that it tries to save something that does not have the type str . Alternatively, you can cast every parameter to str before trying to save it, but I would rather suggest to debug what is it that you are trying to save exactly.

Try:

from html import escape

def process_item(self, item, spider):
    self.cursor.execute("insert into raw_tbl (col1, col2, col3) values ( escape(item['title']), escape(item['desc']), escape(item['url']) ) )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM