简体   繁体   中英

Import large XML file into MySql using Python

I need to import a large xml file into mysql using Python My XML file looks like this

<?xml version="1.0" encoding="UTF-8"?>
<items>
  <item>
    <productid>2321</productid>>
    <price>5600</price>
    <name>Product name 1...</name>
    <description>Some desc. for product 1</description>
    <pictures>
      <picture>http://www.server.com/images/1.jpg</picture>
      <picture>http://www.server.com/images/2.jpg</picture>
      <picture>http://www.server.com/images/3.jpg</picture>
    </pictures>
  </item>
  <item>
    <productid>2322</productid>>
    <price>100</price>
    <name>Product name 2...</name>
    <description>Some desc. for product 2</description>
    <pictures>
      <picture>http://www.server.com/images/1_1.jpg</picture>
      <picture>http://www.server.com/images/2_1.jpg</picture>
      <picture>http://www.server.com/images/3_1.jpg</picture>
    </pictures>
  </item>
<items>

I was use this code

conn = mysql.connector.Connect(host = 'localhost', user = 'user', password ='123456' , database = 'my_shop')

if conn:
    print ("Connected Successfully")
else:
    print ("Connection Not Established")

tree = ET.parse('shop.xml')
root = tree.getroot()

for product in root.findall("item"):
    proid = product.find('productid').text
    price = product.find('price').text
    name = product.find('name').text
    desc = product.find('description').text

    query = "INSERT INTO roming(`productid`, `price`, `name`, `description`) VALUES (%s, %s, %s, %s)"
    cursor = conn.cursor()
    cursor.execute(query,(proid, price, name, desc))
    conn.commit()
    print("Data inserted successfully.")

conn.close()

This code work, but I don't know how to import pictures, cose eatch item have 3 pictures. I need to import only first picture for eatch item or all pictures separate by;

Also I'm not sure if this code work for a large xml file

First, you can find pictures similarly to the way you find other details:

url = product.find('pictures').findall('picture')[0].text

Then you need to decide, whether you want to store just a link to this picture, or the real picture as a binary object. If you choose the first option, simply add a column for pictures in your database of type VARCHAR and follow convention in your query:

query = "INSERT INTO roming(`productid`, `price`, `name`, `description`, `picture`) VALUES (%s, %s, %s, %s, %s)"
cursor = conn.cursor()
cursor.execute(query, (proid, price, name, desc, url))
conn.commit()

If on the other hand, you need to store real picture, you need to create a column picture in the database of type BLOB (choose the right size , depending on the quality of images). Then you will need to convert your image into B inary L arge OB ject (read as binary).

from urllib.request import urlopen
img = urlopen(url).read()

And use similar query:

query = "INSERT INTO roming(`productid`, `price`, `name`, `description`, `picture`) VALUES (%s, %s, %s, %s, %s)"
cursor = conn.cursor()
cursor.execute(query, (proid, price, name, desc, img))
conn.commit()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM