简体   繁体   中英

Using python to generate XML from SQL tables

I have some XML as follows:

<dd>
    <persson>
        <name>sam</name>
        <tel>9748</tel>
    </persson>
    <cat>
        <name>frank</name>
    </cat>
</dd>

I parsed it into two SQL tables, one for the tags and one for the pcdata. The start and stop columns represent the position the tag appears at and ends.

Tags:
 start | stop |  tag   
-------+------+--------
     3 |    5 | name
     6 |    8 | tel
     2 |    9 | persson
    11 |   13 | name
    10 |   14 | cat
     1 |   15 | dd
(6 rows)

Pcdata:
 pos | pcdata 
-----+--------
   4 | sam
   7 | 9748
  12 | frank
(3 rows)

Now i'd like to parse this database back into XML in the original form. I want to write a function that takes both tables and writes the XML out in a file. I'm using python and psycopg2 to do this.

H'mmm having decoded your "columns":

<dd><persson><name>sam</name><tel>9748</tel></persson>
1   2        3     4  5      6    7   8     9                    
<cat><name>frank</name></cat></dd>
10   11    12   13     14    15

I have some questions for you: How did you do that? Why did you do that? Exactly what do you want to achieve? Note that your question title is rather misleading -- "SQL tables" are merely where you have parked your peculiar representation of the data.

Here's some pseudocode to do what you want to do:

pieces = []
result = cursor.execute("select * from tags;")
for start, step, tag in result:
    pieces.append((start, "<" + tag + ">"))
    pieces.append((stop, "</" + tag + ">"))
result = cursor.execute("select * from pcdata;")
for pos, pcdata in result:
    pieces.append((pos, pcdata))
pieces.sort()
xml_stream = "".join(piece[1] for piece in pieces)
your_file_object.write(xml_stream)

In answer to the question about whether the above would put the "positions" in the output stream: No it won't; the following snippet shows it working. The positions are used only to get the soup sorted into the correct order. In the "join", piece[0] refers to the position, but it isn't used, only piece[1] which is the required text.

>>> pieces
[(3, '<name>'), (4, 'sam'), (5, '</name>')]
>>> ''.join(piece[1] for piece in pieces)
'<name>sam</name>'

Relenting on the SQL comment-question:

Although shown with SQLite, this is bog-standard SQL. If your database doesn't grok || as the concatenation operator, try + .

Question that you forgot to ask: "How do I get a <?xml blah-blah ?> thingie up the front?". Answer: See below.

console-prompt>sqlite3
SQLite version 3.6.14
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table tags (start int, stop int, tag text);
sqlite> insert into tags values(3,5,'name');
sqlite> insert into tags values(6,8,'tel');
sqlite> insert into tags values(2,9,'persson');
sqlite> insert into tags values(11,13,'name');
sqlite> insert into tags values(10,14,'cat');
sqlite> insert into tags values(1,15,'dd');
sqlite> create table pcdata (pos int, pcdata text);
sqlite> insert into pcdata values(4,'sam');
sqlite> insert into pcdata values(7,'9748');
sqlite> insert into pcdata values(12,'frank');
sqlite> select datum from (
   ...>     select 0 as posn, '<?xml version="1.0" encoding="UTF-8"?>' as datum
   ...>     union
   ...>     select start as posn, '<' || tag || '>' as datum from tags
   ...>     union
   ...>     select stop as posn, '</' || tag || '>' as datum from tags
   ...>     union
   ...>     select pos as posn, pcdata as datum from pcdata
   ...>     )
   ...> order by posn;
<?xml version="1.0" encoding="UTF-8"?>
<dd>
<persson>
<name>
sam
</name>
<tel>
9748
</tel>
</persson>
<cat>
<name>
frank
</name>
</cat>
</dd>
sqlite>

The simple answer is don't. If you are using Postgres 8.3 or above, build the XML using SQL. It will be much easier.

http://www.postgresql.org/docs/current/static/functions-xml.html

First off, if Postgres includes mechanisms to create XML for you, use them.

Second off, don't use string manipulation to create XML unless you really know what you're doing. And even then, don't. Just concatenating string values from your database will produce poorly-formed XML if any column contains an ampersand, for instance.

Unless you will be processing too much data to fit in memory, use John Machin's approach to parse the data into elements and lxml.etree to create the actual XML elements.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM