简体   繁体   English

使用lxml为Wordpress Importer编写自定义XML文件

[英]Writing a custom XML file for the Wordpress Importer using lxml

Okay, so here is my current situation: 好的,这是我目前的情况:

My knowledge of XML or lxml isn't very good yet, since I rarely used XML files until now. 我对XML或lxml的了解还不是很好,因为到目前为止我很少使用XML文件。 So please tell me if something in my approach to this is really stupid. 因此,请告诉我在解决这个问题时是否真的很愚蠢。 ;-) ;-)

I want to feed my Wordpress installation a custom XML file, using the Wordpress importer. 我想使用Wordpress导入程序为我的Wordpress安装提供一个自定义XML文件。 The Default Format can be seen here: XML File 可以在此处看到默认格式: XML文件

Now there are some tags looking like this 现在有些标签看起来像这样

<wp:author>

I am not a hundred percent sure, but as far as I learned today, the wp: part of the tag is the namespace. 我不是百分百确定的,但是据我今天的了解,wp:标签的一部分是名称空间。

When I tried to use lxml to create those Tags I did this 当我尝试使用lxml创建那些标签时,我这样做了

author = etree.Element("wp:author")

This caused an Error, because I am not allowed to write wp:author, but only author. 这引起了一个错误,因为我不允许写wp:author,而只能写作者。 I used Google, looked upon the lxml website, and came up with this: 我使用Google,查看了lxml网站,并提出了以下内容:

WP = ElementMaker(namespace="http://wordpress.org/export/1.2/",
                  "nsmap={'wp' : "http://wordpress.org/export/1.2/"})
author = WP("author")

Output: 输出:

<wp:author xmlns:wp="http://wordpress.org/export/1.2/"/>

Well, better. 好吧,更好。 The xmlns:wp belongs to the namespace stuff, as I learned today. 正如我今天所学的,xmlns:wp属于名称空间。 But I don't want the xmlns:wp stuff to appear because it doesn't in their XML File. 但是我不希望出现xmlns:wp东西,因为它不在他们的XML文件中。 I looked up how Wordpress itself exports their content, and they do it like this : 我查看了Wordpress本身是如何导出其内容的,并且这样做是这样的

echo '<wp:author_id>' . $author->ID . '</wp:author_id>';

Now my Question, is it better to do the same like them, or should I stick to lxml, as long as there is a way to get a tag without the xmlns:wp stuff? 现在我的问题是,像他们一样做还是更好?还是我应该坚持使用lxml,只要有一种方法可以获取没有xmlns:wp内容的标签? Using lxml to create XML files seems to be the better approach, because it seems to be (normally) pretty easy and is better to read. 使用lxml创建XML文件似乎是更好的方法,因为它(通常)非常容易且易于阅读。

I already tried objectify.deannotate, cleanup_namespace and similar suggestions but all of these don't work. 我已经尝试过objectify.deannotate,cleanup_namespace和类似的建议,但是所有这些都不起作用。 I hope some of you have an answer, either to suggesting a solution to my problem using lxml or by saying, better to do it the way the Wordpress people do! 我希望你们中的一些人有一个答案,可以使用lxml为我的问题提出解决方案,或者说可以像Wordpress员工那样更好地做到这一点!

If I have overlooked an already answered similar Question, I am really sorry and please tell me so. 如果我忽略了一个已经回答过的类似问题,我真的很抱歉,请告诉我。

Thank you Vaelor 谢谢Vaelor

Here is my advice: Take a step back from lxml and consider the python built-in support for xml processing: a module called xml.etree.ElementTree. 这是我的建议:从lxml退后一步,考虑python对xml处理的内置支持:一个名为xml.etree.ElementTree的模块。 Import it in repl like this: 像这样在repl中导入它:

import xml.etree.ElementTree as ET

and play with it for a while. 并玩一会儿。 Here is a good python documentation on the module: http://goo.gl/8FVto 这是该模块上的优质python文档: http : //goo.gl/8FVto

Building an element is as simple as that: 构建元素非常简单:

a = ET.Element('wp:author')
ET.dump(a)

Then add some sub-elements. 然后添加一些子元素。 It's all in the docs. 全部在文档中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM