[英]Edit XML file with python
我有一個使用 Informatica BDM 自動生成的 XML 文件,編輯值對我來說非常復雜 我用 xml.etree.ElementTree 做了幾次嘗試,但我沒有得到結果。 這是文件的摘錄:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="http://www.informatica.com/Parameterization/1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema"
version="2.0"><!--Specify deployed application specific parameters here.--><!--
<application name="app_2">
<mapping name="M_kafka_hdfs"/>
</application>--><project name="V2">
<folder name="Streaming">
<mapping name="M_kafka_hdfs">
<parameter name="P_s_spark_executor_cores">4</parameter>
<parameter name="P_s_spark_executor_memory">8G</parameter>
<parameter name="P_s_spark_sql_shuffle_partitions">108</parameter>
<parameter name="P_s_spark_network_timeout">180000</parameter>
<parameter name="P_s_spark_executor_heartbeatInterval">6000</parameter>
<parameter name="P_i_maximum_rows_read">0</parameter>
<parameter name="P_s_checkpoint_directory">checkpoint</parameter>
</mapping>
</folder>
</project>
</root>
我的想法是能夠更改參數,例如: <parameter name="P_s_spark_executor_memory">8G</parameter>
到<parameter name="P_s_spark_executor_memory">16G</parameter>
我只能訪問這些值,但不能訪問它們的內容,我也不能編輯它們:
import xml.etree.ElementTree as ET
treexml = ET.parse('autogenerated.xml')
for element in treexml.iter():
dict_keys={}
if element.keys():
for name, value in element.items():
dict_keys[name]=value
print(dict_keys[name])
這個想法是能夠覆蓋任何參數:
xml["parameter"]["P_s_spark_sql_shuffle_partitions"] = 64
並且它在文件中被<parameter name="P_s_spark_sql_shuffle_partitions">64</parameter>
試試這個代碼:
import xml.etree.ElementTree as ET
name_space = 'http://www.informatica.com/Parameterization/1.0'
ET.register_namespace('', name_space)
treexml = ET.parse(r'c:\test\test.xml')
# get all elements with 'parameter' tags (it is necessary to specify the namespace prefix)
params = treexml.getroot().findall(f'.//{{{name_space}}}parameter')
# make the dict with names as keys and previously found elements as value
xml = {el.attrib['name']: el for el in params}
# set the text of the "P_s_spark_sql_shuffle_partitions"
xml["P_s_spark_sql_shuffle_partitions"].text = str(64)
# write out the xml
treexml.write(r'c:\test\testOut.xml')
輸出c:\\test\\testOut.xml
<root xmlns="http://www.informatica.com/Parameterization/1.0" version="2.0"><project name="V2">
<folder name="Streaming">
<mapping name="M_kafka_hdfs">
<parameter name="P_s_spark_executor_cores">4</parameter>
<parameter name="P_s_spark_executor_memory">8G</parameter>
<parameter name="P_s_spark_sql_shuffle_partitions">64</parameter>
<parameter name="P_s_spark_network_timeout">180000</parameter>
<parameter name="P_s_spark_executor_heartbeatInterval">6000</parameter>
<parameter name="P_i_maximum_rows_read">0</parameter>
<parameter name="P_s_checkpoint_directory">checkpoint</parameter>
</mapping>
</folder>
</project>
</root>
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.