[英]How do I add a blank cell inside a column of a Pandas dataframe?
I'm a beginner to Python and using dataframes thorugh Pandas.我是 Python 的初学者,并通过 Pandas 使用数据帧。 I'm trying to extract a table from an XML file using xml.dom.minidom
into an Excel file.我正在尝试使用xml.dom.minidom
从 XML 文件中提取表格到 Excel 文件中。 This is what the original table should look like (notice the black entry under 'Bike'):这是原始表格的样子(注意“自行车”下的黑色条目):
VEHICLE BRAND
Car Mercedes
Bike Kawasaki
Ducati
Truck Ram
I am trying to extract this table from the given XML file:我正在尝试从给定的 XML 文件中提取此表:
<Info_Collection>
<Info car="Car">
<V_Collection>
<Brand type="Mercedes"/>
</V_Collection>
</Info>
<Info car="Bike">
<V_Collection>
<Brand type="Kawasaki"/>
<Brand type="Ducati"/>
</V_Collection>
</Info>
<Info car="Truck">
<V_Collection>
<Brand type="Ram"/>
</V_Collection>
</Info>
</Info_Collection>
This is the code that I am using:这是我正在使用的代码:
def main():
x1=[]
x2=[]
doc = xml.dom.minidom.parse('xml_file')
t1 = doc.getElementsByTagName("Info")
t2 = doc.getElementsByTagName("Brand")
for a in t1:
x1.append(tb.getAttribute("car"))
for a in t2:
x2.append(tb.getAttribute("type"))
while len(x1) != len(x2):
x1.append("")
boDF = pd.DataFrame({'VEHICLE': x1, 'BRAND':x2})
boDF.to_excel(writer, sheet_name='Sheet1', index=0, startrow=1)
writer.save()
if __name__ == "__main__":
main()
After running it, the output table is as follows:运行后,输出表如下:
VEHICLE BRAND
Car Mercedes
Bike Kawasaki
Truck Ducati
Ram
Could someone kindly help me figure out how to insert a space between 'Bike' and 'Truck'?有人可以帮我弄清楚如何在“自行车”和“卡车”之间插入一个空格? I tried to run both for loops concurrently and compared their lengths to see if they were equal or not and when they are not, a blank space would be added to the first column.我尝试同时运行两个 for 循环并比较它们的长度以查看它们是否相等,当它们不相等时,将在第一列中添加一个空格。 However, I cannot get it to work.但是,我无法让它工作。 I know that the while loop in my code adds a space to the end of the first column, but I cannot figure out how to add anywhere inside the column.我知道代码中的 while 循环在第一列的末尾添加了一个空格,但我不知道如何在列内的任何位置添加空格。
See below.见下文。
The idea is to create a csv file which is usually associated with Excel.这个想法是创建一个通常与 Excel 相关联的 csv 文件。 The XML parsing is done by builtin python XML parsing library 'ElementTree'. XML 解析由内置的 Python XML 解析库 'ElementTree' 完成。 When you double click the csv file it will usually will be opened by Excel and you will get the table you are looking for.当您双击 csv 文件时,它通常会被 Excel 打开,您将获得所需的表格。
Note that the XML you have posted is not a valid XML and had to be fixed (v_Collection Vs V_Collection)请注意,您发布的 XML 不是有效的 XML,必须进行修复(v_Collection Vs V_Collection)
import xml.etree.ElementTree as ET
xml = '''<Info_Collection>
<Info car="Car">
<V_Collection>
<Brand type="Mercedes"/>
</V_Collection>
</Info>
<Info car="Bike">
<V_Collection>
<Brand type="Kawasaki"/>
<Brand type="Ducati"/>
</V_Collection>
</Info>
<Info car="Truck">
<V_Collection>
<Brand type="Ram"/>
</V_Collection>
</Info>
</Info_Collection>'''
data = [['VEHICLE', 'BRAND']]
root = ET.fromstring(xml)
info_list = root.findall('.//Info')
cars = set()
for info_entry in info_list:
car = info_entry.attrib['car']
for brand in [brand.attrib['type'] for brand in info_entry.findall('.//Brand')]:
data.append([car if car not in cars else '', brand])
cars.add(car)
with open('cars.csv', 'w') as out:
for entry in data:
out.write(','.join(entry) + '\n')
output (cars.csv)输出(cars.csv)
VEHICLE,BRAND
Car,Mercedes
Bike,Kawasaki
,Ducati
Truck,Ram
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.