简体   繁体   English

如何在 Pandas 数据框的列中添加空白单元格?

[英]How do I add a blank cell inside a column of a Pandas dataframe?

I'm a beginner to Python and using dataframes thorugh Pandas.我是 Python 的初学者,并通过 Pandas 使用数据帧。 I'm trying to extract a table from an XML file using xml.dom.minidom into an Excel file.我正在尝试使用xml.dom.minidom从 XML 文件中提取表格到 Excel 文件中。 This is what the original table should look like (notice the black entry under 'Bike'):这是原始表格的样子(注意“自行车”下的黑色条目):

VEHICLE   BRAND
Car       Mercedes
Bike      Kawasaki
          Ducati
Truck     Ram

I am trying to extract this table from the given XML file:我正在尝试从给定的 XML 文件中提取此表:

<Info_Collection>
    <Info car="Car">
        <V_Collection>
            <Brand type="Mercedes"/>
        </V_Collection>
    </Info>
    <Info car="Bike">
        <V_Collection>
            <Brand type="Kawasaki"/>
            <Brand type="Ducati"/>
        </V_Collection>
    </Info>
    <Info car="Truck">
        <V_Collection>
            <Brand type="Ram"/>
        </V_Collection>
    </Info>
</Info_Collection>

This is the code that I am using:这是我正在使用的代码:

def main():
  x1=[]
  x2=[]
  doc = xml.dom.minidom.parse('xml_file')
  t1 = doc.getElementsByTagName("Info")
  t2 = doc.getElementsByTagName("Brand")
  for a in t1:
     x1.append(tb.getAttribute("car"))
  for a in t2:
     x2.append(tb.getAttribute("type"))
  while len(x1) != len(x2): 
     x1.append("")
  boDF = pd.DataFrame({'VEHICLE': x1, 'BRAND':x2})
  boDF.to_excel(writer, sheet_name='Sheet1', index=0, startrow=1)
  writer.save()

if __name__ == "__main__":
  main()

After running it, the output table is as follows:运行后,输出表如下:

VEHICLE   BRAND
Car       Mercedes
Bike      Kawasaki
Truck     Ducati
          Ram

Could someone kindly help me figure out how to insert a space between 'Bike' and 'Truck'?有人可以帮我弄清楚如何在“自行车”和“卡车”之间插入一个空格? I tried to run both for loops concurrently and compared their lengths to see if they were equal or not and when they are not, a blank space would be added to the first column.我尝试同时运行两个 for 循环并比较它们的长度以查看它们是否相等,当它们不相等时,将在第一列中添加一个空格。 However, I cannot get it to work.但是,我无法让它工作。 I know that the while loop in my code adds a space to the end of the first column, but I cannot figure out how to add anywhere inside the column.我知道代码中的 while 循环在第一列的末尾添加了一个空格,但我不知道如何在列内的任何位置添加空格。

See below.见下文。

The idea is to create a csv file which is usually associated with Excel.这个想法是创建一个通常与 Excel 相关联的 csv 文件。 The XML parsing is done by builtin python XML parsing library 'ElementTree'. XML 解析由内置的 Python XML 解析库 'ElementTree' 完成。 When you double click the csv file it will usually will be opened by Excel and you will get the table you are looking for.当您双击 csv 文件时,它通常会被 Excel 打开,您将获得所需的表格。

Note that the XML you have posted is not a valid XML and had to be fixed (v_Collection Vs V_Collection)请注意,您发布的 XML 不是有效的 XML,必须进行修复(v_Collection Vs V_Collection)

import xml.etree.ElementTree as ET

xml = '''<Info_Collection>
    <Info car="Car">
        <V_Collection>
            <Brand type="Mercedes"/>
        </V_Collection>
    </Info>
    <Info car="Bike">
        <V_Collection>
            <Brand type="Kawasaki"/>
            <Brand type="Ducati"/>
        </V_Collection>
    </Info>
    <Info car="Truck">
        <V_Collection>
            <Brand type="Ram"/>
        </V_Collection>
    </Info>
</Info_Collection>'''

data = [['VEHICLE', 'BRAND']]
root = ET.fromstring(xml)
info_list = root.findall('.//Info')
cars = set()
for info_entry in info_list:
    car = info_entry.attrib['car']
    for brand in [brand.attrib['type'] for brand in info_entry.findall('.//Brand')]:
        data.append([car if car not in cars else '', brand])
        cars.add(car)
with open('cars.csv', 'w') as out:
    for entry in data:
        out.write(','.join(entry) + '\n')

output (cars.csv)输出(cars.csv)

VEHICLE,BRAND
Car,Mercedes
Bike,Kawasaki
,Ducati
Truck,Ram

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何计算 Pandas 数据框单元格中的单词总数并将它们添加到新列中? - How do I count the total number of words in a Pandas dataframe cell and add those to a new column? 在 Pandas DataFrame 的单元格内为列表中的每个项目添加一个新列 - Add a new column for each item in a list inside a cell in a Pandas DataFrame 如何在pandas数据帧的第二行中添加列标题? - How do i add column header, in the second row in a pandas dataframe? Pandas:如何在另一个数据帧的单元格内添加数据帧? - Pandas: how to add a dataframe inside a cell of another dataframe? 如何在pandas DataFrame中复制行并添加id列 - How do I copy rows in a pandas DataFrame and add an id column 如何检查 dataframe 列是否包含来自另一个 dataframe 列的字符串并返回 python Z3A0524F883225EFFA94 中的相邻单元格 - How do I check if dataframe column contains a string from another dataframe column and return adjacent cell in python pandas? Pandas Dataframe中的空白列 - Blank column in Pandas Dataframe 如何将熊猫数据框中的每一列添加到列表中,除了第一列? - How do I add every column in a pandas dataframe to a list except for the first column? 当列为一系列列表时,如何有条件地将其添加到pandas数据框列中的单元格选择中? - How do I add conditionally to a selection of cells in a pandas dataframe column when the the column is a series of lists? 如何用文件名替换熊猫数据框中的单元格值? - How do I replace cell values in pandas dataframe with the filename?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM