简体   繁体   English

使用Pandas将suds对象转换为数据框

[英]Converting a suds object into a dataframe with Pandas

I have a list that looks like this: 我有一个看起来像这样的列表:

`[(deliveryObject){
   id = "0bf003ee0000000000000000000002a11cb6"
   start = 2019-01-02 09:30:00
   messageId = "68027b94b892396ed29581cde9ad07ff"
   status = "sent"
   type = "normal"
   }, (deliveryObject){
   id = "0bf0BE3ABFFDF8744952893782139E82793B"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113404"
   status = "sent"
   type = "transactional"
 }, (deliveryObject){
   id = "0bf0702D03CB42D848CBB0B0AF023A87FA65"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113403"
   status = "sent"
   type = "transactional"
   }
]`

When I call type() Python tells me it is a list. 当我调用type() Python告诉我这是一个列表。

When I convert it to a dataframe with pd.DataFrame(df) , the result looks like this: 当我使用pd.DataFrame(df)将其转换为数据pd.DataFrame(df) ,结果如下所示:

我的列表作为数据框

Can anyone help me here? 有人能帮我一下吗? The dataframe is supposed to have column names such as "Id", "Start", "messageId" etc. but they just appear as the first element of each observation instead, with column names appearing as 0, 1 , 2 etc. 该数据框应该具有列名,例如“ Id”,“ Start”,“ messageId”等,但它们只是作为每个观察值的第一个元素出现,而列名则显示为0、1、2等。

Any help is appreciated, thank you! 任何帮助表示赞赏,谢谢!

Ok, this doesn't look pretty but it works. 好的,这看起来不漂亮,但可以。 I converted your list into a string: 我将您的列表转换为字符串:

import re
import pandas as pd

x = """[(deliveryObject){
   id = "0bf003ee0000000000000000000002a11cb6"
   start = 2019-01-02 09:30:00
   messageId = "68027b94b892396ed29581cde9ad07ff"
   status = "sent"
   type = "normal"
   }, (deliveryObject){
   id = "0bf0BE3ABFFDF8744952893782139E82793B"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113404"
   status = "sent"
   type = "transactional"
 }, (deliveryObject){
   id = "0bf0702D03CB42D848CBB0B0AF023A87FA65"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113403"
   status = "sent"
   type = "transactional"
   }
]"""

Then I used regex to somehow make a list of dictionaries: 然后,我使用正则表达式以某种方式列出字典:

a = re.sub(' =', ':', x)
a = re.sub('\(deliveryObject\)', '', a)

for x in ['id', 'start', 'messageId', 'status', 'type']:
    a = re.sub(x, '\''+x+'\'', a)

a = re.sub("(?<=[\"0])\n(?= +?[\'])", '\n,', a)
a = re.sub('(?<=[0])\n(?=,)', '\"\n', a)
a = re.sub('(?<=[:]) (?=[0-9])', ' \"', a)
a = re.sub('(?<= )\"(?=[\w])', '[\"', a)
a = re.sub('(?<=[\w])\"(?=\n)', '\"]', a)

Now you have a list of dictionaries. 现在,您有了字典列表。 First row looks like this 第一排看起来像这样

list_of_dict = eval(a)
df = pd.DataFrame(list_of_dict[0])
print(df.head())

                                     id                start                         messageId status    type
0  0bf003ee0000000000000000000002a11cb6  2019-01-02 09:30:00  68027b94b892396ed29581cde9ad07ff   sent  normal

Add the rest of dictionaries from list_of_dict. 从list_of_dict添加其余的字典。

Please, feel free to improve my regex, I know it looks bad. 请随意改善我的正则表达式,我知道它看起来很糟糕。

If this is for bronto and is using the SOAP and suds implementation. 如果这是针对bronto的,并且正在使用SOAP和suds实现。 Then deliverObject is just a suds object. 然后,deliveryObject只是一个suds对象。

You can do 你可以做

from suds.client import Client

list_of_deliveryObjects = [(deliveryObject){
   id = "0bf003ee0000000000000000000002a11cb6"
   start = 2019-01-02 09:30:00
   messageId = "68027b94b892396ed29581cde9ad07ff"
   status = "sent"
   type = "normal"
   }, (deliveryObject){
   id = "0bf0BE3ABFFDF8744952893782139E82793B"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113404"
   status = "sent"
   type = "transactional"
 }, (deliveryObject){
   id = "0bf0702D03CB42D848CBB0B0AF023A87FA65"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113403"
   status = "sent"
   type = "transactional"
   }
]


data = [Client.dict(suds_object) for suds_object in list_of_deliveryObjects]
df = pd.DataFrame(data)

I did this: 我这样做:

import pandas as pd
lst =[{
   'id':"0bf003ee0000000000000000000002a11cb6",
   'start' : "2019-01-02 09:30:00",
   'messageId': "68027b94b892396ed29581cde9ad07ff",
   'status' : "sent",
   'type' : "normal"
   },{
   'id' :  "0bf0BE3ABFFDF8744952893782139E82793B",
   'start' :  "2018-12-29 23:00:00",
   'messageId' :  "0bc403eb0000000000000000000000113404",
   'status' :  "sent",
   'type' :  "transactional"
 }, {
   'id' :  "0bf0702D03CB42D848CBB0B0AF023A87FA65",
   'start' :  "2018-12-29 23:00:00",
   'messageId' :  "0bc403eb0000000000000000000000113403",
   'status' :  "sent",
   'type' :  "transactional"
   }]
df = pd.DataFrame(lst)
df

and got this(see attached image too): 并得到了这个(也参见附件图片):

    id  messageId   start   status  type
0   0bf003ee0000000000000000000002a11cb6    68027b94b892396ed29581cde9ad07ff    2019-01-02 09:30:00 sent    normal
1   0bf0BE3ABFFDF8744952893782139E82793B    0bc403eb0000000000000000000000113404    2018-12-29 23:00:00 sent    transactional
2   0bf0702D03CB42D848CBB0B0AF023A87FA65    0bc403eb0000000000000000000000113403    2018-12-29 23:00:00 sent    transactional

Result 结果

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM