[英]Converting a suds object into a dataframe with Pandas
我有一个看起来像这样的列表:
`[(deliveryObject){
id = "0bf003ee0000000000000000000002a11cb6"
start = 2019-01-02 09:30:00
messageId = "68027b94b892396ed29581cde9ad07ff"
status = "sent"
type = "normal"
}, (deliveryObject){
id = "0bf0BE3ABFFDF8744952893782139E82793B"
start = 2018-12-29 23:00:00
messageId = "0bc403eb0000000000000000000000113404"
status = "sent"
type = "transactional"
}, (deliveryObject){
id = "0bf0702D03CB42D848CBB0B0AF023A87FA65"
start = 2018-12-29 23:00:00
messageId = "0bc403eb0000000000000000000000113403"
status = "sent"
type = "transactional"
}
]`
当我调用type()
Python告诉我这是一个列表。
当我使用pd.DataFrame(df)
将其转换为数据pd.DataFrame(df)
,结果如下所示:
有人能帮我一下吗? 该数据框应该具有列名,例如“ Id”,“ Start”,“ messageId”等,但它们只是作为每个观察值的第一个元素出现,而列名则显示为0、1、2等。
任何帮助表示赞赏,谢谢!
好的,这看起来不漂亮,但可以。 我将您的列表转换为字符串:
import re
import pandas as pd
x = """[(deliveryObject){
id = "0bf003ee0000000000000000000002a11cb6"
start = 2019-01-02 09:30:00
messageId = "68027b94b892396ed29581cde9ad07ff"
status = "sent"
type = "normal"
}, (deliveryObject){
id = "0bf0BE3ABFFDF8744952893782139E82793B"
start = 2018-12-29 23:00:00
messageId = "0bc403eb0000000000000000000000113404"
status = "sent"
type = "transactional"
}, (deliveryObject){
id = "0bf0702D03CB42D848CBB0B0AF023A87FA65"
start = 2018-12-29 23:00:00
messageId = "0bc403eb0000000000000000000000113403"
status = "sent"
type = "transactional"
}
]"""
然后,我使用正则表达式以某种方式列出字典:
a = re.sub(' =', ':', x)
a = re.sub('\(deliveryObject\)', '', a)
for x in ['id', 'start', 'messageId', 'status', 'type']:
a = re.sub(x, '\''+x+'\'', a)
a = re.sub("(?<=[\"0])\n(?= +?[\'])", '\n,', a)
a = re.sub('(?<=[0])\n(?=,)', '\"\n', a)
a = re.sub('(?<=[:]) (?=[0-9])', ' \"', a)
a = re.sub('(?<= )\"(?=[\w])', '[\"', a)
a = re.sub('(?<=[\w])\"(?=\n)', '\"]', a)
现在,您有了字典列表。 第一排看起来像这样
list_of_dict = eval(a)
df = pd.DataFrame(list_of_dict[0])
print(df.head())
id start messageId status type
0 0bf003ee0000000000000000000002a11cb6 2019-01-02 09:30:00 68027b94b892396ed29581cde9ad07ff sent normal
从list_of_dict添加其余的字典。
请随意改善我的正则表达式,我知道它看起来很糟糕。
如果这是针对bronto的,并且正在使用SOAP和suds实现。 然后,deliveryObject只是一个suds对象。
你可以做
from suds.client import Client
list_of_deliveryObjects = [(deliveryObject){
id = "0bf003ee0000000000000000000002a11cb6"
start = 2019-01-02 09:30:00
messageId = "68027b94b892396ed29581cde9ad07ff"
status = "sent"
type = "normal"
}, (deliveryObject){
id = "0bf0BE3ABFFDF8744952893782139E82793B"
start = 2018-12-29 23:00:00
messageId = "0bc403eb0000000000000000000000113404"
status = "sent"
type = "transactional"
}, (deliveryObject){
id = "0bf0702D03CB42D848CBB0B0AF023A87FA65"
start = 2018-12-29 23:00:00
messageId = "0bc403eb0000000000000000000000113403"
status = "sent"
type = "transactional"
}
]
data = [Client.dict(suds_object) for suds_object in list_of_deliveryObjects]
df = pd.DataFrame(data)
我这样做:
import pandas as pd
lst =[{
'id':"0bf003ee0000000000000000000002a11cb6",
'start' : "2019-01-02 09:30:00",
'messageId': "68027b94b892396ed29581cde9ad07ff",
'status' : "sent",
'type' : "normal"
},{
'id' : "0bf0BE3ABFFDF8744952893782139E82793B",
'start' : "2018-12-29 23:00:00",
'messageId' : "0bc403eb0000000000000000000000113404",
'status' : "sent",
'type' : "transactional"
}, {
'id' : "0bf0702D03CB42D848CBB0B0AF023A87FA65",
'start' : "2018-12-29 23:00:00",
'messageId' : "0bc403eb0000000000000000000000113403",
'status' : "sent",
'type' : "transactional"
}]
df = pd.DataFrame(lst)
df
并得到了这个(也参见附件图片):
id messageId start status type
0 0bf003ee0000000000000000000002a11cb6 68027b94b892396ed29581cde9ad07ff 2019-01-02 09:30:00 sent normal
1 0bf0BE3ABFFDF8744952893782139E82793B 0bc403eb0000000000000000000000113404 2018-12-29 23:00:00 sent transactional
2 0bf0702D03CB42D848CBB0B0AF023A87FA65 0bc403eb0000000000000000000000113403 2018-12-29 23:00:00 sent transactional
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.