简体   繁体   English

Python重新提取卷曲brakets内的项目

[英]python re extract items within curly brakets

I have a large dataset with such as in my sql such as: 我在sql中有一个大型数据集,例如:

("Successfully confirmed payment - {'PAYMENTINFO_0_TRANSACTIONTYPE': ['expresscheckout'], 'ACK': ['Success'], 'PAYMENTINFO_0_PAYMENTTYPE': ['instant'], 'PAYMENTINFO_0_RECEIPTID': ['1037-5147-8706-9322'], 'PAYMENTINFO_0_REASONCODE': ['None'], 'SHIPPINGOPTIONISDEFAULT': ['false'], 'INSURANCEOPTIONSELECTED': ['false'], 'CORRELATIONID': ['1917b2c0e5a51'], 'PAYMENTINFO_0_TAXAMT': ['0.00'], 'PAYMENTINFO_0_TRANSACTIONID': ['3U4531424V959583R'], 'PAYMENTINFO_0_ACK': ['Success'], 'PAYMENTINFO_0_PENDINGREASON': ['authorization'], 'PAYMENTINFO_0_AMT': ['245.40'], 'PAYMENTINFO_0_PROTECTIONELIGIBILITY': ['Eligible'], 'PAYMENTINFO_0_ERRORCODE': ['0'], 'TOKEN': ['EC-82295469MY6979044'], 'VERSION': ['95.0'], 'SUCCESSPAGEREDIRECTREQUESTED': ['true'], 'BUILD': ['7507921'], 'PAYMENTINFO_0_CURRENCYCODE': ['GBP'], 'TIMESTAMP': ['2013-08-29T09:15:59Z'], 'PAYMENTINFO_0_SECUREMERCHANTACCOUNTID': ['XFQALBN3EBE8S'], 'PAYMENTINFO_0_PROTECTIONELIGIBILITYTYPE': ['ItemNotReceivedEligible,UnauthorizedPaymentEligible'], 'PAYMENTINFO_0_ORDERTIME': ['2013-08-29T09:15:59Z'], 'PAYMENTINFO_0_PAYMENTSTATUS': ['Pending']}", 1L, datetime.datetime(2013, 8, 29, 11, 15, 59))

I use the following regex to pull the data from the first item list that is within curley brackets 我使用以下正则表达式从居里括号内的第一个项目列表中提取数据

paypal_meta_re = re.compile(r"""\{(.*)\}""").findall

This works as expected, but when I try to remove the square brackets from the dictionary values, I get an error. 这可以按预期工作,但是当我尝试从字典值中删除方括号时,出现错误。

here is my code: 这是我的代码:

paypal_meta = get_paypal(order_id)
paypal_msg_re = paypal_meta_re(paypal_meta[0])
print type(paypal_msg_re), len(paypal_msg_re)
paypal_str = ''.join(map(str, paypal_msg_re))
print paypal_str, type(paypal_str)
paypal = ast.literal_eval(paypal_str)
paypal_dict = {}
for k, v in paypal.items():
    paypal_dict[k] = str(v[0])
if paypal_dict:
    namespace['payment_gateway'] = { 'paypal' : paypal_dict}

and here is the traceback: 这是回溯:

Traceback (most recent call last):
  File "users.py", line 383, in <module>
    orders = get_orders(user_id, mongo_user_id, address_book_list)
  File "users.py", line 290, in get_orders
    paypal = ast.literal_eval(paypal_str)
  File "/usr/local/Cellar/python/2.7.2/lib/python2.7/ast.py", line 49, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/usr/local/Cellar/python/2.7.2/lib/python2.7/ast.py", line 37, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)
  File "<unknown>", line 1
    'PAYMENTINFO_0_TRANSACTIONTYPE': ['expresscheckout'], 'ACK': ['Success'], 'PAYMENTINFO_0_PAYMENTTYPE': ['instant'], 'PAYMENTINFO_0_RECEIPTID': ['2954-8480-1689-8177'], 'PAYMENTINFO_0_REASONCODE': ['None'], 'SHIPPINGOPTIONISDEFAULT': ['false'], 'INSURANCEOPTIONSELECTED': ['false'], 'CORRELATIONID': ['5f22a1dddd174'], 'PAYMENTINFO_0_TAXAMT': ['0.00'], 'PAYMENTINFO_0_TRANSACTIONID': ['36H74806W7716762Y'], 'PAYMENTINFO_0_ACK': ['Success'], 'PAYMENTINFO_0_PENDINGREASON': ['authorization'], 'PAYMENTINFO_0_AMT': ['86.76'], 'PAYMENTINFO_0_PROTECTIONELIGIBILITY': ['PartiallyEligible'], 'PAYMENTINFO_0_ERRORCODE': ['0'], 'TOKEN': ['EC-6B957889FK3149915'], 'VERSION': ['95.0'], 'SUCCESSPAGEREDIRECTREQUESTED': ['true'], 'BUILD': ['6680107'], 'PAYMENTINFO_0_CURRENCYCODE': ['GBP'], 'TIMESTAMP': ['2013-07-02T13:02:50Z'], 'PAYMENTINFO_0_SECUREMERCHANTACCOUNTID': ['XFQALBN3EBE8S'], 'PAYMENTINFO_0_PROTECTIONELIGIBILITYTYPE': ['ItemNotReceivedEligible'], 'PAYMENTINFO_0_ORDERTIME': ['2013-07-02T13:02:49Z'], 'PAYMENTINFO_0_PAYMENTSTATUS': ['Pending']
                                   ^
SyntaxError: invalid syntax

where as if i split the code, using 好像我拆分代码,使用

msg, paypal_msg = paypal_meta[0].split(' - ')
paypal = ast.literal_eval(paypal_msg)
paypal_dict = {}
for k, v in paypal.items():
    paypal_dict[k] = str(v[0])
if paypal_dict:
    namespace['payment_gateway'] = { 'paypal' : paypal_dict}
insert = orders_dbs.save(namespace)
return insert

This works, but I can't use it, as some of the records returned don't split and is not accurate. 这行得通,但是我不能使用它,因为返回的某些记录不会拆分并且不准确。

Basically, I want to take the items in the curly brackets and remove the square brackets from the values and then create a new dictionary from that. 基本上,我想将大括号中的项目取走,并从值中删除方括号,然后从中创建一个新字典。

You need to include the curly braces, your code omits these: 您需要包括花括号,您的代码忽略了这些:

r"""({.*})""")

Note that the parentheses are now around the {...} . 请注意,括号现在 {...} 周围

Alternatively, if there is always a message and one dash before the dictionary, you can use str.partition() to split that off: 另外,如果在字典前总是有一条消息和一个破折号,则可以使用str.partition()将其分开:

paypal_msg = paypal_meta[0].partition(' - ')[-1]

or limit your splitting with str.split() to just once: 或通过str.split()将拆分限制为一次:

paypal_msg = paypal_meta[0].split(' - ', 1)[-1]

Try to avoid putting Python structures like that into the database instead; 尽量避免将类似的Python结构放入数据库中; store JSON in a separate column rather than a string dump of the object. 将JSON存储在单独的列中,而不是对象的字符串转储中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM