Unicode to dictionary (unicode contains apostrophe punctuation)

Question

I have read the following Unicode from a CSV file:

line = u"{u'There's Still Time': u'foo'}"

I would like to be able to convert this to a dictionary so I would be able to so I can access it as the following:

line["There's Still Time"] 
Output: 'foo'

Please help.

Answer 1

Given that there is an apostrophe within the string, you'll have to do some pre-processing before you even attempt to parse it into a dict . Assuming that all strings within the target dict are unicode and that closing strings have to be followed immediately by a control character (ie } , : , , , } , whitespace...) you can search for all apostrophes that do not match these two categories and escape them. Then you can use ast.literal_eval() to parse it into a dict , something like:

import ast
import re

APOSTROPHE_ESCAPE = re.compile(r"(?<!u)'(?![.}:,\s])")

line = u"{u'There's Still Time': u'foo'}"
your_dict = ast.literal_eval(APOSTROPHE_ESCAPE.sub(r"\'", line))

print(your_dict)  # {u"There's Still Time": u'foo'}

Keep in mind, tho, that just a simple:

line = u"{u'There'}s Still Time': u'foo'}"

Will throw it off - sure, it would be an illegal dictionary in the source as well, but keep in mind these limitations and adjust your pre-process regex accordingly.

Unicode to dictionary (unicode contains apostrophe punctuation)

Question

1 answers

solution1
2 2018-08-01 22:26:59

Unicode to dictionary (unicode contains apostrophe punctuation)

Question

1 answers

solution1 2 2018-08-01 22:26:59

solution1
2 2018-08-01 22:26:59