JSON 文件如何删除不需要的字符

Question

so I scraped some data into a JSON file format but there are some unwanted characters in the saved data for example:所以我将一些数据刮成 JSON 文件格式，但保存的数据中有一些不需要的字符，例如：

"quote_text": "\“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.\”", "author": "Albert Einstein", "tags": [ "change", "deep-thoughts", "thinking", "world" "quote_text": "\“我们创造的世界是我们思考的过程。如果不改变我们的想法，它就无法改变。\”", "author": "Albert Einstein", "tags": [ "change" ，“深思”，“思考”，“世界”

So how can I remove these \“ type characters from the file in python那么如何从 python 中的文件中删除这些\“类型的字符

Answer 1

Replace method:更换方法：

If you have only 1 or 2 characters to remove I suggest that you use the string .replace() method:如果您只有 1 或 2 个字符要删除，我建议您使用字符串.replace()方法：

An example can be on the quote_text key一个例子可以在 quote_text 键上

your_dict['quote_text'].replace('\u201c','')

Regex:正则表达式：

If you are struggling with multiple characters I suggest you dive into Regex如果您正在为多个字符而苦苦挣扎，我建议您深入研究 Regex

More:更多的：

If you wish to apply your function to the entire dictionnary values you can use dict comprehensions:如果您希望将您的函数应用于整个字典值，您可以使用 dict comprehensions：

d2 = dict((k, f(v)) for k, v in d1.items())

d1 being your original dictionnary and f your function. d1是您的原始字典，而f您的功能。

In our example it would be:在我们的示例中，它将是：

d2 = dict((k, v.replace('\u201c','')) for k, v in d1.items())

Answer 2

If you want to remove multiple characters you can use a list to indicate what letters to remove:如果要删除多个字符，可以使用列表来指示要删除的字母：

text = '{ "work": "\u201cfun\u201c", "foo": ["bar", "baz"] }'
remove_chars = ['u201c', 'b', 'f']
new_text = ''.join([ch for ch in text if ch not in remove_chars])

To replace unwanted characters make a dictionary to hold the substitutions then make the changes:要替换不需要的字符，请制作一个字典来保存替换内容，然后进行更改：

subs = {
  '\u201c': "'",
  'z': 't'
}
text = '{ "work": "\u201cfun\u201c", "foo": ["bar", "baz"] }'
letter_list = [(subs[ch] if ch in subs else ch)  for ch in text]
new_text = ''.join(letter_list)

Answer 3

Let's assume dictionary as d.让我们假设字典为 d。 As I can see, there are different unicode characters like \“ , \” .正如我所看到的，有不同的 unicode 字符，如\“ 、 \” 。 If you want to remove all Unicode characters at once, you can do something like this:如果要一次删除所有 Unicode 字符，可以执行以下操作：

one liner code:一个班轮代码：

d['quote_text'].encode("ascii", "ignore").decode('utf-8')

Explanation in detail:详细说明：

The below one line code remove all the unicode characters and will return value in bytes.下面一行代码删除所有 unicode 字符，并将以字节为单位返回值。

remov_unicode_char = d['quote_text'].encode("ascii", "ignore")

Now, in order to convert into string, you can decode it.现在，为了转换为字符串，您可以对其进行解码。

convert_str =  remov_unicode_char.decode("utf-8")

Now, you can check the result by printing it.现在，您可以通过打印来检查结果。

print(convert_str)

output:输出：

The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.

JSON 文件如何删除不需要的字符

问题描述

3 个解决方案

解决方案1
0 2021-06-30 18:14:01

解决方案2
0 2021-06-30 18:23:40

解决方案3
0 已采纳 2021-06-30 18:45:57

JSON 文件如何删除不需要的字符

问题描述

3 个解决方案

解决方案1 0 2021-06-30 18:14:01

解决方案2 0 2021-06-30 18:23:40

解决方案3 0 已采纳 2021-06-30 18:45:57

解决方案1
0 2021-06-30 18:14:01

解决方案2
0 2021-06-30 18:23:40

解决方案3
0 已采纳 2021-06-30 18:45:57