I have a list of 1500 emoji character dictionary in a json file, and I wanted to import those to my python code, I did a file read and convert it to a python dictionary but now I have only 143 records. How can I import all the emoji to my code, this is my code.
import sys
import ast
file = open('emojidescription.json','r').read()
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
emoji_dictionary = ast.literal_eval(file.translate(non_bmp_map))
#word = word.replaceAll(",", " ");
keys = list(emoji_dictionary["emojis"][0].keys())
values = list(emoji_dictionary["emojis"][0].values())
file_write = open('output.txt','a')
print(len(keys))
for i in range(len(keys)):
try:
content = 'word = word.replace("{0}", "{1}")'.format(keys[i],values[i][0])
except Exception as e:
content = 'word = word.replace("{0}", "{1}")'.format(keys[i],'')
#file.write()
#print(keys[i],values[i])
print(content)
file_write.close()
This is my input sample
{
"emojis": [
{
"👨🎓": ["Graduate"],
"©": ["Copy right"],
"®": ["Registered"],
"👨👩👧": ["family"],
"👩❤️💋👩": ["love"],
"™": ["trademark"],
"👨❤👨": ["love"],
"⌚": ["time"],
"⌛": ["wait"],
"⭐": ["star"],
"🐘": ["Elephant"],
"🐕": ["Cat"],
"🐜": ["ant"],
"🐔": ["cock"],
"🐓": ["cock"],
This is my result, and the 143 denotes number of emoji.
143
word = word.replace(" ", "family")
word = word.replace("Ⓜ", "")
word = word.replace("♥", "")
word = word.replace("♠", "")
word = word.replace("⌛", "wait")
I'm not sure why you're seeing only 143 records from an input of 1500 (your sample doesn't seem to display this behavior).
The setup doesn't seem to do anything useful, but what you're doing boils down to (simplified and skipping lots of details):
d = ..read json as python dict.
keys = d.keys()
values = d.values()
for i in range(len(keys)):
key = keys[i]
value = values[i]
and that should be completely correct. There are better ways to do this in Python, however, like using the zip
function:
d = ..read json as python dict.
keys = d.keys()
values = d.values()
for key, value in zip(keys, values): # zip picks pair-wise elements
...
or simply asking the dict for its items:
for key, value in d.items():
...
The json
module makes reading and writing json much simpler (and safer), and using the idiom from above the problem reduces to this:
import json
emojis = json.load(open('emoji.json', 'rb'))
with open('output.py', 'wb') as fp:
for k,v in emojis['emojis'][0].items():
val = u'word = word.replace("{0}", "{1}")\n'.format(k, v[0] if v else "")
fp.write(val.encode('u8'))
Why do you replace all emojis with 0xfffd
in the lines:
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
emoji_dictionary = ast.literal_eval(file.translate(non_bmp_map))
Just don't to this!
Using json:
import json
with open('emojidescription.json', encoding="utf8") as emojis:
emojis = json.load(emojis)
with open('output.txt','a', encoding="utf8") as output:
for emoji, text in emojis["emojis"][0].items():
text = "" if not text else text[0]
output.write('word = word.replace("{0}", "{1}")\n'.format(emoji, text))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.