I'm making a dictionary which contains words containing characters like č,ě,á
(Czech alphabet). When I try to print these words into the console before adding to dictionary, I can see correct encoded words. The problem is that when I add it to the dictionary and print the word as it's value, I see it but in wrong encoding. Here is a printscreen of my console, the first row is print name
and the second row is print dict
.
These words / sentences should be the same.
Info:
PyCharm IDE, Python 2.7.8, default encoding: "utf-8"
Thank you for your advices!
EDIT: Attaching the code ('url' is the url of the web page):
def getSoup(url):
req = urllib2.Request(url)
response = urllib2.urlopen(req)
page = response.read()
soup = BeautifulSoup(page, 'xml')
return soup
a=0
klubyDict = dict()
index = getSoup("url")
all = index.findAll('A')
for i in all:
okres = getSoup("http://url%s"%(i['HREF']))
kluby = okres.findAll('A')
# print(kluby[0]['HREF'])
print "Okrsok...%s"%(i.text)
for klub in kluby:
klubHtml = getSoup("http://url%s"%(klub['HREF']))
name = klub.text
print name
emailTag = klubHtml.find('td',text=re.compile("Email:"))
email = emailTag.text[7:]
if len(name)>0:
klubyDict[name]=email if len(email)>0 else "email nezadany"
print klubyDict
print "Saving to file..."
with open('futbaloveKluby','wb') as f:
pickle.dump(klubyDict,f)
EDIT2: Adding the data into the excel file
# -*- coding: utf-8 -*-
import cPickle as pickle
dict = dict()
workbook = xlsxwriter.Workbook('Futbal.xlsx')
worksheet = workbook.add_worksheet()
with open('futbaloveKluby','rb') as f:
dict = pickle.load(f)
colKlub = 0
colEmail = 1
row = 0
for klub in dict.keys():
worksheet.write(row,colKlub, klub)
worksheet.write(row,colEmail, dict[klub])
row += 1
workbook.close()
print table.text
The main thing is that after this code, I put values of this dictionary into the Excel table using xlscWriter. When I open Excel file I can see wrong characters.
import codecs
and use this code for your name :
name = klub.text
print name.decode('utf-8')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.