With the below code I tried to extract the description from the url which is having special characters.
from bs4 import BeautifulSoup
import urllib.request
import pandas as pd
html = urllib.request.urlopen('http://uk.rs-online.com/web/p/piezoelectric-
miniature-speakers/7868948/').read()
soup = BeautifulSoup(html)
description = soup.find(itemprop="name").string.strip()
description
pd.DataFrame([description]).to_csv('file.csv')
Upon viewing the scraped data in csv file I found that those special characters are replaced with question mark.
How to get those special characters in a csv file.
Thank you in advance for your suggestions.
Choose proper encoding and the special characters appear in the file. I test it with utf8 and all special characters displayed correctly.
from bs4 import BeautifulSoup
import urllib.request
import pandas as pd
html = urllib.request.urlopen('http://uk.rs-online.com/web/p/piezoelectric-miniature-speakers/7868948/').read()
soup = BeautifulSoup(html)
description = soup.find(itemprop="name").string.strip()
pd.DataFrame([description]).to_csv('file.csv', encoding='utf8')
Also ensure that you are opening the file with correct encoding in the editor
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.