[英]How to replace a string contains quotation marks in python?
I have a CSV file is about HTML code.我有一个 CSV 文件大约是 HTML 代码。
import pandas as pd
import numpy as np
import csv
import seaborn as sns
import re
import os
pd.set_option("display.max_rows",1000000000)
pd.set_option("display.max_columns",1000000000)
dirs = os.listdir('DataCollectionCA/')
for i in dirs:
if os.path.splitext(i)[1] == ".csv":
print(i)
dirss = 'DataCollectionCA/'
print("<div class=\"\"ContentGrid\"\">")
df = pd.read_csv(dirss+"7197409.csv") #導入資料
df_num = len(df) #計算有多少行
print(df_num)
real_df_num = df_num+1
with open ('719740999999.csv', 'a' ,newline='', encoding="utf-8") as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['互動作者','發表時間','互動內容'])
for post in range(1,real_df_num):
with open (dirss+'7197409.csv', newline='', encoding="utf-8") as csvfile:
reader = csv.reader(csvfile)
column0 = [row[0] for row in reader]
for i, rows in enumerate(column0):
if i == post:
row000 = rows
with open (dirss+'7197409.csv', newline='', encoding="utf-8") as csvfile:
reader = csv.reader(csvfile)
column1 = [row[1] for row in reader]
for j, rows in enumerate(column1):
if j == post:
row001 = rows
with open (dirss+'7197409.csv', newline='', encoding="utf-8") as csvfile:
reader = csv.reader(csvfile)
column2 = [row[2] for row in reader]
for k, rows in enumerate(column2):
if k == post:
row002 = rows
author = row000
res_time = row001
original_html_code = row002
new_html_code_01 = original_html_code.replace('"<div class=""ContentGrid"">', " ")
new_html_code_02 = new_html_code_01.replace('<br>', " ")
print(new_html_code_02)
print("======")
with open ('719740999999.csv', 'a' ,newline='', encoding="utf-8") as csvfile:
writer = csv.writer(csvfile)
writer.writerow([author,res_time,new_html_code_02])
I want to use Python to replace the following string (it is HTML code):我想使用 Python 来替换以下字符串(它是 HTML 代码):
"<div class=""ContentGrid"">
<img data-icons="":~("" src=""
"" onload=""DrawImage(this)"" width=""300"" height=""617"">
and so on.等等。
I tried to use the following code to do it, but it was failed.我尝试使用以下代码来执行此操作,但失败了。 I want to replace to blank.
我想替换为空白。
new_html_code_02 = re.sub('<div class=\"\"ContentGrid\"\">', " ", new_html_code_01)
new_html_code_02 = re.sub('<div class=""ContentGrid"">', " ", new_html_code_01)
The new file still shows these string.新文件仍然显示这些字符串。 I don't know what to solve.
我不知道要解决什么。
I'm not exactly sure what you want, but the second replacement statement you tried worked for me.我不确定您想要什么,但是您尝试的第二个替换语句对我有用。 You don't need to escape the qoutes (
"
). If you only want to replace static expressions, you don't even need to use regex, you could also use Python's replace()
method of the string
type:你不需要转义 qoutes (
"
)。如果你只想替换 static 表达式,你甚至不需要使用正则表达式,你也可以使用 Python 的string
类型的replace()
方法:
import re
html = '<div class=""ContentGrid"">' \
'<img data-icons="":~("" src=""' \
'"" onload=""DrawImage(this)"" width=""300"" height=""617"">'
new_html = re.sub('<div class=""ContentGrid"">', " ", html)
print(new_html)
new_html = html.replace('<div class=""ContentGrid"">', " ")
print(new_html)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.