簡體   English   中英

刪除換行符但僅在引號之間

[英]Remove New Line Feed But Only Between Quotes

我有以下代碼:

output = requests.get(url=url, auth=oauth, headers=headers, data=payload)
output_data = output.content

type(output_date)
<class 'bytes'>

output_data

壓縮文本(3632 行)

在查看壓縮文本時,我有一些如下所示的值:

Steve likes to walk his dog. Steve says to John "I like \n Pineapple, oranges, \n and pizza.\n" and then he went to bed \n.
John likes his beer cold.\n
Sally likes her teeth brushed with a bottle of jack.\n

如何刪除 \\n 字符,但僅當它包含在雙引號內時,我的結果如下所示:

Steve likes to walk his dog. Steve says to John "I like Pineapple, oranges, and pizza." and then he went to bed \n.
John likes his beer cold.\n
Sally likes her teeth brushed with a bottle of jack.\n

我知道如何刪除\\n字符,但是如果我只想刪除包含在雙引號中的值,我不確定如何執行此操作。

這是我的嘗試:

我找到了這個,並使用了這個代碼:

my_text = re.sub(r'"\\n"','',my_text)

但它似乎不起作用。

我可能有點復雜,但這樣的事情可能會奏效

parts = content.split("\"")
for i, part in enumerate(parts):
    if i % 2:
        parts[i] = part.replace("\n", "")
content = "\"".join(parts)

弄清楚了。

腳步:

  1. 將字節轉換為字符串
  2. 為 Regex 創建模式
  3. 使用正則表達式來格式化值。

第1步:

my_text = my_text.decode("utf-8")

第2步:

pattern = re.compile(r'".*?"',re.DOTALL)

第 3 步:

my_text = pattern.sub(lambda x:x.group().replace('\n',''),my_text)

這解決了我的問題。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM