簡體   English   中英

從科學出版物中刪除特定的參考文獻,如作者

[英]Remove specific references like author from scientific publication

我之前看到過這個問題: python 從科學論文中刪除參考這與我想做的類似,但我仍然無法弄清楚如何准確假設我的字符串有這樣的參考,例如: Poteete et al。 (2010)如何使用 python 中的正則表達式從字符串中刪除它?

我嘗試過的與上一個問題相似,但也許我忘記了某事:

sentence = "Moreover, we elaborate on how these methods have led to improved insights into the theoretical framework proposed by  Poteete et al. (2010)"
sentence = re.sub(r'(?:[\w \.])+[0-9]{4}','',sentence)

有什么想法嗎? 非常感謝你的幫助。

如果名稱以大寫字符 AZ 開頭:

[A-Z]\w*(?: +\w+)*\. \(\d{4}\)
  • [AZ]\w*匹配一個字符 AZ 和可選單詞 char
  • (?: +\w+)*可選擇重復 1+ 個空格和 1+ 個單詞字符
  • \. 匹配.
  • \(\d{4}\)匹配括號之間的 4 位數字

除了匹配空格,您還可以使用\s但它也可以匹配換行符。

正則表達式演示

import re
 
sentence = "Moreover, we elaborate on how these methods have led to improved insights into the theoretical framework proposed by  Poteete et al. (2010)"
sentence = re.sub(r'[A-Z]\w*(?: +\w+)*\. \(\d{4}\)', '', sentence)
print (sentence)

Output

Moreover, we elaborate on how these methods have led to improved insights into the theoretical framework proposed by  

重新進口

s = "Moreover, we elaborate on how these methods have led to improved insights into the theoretical framework proposed by Poteete et al. (2010) and Someone et al. (2010) and something"

print (re.sub(r'[A-Z][a-z]+\set al\.\s\([0-9]{4}\)','',s))

Output:

Moreover, we elaborate on how these methods have led to improved insights into the theoretical framework proposed by  and  and something

在這里您可以檢測到“等”。 后跟日期,前面是第一個字母為大寫的名稱。然后將其刪除。

不用說這些操作大多是自定義的,具體取決於您的特定論文、編輯器格式化參考的方式等。所以無論如何它一定很痛苦。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM