簡體   English   中英

Python 庫 - 出版物引文拆分

[英]Python Libraries - Publication Citation Splits

我有一堆引文字符串,我想將它們拆分為一個引文。 這是我從 OWL 引文網站上找到的一個例子。 我有 MLA、APA 等引文類型的組合。 是否有 python 庫或其他應用程序可以將這些字符串拆分為列表中的元素。 由於引用類型的多樣性,我嘗試避免使用正則表達式,我也嘗試用“/n”分割,但是,我的一些字符串沒有“/n”分隔符……所以你可以看到這個問題。 我想知道是否有更好的捕獲方法。 我不是在尋找捕獲名稱、日期、標題……找到一個可以做到這一點的庫……我只需要將字符串分開。 任何幫助將非常感激!!!! 謝謝!!

輸入字符串 - 樣本

Dean, Cornelia. "Executive on a Mission: Saving the Planet." The New York Times, 22 May 2007, www.nytimes.com/2007/05/22/science/earth/22ander.html?_r=0. Accessed 12 May 2016.

Ebert, Roger. Review of An Inconvenient Truth, directed by Davis Guggenheim. rogerebert.com, 1 June 2006, www.rogerebert.com/reviews/an-inconvenient-truth-2006. Accessed 15 June 2016.

輸出 - 樣本

['Dean, Cornelia. "Executive on a Mission: Saving the Planet." The New York Times, 22 May 2007, www.nytimes.com/2007/05/22/science/earth/22ander.html?_r=0. Accessed 12 May 2016.',
'Ebert, Roger. Review of An Inconvenient Truth, directed by Davis Guggenheim. rogerebert.com, 1 June 2006, www.rogerebert.com/reviews/an-inconvenient-truth-2006. Accessed 15 June 2016.']

嘗試split然后使用filter刪除空元素:

string = '''Dean, Cornelia. "Executive on a Mission: Saving the Planet." The New York Times, 22 May 2007, www.nytimes.com/2007/05/22/science/earth/22ander.html?_r=0. Accessed 12 May 2016.

Ebert, Roger. Review of An Inconvenient Truth, directed by Davis Guggenheim. rogerebert.com, 1 June 2006, www.rogerebert.com/reviews/an-inconvenient-truth-2006. Accessed 15 June 2016.'''

result = list(filter(None, string.split('\n')))

輸出:

['Dean, Cornelia. "Executive on a Mission: Saving the Planet." The New York Times, 22 May 2007, www.nytimes.com/2007/05/22/science/earth/22ander.html?_r=0. Accessed 12 May 2016.', 'Ebert, Roger. Review of An Inconvenient Truth, directed by Davis Guggenheim. rogerebert.com, 1 June 2006, www.rogerebert.com/reviews/an-inconvenient-truth-2006. Accessed 15 June 2016.']

如果你想用換行符分隔字符串s \\n你可以使用帶有 listcomp 的字符串方法splitlines()來過濾空元素:

[i for i in s.splitlines() if i]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM