簡體   English   中英

如何使用Stanford CoreNLP java實現進行共參考分辨率

[英]How to use Stanford CoreNLP java implementation for coreference resolution

我試圖了解corenlp-coreference解析系統的輸出。

以下是基於規則的系統獲得的輸入和輸出對示例:

輸入句子:

他的曾祖父是諾森伯蘭郡第四伯爵亨利珀西,他的妻子是諾森伯蘭郡伯爵夫人莫德赫伯特。 他的外祖母是羅伯特斯賓塞爵士和埃莉諾博福特的女兒。 埃莉諾是埃德蒙·博福特的女兒,第二任薩默塞特公爵和埃莉諾·博尚。 她是Richard de Beauchamp的孫女,第13任華威伯爵和伊麗莎白伯克利。

我用來獲取輸出的命令:

./corenlp.sh -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt -outputFormat json

首先,我不明白鍵是什么意思? 這些數字代表什么? 它寫在某處嗎? 我只能在這里找到有關xml輸出格式的信息

> json_output['corefs'].keys()

dict_keys(['1', '2', '3', '4', '6', '7', '9', '10', '11', '12', '15', '16', '17', '18', '19', '20', '22', '23', '24', '25', '26', '29', '30', '31'])

其次,上面字典中的所有值是否代表輸入中找到的不同集群? 換句話說,我可以說在輸入中找到了len(json_output['corefs'].keys())簇嗎?

編輯

如果你想看到輸出,我在下面分享。

輸出(我將outputType設置為json以下,我只分享完整輸出的'corefs'鍵):

> json_output['corefs']

{'1':[{'id':1,'text':'Henry Percy','type':'PROPER','number':'SINGULAR','gender':'MALE','animacy': 'ANIMATE','startIndex':5,'endIndex':7,'headIndex':6,'sentNum':1,'position':[1,4],'isRepresentativeMention':True}],'2': [{'id':2,'text':'4th','type':'PROPER','number':'SINGULAR','gender':'UNKNOWN','animacy':'UNKNOWN','startIndex ':8,'endIndex':9,'headIndex':8,'sentNum':1,'position':[1,5],'isRepresentativeMention':True}],'3':[{'id': 3,'text':'Northumberland','type':'PROPER','number':'SINGULAR','gender':'NEUTRAL','animacy':'INANIMATE','startIndex':11,'endIndex ':12,'headIndex':11,'sentNum':1,'position':[1,6],'isRepresentativeMention':True},{'id':5,'text':'Northumberland','type ':'PROPER','number':'SINGULAR','gender':'NEUTRAL','animacy':'INANIMATE','startIndex':21,'endIndex':22,'headIndex':21,'sentNum ':1,'位置':[1,10],'isRepresentativeMention':False}],'4':[{'id':4,'text':'Maud Herbert','ty pe':'PROPER','number':'SINGULAR','gender':'FEMALE','animacy':'ANIMATE','startIndex':16,'endIndex':18,'headIndex':17,' sentNum':1,'position':[1,9],'isRepresentativeMention':True}],'6':[{'id':6,'text':'他的母親的曾祖父','類型' :'NOMINAL','number':'SINGULAR','gender':'MALE','animacy':'ANIMATE','startIndex':1,'endIndex':4,'headIndex':3,'sentNum' :1,'位置':[1,1],'isRepresentativeMention':False},{'id':8,'text':'亨利珀西,諾森伯蘭郡的第四伯爵,他的妻子是莫德赫伯特,諾森伯蘭伯爵夫人' ,'type':'PROPER','number':'SINGULAR','gender':'MALE','animacy':'ANIMATE','startIndex':5,'endIndex':22,'headIndex':9 ,'sentNum':1,'position':[1,3],'isRepresentativeMention':True},{'id':13,'text':'他','type':'PRONOMINAL','number' :'SINGULAR','gender':'MALE','animacy':'ANIMATE','startIndex':1,'endIndex':2,'headIndex':1,'sentNum':2,'position':[ 2,2],'isRepresentativeMention':False}], '7':[{'id':7,'text':'他','type':'PRONOMINAL','number':'SINGULAR','gender':'MALE','animacy':'ANIMATE ','startIndex':1,'endIndex':2,'headIndex':1,'sentNum':1,'position':[1,2],'isRepresentativeMention':True}],'9':[{ 'id':9,'text':'諾森伯蘭,他的妻子是Maud Herbert,諾森伯蘭伯爵夫人','type':'PROPER','number':'SINGULAR','gender':'NEUTRAL','animacy ':'INANIMATE','startIndex':11,'endIndex':22,'headIndex':11,'sentNum':1,'position':[1,7],'isRepresentativeMention':True}],'10 ':[{'id':10,'text':'Maud Herbert,諾森伯蘭伯爵夫人','類型':'正確','數字':'奇異','性別':'女性','動物' :'ANIMATE','startIndex':16,'endIndex':22,'headIndex':19,'sentNum':1,'position':[1,8],'isRepresentativeMention':True}],'11' :[{'id':11,'text':'Robert Spencer','type':'PROPER','number':'SINGULAR','gender':'MALE','animacy':'ANIMATE', 'startIndex':9,'endIndex':11,'headIndex':10,'sentNum':2, 'position':[2,6],'isRepresentativeMention':True}],'12':[{'id':12,'text':'他的外祖母','type':'NOMINAL','數字':'SINGULAR','gender':'FEMALE','animacy':'ANIMATE','startIndex':1,'endIndex':4,'headIndex':3,'sentNum':2,'position': [2,1],'isRepresentativeMention':真},{'id':14,'文字':'羅伯特斯賓塞爵士和埃莉諾博福特的女兒','類型':'名詞','數字':'奇異','gender':'FEMALE','animacy':'ANIMATE','startIndex':5,'endIndex':14,'headIndex':6,'sentNum':2,'position':[2,3 ],'isRepresentativeMention':False}],'15':[{'id':15,'text':'Sir Robert Spencer和Eleanor Beaufort','type':'LIST','number':'PLURAL' ,'sex':'UNKNOWN','animacy':'ANIMATE','startIndex':8,'endIndex':14,'headIndex':13,'sentNum':2,'position':[2,4] ,'isRepresentativeMention':True}],'16':[{'id':16,'text':'Sir','type':'PROPER','number':'SINGULAR','gender':' MALE','animacy':'INANIMATE','startIndex':8,'endIndex':9, 'headIndex':8,'sentNum':2,'position':[2,5],'isRepresentativeMention':True}],'17':[{'id':17,'text':'Eleanor', 'type':'PROPER','number':'SINGULAR','gender':'FEMALE','animacy':'ANIMATE','startIndex':1,'endIndex':2,'headIndex':1, 'sentNum':3,'position':[3,1],'isRepresentativeMention':True},{'id':21,'text':'Edmund Beaufort的女兒,Somerset的第二公爵和Eleanor Beauchamp', 'type':'NOMINAL','number':'SINGULAR','gender':'FEMALE','animacy':'ANIMATE','startIndex':3,'endIndex':16,'headIndex':4, 'sentNum':3,'position':[3,2],'isRepresentativeMention':False},{'id':27,'text':'她','type':'PRONOMINAL','number': 'SINGULAR','gender':'FEMALE','animacy':'ANIMATE','startIndex':1,'endIndex':2,'headIndex':1,'sentNum':4,'position':[4 ,1,'isRepresentativeMention':False},{'id':28,'text':'Richard de Beauchamp的孫女,Warwick的第13任伯爵和Elizabeth Berkeley','type':'NOMINAL','number' :'奇異','gen der':'FEMALE','animacy':'ANIMATE','startIndex':3,'endIndex':17,'headIndex':4,'sentNum':4,'position':[4,2],' isRepresentativeMention':False}],'18':[{'id':18,'text':'Edmund Beaufort','type':'PROPER','number':'SINGULAR','gender':'MALE ','animacy':'ANIMATE','startIndex':6,'endIndex':8,'headIndex':7,'sentNum':3,'position':[3,4],'isRepresentativeMention':True} ],'19':[{'id':19,'text':'2nd','type':'PROPER','number':'SINGULAR','gender':'UNKNOWN','animacy': 'UNKNOWN','startIndex':9,'endIndex':10,'headIndex':9,'sentNum':3,'position':[3,5],'isRepresentativeMention':True}],'20': [{'id':20,'text':'Somerset','type':'PROPER','number':'SINGULAR','gender':'NEUTRAL','animacy':'INANIMATE','startIndex ':12,'endIndex':13,'headIndex':12,'sentNum':3,'position':[3,7],'isRepresentativeMention':True}],'22':[{'id': 22,'文字':'Edmund Beaufort,Somerset第二公爵和Eleanor Beauchamp','type':'PROPER','number':'SINGU LAR','gender':'NEUTRAL','animacy':'ANIMATE','startIndex':6,'endIndex':16,'headIndex':10,'sentNum':3,'position':[3, 3],'isRepresentativeMention':True}],'23':[{'id':23,'text':'Somerset and Eleanor Beauchamp','type':'LIST','number':'PLURAL', 'gender':'UNKNOWN','animacy':'ANIMATE','startIndex':12,'endIndex':16,'headIndex':15,'sentNum':3,'position':[3,6], 'isRepresentativeMention':True}],'24':[{'id':24,'text':'Richard de Beauchamp','type':'PROPER','number':'SINGULAR','gender': 'MALE','animacy':'ANIMATE','startIndex':6,'endIndex':9,'headIndex':8,'sentNum':4,'position':[4,3],'isRepresentativeMention':真},'25':[{'id':25,'text':'13th','type':'PROPER','number':'SINGULAR','gender':'UNKNOWN','animacy ':'UNKNOWN','startIndex':10,'endIndex':11,'headIndex':10,'sentNum':4,'position':[4,6],'isRepresentativeMention':True}],'26 ':[{'id':26,'text':'Warwick','type':'PROPER','number':'UNKNOWN','gender':'UN KNOWN','animacy':'INANIMATE','startIndex':13,'endIndex':14,'headIndex':13,'sentNum':4,'position':[4,8],'isRepresentativeMention':True },'29':[{'id':29,'text':'Richard de Beauchamp,第13屆沃里克伯爵和伊麗莎白伯克利','類型':'正確','數字':'奇異','性別':'男性','動畫':'動畫','startIndex':6,'endIndex':17,'headIndex':8,'sentNum':4,'位置':[4,4],' isRepresentativeMention':True}],'30':[{'id':30,'text':'第13屆沃里克伯爵和伊麗莎白伯克利','類型':'正確','數字':'奇異','性別':'男性','動畫':'動畫','startIndex':10,'endIndex':17,'headIndex':11,'sentNum':4,'位置':[4,5],' isRepresentativeMention':True}],'31':[{'id':31,'text':'Warwick and Elizabeth Berkeley','type':'LIST','number':'PLURAL','gender': 'UNKNOWN','animacy':'ANIMATE','startIndex':13,'endIndex':17,'headIndex':16,'sentNum':4,'position':[4,7],'isRepresentativeMention':真正}]}

列表代表提到的集群。 每個條目都是一個明顯的提及。 我不希望即使是當前最先進的共參配系統也能在您的示例中表現良好。 我建議運行一個更簡單的例子,比如"Joe Smith ate his lunch." 應該有希望在兩個提及之間顯示出一個聯系。

編輯:我剛剛運行這個例子並得到了這個JSON(顯示“Joe Smith”和“他的”之間的鏈接):

{'1':[{'id':1,'text':'Joe Smith','type':'PROPER','number':'SINGULAR','gender':'MALE','animacy': 'ANIMATE','startIndex':1,'endIndex':3,'headIndex':2,'sentNum':1,'position':[1,1],'isRepresentativeMention':True},{'id': 3,'text':'他','type':'PRONOMINAL','number':'SINGULAR','gender':'MALE','animacy':'ANIMATE','startIndex':4,'endIndex ':5,'headIndex':4,'sentNum':1,'position':[1,3],'isRepresentativeMention':False}],'2':[{'id':2,'text': '他的午餐','類型':'NOMINAL','數字':'奇異','性別':'未知','動畫':'INANIMATE','startIndex':4,'endIndex':6,' headIndex':5,'sentNum':1,'position':[1,2],'isRepresentativeMention':True}]}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM