简体   繁体   English

CTAKES使用的UMLS和SNOMED-CT词汇究竟是什么?

[英]What exactly are the UMLS and SNOMED-CT vocabularies used by cTAKES?

Very new to cTAKES and looking through the docs , curious about what exactly the UMLS and SNOMEDCT "vocabularies" are. 对cTAKES非常陌生并浏览文档 ,对UMLS和SNOMEDCT “词汇表”究竟是什么感到好奇。 The user installation docs don't really seem to tell and simply applying for the UMLS license and the language around the UMLS Metathesaurus does not really divulge much more about the structure of the data being accessed. 用户安装文档似乎并不真正告诉并简单地申请UMLS许可证,而围绕 UMLS Metathesaurus的语言并没有真正泄露更多关于所访问数据结构的信息。 Eg. 例如。 is it some online API service? 它是一些在线API服务? Is it some files that come with the cTAKES download that can only be unlocked with a valid UMLS password that is checked against an online DB? 是否有一些cTAKES下载附带的文件只能通过在线数据库检查的有效UMLS密码解锁?

Info on what the UMLS Metathesaurus and SNOMEDCT are can be found here ( https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/index.html ) and here ( https://www.ncbi.nlm.nih.gov/books/NBK9676/ , specifically https://www.ncbi.nlm.nih.gov/books/NBK9684/ ): 关于UMLS Metathesaurus和SNOMEDCT的信息可以在这里找到( https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/index.html )和这里( https://www.ncbi.nlm .nih.gov / books / NBK9676 / ,特别是https://www.ncbi.nlm.nih.gov/books/NBK9684/ ):

The Metathesaurus is a very large, multi-purpose, and multi-lingual [relational?] vocabulary database that contains information about biomedical and health related concepts, their various names, and the relationships among them. Metathesaurus是一个非常庞大,多用途,多语言的[关系型]词汇数据库,包含有关生物医学和健康相关概念的信息,它们的各种名称以及它们之间的关系。 Designed for use by system developers... 专为系统开发人员使用而设计......

...The Metathesaurus contains concepts, concept names, and other attributes from more than 100 terminologies, classifications, and thesauri, some in multiple editions. ... Metathesaurus包含来自100多个术语,分类和叙词表的概念,概念名称和其他属性,其中一些属于多个版本。

While I'm not sure how exactly cTAKES implements its use of the UMLS Metathesaurus (anyone who knows could please enlighten), I assume that it is accessing some API for a relational database based on the UMLS credentials you need to add to the example scripts that come with the cTAKES download (see https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+User+Install+Guide#cTAKES4.0UserInstallGuide-(Recommended)AddUMLSaccessrights ). 虽然我不确定cTAKES究竟是如何实现其对UMLS Metathesaurus的使用(任何知道可以请指教的人),但我认为它是基于您需要添加到示例脚本的UMLS凭据来访问关系数据库的某些API随附cTAKES下载(参见https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+User+Install+Guide#cTAKES4.0UserInstallGuide-(Recommended)AddUMLSaccessrights )。

...You may select from two relational formats: the Rich Release Format (RRF), introduced in 2004, and the Original Release Format (ORF). ...您可以选择两种关系格式:2004年推出的富版本格式(RRF)和原始版本格式(ORF)。

(I think) this is what is used to power the UIMA analysis engines used to process text in cTAKES (我认为)这是用于驱动用于处理cTAKES中的文本的UIMA分析引擎的内容

UIMA is an architecture in which basic building blocks called Analysis Engines (AEs) are composed in order to analyze a document [...] How Annotators represent and share their results is an important part of the UIMA architecture. UIMA是一种体系结构,其中组成了称为分析引擎(AEs)的基本构建块,以便分析文档[...]注释器如何表示和共享其结果是UIMA体系结构的重要组成部分。 To enable composition and reuse, UIMA defines a Common Analysis Structure (CAS) precisely for these purposes. 为了实现组合和重用,UIMA为这些目的精确定义了通用分析结构(CAS) The CAS is an object-based container that manages and stores typed objects having properties and values, https://www.ibm.com/developerworks/data/downloads/uima/#How-does-it-work CAS是一个基于对象的容器,用于管理和存储具有属性和值的类型对象, https://www.ibm.com/developerworks/data/downloads/uima/#How-does-it-work

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM