简体繁体 English

从 EDGAR XBRL 文件构建财务表

[英]Build financial table from EDGAR XBRL files

原文 2022-01-19 14:26:17 9 1 xbrl/ edgar

When companies submit their reports to SEC, a number of files are made available.当公司向 SEC 提交报告时，会提供一些文件。 Eg the latest 10-K of AAPL .例如最新的 10-K 的AAPL 。 From this even the SEC website (and many others) make the tables as structured data .由此，甚至 SEC 网站（以及许多其他网站）也将表格制作为结构化数据。

What I would like to do is to reproduce it myself, but got stuck.我想做的是自己复制它，但被卡住了。 Can somebody point me to a detailed, step-by-step description to how to do it?有人可以向我指出详细的分步说明如何做到这一点吗？

Usually SO users ask for more specific questions and the list of things the OP tried, so, here is what I tried, what I understand:通常 SO 用户会询问更具体的问题和 OP 尝试过的事情列表，所以，这就是我尝试过的，我理解的：

The six files at the bottom (8-12, 15 in the example) have all data used.底部的六个文件（示例中的 8-12、15）已使用所有数据。 Basically 8-12 submitted by the company and 15 is an extract from the in-line Xbrl of the filing itself (1 in the example).基本上，公司提交的 8-12 和 15 是文件本身的内联 Xbrl 的摘录（示例中的 1）。
The extract file (15) has all the Xbrl instances listed and all the contexts.提取文件 (15) 列出了所有 Xbrl 实例和所有上下文。 It is very clear.非常清楚。
The XSD file (8) has a list of all the forms and all company specific elements. XSD 文件 (8) 包含所有 forms 和所有公司特定元素的列表。 The first is given in link:roleType blocks, giving a Definition and a list of linkbases where it appears (although sometimes they do not appear).第一个在 link:roleType 块中给出，给出一个定义和它出现的链接库列表（尽管有时它们不出现）。 The second is given as <xs:element>-s.第二个以 <xs:element>-s 的形式给出。
The presentation file (12) has the same list of tables.演示文件 (12) 具有相同的表格列表。
The definition (10) and label (11) linkbases should give some more details, eg the company specific label of a certain data.定义 (10) 和 label (11) 链接库应提供更多详细信息，例如某个数据的公司特定 label。
The calculation linkbase is not really needed (I guess) it is more a validation that the totals are indeed calculated how it is indicated.计算链接库并不是真正需要的（我猜），它更多的是验证总数确实是如何计算的。

What I do not understand though:我不明白的是：

What is the right approach to build up the tables from these files?从这些文件构建表的正确方法是什么？ Is it by going through the XSD/PRE forms and find the data for them in the Extract file or the other way round?是通过 XSD/PRE forms 并在提取文件中找到它们的数据还是相反？
Regardless how hard I tried I could not find the link (with all the locators and arcs) between a datapoint in the Extract file and a label in the LAB file.无论我多么努力，我都找不到提取文件中的数据点和 LAB 文件中的 label 之间的链接（包含所有定位器和弧）。 As a human, it is "easy", but as a machine, the names are always slightly different (eg (a) loc_XYZ changing to lab_XYZ; (b) a name "XYZ" has its own version and an "XYZAbstract" version, (c) names like XYZ have numbers attached to them XY_123) and so the link between the "two ends" I cannot establish.作为人类，它“容易”，但作为机器，名称总是略有不同（例如（a）loc_XYZ 更改为lab_XYZ；（b）名称“XYZ”有自己的版本和“XYZAbstract”版本， (c) 像 XYZ 这样的名称附有数字 XY_123)，因此我无法建立“两端”之间的联系。

This is why I would like a step-by-step explanation, like:这就是为什么我想要一步一步的解释，比如：

Take file... first.拿文件……先。 There iterate through the <...> tags.那里遍历 <...> 标签。 For every tag find a <...> tag in file..., where attribute... is equal to the... attribute of the iterated tag.对于每个标签，在文件...中找到一个 <...> 标签，其中属性... 等于迭代标签的... 属性。 Etc.等等。

Thanks,谢谢，

PS (I am not interested in available software and services that already do this, neither in some specific libraries to call. I simply would like to extract the information, using the plain text files.) PS（我对已经这样做的可用软件和服务不感兴趣，也不对某些特定的库调用。我只是想使用纯文本文件提取信息。）

1 个解决方案

If you're looking to process XBRL without re-using existing XBRL software, then the best place to start would be the XBRL Specifications .如果您希望在不重复使用现有 XBRL 软件的情况下处理 XBRL，那么最好的起点是XBRL 规范。 In particular, the section on XLink in XBRL will explain how XBRL Linkbases work, including the labels used in the xlink:from and xlink:to attributes.特别是， XBRL 中的 XLink部分将解释 XBRL Linkbases 的工作原理，包括xlink:from和xlink:to属性中使用的标签。 The short answer is that those attributes just contain arbitrary identifiers that reference the xlink:label attribute of an element elsewhere in the file.简短的回答是，这些属性只包含任意标识符，这些标识符引用文件中其他位置元素的xlink:label属性。

The specific question of how you construct financial tables from the XBRL data is tricky.如何从 XBRL 数据构建财务表的具体问题非常棘手。 An XBRL Report does not contain any explicit information that associates facts in the report with tables. XBRL 报告不包含将报告中的事实与表格相关联的任何明确信息。 You can build a list of the concepts in a section of a financial report from the presentation linkbase, but you'll often find that you have more facts than expected using those concepts.您可以从演示链接库的财务报告部分中构建概念列表，但您经常会发现使用这些概念获得的事实比预期的要多。 For example, if you build a list of concepts from the Balance Sheet section, you'll often find that you have facts that use those concepts but with additional dimensions because they were tagged from a note providing a breakdown of that concept.例如，如果您从“资产负债表”部分构建概念列表，您通常会发现您拥有使用这些概念但具有附加维度的事实，因为它们是从提供该概念细分的注释中标记的。

The SEC use a heuristic-based approach to organising facts into tables. SEC 使用基于启发式的方法将事实组织成表格。 This process is documented in section 6.24 of the Edgar Filer Manual .此过程记录在Edgar Filer Manual的第 6.24 节中。