简体   繁体   English

为DNS区域文件提取域

[英]Extracting domain out for DNS zone file

I am trying to extract all domain names out of COM and NAME dns zone file. 我正在尝试从COM和NAME dns区域文件中提取所有域名。 Those zone files contain all dns entries and there seem to be lack of information about structure of zone files. 这些区域文件包含所有dns条目,似乎缺少有关区域文件结构的信息。

Do all domain registered has NS entries? 是否所有注册的域都有NS条目? Even those which are not actively used? 甚至那些没有被积极使用的东西? Which record/records should I use to extract domain names. 我应该使用哪些记录来提取域名。

Zone files are very large and sorting them would be stupid idea. 区域文件很大,对它们进行排序将是愚蠢的主意。 So if I can use one DNS record type to extract domain name than it would be easier. 因此,如果我可以使用一种DNS记录类型来提取域名,那将比这容易得多。 I found this python script(I dont know python) on GitHub which uses only NS entries. 我在仅使用NS条目的GitHub上找到了这个 python脚本(我不知道python)。 Is it correct logically? 逻辑上正确吗?

Someone with experience please comment. 有经验的人请发表评论。

The format of the DNS zone file is defined in RFC 1035 (section 5) and RFC 1034 (section 3.6.1). DNS区域文件的格式在RFC 1035(第5节)和RFC 1034(第3.6.1节)中定义。 You can find many details on Wikipedia: https://en.wikipedia.org/wiki/Zone_file 您可以在Wikipedia上找到许多详细信息: https : //en.wikipedia.org/wiki/Zone_file

It contains only the published domain names that is those having at least one nameserver and not being under clientHold or serverHold statuses (see http://www.icann.org/epp#clientHold and http://www.icann.org/epp#serverHold ), which means in short it is NOT all domain names registered. 它仅包含已发布的域名,这些域名具有至少一个名称服务器且不处于clientHoldserverHold状态(请参阅http://www.icann.org/epp#clientHoldhttp://www.icann.org/epp) #serverHold ),简而言之, 并非所有域名都已注册。

.COM zone file is huge indeed. .COM区域文件确实很大。 In any case, you need to match on NS records lines and deduplicate domain names. 无论如何,您都需要在NS记录行上进行匹配并删除重复的域名。 There are multiple strategies to do that, depending on your constraints. 有多种策略可以执行此操作,具体取决于您的约束。

Note that many providers on line already do this work for you and can provide directly the domain names if this is all you are interested in. Some may also provide differential content, one day from the previous. 请注意,在线上的许多提供商已经为您完成了这项工作,如果您感兴趣的话,可以直接提供域名。有些提供商可能会提供与前一天不同的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM