[英]Bigquery multiple join using clause
I need to get the BGP AS details for the IP addresses in a table, the table contains SrcAddr and DstAddr as mentioned in table1我需要获取表中 IP 地址的 BGP AS 详细信息,该表包含表 1 中提到的 SrcAddr 和 DstAddr
table1表格1
SrcAddr![]() |
DstAddr![]() |
Bytes![]() |
---|---|---|
1.1.1.1 ![]() |
8.8.8.8 ![]() |
1005 ![]() |
Table2 contains the BGP as number details.表 2 包含 BGP 作为编号详细信息。
Table2表2
IPaddr ![]() |
Organization![]() |
network_bin![]() |
mask![]() |
---|---|---|---|
1.1.1.0/24 ![]() |
Cloudflare![]() |
asdjqowiq ![]() |
24 ![]() |
8.8.8.0/24 ![]() |
Google![]() |
asdqwrqsd ![]() |
24 ![]() |
I want to build a final table like below我想建立一个如下所示的决赛桌
Table3表3
SrcAddr![]() |
SrcAS ![]() |
DstAddr![]() |
Dst AS![]() |
Bytes![]() |
---|---|---|---|---|
1.1.1.1 ![]() |
Cloudflare![]() |
8.8.8.8 ![]() |
Google![]() |
1005 ![]() |
I used the below query by referring to the doc https://cloudplatform.googleblog.com/2014/03/geoip-geolocation-with-google-bigquery.html and was able to get the src_as field but was not able to resolve the dst_as.我通过参考文档https://cloudplatform.googleblog.com/2014/03/geoip-geolocation-with-google-bigquery.html使用了以下查询,并且能够获取 src_as 字段但无法解析dst_as。 can someone help me with this?
有人可以帮我弄这个吗?
WITH source_of_ip_addresses AS (
SELECT SamplerAddress, REGEXP_REPLACE(SrcAddr, 'xxx', '0') srcip, REGEXP_REPLACE(DstAddr, 'xxx', '0') dstip
FROM `fluentd.netflow_message`
WHERE SrcAddr IS NOT null
GROUP BY 1,2,3
)
SELECT *, srcip, src_as,
FROM (
SELECT srcip, network_bin, mask, autonomous_system_organization as src_as
FROM (
SELECT *, NET.SAFE_IP_FROM_STRING(source_of_ip_addresses.srcip) & NET.IP_NET_MASK(4, mask) network_bin ,
FROM source_of_ip_addresses, UNNEST(GENERATE_ARRAY(9,32)) mask
WHERE BYTE_LENGTH(NET.SAFE_IP_FROM_STRING(srcip)) = 4
)
JOIN `fluentd.asn_block_processed` USING (network_bin, mask)
just repeat the same process.只需重复相同的过程。 Also it is more convenient to use WITH clause instead of nested queries to make it simpler to repeat this code.
此外,使用 WITH 子句而不是嵌套查询更方便,可以更简单地重复此代码。 Something like below.
像下面这样的东西。 I obviously don't have access to your tables, so cannot check syntax, there will likely be duplicate columns you'll need to remove by using explicit column names rather than
*
.我显然无权访问您的表,因此无法检查语法,您可能需要使用显式列名而不是
*
来删除重复的列。
WITH source_of_ip_addresses AS (
SELECT SamplerAddress, REGEXP_REPLACE(SrcAddr, 'xxx', '0') srcip, REGEXP_REPLACE(DstAddr, 'xxx', '0') dstip
FROM `fluentd.netflow_message`
WHERE SrcAddr IS NOT null
GROUP BY 1,2,3
), source_with_masks AS (
SELECT *, NET.SAFE_IP_FROM_STRING(source_of_ip_addresses.srcip) & NET.IP_NET_MASK(4, mask) network_bin ,
FROM source_of_ip_addresses, UNNEST(GENERATE_ARRAY(9,32)) mask
WHERE BYTE_LENGTH(NET.SAFE_IP_FROM_STRING(srcip)) = 4
), source_processed AS (
SELECT *
FROM source_with_masks
JOIN `fluentd.asn_block_processed` USING (network_bin, mask)
), dest_with_masks AS (
-- same as above, with dstip instead of srcip
SELECT *, NET.SAFE_IP_FROM_STRING(source_of_ip_addresses.dstip) & NET.IP_NET_MASK(4, mask) network_bin ,
FROM source_processed, UNNEST(GENERATE_ARRAY(9,32)) mask
WHERE BYTE_LENGTH(NET.SAFE_IP_FROM_STRING(srcip)) = 4
), dest_processed AS (
SELECT *
FROM dest_with_masks
JOIN `fluentd.asn_block_processed` USING (network_bin, mask)
)
SELECT * from dest_processed
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.