简体   繁体   English

U-SQL和/或Azure Data Lake Store中的PGP加密

[英]PGP Encryption in U-SQL and/or Azure Data Lake Store

Without spinning up a VM instance, is it possible to add PGP encryption to data already in Azure Lake Lake Store? 在不扩展VM实例的情况下,是否可以向Azure Lake Lake Store中已经存在的数据添加PGP加密? Theoretically, it seems this should be possible with a registered c# binary (dll) in U-SQL but theoretically this would require treating files as blobs (or as text), and I'm not sure how one would do that from U-SQL? 从理论上讲,似乎可以通过在U-SQL中使用已注册的c#二进制(dll)来实现,但从理论上讲,这将需要将文件视为blob(或文本),而且我不确定如何从U-SQL中做到这一点。 ?

The use case is to take data from the lake, encrypt it as PGP/GPG using a public key, and then land the data into an ADLS location for pickup by an external team (subsequent egress from ADLS). 用例是从湖泊中获取数据,使用公共密钥将其加密为PGP / GPG,然后将数据放入ADLS位置,以供外部团队提取(随后从ADLS出站)。

Any ideas? 有任何想法吗?

You can write a custom extractor and outputter that can then do the decryption/encryption. 您可以编写一个自定义的提取器和输出器,然后可以执行解密/加密。 This would most likely look something like this (at the abstract level): 这很可能看起来像这样(在抽象级别):

  • Extractor: 提取:

     AtomicFileProcessing=true d = decrypt(input.baseStream) for each row in d.Split do outputrow end // or whatever the right processing is 
  • Outputter: 输出器:

     AtomicFileProcessing=true serialize rows into outputstream encrypt outputstream and write to output 

Note that there are some examples on the example section in our U-SQL GitHub page that show how to operate on data at the basestream level. 请注意,我们的U-SQL GitHub页面上的示例部分中有一些示例,展示了如何在基本流级别上对数据进行操作。

You will want to avoid having to load more than 500MB of data into main memory though if you can. 您将尽可能避免将500 MB以上的数据加载到主内存中。 So it would be good if the encrypt/decrypt could be done in a streaming way. 因此,如果可以以流方式完成加密/解密,那就太好了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM