简体   繁体   English

Hive UDF将二进制(utf8)转换为Base64字符串

[英]Hive UDF to convert binary(utf8) into Base64 string

I have a binary thrift field stored in a parquet file. 我在镶木地板文件中存储了一个二进制节俭字段。 Parquet writes it as binary (UTF8) and I want to convert this into Base64 String using a Hive UDF. Parquet将其写为二进制(UTF8),我想使用Hive UDF将其转换为Base64 String。 It should be very basic but don't know why my code doesn't work, here's what I've tried, 它应该是非常基本的,但是不知道为什么我的代码不起作用,这是我尝试过的,

 public class Base64Encode extends UDF {
  public Text evaluate(Text bin) {
    if (bin != null) {
      String encoded = new String(Base64.getEncoder().encode(bin.getBytes()));
      if (encoded != null) {
        return new Text(encoded);
      }
    }
    return null;
  }

}

You don't need to create your own UDF for this task. 您无需为此任务创建自己的UDF。 There are several already defined. 已经定义了几个。 In your question you say that Parquet is storing the data as a Binary, but your example code has a parameter of type Text. 在您的问题中,您说Parquet将数据存储为二进制,但是示例代码的参数类型为Text。

If your parameter is already in binary, just use: 如果您的参数已经是二进制文件,则只需使用:

base64(bin_field)

Otherwise, if it is in text format and you want to convert it to Binary UTF-8 then to base 64, combine: 否则,如果它是文本格式,并且您想将其转换为Binary UTF-8,然后转换为base 64,请组合:

base64(encode(text_field, 'UTF-8'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM