简体   繁体   English

从 java 中的箭头文件读取 float16 数据类型列的正确方法是什么?

[英]What is the correct way to read float16 data type column from arrow file in java?

I am trying to establish an IPC pipeline.我正在尝试建立 IPC 管道。 I have a python program which is writing an.arrow file and saving it in memory which gets picked up by the Java application.我有一个 python 程序,它正在编写一个.arrow 文件并将其保存在 memory 中,该程序被 Java 应用程序拾取。 This application reads the file schema and does relevant operations.此应用程序读取文件架构并执行相关操作。 Currently I am having trouble with a column that is of float16 datatype .目前我在使用float16 数据类型的列时遇到问题。 So my python program writes this column something like this:所以我的 python 程序将这个专栏写成这样:

# sample
float16column= pa.array(([np.float16(0) for _ in range(5)]), type=type_float16)
item_table = pa.table([float16column], ['samplefloat16columnname'])
local = fs.LocalFileSystem()
with local.open_output_stream("output.arrow") as file:
    with pa.RecordBatchFileWriter(file, table.schema) as writer:
        writer.write_table(table)

Now when I try to read this file from java application (and mind you this is the only column throwing error) using the below program现在,当我尝试使用以下程序从 java 应用程序中读取此文件时(请注意,这是唯一的列抛出错误)

    public void read(String path) throws IOException {
    File arrowFile = new File(path);
    FileInputStream fileInputStream = new FileInputStream(arrowFile);
    SeekableReadChannel seekableReadChannel = new SeekableReadChannel(fileInputStream.getChannel());
    ArrowFileReader arrowFileReader = new ArrowFileReader(seekableReadChannel,
            new RootAllocator(Integer.MAX_VALUE));
    List<ArrowBlock> arrowBlocks = arrowFileReader.getRecordBlocks();
    for (int i = 0; i < arrowBlocks.size(); i++) {
        ArrowBlock rbBlock = arrowBlocks.get(i);
        if (!arrowFileReader.loadRecordBatch(rbBlock)) { // load the batch
            throw new IOException("Expected to read record batch");
        }
        // do something with the loaded batch

    }
}

I see this error:我看到这个错误:

Exception in thread "main" java.lang.UnsupportedOperationException: NYI: FloatingPoint(HALF)

Now I am not very proficient in java, but I am guessing this may have something to do with incompatible data types of both.现在我对java不是很精通,但我猜这可能与两者的数据类型不兼容有关。 Does anyone else know the correct of way of doing this?有其他人知道这样做的正确方法吗?

ps: reading the same arrow file using python seems to be working fine ps:使用 python 读取相同的箭头文件似乎工作正常

The smallest floating point type supported in Java is float, which is 4 bytes and is represented in Arrow Java as float4. Java 中支持的最小浮点类型是 float,它是 4 个字节,在箭头 Java 中表示为 float4。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从二进制文件读取 numpy 数组作为 float16 而不是 float32 重塑输入 - Reading numpy array from binary file as float16 instead of float32 reshapes the input 如何将Dense层的参数的数据类型设置为float16? - how can I set the data type of parameters of Dense layer to float16? ONNX 量化 Model 类型错误:类型“张量(float16)” - ONNX Quantized Model Type Error: Type 'tensor(float16)' 从float32到float16的numpy astype - numpy astype from float32 to float16 如何在数组 class 中添加对 float16 类型的支持 - how to add support for float16 type in array class 将TensorFlow Graph从PB转换为float16操作 - Convert TensorFlow Graph from PB to float16 operations 在OpenCV中运行神经网络时,如何解决“错误:(-215)pbBlob.raw_data_type()== caffe :: FLOAT16在函数blobFromProto中的问题” - How to fix, “error: (-215) pbBlob.raw_data_type() == caffe::FLOAT16 in function blobFromProto” when running neural network in OpenCV 将torchaudio加载的16位音频从`float32`截断到`float16`是否安全? - Is it safe to truncate torchaudio's loaded 16-bit audios to `float16` from `float32`? 在pandas中,如何将一列float32值转换为float16值? - In pandas, how to convert a column of float32 values to float16 values? 是否可以使用 float16 使用 tensorflow 1 进行训练? - Is it possible to train with tensorflow 1 using float16?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM