簡體   English   中英

從 BigQuery 讀取並將數據存儲到 Google 存儲(特殊字符問題)

[英]Reading from BigQuery and store data to Google storage (Special Character issue)

參考: 谷歌數據流可以使用現有的虛擬機而不是臨時創建的虛擬機嗎?

代碼正在運行,但問題是當它將 BigQuery 的響應保存到谷歌存儲時,所有日語字符都已損壞。

PCollectionTuple QVCollections = rows.apply("FilterEmptyRows", ParDo.of(new FilterEmptyRowDoFn("TransactionId", "TransactionDateTime"))).apply("CreateQVFiles",ParDo.of(new TransactionToQVFilesDoFnJP())
        .withOutputTags(BobShare.QVHeaders, TupleTagList.of(BobShare.QVEvents).and(BobShare.QVPayments)));

QVCollections.get(BobShare.QVEvents).apply("WriteQVEvents", TextIO.write().to(storagePath + CSV_OUTPUT_FOLDER + "events_" + timeSuffix).withoutSharding().withHeader(CSV_HEADER_EVENTS).withSuffix(".csv"));
QVCollections.get(BobShare.QVPayments).apply("WriteQVPayments", TextIO.write().to(storagePath + CSV_OUTPUT_FOLDER + "payments_" + timeSuffix).withoutSharding().withHeader(CSV_HEADER_PAYMENTS).withSuffix(".csv"));
QVCollections.get(BobShare.QVHeaders).apply("WriteQVHeaders", TextIO.write().to(storagePath + CSV_OUTPUT_FOLDER + "header_" + timeSuffix).withoutSharding().withHeader(CSV_HEADER_TRANSACTION).withSuffix(".csv"));

根據我的發現,需要使用.withCoder(StringUtf8Coder.of())

此外,這是嘗試過的(但只能在本地工作 - DirectRunner)

private static void uploadBlob(String project, String bucket, String filename, String localfile) {
    String listFromCsv = readCsvFromLocalStorage(localfile);

    Storage storage = StorageOptions.newBuilder().setProjectId(project).build().getService();
    BlobId blobId = BlobId.of(bucket, filename);
    BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("application/json").setContentEncoding(UTF_8).build();
    try {
        storage.create(blobInfo, listFromCsv.getBytes(UTF_8));
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }
}


private static String readCsvFromLocalStorage(String fileName) {
    StringBuilder builder = new StringBuilder();
    Path pathToFile = Paths.get(fileName);

    try (BufferedReader br = Files.newBufferedReader(pathToFile,
            StandardCharsets.UTF_8)) {

        // read the first line from the text file
        String line = br.readLine();

        // loop until all lines are read
        while (line != null) {
            builder.append(line).append("\n");
            line = br.readLine();
        }

    } catch (IOException ioe) {
        ioe.printStackTrace();
    }

    return builder.toString();
}

private static void deleteLocalFile (String fileName)
{
    try {
        if (new File(fileName).delete()) {
            System.out.println(fileName + " deleted.");
        } else {
            System.out.println(fileName + " could not be deleted.");
        }
    } catch (Exception e)
    {
        System.out.println(fileName + " could not be deleted.");
        e.printStackTrace();
    }
}  

這就是數據的樣子(已損壞): JAPANESE CHRACTERS

有什么建議? 任何 .... (((

你需要更換

BufferedReader br = Files.newBufferedReader(pathToFile, StandardCharsets.UTF_8))

經過

BufferedReader br = Files.newBufferedReader(pathToFile, Charset.forName("UTF-8"))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM