简体   繁体   English

尝试解析从 Google Cloud PubSub 检索到的协议缓冲区架构时出现“错误:非法令牌‘字符串’”

[英]"Error: Illegal token 'string'" when trying to parse protocol buffer schema retrieved from Google Cloud PubSub

If you are reading this, I hope you are familiar with Google Cloud PubSub, PubSub Topics and Schemas for those Topics.如果您正在阅读本文,我希望您熟悉 Google Cloud PubSub、PubSub 主题和这些主题的架构。 :). :)。

When defining a schema for a topic is GC pubsub, you have two choices for syntax - AVRO and Protocol Buffer.当为主题定义模式是 GC pubsub 时,您有两种语法选择 - AVRO 和 Protocol Buffer。 I've been successful at using AVRO but when trying to use Protocol Buffer, I'm getting an error that I'm not sure how to fix.我在使用 AVRO 方面取得了成功,但是在尝试使用 Protocol Buffer 时,我遇到了一个我不知道如何修复的错误。

Here's the schema in proto2 syntax:这是 proto2 语法中的模式:

syntax = "proto2";

message ProtocolBuffer {
  string event_name = 1;
  string user_id = 2;
}

That's pretty close to what you get out of the box when starting to create a schema in GC pubsub and picking Protocol Buffer and I assume that GC pubsub doesn't like proto3 format since it defaults to proto2.这与开始在 GC pubsub 中创建模式并选择 Protocol Buffer 时开箱即用的结果非常接近,我假设 GC pubsub 不喜欢 proto3 格式,因为它默认为 proto2。

I have a NodeJS based Cloud Function that is invoked when a document is created in GC Firestore.我有一个基于 NodeJS 的云 Function,它在 GC Firestore 中创建文档时被调用。 My goal is to get the data from Firestore and into BigQuery.我的目标是从 Firestore 获取数据并输入 BigQuery。

Here's my code for the Cloud Function:这是我的 Cloud Function 代码:

const Firestore = require('@google-cloud/firestore');
const { PubSub } = require('@google-cloud/pubsub');
const protobuf = require('protobufjs');

const firestore = new Firestore();
const pubsub = new PubSub();

exports.publishToBigQuery = async (event, context) => {
    console.log("event", JSON.stringify(event, 2, null));
    console.log("context", JSON.stringify(context, 2, null));
    const affectedDoc = firestore.doc(`messages/${context.params.documentId}`);

    try {
        const documentSnapshot = await affectedDoc.get();
        if (documentSnapshot.exists) {
            const firestoreData = documentSnapshot.data();

            const topic = pubsub.topic('firestore-document-created-with-proto-schema');

            const schema = pubsub.schema('event-pb-bq');
            const info = await schema.get();
            console.log('info', info);

            let root = new protobuf.Root();
            const type = protobuf.parse(info.definition);
            console.log('type', type);
            const ProtocolBuffer = type.root.lookupType('ProtocolBuffer');
            console.log('ProtocolBuffer', ProtocolBuffer);
            const message = ProtocolBuffer.create(firestoreData);
            console.log('message', message);
            const data = Buffer.from(message.toJSON());
            console.log('data', data);

            const value = await topic.publishMessage({data});
            console.log("Message published", value);
        } else {
            console.log("Document doesn't exist", JSON.stringify(affectedDoc));
        }
    } catch (error) {
        console.error("Error when fetching document", error);
    };
};

I get this error on the const type = protobuf.parse(info.definition);我在const type = protobuf.parse(info.definition);上收到此错误line.线。 I have no idea if the later lines are correct.我不知道后面的行是否正确。 They are guesses.他们是猜测。 If the schema source can't be parsed then I'm stuck.如果无法解析模式源,那么我就卡住了。

Here's the error stack trace:这是错误堆栈跟踪:

event {"oldValue":{},"updateMask":{},"value":{"createTime":"2022-09-09T16:08:59.107887Z","fields":{"event_name":{"stringValue":"fridayeventname"},"user_id":{"stringValue":"fridayuserid"}},"name":"projects/myproject/databases/(default)/documents/messages/jH9W7SQj2aLh7eK8lRCl","updateTime":"2022-09-09T16:08:59.107887Z"}}
context {"eventId":"a00ecca0-0740-4cf8-94bf-15828af8e180-0","eventType":"providers/cloud.firestore/eventTypes/document.create","notSupported":{},"params":{"documentId":"jH9W7SQj2aLh7eK8lRCl"},"resource":"projects/myproject/databases/(default)/documents/messages/jH9W7SQj2aLh7eK8lRCl","timestamp":"2022-09-09T16:08:59.107887Z"}
info {
 name: 'projects/myproject/schemas/event-pb-bq',
 type: 'PROTOCOL_BUFFER',
 definition: 'syntax = "proto2";\n' +
 '\n' +
 'message ProtocolBuffer {\n' +
 ' string event_name = 1;\n' +
 ' string user_id = 2;\n' +
 '}\n'
}
Error when fetching document Error: illegal token 'string' (line 4)
 at illegal (/workspace/node_modules/protobufjs/src/parse.js:96:16)
 at parseType_block (/workspace/node_modules/protobufjs/src/parse.js:347:31)
 at ifBlock (/workspace/node_modules/protobufjs/src/parse.js:290:17)
 at parseType (/workspace/node_modules/protobufjs/src/parse.js:308:9)
 at parseCommon (/workspace/node_modules/protobufjs/src/parse.js:261:17)
 at Object.parse (/workspace/node_modules/protobufjs/src/parse.js:829:21)
 at exports.publishToBigQuery (/workspace/index.js:26:35)

I couldn't find an example anywhere that would retrieve the schema source from PubSub and then use that to format the pubsub message.我在任何地方都找不到可以从 PubSub 检索模式源然后使用它来格式化 pubsub 消息的示例。 Anyone have any ideas?有人有想法么?

Thanks.谢谢。

The issue is that Pub/Sub's schema validation is too permissive.问题是 Pub/Sub 的模式验证过于宽松。 In this case, the schema definition provided is not considered valid because it is proto2 and does not have optional, repeated, or required specified for the fields.在这种情况下,提供的模式定义被认为是无效的,因为它是 proto2 并且没有为字段指定可选、重复或必需的。 The protobuf parser for Node is catching this fact while Pub/Sub's validator is implicitly treating these as optional. Node 的 protobuf 解析器捕捉到了这一事实,而 Pub/Sub 的验证器隐含地将这些视为可选。

If you change the schema to the following, it should work:如果将架构更改为以下内容,它应该可以工作:

 syntax = "proto2";

 message ProtocolBuffer {
   optional string event_name = 1;
   optional string user_id = 2;
 }

For follow-up on improvements to the validator in this case, you can see the issue entered for it .对于在这种情况下对验证器的改进的跟进,您可以看到为它输入的问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 terraform 创建的 pubsub 模式定义错误 - Error in pubsub schema definition creating from terraform ECONNREFUSED 尝试在 shell 上运行 Firebase pubsub scheduled cloud function 时出错 - ECONNREFUSED Error when trying to run Firebase pubsub scheduled cloud function on shell 尝试 ssh 进入 Google Cloud Platform VM 时出现错误公钥 - Error Public Key when trying to ssh into Google Cloud Platform VM 尝试部署到 Google Cloud Run 时出现权限错误 - Permission error when trying to deploy to Google Cloud Run 尝试使用来自 google.cloud 的 BigQuery 时出现 ModuleNotFoundError - ModuleNotFoundError when trying to use BigQuery from google.cloud 谷歌发布/订阅错误 com.google.cloud.pubsub.v1.StreamingSubscriberConnection - Google pub/sub ERROR com.google.cloud.pubsub.v1.StreamingSubscriberConnection 如何将数据从 Google PubSub 主题流式传输到 PySpark(在 Google Cloud 上) - How can I stream data from a Google PubSub topic into PySpark (on Google Cloud) 如何在没有云的情况下使用 Google PubSub Function - How to use Google PubSub without Cloud Function Google Cloud:Pubsub 订阅过滤器表达式选项 - Google Cloud: Pubsub subscription filter expression options Spring 和 Google Cloud PubSub - 订阅事件 - Spring and Google Cloud PubSub - subscribing to events
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM