在 UART 通信中接收额外的字节

Question

I have been involved in a project where I need to extract some data from a device and display it on a PC to be checked.我参与了一个项目，我需要从设备中提取一些数据并将其显示在要检查的 PC 上。 The device I am recieving data from sends a string which includes the device ID, current mode, temperature reading and battery reading.我从中接收数据的设备会发送一个字符串，其中包括设备 ID、当前模式、温度读数和电池读数。 For ease, these are separated by a comma.为方便起见，这些用逗号分隔。 An example of a string would be:字符串的一个例子是：

01,03,66661242,28 01,03,66661242,28

so this would bean the device ID is 1, the mode is mode 3, a temperature reading of 36.6 (this is in ASCII little endian format), and battery level is 4.0V (sent in ASCII and divided by 10)所以这将 bean 设备 ID 为 1，模式为模式 3，温度读数为 36.6（这是 ASCII 小端格式），电池电量为 4.0V（以 ASCII 发送并除以 10）

I have no control over the format the data is sent我无法控制数据发送的格式

I am using an STM32F091RC Nucleo board for this and the code I have is:我为此使用了 STM32F091RC Nucleo 板，我拥有的代码是：

#include "mbed.h"

Serial pc(PA_2, PA_3);
Serial Unit (PA_9, PA_10, 9600);                                                // 9600 baud rate - no parity - 1 stop bit

//Input pins
DigitalIn START(PB_8, PullUp);

void GetData();
void CheckData();

char Data[100];

int deviceId;
int Mode;
float TempReading;
float battReading;

unsigned char Ascii2Hex (unsigned char data)
{
    if (data > '9')data += 9;   // add offset if value > 9
    return (data &= 0x0F);
}

unsigned char Ascii2Char(unsigned char Offset)
{
    unsigned char Ans;
    Ans = Ascii2Hex(Data[Offset]);
    Ans = Ans<<4;
    Ans += Ascii2Hex(Data[Offset+1]);
    return(Ans);
}

float Ascii2Float(unsigned char Offset)
{
    float Bob;
    unsigned char Ans;
    Ans = Ascii2Hex(Data[Offset+6]);
    Ans = Ans<<4;
    Ans += Ascii2Hex(Data[Offset+7]);
    ((unsigned char*)&Bob)[3]= Ans;
    Ans = Ascii2Hex(Data[Offset+4]);
    Ans = Ans<<4;
    Ans += Ascii2Hex(Data[Offset+5]);
    ((unsigned char*)&Bob)[2]= Ans;
    Ans = Ascii2Hex(Data[Offset+2]);
    Ans = Ans<<4;
    Ans += Ascii2Hex(Data[Offset+3]);
    ((unsigned char*)&Bob)[1]= Ans;
    Ans = Ascii2Hex(Data[Offset]);
    Ans = Ans<<4;
    Ans += Ascii2Hex(Data[Offset+1]);
    ((unsigned char*)&Bob)[0]= Ans;
    return(Bob);
}

void DecodeString()
{
    char x;

    //numbers in brackets is where the data starts in the string
    deviceId = Ascii2Char(0);
    Mode = Ascii2Char(3);
    TempReading = Ascii2Float(6);
    x = Ascii2Char(15);
    battReading = (float)x/10;
    GetData();
}

void GetData()
{
    Unit.scanf("%s,",Data);  // scan the incoming data on the RX line
    pc.printf("%s,\n\r",Data);
    pc.printf("Device ID = %i\n\r", deviceId);
    pc.printf("Mode = %i\n\r", Mode);
    pc.printf("Temp = %.1f\n\r", TempReading);
    pc.printf("Bat = %.1f\n\n\r", battReading);
}

int main()
{
    while(1) {
        if(START == 0) {
            wait(0.1);
            DecodeString();
        }
    }
}

When I first start up and press the button to get the data, the string I recieve has an extra 0 at the front: 001,03,66661242,28当我第一次启动并按下按钮获取数据时，我收到的字符串前面多了一个 0：001,03,66661242,28

This then means the data is incorrect as the data has shifted, however, if I press it again, it then gives the correct string but the printed data is incorrect Another press and everything works fine and will continue working until the Nucleo board is reset.这意味着数据不正确，因为数据已经移动，但是，如果我再次按下它，它会给出正确的字符串，但打印的数据不正确再按下一次，一切正常，并将继续工作，直到 Nucleo 板被重置。 An example of the recieved string and displayed data from my serial monitor is:从我的串行监视器收到的字符串和显示数据的示例是：

001,03,33331342,28,
Device ID = 0
Mode = 0
Temp = 0.0
Bat = 0.0

01,03,CDCC1242,28,
Device ID = 0
Mode = 192
Temp = 0.0
Bat = 19.4

01,03,CDCC1242,28,
Device ID = 1
Mode = 3
Temp = 36.7
Bat = 4.0

I am not an expert coder, I am very much a beginner.我不是专业的编码员，我是一个初学者。 The bit of code that decodes the string was given to me by the engineer who designed the device that sends the data string.解码字符串的代码是由设计发送数据字符串的设备的工程师给我的。 I have for assistance but because of working from home and people being very busy with other things, this isn't a pressing issue so help is limited.我需要帮助，但由于在家工作并且人们忙于其他事情，这不是一个紧迫的问题，所以帮助是有限的。

I have tried adding some delays in various places (such as after the original scanf and before printing) and I have also tried the scanf function 3 times just as an experiment to see if I can bypass the incorrect data but none of these have helped.我尝试在各个地方添加一些延迟（例如在原始 scanf 之后和打印之前），并且我还尝试了 scanf function 3 次，只是作为一个实验，看看我是否可以绕过不正确的数据，但这些都没有帮助。 I have tried using different UART pins (the STM32F091RC 64 pin device has 6 available) but I still get the same result.我尝试过使用不同的 UART 引脚（STM32F091RC 64 引脚器件有 6 个可用），但我仍然得到相同的结果。 I have also changed the data byte length from 100 to 17 as that is the amount I am expecting to recieve but it still makes no difference.我还将数据字节长度从 100 更改为 17，因为这是我期望收到的数量，但它仍然没有区别。

I have made sure that all devices are sharing a common GND and double checked all the hardware connections.我确保所有设备都共享一个公共 GND，并仔细检查了所有硬件连接。

All I want to do is recieve the correct data first time and display the correct result first time, but I can't seem to get it working.我想做的就是第一次收到正确的数据并第一次显示正确的结果，但我似乎无法让它工作。

EDIT编辑

I have now tried adding in an extra few lines.我现在尝试添加额外的几行。 I am using strlen to count the number of bytes in the string.我正在使用strlen来计算字符串中的字节数。 If it is more than 17, I then retry.如果超过 17，我会重试。 This has eliminated the first issue, but the first set of decoded data is still displayed incorrectly:这样就消除了第一个问题，但是第一组解码后的数据还是显示不正确：

String Length = 18

String Length = 17

01,03,66661242,28,
Device ID = 0
Mode = 192
Temp = 0.0
Bat = 19.4

String Length = 17

01,03,66661242,28,
Device ID = 1
Mode = 3
Temp = 36.6
Bat = 4.0

Is there any way first to make sure the data is decoded correctly first time, or that the data is read correctly first time instead of needing a workaround?有什么方法可以首先确保第一次正确解码数据，或者第一次正确读取数据而不需要解决方法？

Answer 1

You don't seem to have any message delimiters to indicate the start or end of a message stream.您似乎没有任何消息分隔符来指示消息 stream 的开始或结束。 I assume that this is because you are working on only receiving ASCII data.我认为这是因为您正在处理仅接收 ASCII 数据。

One option would look at would be to use strtok to split the data into strings (using ',' as the delimiter).一种选择是使用strtok将数据拆分为字符串（使用','作为分隔符）。

Test that you have 4 strings returned in your array.测试您的数组中是否返回了 4 个字符串。

Then for the first block just use atoi to convert into an integer.然后对于第一个块，只需使用atoi转换为 integer。 Doing this "001" and "01" should both convert to 1.这样做“001”和“01”应该都转换为1。

Ideally you should check the format of the message on reception, in case you haven't received a full message, but from what I can see here so far that isn't really necessary.理想情况下，您应该在接收时检查消息的格式，以防您没有收到完整的消息，但从我目前在这里看到的情况来看，这并不是真正必要的。 Just check the format of each string eg if they contain non-numeric characters when the should then discard the data up to and including that point.只需检查每个字符串的格式，例如它们是否包含非数字字符，然后应该丢弃直到并包括该点的数据。

edit编辑

I haven't understood how Temp is encoded but I have this example code Temp is incorrect in this code :我不明白 Temp 是如何编码的，但我有这个示例代码Temp 在这段代码中不正确：

#include <stdio.h>

#include <stdlib.h>
#include "string.h"

int main()
{
    char input[] = "001,03,66661242,28";

    char* pstr = strtok(input,",");
    int count =0;
    int ID =0;
    int Mode =0;
    double Temp =0.0;
    float Volt = 0.0;

    while(pstr!=NULL)
    {
        switch(count)
        {
            case 0:
                ID = atoi(pstr);
            break;
            case 1:
                Mode = atoi(pstr);
            break;

            case 2:
                Temp = strtod(pstr, NULL);
            break;

            case 3 :
                Volt = strtol(pstr, NULL ,16)/10;
            break;

        }
        printf("%s\n", pstr);
        pstr = strtok(NULL,",");
        count++;

    }

    if(count == 4)
    {
        printf("ID = %d\n", ID);
        printf("Mode = %d\n", Mode);
        printf("Temp = %.1f\n", Temp);
        printf("Voltage = %.1f\n", Volt);
    }
    else
    {
        printf("Error");

    }

}

Answer 2

This answers your problem:这回答了你的问题：

The way your code is written, GetData() is printing the “Device ID”, “Mode”, “Temp” and “Bat” data from the previously acquired data, not from the most recently acquired data.按照您的代码编写方式， GetData()是从先前获取的数据中打印“设备 ID”、“模式”、“温度”和“蝙蝠”数据，而不是从最近获取的数据中。 That's why your first set of data is all-zeros;这就是为什么你的第一组数据是全零的； on the first time through, all those variables still contain their original uninitialized values, which since they are statically allocated “global data”, is all-zeros.在第一次通过时，所有这些变量仍然包含它们原始的未初始化值，因为它们是静态分配的“全局数据”，所以它们都是全零的。 And the second time through, it's printing the results you obtained from the first reading, which gives you the wrong value for “Device ID” because of the extra zero at the start of the bytestream.第二次，它打印您从第一次读取获得的结果，由于字节流开头的额外零，这为您提供了错误的“设备 ID”值。 Finally, the third time through, it prints the data you obtained from the second reading, which was good.最后，第三次通过，它打印了您从第二次读取中获得的数据，这很好。

If you just rearrange some of your code, it will print the data based on the most recent sample.如果您只是重新排列一些代码，它将根据最新的样本打印数据。 I haven't tried compiling & running it, but this seems to me to be a likely good rewrite of it, which combines your DecodeString() and GetData() functions into one function:我没有尝试编译和运行它，但在我看来这可能是一个很好的重写，它将你的DecodeString()和GetData()函数组合成一个 function：

void DecodeString()
{
    char x;

    // scan the incoming data on the RX line
    Unit.scanf("%s,",Data);

    // translate the data from ASCII into int/float native types
    // numbers in brackets is where the data starts in the string
    deviceId = Ascii2Char(0);
    Mode = Ascii2Char(3);
    TempReading = Ascii2Float(6);
    x = Ascii2Char(15);
    battReading = (float)x/10;

    // print the original unprocessed input string
    pc.printf("%s,\n\r",Data);

    // print what we translated
    pc.printf("Device ID = %i\n\r", deviceId);
    pc.printf("Mode = %i\n\r", Mode);
    pc.printf("Temp = %.1f\n\r", TempReading);
    pc.printf("Bat = %.1f\n\n\r", battReading);
}

You may also get better results if, at startup, you flush your incoming data stream (read & discard any existing buffered garbage), which may resolve your issue with the extra character on your first read.如果在启动时刷新传入数据 stream （读取并丢弃任何现有的缓冲垃圾），您也可能会获得更好的结果，这可能会解决您在第一次读取时出现额外字符的问题。 But if there's a startup race condition between both ends of your comms link, you may find that (perhaps occasionally) your receiver begins processing the first sample in the middle of the packet's characters, and perhaps that flush operation would have discarded the first part of the packet.但是，如果在您的通信链路的两端之间存在启动竞争条件，您可能会发现（可能偶尔）您的接收器开始处理数据包字符中间的第一个样本，并且可能该刷新操作会丢弃第一部分数据包。 While an initial flush is a good idea, it's even more important that you have a robust way of validating each packet.虽然初始刷新是一个好主意，但更重要的是您拥有一种验证每个数据包的可靠方法。

This is additional commentary about your situation:这是关于您的情况的附加评论：

In his comment, MartinJames is correct, though perhaps a bit blunt.在他的评论中，MartinJames 是正确的，尽管可能有点生硬。 Serial data streams without well-defined packet protocols are notoriously unreliable and data logging over such an interface is likely to produce erroneous data, which can have serious consequences if you're doing research or engineering on the resulting dataset.众所周知，没有明确定义的数据包协议的串行数据流是不可靠的，并且通过这样的接口记录数据可能会产生错误的数据，如果您正在对生成的数据集进行研究或工程，这可能会产生严重后果。 A more robust message system might start each “packet” with a known character or character pair, just as a helpful resync mechanism: If your byte stream gets out-of-sync, the resync character (or pair) helps you get back in sync quickly & easily.一个更健壮的消息系统可能会以已知字符或字符对开始每个“数据包”，就像一个有用的重新同步机制一样：如果您的字节 stream 不同步，重新同步字符（或对）可以帮助您恢复同步快速轻松。 In your case since you're reading ASCII data, that's a '\n' or "\r\n" , so from that standpoint you're good, as long as you actually do something to start & stop each data sample on those boundaries.在您的情况下，因为您正在读取 ASCII 数据，所以这是一个'\n'或"\r\n" ，所以从这个角度来看，你很好，只要你真的做一些事情来启动和停止这些数据样本边界。 What happens if you receive a data sample like this?...如果您收到这样的数据样本会怎样？...

01,03,CDCC1242,28,
01,03,CDCC1240,27,
01,03,CDCC1241,29,
01,03,CDCC1243,28,
01,03,CDCC123F,2A,
01,03,CD9,
01,03,CDCC1241,29,
01,03,CDCC1241,29,
01,0yĔñvśÄ“3,CDCC1243,28,
01,03,CDCC123F,2A,
01,03,CDCC1242,29,

Will your code be able to re-sync after the sample that's missing several characters?在缺少几个字符的示例之后，您的代码是否能够重新同步？ What about the one that has garbage in it?那里面有垃圾的呢？ Your code needs to be able to break apart the serial stream into chunks beginning with one delimiter character (or pair) and ending right before the next one in the serial stream.您的代码需要能够将串行 stream 分解为以一个分隔符（或对）开始并在串行 stream 中的下一个字符之前结束的块。 And it should examine the characters between and verify that they “make sense” in some manner, and be capable of rejecting any sample that doesn't check out OK.它应该检查它们之间的字符，并以某种方式验证它们是否“有意义”，并且能够拒绝任何检查不正确的样本。 What it does then may depend on the needs of your end consumer of the data: Perhaps you can just throw out the sample & still be OK.然后它会做什么可能取决于您的数据最终消费者的需求：也许您可以丢弃样本并仍然可以。 Or maybe you should repeat the last good sample until you get to the next good one.或者也许你应该重复上一个好的样本，直到你得到下一个好的样本。 Or perhaps you should wait until the next good one and then linearly interpolate to find reasonable estimates of what the data should have been between those good samples.或者，也许您应该等到下一个好的样本，然后进行线性插值，以找到这些好的样本之间数据应该是什么的合理估计。

In any case, as I said, you need some way to validate each data sample.无论如何，正如我所说，您需要某种方法来验证每个数据样本。 If the “packet” (data sample) length can vary, then each packet should contain some indication as to the number of bytes in it, so if you get more or less, you know that packet is bad.如果“数据包”（数据样本）长度可以变化，那么每个数据包都应该包含一些关于其中字节数的指示，所以如果你得到更多或更少，你就知道这个数据包是坏的。 (Also, if the length is unreasonable, you also know the data is bad, and you don't allow your data collection algorithm to be fooled by a bad byte that your next packet is 1.8 Gigabytes long… which would probably crash your program since your receive buffer isn't that big.) Finally, there should be some sort of checksum system over all the data in the packet; （另外，如果长度不合理，你也知道数据是坏的，并且你不允许你的数据收集算法被你的下一个数据包长 1.8 GB 的坏字节所欺骗......这可能会使你的程序崩溃，因为你的接收缓冲区没有那么大。）最后，应该有某种校验和系统来处理数据包中的所有数据； a 16-bit additive checksum would work, but a CRC would be better. 16 位加法校验和可以工作，但 CRC 会更好。 By generating this packet overhead metadata on the sending end and verifying it on the receiving end, you (at least with some high probability) guarantee the validity of your dataset.通过在发送端生成此数据包开销元数据并在接收端对其进行验证，您（至少很有可能）保证数据集的有效性。

But as you said, you have no control over the format of the transmitted data.但是正如您所说，您无法控制传输数据的格式。 And that's a shame;这是一种耻辱； as MartinJames said, whoever designed the protocol didn't seem to understand the unreliability of simple serial bytestreams.正如 MartinJames 所说，设计协议的人似乎并不了解简单串行字节流的不可靠性。 Since you can't change that, you'll just have to do your best to find some heuristics to validate the data;既然你无法改变它，你只需要尽最大努力找到一些启发式方法来验证数据； perhaps you make your code remember the last 5 samples in an array, and compare each new one to the last 5 assumed-valid samples;也许您让您的代码记住数组中的最后 5 个样本，并将每个新样本与最后 5 个假定有效样本进行比较； if you get a value that's outside of the bounds of a reasonable change from the preceding samples, you throw it out & wait for the next one.如果您得到的值超出了前面示例的合理更改范围，则将其丢弃并等待下一个。 Or come up with your own heuristics.或者想出你自己的启发式方法。 Just make sure your heuristics don't result in you invalidating all future samples if the actual measured value changes too fast.如果实际测量值变化太快，请确保您的启发式方法不会导致您使所有未来的样本无效。

在 UART 通信中接收额外的字节

问题描述

2 个解决方案

解决方案1
1 2020-04-27 14:31:54

解决方案2
1 已采纳 2020-04-27 17:52:02

在 UART 通信中接收额外的字节

问题描述

2 个解决方案

解决方案1 1 2020-04-27 14:31:54

解决方案2 1 已采纳 2020-04-27 17:52:02

解决方案1
1 2020-04-27 14:31:54

解决方案2
1 已采纳 2020-04-27 17:52:02