如何使用 python 或 bash 从原始数据中获取浮点数？

Question

我有一个由名为 AFL 的自动化测试/模糊测试工具生成的文件。 该文件代表一组输入数据，可以触发被测程序中的程序错误。

我知道这个文件应该正好包含 7 个浮点数，但如果我用cat读取文件，我就会得到这些。

6.5
06.5
088.1
16.5
08.3
12.6
0.88.1
16.5
08.3
12.6
0.7@��25

显然，上面的列表有超过 7 个浮点数，甚至还带有无法识别的字符。 所以我想这些是某种原始数据。 我如何编写 python 脚本（或 bash 命令行）来获取它们的原始格式，在本例中为 7 个浮点数？

有关信息，我可以编写一个 C 程序来完成这样的工作

#include <stdio.h>


int
main(void)
{
  double x0, x1, x2, x3, x4, x5, x6;

  if (scanf("%lf %lf %lf %lf %lf %lf %lf", &x0, &x1, &x2, &x3, &x4, &x5, &x6) != 7) return 2;

  printf ("%g,%g,%g,%g,%g,%g,%g\n",   x0, x1, x2, x3, x4, x5, x6);

  return 0;
}

使用上面的输入运行 C 程序确实会产生 7 个浮点数“6.5、6.5、88.1、16.5、8.3、12.6、0.88”，但我正在寻找更简单、也许更优雅的 python/bash 解决方案。 任何的想法？

Answer 1

解决这个问题的最好方法是使用循环并使其健壮； 检查一切这是一个简单的例子

# Get a list of legal characters
allowed_chars = "1,2,3,4,5,6,7,8,9,0,.".split(",")
# list of lines that have been edited
legalized_lines = []

# Open the raw data file
with open("path/to/file.extension", "r") as file:

    # Get all the lines in the file as a list
    lines = file.read().splitlines();

    # Loop through each line and check if it contains any illegal characters
    for line in lines:

        legalized_line = ""
        point_count = 0

        for char in line:

            if char in allowed_chars:

                legalized_line += char

        # Remove the last decimal point if there are more than 1
        for char in legalized_line:

            if char == ".":

                point_count += 1

        if point_count > 1:

            # Reverse the string and remove the point/s
            legalized_line = legalized_line[::-1]
            legalized_line = legalized_line.replace(".", "", point_count)
            legalized_line = legalized_line[::-1]

        legalized_lines.append(float(legalized_line))

for line in legalized_lines:

    print(line)

如何使用 python 或 bash 从原始数据中获取浮点数？

问题描述

1 个解决方案

解决方案1
1 2022-04-15 02:34:37

如何使用 python 或 bash 从原始数据中获取浮点数？

问题描述

1 个解决方案

解决方案1 1 2022-04-15 02:34:37

解决方案1
1 2022-04-15 02:34:37