How to get floats from raw data using python or bash?

Question

I have a file generated by an automated testing/fuzzing tool called AFL. The file represents one set of input data that can trigger a program bug in the program under test.

I know this file is supposed to contain exactly 7 floating-point numbers, but if I read the file with cat , I got these.

6.5
06.5
088.1
16.5
08.3
12.6
0.88.1
16.5
08.3
12.6
0.7@��25

Apparently, the list above has more than 7 floats and even comes with unrecognized characters. So I suppose these are some sort of raw data. How can I write a python script (or bash command line) to get their original format, which, in this case, are 7 floating-point numbers?

For information, I can write a C program to do the work like this

#include <stdio.h>


int
main(void)
{
  double x0, x1, x2, x3, x4, x5, x6;

  if (scanf("%lf %lf %lf %lf %lf %lf %lf", &x0, &x1, &x2, &x3, &x4, &x5, &x6) != 7) return 2;

  printf ("%g,%g,%g,%g,%g,%g,%g\n",   x0, x1, x2, x3, x4, x5, x6);

  return 0;
}

Running the C program with the input above indeed produces 7 floating-point numbers "6.5,6.5,88.1,16.5,8.3,12.6,0.88", but I am looking for a simpler, maybe more elegant python/bash solution. Any idea?

Answer 1

The best way to approach this is by using loops and making it robust; check for everything Here's a quick example

# Get a list of legal characters
allowed_chars = "1,2,3,4,5,6,7,8,9,0,.".split(",")
# list of lines that have been edited
legalized_lines = []

# Open the raw data file
with open("path/to/file.extension", "r") as file:

    # Get all the lines in the file as a list
    lines = file.read().splitlines();

    # Loop through each line and check if it contains any illegal characters
    for line in lines:

        legalized_line = ""
        point_count = 0

        for char in line:

            if char in allowed_chars:

                legalized_line += char

        # Remove the last decimal point if there are more than 1
        for char in legalized_line:

            if char == ".":

                point_count += 1

        if point_count > 1:

            # Reverse the string and remove the point/s
            legalized_line = legalized_line[::-1]
            legalized_line = legalized_line.replace(".", "", point_count)
            legalized_line = legalized_line[::-1]

        legalized_lines.append(float(legalized_line))

for line in legalized_lines:

    print(line)

How to get floats from raw data using python or bash?

Question

1 answers

solution1
1 2022-04-15 02:34:37

How to get floats from raw data using python or bash?

Question

1 answers

solution1 1 2022-04-15 02:34:37

solution1
1 2022-04-15 02:34:37