简体   繁体   中英

String List conversion to float python

I have a file with lines of numbers represting vectors. I am trying to convert it into list of lists of floats. Right now, my problem is that it takes only the first row of each line. I have try to loop over each index, but I face an error of "can not convert string into float"

Here is my code:

with open(input_file) as f:
    content = f.readlines()
content = [x.strip() for x in content]
input_val_arr = list(map(float, [i.split(' ', 1)[0] for i in content]))

input format:

0.03518 -0.02543 ... (dim = 100)

0.0025865 -0.01867 ....

...

(dim = ALOT)

desire output:

[[ 0.03518 , -0.02543 ...]

 [0.0025865 -0.01867 ...]

...

]]

I have tried to change my code to:

with open(input_file) as f:
    content = f.readlines()
input_val_arr = []
for index in range(x_dim):
    temp_list = list(map(float, [i.split(' ', 1)[index] for i in content]))
    input_val_arr.append(temp_list)

And I get the following error: ValueError: could not convert string to float: '-0.02543 0.0025865 ...'

Use regex to extract all floats from your file. and then use map to convert it to float object.

Ex:

import re
res = []
with open(filename, "r") as infile:
    for line in infile.readlines():
        data = re.findall("-?\d+\.\d+", line)
        if data:
            floatData = list(map(float, data))
            res.append(floatData)
print(res)

Output:

[[0.03518, -0.02543], [0.0025865, -0.01867]]

Another simple and quick approach:

line = "0.0025865 -0.01867"
values  = list(map(float, line.split()))
print(values)

Output:

[0.0025865, -0.01867]

If you want to have a list with lists, where a list represents a line from a file, then something like this will work:

result = []

for i in range(5):
    line = "0.0025865 -0.01867\n"
    values  = list(map(float, line.split()))
    result.append(values)

print(result)

Output:

[[0.0025865, -0.01867], [0.0025865, -0.01867], [0.0025865, -0.01867], [0.0025865, -0.01867], [0.0025865, -0.01867]]

Here for simplicity, I used a single input called line 5 times, but in your case, the line will come from your file.

Here we assume that the line is a line from a file and that it contains numeric values. You should complete the code to handle the corner cases.

Let's say this is your file:

0.03518 -0.02543 0.5469 0.538

The separation is the space.

with open(input_file, "r") as f:
    content = f.read()

content = content = content.split(" ")
content = [eval(elt) for elt in content]

# Output: Out[43]: [0.03518, -0.02543, 0.5469, 0.538]

If you have several lines as shown:

with open(input_file, "r") as f:
    content= f.readlines()

content = [line.split(" ") for line in content]
content = [[eval(x) for x in elt] for elt in content]

You might need to add a strip("\\n") if required.

I do know, you are asking through python. But whatever you want achieve can be simply done using numpy. and np.loadtxt as multiple options, which helps to process flat files which consists numerical data.

import numpy as np
numbers = np.loadtxt('file_name.txt')
numbers

and output will be like below

[[ 3.518e-02, -2.543e-02,  3.518e-02, -2.543e-02,  3.518e-02,
        -2.543e-02,  3.518e-02, -2.543e-02,  3.518e-02, -2.543e-02,
         3.518e-02, -2.543e-02,  3.518e-02, -2.543e-02,  3.518e-02,
        -2.543e-02,  3.518e-02, -2.543e-02,  3.518e-02, -2.543e-02,
         3.518e+03, -2.543e-02,  3.518e-02, -2.543e-02],
       [ 3.518e-02, -2.543e-02,  3.518e-02, -2.543e-02,  3.518e-02,
        -2.543e-02,  3.518e-02, -2.543e-02,  3.518e-02, -2.543e-02,
         3.518e-02, -2.543e-02,  3.518e-02, -2.543e-02,  3.518e-02,
        -2.543e-02,  3.518e-02, -2.543e-02,  3.518e-02, -2.543e-02,
         3.518e+03, -2.543e-02,  3.518e-02, -2.543e-02]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM