简体   繁体   中英

Problems reading files with letters and numbers

I'm trying to work with a file that contains a header a set of numbers separated by double space and some text at the end (as shown in the image below).

enter image description here

My goal is to extract these numbers so that I can build a graph with them. Another problem is that the program's decimal separator is a comma and python uses a period.

I feel like this is pretty easy to do, but my stupidity limits me.

I cannot provide you a precise answer since we don't have access to the exact file. I'm using this as example.

我用这个作为sample.txt Check Pastebin here.

So the code goes:

f = open("sample.txt","r")
file_lines = f.read().splitlines()
header_lines = file_lines[1]
# split takes a separator as first argument
headers = [k for k in header_lines.split("  ")]
numbers_line = file_lines[2]
# strip remove spaces from the start and end "                1  2 3"
numbers_line = numbers_line.strip().split("  ")
# in my example data starts at 4th line and ends at 8th line (inclusive)
data_line_start = 4
data_line_end = 8
data_lines = file_lines[data_line_start-1:data_line_end]
# format data_lines remove spaces from start and end
data_lines = [j.strip() for j in data_lines]
# data_lines => DATA LINES
# ['0.03592  0.04902  0.0248  0.0327  0.0520  0.0318', '0.0553  0.06602  0.0548  0.0232  0.0710  0.0782', '0.08413  0.04402  0.0348  0.0654  0.0612  0.0428', '0.0543  0.06202  0.0148  0.0732  0.0810  0.0882', '0.0443  0.04102  0.0343  0.0556  0.0652  0.0928']
# we still need to format this using doble space as separator
data_array = []
for data_line in data_lines:
    data_line_formatted = [float(k) for k in data_line.split("  ")]
    data_array.append(data_line_formatted)
print("HEADERS")
print(headers)
print("NUMBERS LINE")
print(numbers_line)
print("DATA ARRAY")
print(data_array)

OUTPUT:

HEADERS
['Plate:', 'PLate1', '1.3', 'PlateFormat', 'EndPoint', 'Absorbance', 'Reduced', 'FALSE', '1', '1', '410', '1', '12', '96', '1', '5']
NUMBERS LINE
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12']
DATA ARRAY
[[0.03592, 0.04902, 0.0248, 0.0327, 0.052, 0.0318], [0.0553, 0.06602, 0.0548, 0.0232, 0.071, 0.0782], [0.08413, 0.04402, 0.0348, 0.0654, 0.0612, 0.0428], [0.0543, 0.06202, 0.0148, 0.0732, 0.081, 0.0882], [0.0443, 0.04102, 0.0343, 0.0556, 0.0652, 0.0928]]

You can use the open() function to open a file, then get a list of line files and storing into file_lines variable, what's next is just using some python string methods to format the data. The script below might not be useful but you can adapt it to your needs. Let me know if it helped you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM