I have a file which looks like the following:
@ junk
...
@ junk
1.0 -100.102487081243
1.1 -100.102497023421
... ...
3.0 -100.102473082342
&
@ junk
...
I am interested only in the two columns of numbers given between the @
and &
characters. These characters may appear anywhere else in the file but never inside the number block.
I want to create two lists , one with the first column and one with the second column.
List1 = [1.0, 1.1,..., 3.0]
List2 = [-100.102487081243, -100.102497023421,..., -100.102473082342]
I've been using shell scripting to prep these files for a simpler Python script which makes lists, however, I'm trying to migrate these processes over to Python for a more consistent application. Any ideas? I have limited experience with Python and file handling.
Edit: I should mention, this number block appears in two places in the file. Both number blocks are identical.
Edit2: A general function would be most satisfactory for this as I will put it into a custom library.
Current Efforts
I currently use a shell script to trim out everything but the number block into two separate columns. From there it is trivial for me to use the following function
def ReadLL(infile):
List = open(infile).read().splitlines()
intL = [int(i) for i in List]
return intL
by calling it from my main
import sys
import eLIBc
infile = sys.argv[1]
sList = eLIBc.ReadLL(infile)
The problem is knowing how to extract the number block from the original file with Python rather than using shell scripting.
You want to loop over the file itself, and set a flag for when you find the first line without a @
character, after which you can start collecting numbers. Break off reading when you find the &
character on a line.
def readll(infile):
with open(infile) as data:
floatlist1, floatlist2 = [], []
reading = False
for line in data:
if not reading:
if '@' not in line:
reading = True
else:
continue
if '&' in line:
return floatlist1, floatlist2
numbers = map(float, line.split())
floatlist1.append(numbers[0])
floatlist2.append(numbers[1])
So the above:
False
, and only when a line without '@'
is found, is that set to True
. True
:
&
By returning, the function ends, with the file closed automatically. Only the first block is read, the rest of the file is simply ignored.
Try this out:
with open("i.txt") as fp:
lines = fp.readlines()
data = False
List1 = []
List2 = []
for line in lines:
if line[0] not in ['&', '@']:
print line
line = line.split()
List1.append(line[0])
List2.append(line[1])
data = True
elif data == True:
break
print List1
print List2
This should give you the first block of numbers.
Input:
@ junk
@ junk
1.0 -100.102487081243
1.1 -100.102497023421
3.0 -100.102473082342
&
@ junk
1.0 -100.102487081243
1.1 -100.102497023421
Output:
['1.0', '1.1', '3.0']
['-100.102487081243', '-100.102497023421', '-100.102473082342']
Update
If you need both blocks, then use this:
with open("i.txt") as fp:
lines = fp.readlines()
List1 = []
List2 = []
for line in lines:
if line[0] not in ['&', '@']:
print line
line = line.split()
List1.append(line[0])
List2.append(line[1])
print List1
print List2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.