Fortran read of data with * to signify similar data

Question

My data looks like this

-3442.77 -16749.64 893.08 -3442.77 -16749.64 1487.35 -3231.45 -16622.36 902.29

.....

159*2539.87 10*0.00 162*2539.87 10*0.00

which means I start with either 7 or 8 reals per line and then (towards the end) have 159 values of 2539.87 followed by 10 values of 0 followed by 162 of 2539.87 etc. This seems to be a space-saving method as previous versions of this file format were regular 6 reals per line.

I am already reading the data into a string because of not knowing whether there are 7 or 8 numbers per line. I can therefore easily spot lines that contain *. But what then? I suppose I have to identify the location of each * and then identify the integer number before and real value after before assigning to an array. Am I missing anything?

Answer 1

Read the line. Split it into tokens delimited by whitespace(s). Replace the * in tokens that have it with space. Then read from the string one or two values, depending on wheather there was an asterisk or not. Sample code follows:

REAL, DIMENSION(big) :: data
CHARACTER(LEN=40) :: token
INTEGER :: iptr, count, idx
REAL :: val

iptr = 1
DO WHILE (there_are_tokens_left)
  ... ! Get the next token into "token"
  idx = INDEX(token, "*")
  IF (idx == 0) THEN
    READ(token, *) val
    count = 1
  ELSE
    ! Replace "*" with space and read two values from the string
    token(idx:idx) = " "
    READ(token, *) count, val
  END IF
  data(iptr:iptr+count-1) = val  ! Add "val" "count" times to the list of values
  iptr = iptr + count
END DO

Here I have arbitrarily set the length of the token to be 40 characters. Adjust it according to what you expect to find in your input files.

BTW, for the sake of completeness, this method of compressing something by replacing repeating values with value/repetition-count pairs is called run-length encoding (RLE).

Answer 2

Your input data may have been written in a form suitable for list directed input (where the format specification in the READ statement is simply ''*''). List directed input supports the r*c form that you see, where r is a repeat count and c is the constant to be repeated.

If the total number of input items is known in advance (perhaps it is fixed for that program, perhaps it is defined by earlier entries in the file) then reading the file is as simple as:

REAL :: data(size_of_data)
READ (unit, *) data

For example, for the last line shown in your example on its own ''size_of_data'' would need to be 341, from 159+10+162+10.

With list directed input the data can span across multiple records (multiple lines) - you don't need to know how many items are on each line in advance - just how many appear in the next "block" of data.

List directed input has a few other "features" like this, which is why it is generally not a good idea to use it to parse "arbitrary" input that hasn't been written with it in mind - use an explicit format specification instead (which may require creating the format specification on the fly to match the width of the input field if that is not know ahead of time).

If you don't know (or cannot calculate) the number of items in advance of the READ statement then you will need to do the parsing of the line yourself.

Fortran read of data with * to signify similar data

Question

2 answers

solution1
3 2012-07-30 14:46:22

solution2
1 2012-07-30 21:34:52

Fortran read of data with * to signify similar data

Question

2 answers

solution1 3 2012-07-30 14:46:22

solution2 1 2012-07-30 21:34:52

solution1
3 2012-07-30 14:46:22

solution2
1 2012-07-30 21:34:52