Pandas read_fwf alternating line formats

Question

I have a positional text file that has the related data split into two lines.

Column 1Column 2Column 3
Text
11      12      13
text for 1
21      22      23
text for 2
31      32      33
text for 3
41      42      43
text for 4
51      52      53
text for 5

I'm trying to get this into a dataframe like

Column 1Column 2Column 3 Text
11      12      13       text for 1
21      22      23       text for 2
31      32      33       text for 3
41      42      43       text for 4
51      52      53       text for 5

I'm testing without the column headers

import pandas as pd    


cols=([(0,8),(8,16),(16,None),(0,50)])
rs=pd.read_fwf(fn,colspecs=cols,header=None)

gives me:

    0   1   2   3
0   11  12  13.0    11 12 13
1   text for    1   NaN text for 1
2   21  22  23.0    21 22 23
3   text for    2   NaN text for 2

is there any way to alternate the formats of the lines

Answer 1

You can try to get every other row, and join the Text into one string as a new Text column, like this:

data = df.values.tolist()[::2][1:]
df = df[1:]
df = pd.DataFrame(df.values.tolist()[::2], columns=df.columns)
df['Text'] = [' '.join([str(x) for x in i[:-1]]) for i in data]
df = df.drop('3', axis=1)
print(df)

Result:

  Column 1Column  2Column          Text
0     11      12     13.0  text for 1.0
1     21      22     23.0  text for 2.0
2     31      32     33.0  text for 3.0
3     41      42     43.0  text for 4.0
4     51      52     53.0  text for 5.0

The first line of code creates an new list with the values of every other row in df , the second lines removes the first row from df , then the third line creates a new dataframe with every other row, then the fourth line creates the Text column with a list comprehension with the data list, the fifth row drops the 3rd column, because it's extra, the sixth row prints the data out.

Pandas read_fwf alternating line formats

Question

1 answers

solution1
1 2020-12-16 07:46:02

Pandas read_fwf alternating line formats

Question

1 answers

solution1 1 2020-12-16 07:46:02

solution1
1 2020-12-16 07:46:02