I have a text file like this:
MAX_POWER SPEED ETDWPNO ETAWPNO OPTIMIZED BUDGET
100 20.0 000 000 MaxSpeed 00000000.00
ETD_YEAR ETD_MONTH ETD_DAY ETD_HOUR ETD_MINUTE ETA_YEAR ETA_MONTH ETA_DAY ETA_HOUR ETA_MINUTE
2013 03 03 08 00 2013 03 03 08 00
NAME LAT LON LEG_TYPE TURN_RADIUS CHN_LIMIT PLANNED_SPEED SPEED_MIN SPEED_MAX COURSE LENGTH DO_PLAN HFO_PLAN HFO_LEFT DO_LEFT ETA_DAY ETA_TIME
BERTH 34 28.343 N 133 27.147 E RHUMBLINE 00.8 00185 000.0 000.0 000.0 000.0 00000.00 00000.0 00000.0 00000 00000 0000.00.00 00:00
CHANNEL 34 28.005 N 133 26.887 E RHUMBLINE 00.3 00110 006.0 000.0 012.5 212.5 00000.32 00000.0 00000.0 00000 00000 0000.00.00 00:00
FAIRWAY 34 22.671 N 133 26.773 E RHUMBLINE 00.3 00100 008.0 000.0 012.5 181.0 00005.35 00000.0 00000.0 00000 00000 0000.00.00 00:00
HAKAMA S 34 21.016 N 133 27.444 E RHUMBLINE 00.3 00231 011.3 000.0 012.5 161.4 00001.74 00000.0 00000.0 00000 00000 0000.00.00 00:00
MU SHIMA 34 17.485 N 133 30.836 E RHUMBLINE 00.3 00231 011.3 000.0 012.5 141.4 00004.41 00000.0 00000.0 00000 00000 0000.00.00 00:00
BISAN SE 34 17.571 N 133 37.128 E RHUMBLINE 00.3 00233 011.3 000.0 012.5 089.1 00005.34 00000.0 00000.0 00000 00000 0000.00.00 00:00
BISAN SE 34 17.557 N 133 40.198 E RHUMBLINE 00.3 00231 011.3 000.0 012.5 090.3 00002.45 00000.0 00000.0 00000 00000 0000.00.00 00:00
BISAN SE 34 18.594 N 133 42.000 E RHUMBLINE 00.3 00231 011.3 000.0 012.5 055.3 00001.89 00000.0 00000.0 00000 00000 0000.00.00 00:00
BISAN SE 34 20.873 N 133 47.007 E RHUMBLINE 00.3 00231 011.3 000.0 012.5 061.2 00004.74 00000.0 00000.0 00000 00000 0000.00.00 00:00
while reading this file:
data = read_csv("D:/waypoints/route/"+file[0],sep="\t", header=None, engine='python')
I got this error:
ParserError: Expected 12 fields in line 5, saw 20
i tried skipping first 4 rows and that worked but i don't want to go by this approach.
i don't want to skip any rows.
Can this all be used to create a dataframe or multiple dataframes based on the no. of columns?
Can anybody help me with this?
Any help would be appreciated.
here is a beginning of solution:
df = pd.read_csv("file.csv", sep="\t", header=None, engine='python', names=['col' + str(x) for x in range(30) ])
you have to use the option names
with the number needed or greater than minimal or you'll have an error. i have choosen 30 columns from cols0 to cols29.. but to avoid error you could choose 100 or more
All columns full filled with NaN could be deleted after or you added the function at the end of first command:
df = df.dropna(axis=1, how='all')
its the only solution i see to read text file with variable columns in pandas dataframe
after that you could work on your dataframe and search the row you want
result:
col0 col1 col2 col3 ... col17 col18 col19 col20
0 MAX_POWER SPEED ETDWPNO ETAWPNO ... NaN NaN None None
1 100 20.0 000 000 ... NaN NaN None None
2 ETD_YEAR ETD_MONTH ETD_DAY ETD_HOUR ... NaN NaN None None
3 2013 03 03 08 ... NaN NaN None None
4 NAME LAT LON LEG_TYPE ... NaN NaN None None
5 BERTH 34 28.343 N ... 0.0 0.0 0000.00.00 00:00
6 CHANNEL 34 28.005 N ... 0.0 0.0 0000.00.00 00:00
7 FAIRWAY 34 22.671 N ... 0.0 0.0 0000.00.00 00:00
8 HAKAMA S 34 21.016 N ... 0.0 0.0 0000.00.00 00:00
9 MU SHIMA 34 17.485 N ... 0.0 0.0 0000.00.00 00:00
10 BISAN SE 34 17.571 N ... 0.0 0.0 0000.00.00 00:00
11 BISAN SE 34 17.557 N ... 0.0 0.0 0000.00.00 00:00
12 BISAN SE 34 18.594 N ... 0.0 0.0 0000.00.00 00:00
13 BISAN SE 34 20.873 N ... 0.0 0.0 0000.00.00 00:00
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.