简体   繁体   中英

joining many tables together that are in one csv file with vertical/horizontal information

I need salinity/temperature data from the NDBC. I have managed to download it but in the csv file there is a table for each buoy containing data for various depths (I only want the shallowest depth) with info like lat/long coordinates and the time/day/month/year above each buoy in a vertical table.

Is there anyway to create one table with the date/time, coordinates, temp and salinity data of each buoy?

a code for r or any other language (except VBA as the file is too large for excel!)

Thanks in advance!!

Below is a copy of the first two buoys:

Latitude                    ,,        25.5590,decimal degrees,,
Longitude                   ,,       -66.0020,decimal degrees,,
Year                        ,,           2003,,,
Month                       ,,             11,,,
Day                         ,,              3,,,
Time                        ,,           2.97,decimal hours (UT),,
VARIABLES ,Depth     ,F,O,Temperatur ,F,O,Salinity   ,F,O,Oxygen     ,F,O,,
UNITS     ,m         , , ,degrees C ,, , ,PSS       ,, , ,ml/l      ,, , ,,
Prof-Flag ,          ,0, ,          ,0, ,          ,0, ,          ,0, ,,
         1,      2.58,0,2,   27.4173,0,2,   36.5551,0,2,     4.577,0,2,
         2,     23.64,0,2,   27.4678,0,2,   36.6834,0,2,     4.581,0,2,


----------------------------------------------------------------------------

Latitude                    ,,        26.2110,decimal degrees,,
Longitude                   ,,       -66.0072,decimal degrees,,
Year                        ,,           2003,,,
Month                       ,,             11,,,
Day                         ,,              3,,,
Time                        ,,           10.0,decimal hours (UT),,
VARIABLES ,Depth     ,F,O,Temperatur ,F,O,Salinity   ,F,O,Oxygen     ,F,O,,
UNITS     ,m         , , ,degrees C ,, , ,PSS       ,, , ,ml/l      ,, , ,,
Prof-Flag ,          ,0, ,          ,0, ,          ,0, ,          ,0, ,,
         1,      3.18,0,2,   27.5938,0,2,   36.8218,0,2,     4.563,0,2,
         2,     25.33,0,2,   27.6006,0,2,   36.8357,0,2,     4.554,0,2,

Is there anyway to create one table with the date/time, coordinates, temp and salinity data of each buoy?

In Python, there is almost always a way. For your data, I wouldn't really call it CSV data (although its data elements are technically separated by commas.) However, if the data are always arrayed in the same way , you can massage the data into a form that will be easier to work with.

Given the data for a single buoy as an example:

Latitude                    ,,        25.5590,decimal degrees,,
Longitude                   ,,       -66.0020,decimal degrees,,
Year                        ,,           2003,,,
Month                       ,,             11,,,
Day                         ,,              3,,,
Time                        ,,           2.97,decimal hours (UT),,
VARIABLES ,Depth     ,F,O,Temperatur ,F,O,Salinity   ,F,O,Oxygen     ,F,O,,
UNITS     ,m         , , ,degrees C ,, , ,PSS       ,, , ,ml/l      ,, , ,,
Prof-Flag ,          ,0, ,          ,0, ,          ,0, ,          ,0, ,,
         1,      2.58,0,2,   27.4173,0,2,   36.5551,0,2,     4.577,0,2,
         2,     23.64,0,2,   27.4678,0,2,   36.6834,0,2,     4.581,0,2,

We can use a series of steps (consolidated here into one list comprehension) to get the data into an understandable format. The following line splits on each comma, strips out whitespace and new line characters, and keeps anything that's not an empty string.

data = [t.strip().replace('\n', '') for t in data.split(',') if t.strip() != '']

This gives us the data written to a list which can then be used to access the data you want.

['Latitude', '25.5590', 'decimal degrees', 'Longitude', '-66.0020', 'decimal degrees', 'Year', '2003', 'Month', '11', 'Day', '3', 'Time', '2.97', 'decimal hours (UT)', 'VARIABLES', 'Depth', 'F', 'O', 'Temperatur', 'F', 'O', 'Salinity', 'F', 'O', 'Oxygen', 'F', 'O', 'UNITS', 'm', 'degrees C', 'PSS', 'ml/l', 'Prof-Flag', '0', '0', '0', '0', '1', '2.58', '0', '2', '27.4173', '0', '2', '36.5551', '0', '2', '4.577', '0', '2', '2', '23.64', '0', '2', '27.4678', '0', '2', '36.6834', '0', '2', '4.581', '0', '2']

Then you can just access your data using an index.

print u"{0}: {1}".format(data[0], data [1])
print u"{0}: {1}".format(data[3], data [4])

Output:

Latitude: 25.5590
Longitude: -66.0020

This example will only work if the data are always arrayed in the same way every time.

If you like the result, use something like the above and apply it to each buoy by using the horizontal line to delineate the buoy records.

===========================

UPDATE: slight adjustment to list comprehension to provide cleaner output and full script that generates a dictionary with all buoys.

final_data = {}
buoy_number = 1

data = """Latitude                    ,,        25.5590,decimal degrees,,
Longitude                   ,,       -66.0020,decimal degrees,,
Year                        ,,           2003,,,
Month                       ,,             11,,,
Day                         ,,              3,,,
Time                        ,,           2.97,decimal hours (UT),,
VARIABLES ,Depth     ,F,O,Temperatur ,F,O,Salinity   ,F,O,Oxygen     ,F,O,,
UNITS     ,m         , , ,degrees C ,, , ,PSS       ,, , ,ml/l      ,, , ,,
Prof-Flag ,          ,0, ,          ,0, ,          ,0, ,          ,0, ,,
         1,      2.58,0,2,   27.4173,0,2,   36.5551,0,2,     4.577,0,2,
         2,     23.64,0,2,   27.4678,0,2,   36.6834,0,2,     4.581,0,2,


----------------------------------------------------------------------------

Latitude                    ,,        26.2110,decimal degrees,,
Longitude                   ,,       -66.0072,decimal degrees,,
Year                        ,,           2003,,,
Month                       ,,             11,,,
Day                         ,,              3,,,
Time                        ,,           10.0,decimal hours (UT),,
VARIABLES ,Depth     ,F,O,Temperatur ,F,O,Salinity   ,F,O,Oxygen     ,F,O,,
UNITS     ,m         , , ,degrees C ,, , ,PSS       ,, , ,ml/l      ,, , ,,
Prof-Flag ,          ,0, ,          ,0, ,          ,0, ,          ,0, ,,
         1,      3.18,0,2,   27.5938,0,2,   36.8218,0,2,     4.563,0,2,
         2,     25.33,0,2,   27.6006,0,2,   36.8357,0,2,     4.554,0,2,"""

raw = [t for t in data.split('----------------------------------------------------------------------------')]

for thing in raw:
    final_data[u'buoy_{0}'.format(buoy_number)] = [t.strip().replace('\n', '') for t in thing.split(',') if t.strip() != '']
    buoy_number += 1

for buoy in final_data:
    print final_data[buoy]

Output:

['Latitude', '25.5590', 'decimal degrees', 'Longitude', '-66.0020', 'decimal degrees', 'Year', '2003', 'Month', '11', 'Day', '3', 'Time', '2.97', 'decimal hours (UT)', 'VARIABLES', 'Depth', 'F', 'O', 'Temperatur', 'F', 'O', 'Salinity', 'F', 'O', 'Oxygen', 'F', 'O', 'UNITS', 'm', 'degrees C', 'PSS', 'ml/l', 'Prof-Flag', '0', '0', '0', '0', '1', '2.58', '0', '2', '27.4173', '0', '2', '36.5551', '0', '2', '4.577', '0', '2', '2', '23.64', '0', '2', '27.4678', '0', '2', '36.6834', '0', '2', '4.581', '0', '2']
['Latitude', '26.2110', 'decimal degrees', 'Longitude', '-66.0072', 'decimal degrees', 'Year', '2003', 'Month', '11', 'Day', '3', 'Time', '10.0', 'decimal hours (UT)', 'VARIABLES', 'Depth', 'F', 'O', 'Temperatur', 'F', 'O', 'Salinity', 'F', 'O', 'Oxygen', 'F', 'O', 'UNITS', 'm', 'degrees C', 'PSS', 'ml/l', 'Prof-Flag', '0', '0', '0', '0', '1', '3.18', '0', '2', '27.5938', '0', '2', '36.8218', '0', '2', '4.563', '0', '2', '2', '25.33', '0', '2', '27.6006', '0', '2', '36.8357', '0', '2', '4.554', '0', '2']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM