简体   繁体   中英

How to transform a Pandas Dataframe with irregular coordinates into a xarray Dataset

I'm working with a pandas Dataframe on python, but in order to plot as a map my data I have to transform it into a xarray Dataset, since the library I'm using to plot (salem) works best for this class. The problem I'm having is that the grid of my data isn't regular so I can't seem to be able to create the Dataset.

My Dataframe has the latitude and longitude, as well as the value in each point:

              lon        lat      value
0     -104.936302 -51.339233   7.908411
1     -104.827377 -51.127686   7.969049
2     -104.719154 -50.915470   8.036676
3     -104.611641 -50.702595   8.096765
4     -104.504814 -50.489056   8.163690
...           ...        ...        ...
65995  -32.911377  15.359591  25.475702
65996  -32.957718  15.579139  25.443994
65997  -33.004040  15.798100  25.429346
65998  -33.050335  16.016472  25.408105
65999  -33.096611  16.234255  25.383844

[66000 rows x 3 columns]

In order to create the Dataset using lat and lon as coordinates and fill all of the missing values with NaN , I was trying the following:

ds = xr.Dataset({
    'ts': xr.DataArray(
                data   = value,   # enter data here
                dims   = ['lon','lat'],
                coords = {'lon': lon, 'lat':lat},
                attrs  = {
                    '_FillValue': np.nan,
                    'units'     : 'K'
                    }
                )},
        attrs = {'attr': 'RegCM output'}
    )
ds

But I got the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [41], in <cell line: 1>()
      1 ds = xr.Dataset({
----> 2     'ts': xr.DataArray(
      3                 data   = value,   # enter data here
      4                 dims   = ['lon','lat'],
      5                 coords = {'lon': lon, 'lat':lat},
      6                 attrs  = {
      7                     '_FillValue': np.nan,
      8                     'units'     : 'K'
      9                     }
     10                 )},
     11         attrs = {'example_attr': 'this is a global attribute'}
     12     )
     14 # ds = xr.Dataset(
     15 #     data_vars=dict(
     16 #         variable=(["lon", "lat"], value)
   (...)
     25 #                      }
     26 # )
     27 ds

File ~\anaconda3\lib\site-packages\xarray\core\dataarray.py:406, in DataArray.__init__(self, data, coords, dims, name, attrs, indexes, fastpath)
    404 data = _check_data_shape(data, coords, dims)
    405 data = as_compatible_data(data)
--> 406 coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
    407 variable = Variable(dims, data, attrs, fastpath=True)
    408 indexes = dict(
    409     _extract_indexes_from_coords(coords)
    410 )  # needed for to_dataset

File ~\anaconda3\lib\site-packages\xarray\core\dataarray.py:123, in _infer_coords_and_dims(shape, coords, dims)
    121     dims = tuple(dims)
    122 elif len(dims) != len(shape):
--> 123     raise ValueError(
    124         "different number of dimensions on data "
    125         f"and dims: {len(shape)} vs {len(dims)}"
    126     )
    127 else:
    128     for d in dims:

ValueError: different number of dimensions on data and dims: 1 vs 2

I would really appreciate any insights to solve this.

If you really require a rectangularly gridded dataset you need to resample your data into a regular grid... ( rasterio , pyresample etc. provide useful functionalities for that). However if you just want to plot the data, this is not necessary!

Not sure about salem (never used it so far), but I've tried my best to simplify plotting of irrelgularly sampled data in the visualization-library I'm developing EOmaps !

You could get a "contour-plot" like appearance if you use a "delaunay triangulation" to visualize the data:

import pandas as pd
df = pd.read_csv("... path-to df.csv ...", index_col=0)

from eomaps import Maps

m = Maps()
m.add_feature.preset.coastline()
m.set_data(df, x="lon", y="lat", crs=4326, parameter="value")
m.set_shape.delaunay_triangulation()
m.plot_map()

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM