简体   繁体   English

当索引名称有多个维度时,如何使用 xarray 导入 netCDF4 文件?

[英]How to import netCDF4 file with xarray when index names have multiple dimensions?

When I try to import netCDF4 files using xarray I get the following error:当我尝试使用 xarray 导入 netCDF4 文件时,出现以下错误:

MissingDimensionsError: 'name' has more than 1-dimension and the same name as one of its dimensions ('time', 'name'). MissingDimensionsError: 'name' 有多个维度,并且与其维度之一('time'、'name')同名。 xarray disallows such variables because they conflict with the coordinates used to label dimensions. xarray 不允许使用此类变量,因为它们与用于标注尺寸的坐标相冲突。

However, I can successfully import these data using the netCDF4 python library, and get the data I need from it.但是,我可以使用 netCDF4 python 库成功导入这些数据,并从中获取我需要的数据。 The problem is that this method is very slow, so I was looking for something faster and wanted to try xarray.问题是这种方法很慢,所以我正在寻找更快的方法并想尝试 xarray。 Here is an example file, and the code that is giving me the bug in question. 这是一个示例文件,以及给我提供相关错误的代码。

from netCDF4 import Dataset
#import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np         
#import seaborn as sns
from tkinter import Tk

from tkinter.filedialog import askdirectory
import os
import xarray as xr

#use this function to get a directory name where the files are
def get_dat():
    root = Tk()
    root.withdraw()
    root.focus_force()
    root.attributes("-topmost", True)      #makes the dialog appear on top
    filename = askdirectory()      # Open single file
    root.destroy()
    root.quit()
    return filename

directory=get_dat()

#loop through files in directory and read the netCDF4 files
for filename in os.listdir(directory):     #loop through files in user's dir
    if filename.endswith(".nc"):     #all my files are .nc not .nc4
        runstart=pd.datetime.now()
        #I get the error right here
        rootgrp3 = xr.open_dataset(directory+'/'+filename)
        #more stuff happens here with the data, but this stuff works

The issue is still currently valid.该问题目前仍然有效。 The problem arise when a coordinate has multiple dimensions and as the same name of one of those dimensions.当坐标具有多个维度并且与这些维度之一具有相同的名称时,就会出现问题。

As an example, output files result.nc issued by the GOTM model have this problem for coordinates z and zi :例如, GOTM 模型发布的输出文件result.nc在坐标zzi有这个问题:

dimensions:
    time = UNLIMITED ; // (4018 currently)
    lon = 1 ;
    lat = 1 ;
    z = 218 ;
    zi = 219 ;
variables:
    ... 
    float z(time, z, lat, lon) ;
    float zi(time, zi, lat, lon) ;

It has been proposed here to implement a 'rename_var' kwarg to xr.open_dataset() as a work-around, but it hasn't been implement yet, to my knowledge.此处已提议将“rename_var”kwarg 实施到 xr.open_dataset() 作为解决方法,但据我所知,它还没有实施。

The quick workaround I use is to call nco-ncrename from python, where needed.我使用的快速解决方法是在需要的地方从 python 调用 nco-ncrename。

In my case :就我而言:

 os.system('ncrename -v z,z_coord -v zi,zi_coord result.nc resultxr.nc')

This allows这允许

 r2 = xr.open_dataset(testdir+'resultxr.nc')

while尽管

 r = xr.open_dataset(testdir+'result.nc')

was failing.失败了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM