简体   繁体   English

如何组织具有多个分析步骤的Python GIS项目?

[英]How to organize a Python GIS-project with multiple analysis steps?

I just started to use ArcPy to analyse geo-data with ArcGIS. 我刚刚开始使用ArcPy通过ArcGIS分析地理数据。 The analysis has different steps, which are to be executed one after the other. 分析具有不同的步骤,这些步骤应一个接一个地执行。

Here is some pseudo-code: 这是一些伪代码:

import arcpy

# create a masking variable
mask1 = "mask.shp"    

# create a list of raster files
files_to_process = ["raster1.tif", "raster2.tif", "raster3.tif"]

# step 1 (e.g. clipping of each raster to study extent)
for index, item in enumerate(files_to_process):
        raster_i = "temp/ras_tem_" + str(index) + ".tif"
        arcpy.Clip_management(item, '#', raster_i, mask1)

# step 2 (e.g. change projection of raster files)

# step 3 (e.g. calculate some statistics for each raster)


This code works amazingly well so far. 到目前为止,这段代码运行得非常好。 However, the raster files are big and some steps take quite long to execute (5-60 minutes). 但是,栅格文件很大,某些步骤要花很长时间才能执行(5-60分钟)。 Therefore, I would like to execute those steps only if the input raster data changes. 因此,仅在输入栅格数据发生更改时,我才想执行这些步骤。 From the GIS-workflow point of view, this shouldn't be a problem, because each step saves a physical result on the hard disk which is then used as input by the next step. 从GIS工作流的角度来看,这应该不成问题,因为每个步骤都会将物理结果保存在硬盘上,然后将其用作下一步的输入。

I guess if I want to temporarily disable eg step 1, I could simply put a # in front of every line of this step. 我想我是否想暂时禁用例如第1步,我可以简单地在此步骤的每一行前面加上# However, in the real analysis, each step might have a lot of lines of code, and I would therefore prefer to outsource the code of each step into a separate file (eg "step1.py", "step2.py",...), and then execute each file. 但是,在实际分析中,每个步骤可能有很多行代码,因此,我希望将每个步骤的代码外包到一个单独的文件中(例如“ step1.py”,“ step2.py”。)。 。),然后执行每个文件。

I experimented with execfile(step1.py) , but received the error NameError: global name 'files_to_process' is not defined . 我尝试使用execfile(step1.py) ,但收到错误NameError: global name 'files_to_process' is not defined It seems that the variables defined in the main script are not automatically passed to scripts called by execfile . 似乎主脚本中定义的变量不会自动传递给execfile调用的脚本。

I also tried this , but I received the same error as above. 我也尝试过此方法 ,但收到与上述相同的错误。

I'm a total Python newbie (as you might have figured out by the misuse of any Python-related expressions), and I would be very thankful for any advice on how to organize such a GIS project. 我是一个Python新手(您可能已经误用了任何与Python相关的表达式,因此可能会发现这一点),对于如何组织此类GIS项目的任何建议,我将非常感谢。

I think what you want to do is build each step into a function. 我认为您想要做的是将每个步骤构建到一个函数中。 These functions can be stored in the same script file or in their own module that gets loaded with the import statement (just like arcpy). 这些函数可以存储在同一脚本文件中,也可以存储在使用import语句加载的它们自己的模块中(就像arcpy一样)。 The pseudo code would be something like this: 伪代码将如下所示:

#file 1: steps.py
def step1(input_files):
  # step 1 code goes here
  print 'step 1 complete'

def step2(input_files):
  # step 2 code goes here
    print 'step 2 complete'
    return output # optionally return a derivative here

#...and so on

Then in a second file in the same directory, you can import and call the functions passing the rasters as your inputs. 然后,可以在同一目录中的另一个文件中,导入并调用将栅格作为输入的函数。

#file 2: analyze.py
import steps
files_to_process = ["raster1.tif", "raster2.tif", "raster3.tif"]


#steps.step2(files_to_process) # uncomment this when you're ready for step 2

Now you can selectively call different steps of your code and it only requires commenting/excluding one line instead of a whle chunk of code. 现在,您可以有选择地调用代码的不同步骤,而只需要注释/排除一行,而不是一小段代码。 Hopefully I understood your question correctly. 希望我能正确理解您的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM