简体   繁体   English

如何组织具有多个分析步骤的Python GIS项目?

[英]How to organize a Python GIS-project with multiple analysis steps?

I just started to use ArcPy to analyse geo-data with ArcGIS. 我刚刚开始使用ArcPy通过ArcGIS分析地理数据。 The analysis has different steps, which are to be executed one after the other. 分析具有不同的步骤,这些步骤应一个接一个地执行。

Here is some pseudo-code: 这是一些伪代码:

import arcpy

# create a masking variable
mask1 = "mask.shp"    

# create a list of raster files
files_to_process = ["raster1.tif", "raster2.tif", "raster3.tif"]

# step 1 (e.g. clipping of each raster to study extent)
for index, item in enumerate(files_to_process):
        raster_i = "temp/ras_tem_" + str(index) + ".tif"
        arcpy.Clip_management(item, '#', raster_i, mask1)

# step 2 (e.g. change projection of raster files)
...

# step 3 (e.g. calculate some statistics for each raster)
...

etc.

This code works amazingly well so far. 到目前为止,这段代码运行得非常好。 However, the raster files are big and some steps take quite long to execute (5-60 minutes). 但是,栅格文件很大,某些步骤要花很长时间才能执行(5-60分钟)。 Therefore, I would like to execute those steps only if the input raster data changes. 因此,仅在输入栅格数据发生更改时,我才想执行这些步骤。 From the GIS-workflow point of view, this shouldn't be a problem, because each step saves a physical result on the hard disk which is then used as input by the next step. 从GIS工作流的角度来看,这应该不成问题,因为每个步骤都会将物理结果保存在硬盘上,然后将其用作下一步的输入。

I guess if I want to temporarily disable eg step 1, I could simply put a # in front of every line of this step. 我想我是否想暂时禁用例如第1步,我可以简单地在此步骤的每一行前面加上# However, in the real analysis, each step might have a lot of lines of code, and I would therefore prefer to outsource the code of each step into a separate file (eg "step1.py", "step2.py",...), and then execute each file. 但是,在实际分析中,每个步骤可能有很多行代码,因此,我希望将每个步骤的代码外包到一个单独的文件中(例如“ step1.py”,“ step2.py”。)。 。),然后执行每个文件。

I experimented with execfile(step1.py) , but received the error NameError: global name 'files_to_process' is not defined . 我尝试使用execfile(step1.py) ,但收到错误NameError: global name 'files_to_process' is not defined It seems that the variables defined in the main script are not automatically passed to scripts called by execfile . 似乎主脚本中定义的变量不会自动传递给execfile调用的脚本。

I also tried this , but I received the same error as above. 我也尝试过此方法 ,但收到与上述相同的错误。

I'm a total Python newbie (as you might have figured out by the misuse of any Python-related expressions), and I would be very thankful for any advice on how to organize such a GIS project. 我是一个Python新手(您可能已经误用了任何与Python相关的表达式,因此可能会发现这一点),对于如何组织此类GIS项目的任何建议,我将非常感谢。

I think what you want to do is build each step into a function. 我认为您想要做的是将每个步骤构建到一个函数中。 These functions can be stored in the same script file or in their own module that gets loaded with the import statement (just like arcpy). 这些函数可以存储在同一脚本文件中,也可以存储在使用import语句加载的它们自己的模块中(就像arcpy一样)。 The pseudo code would be something like this: 伪代码将如下所示:

#file 1: steps.py
def step1(input_files):
  # step 1 code goes here
  print 'step 1 complete'
  return

def step2(input_files):
  # step 2 code goes here
    print 'step 2 complete'
    return output # optionally return a derivative here

#...and so on

Then in a second file in the same directory, you can import and call the functions passing the rasters as your inputs. 然后,可以在同一目录中的另一个文件中,导入并调用将栅格作为输入的函数。

#file 2: analyze.py
import steps
files_to_process = ["raster1.tif", "raster2.tif", "raster3.tif"]

steps.step1(files_to_process)

#steps.step2(files_to_process) # uncomment this when you're ready for step 2

Now you can selectively call different steps of your code and it only requires commenting/excluding one line instead of a whle chunk of code. 现在,您可以有选择地调用代码的不同步骤,而只需要注释/排除一行,而不是一小段代码。 Hopefully I understood your question correctly. 希望我能正确理解您的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM