简体   繁体   English

glob.glob 排序 - 不像预期的那样

[英]glob.glob sorting - not as expected

Im reading in some files from a directory using glob.glob, these files are named as such: 1.bmp我使用 glob.glob 从目录中读取一些文件,这些文件的名称如下: 1.bmp

The files/names continue in this naming pattern: 1.bmp, 2.bmp, 3.bmp ... and so on文件/名称以这种命名模式继续: 1.bmp, 2.bmp, 3.bmp ...等等

This is the code that i currently have, however whilst technically this does sort, it isnt as expected.这是我目前拥有的代码,但是虽然从技术上讲这确实排序,但它并不符合预期。 files= sorted(glob.glob('../../Documents/ImageAnalysis.nosync/sliceImage/*.bmp'))

This method sorts as such:这种方法排序如下:

../../Documents/ImageAnalysis.nosync/sliceImage/84.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/85.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/86.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/87.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/88.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/89.bmp

../../Documents/ImageAnalysis.nosync/sliceImage/9.bmp ../../Documents/ImageAnalysis.nosync/sliceImage/9.bmp

../../Documents/ImageAnalysis.nosync/sliceImage/90.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/91.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/92.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/93.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/94.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/95.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/96.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/97.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/98.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/99.bmp

In the above code i have highlighted the problem really, it is able to sort the file names well for eg 90-99.bmp is completely fine however between 89.bmp and 90.bmp there is the file 9.bmp this obviously shouldnt be there and should be near the start在上面的代码中,我确实强调了这个问题,它能够很好地对文件名进行排序,例如90-99.bmp完全89.bmp ,但是在89.bmp90.bmp之间有文件9.bmp这显然不应该是在那里并且应该在开始附近

The sort of output that im expecting is like this:我期望的输出类型是这样的:

1.bmp
2.bmp
3.bmp
4.bmp
5.bmp
6.bmp
...
10.bmp
11.bmp
12.bmp
13.bmp
...

and so on until the end of the files依此类推,直到文件结束

Is this possible to do with glob?这可能与glob有关吗?

That is because files as sorted based on their names (which are strings), and they are sorted in lexicographic order.那是因为文件是根据它们的名称(它们是字符串)排序的,并且它们是按字典顺序排序的。 Check [Python.Docs]: Sorting HOW TO for more sorting related details.检查[Python.Docs]: Sorting HOW TO以获取更多与排序相关的详细信息。
For things to work as you'd expect, the "faulty" file 9.bmp should be named 09.bmp (this applies to all such files).为了使事情如您所愿,“错误”文件9.bmp应命名为09.bmp (这适用于所有此类文件)。 If you'd have more than 100 files, things would be even clearer (and desired file names would be 009.bmp , 035.bmp ).如果您有 100 个以上的文件,事情会更清楚(并且所需的文件名将是009.bmp035.bmp )。

Anyway, there is an alternative (provided that all of the files follow the naming pattern), by converting the file's base name (without extension - check [Python.Docs]: os.path - Common pathname manipulations ) to an int , and sort based on that (by providing key to [Python.Docs]: sorted ( iterable, *, key=None, reverse=False ) )无论如何,还有一种替代方法(前提是所有文件都遵循命名模式),通过将文件的基本名称(不带扩展名 - 检查[Python.Docs]: os.path - Common pathname operations )转换为int ,然后排序基于此(通过提供[Python.Docs] 的sorted ( iterable, *, key=None, reverse=False )

files = sorted(glob.glob("../../Documents/ImageAnalysis.nosync/sliceImage/*.bmp"), key=lambda x: int(os.path.splitext(os.path.basename(x))[0]))

Not with glob.glob .不是glob.glob It returns a list unsorted or sorted according to the rules of the underlying system.它返回一个未排序或根据底层系统规则排序的列表。

What you need to do is provide a suitable key function to sorted , to define the ordering you want, rather than as plain text strings.您需要做的是为sorted提供一个合适的键函数,以定义您想要的排序,而不是作为纯文本字符串。 Something like (untested code):类似(未经测试的代码):

def mysorter( x):
   path, fn = os.path.split( x)
   fn,ext = os.path.splitext( fn)
   if fn.isdigit():
       fnn = int(fn)
       fn = f'{fnn:08}'  # left pad with zeros
   return f'{path}/{fn}.{ext}'

Then然后

   results=sorted( glob.glob(...), key=mysorter )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM