简体   繁体   English

R 的自然排序在部署上有所不同(可能是操作系统/区域设置问题)

[英]Natural sorting with R differs on deployment (maybe OS/Locale issue)

I am using the package "naturalsort" found here: https://github.com/kos59125/naturalsort Natural sorting is not something that is implemented elsewhere in a good manner in R as far as I know, so I was happy to find this package. I am using the package "naturalsort" found here: https://github.com/kos59125/naturalsort Natural sorting is not something that is implemented elsewhere in a good manner in R as far as I know, so I was happy to find this package。

I use the function naturalsort to sort file names just like windows explorer, which works great locally.我使用 function naturalsort 对文件名进行排序,就像 windows explorer 一样,它在本地工作得很好。

But when I use it in my production environment deployed with Docker on Google Cloud Run, the sorting changes.但是,当我在使用 Google Cloud Run 上的 Docker 部署的生产环境中使用它时,排序会发生变化。 I don't know if this is due to changes in locale(I am fra Denmark) or it is due to OS differences between my windows PC and the Docker/Google Cloud Run deployment.我不知道这是由于语言环境的变化(我来自丹麦)还是由于我的 windows PC 和 Docker/Google Cloud Run 部署之间的操作系统差异。

I have created a example ready to be run in R:我创建了一个可以在 R 中运行的示例:

######## Code start ###########
require(plumber)
require(naturalsort) #for name sorting

#* Retrieve sorted string list
#* @get /sortstrings
#* @param nothing
function(nothing) {
  
  print(nothing)
  
  test <- c("0.jpg", "file (4_5_1).jpeg", "1 tall thin image.jpeg",
            "8.jpeg", "8.jpg", "file (2.1.2).jpeg", "file (0).jpeg", "3.jpeg",
            "file (1).jpeg", "file (2.1.1).jpeg", "file (0) (3).jpeg", "file (2).jpeg",
            "file (2.1).jpeg", "file (4_5).jpeg", "file (4).jpeg", "file (39).jpeg")
  
  print("Direct sort")
  print(naturalsort(text = test))
  
  sorted_strings <- naturalsort(text = test)
  
  return(sorted_strings) 
}
######## Code end ###########

I would expect it to sort the file names like below, which it does locally both when run directly in the script and also when doing it through plumber RUN API:我希望它能够对文件名进行如下排序,当直接在脚本中运行以及通过管道工 RUN API 执行时,它都会在本地进行排序:

    c("0.jpg", 
  "1 tall thin image.jpeg", 
  "3.jpeg", 
  "8.jpeg", 
  "8.jpg", 
  "file (0) (3).jpeg", 
  "file (0).jpeg", 
  "file (1).jpeg", 
  "file (2).jpeg", 
  "file (2.1).jpeg", 
  "file (2.1.1).jpeg", 
  "file (2.1.2).jpeg", 
  "file (4).jpeg", 
  "file (4_5).jpeg", 
  "file (4_5_1).jpeg", 
  "file (39).jpeg"
  )

But instead it sorts it like this:但相反,它是这样排序的:

c("0.jpg",
"1 tall thin image.jpeg",
"3.jpeg",
"8.jpeg",
"8.jpg",
"file (0) (3).jpeg",
"file (0).jpeg",
"file (1).jpeg",
"file (2.1.1).jpeg",
"file (2.1.2).jpeg",
"file (2.1).jpeg",
"file (2).jpeg",
"file (4_5_1).jpeg",
"file (4_5).jpeg",
"file (4).jpeg",
"file (39).jpeg")

Which is not like windows explorer.这不像 windows 资源管理器。

Try fixing the collating sequence prior to the naturalsort call.尝试在调用naturalsort之前修复整理顺序。 It varies by locale and can affect how strings are compared (and therefore sorted).它因语言环境而异,并且会影响字符串的比较方式(以及排序方式)。

## Get initial value
lcc <- Sys.getlocale("LC_COLLATE")

## Use fixed value
Sys.setlocale("LC_COLLATE", "C")

sorted_strings <- naturalsort(text = test)

## Restore initial value
Sys.setlocale("LC_COLLATE", lcc)

You can find some details in ?sort , ?Comparison , and ?locales and more here .您可以在?sort?Comparison?locales中找到一些详细信息,等等

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM