NGINX 位置块正则表达式和代理传递

Question

I hope all of you are well.我希望你们都好。

I am a beginner with NGINX and I am trying to understand the following NGINX config file block.我是 NGINX 的初学者，我正在尝试了解以下 NGINX 配置文件块。 I would be really grateful if someone could help me understand this block.如果有人能帮助我理解这个块，我将不胜感激。

location ~ ^/search/google(/.*)?$ {
  set $proxy_uri $1$is_args$args;
  proxy_pass http://google.com$proxy_uri;
}

From the following SO article ( https://stackoverflow.com/a/59846239 ), I understand that:从以下 SO 文章（ https://stackoverflow.com/a/59846239 ）中，我了解到：

For the location ~ ^/search/google(/.*)?$对于location ~ ^/search/google(/.*)?$
- ~ means that it will perform regex search (case sensitive) ~表示它将执行正则表达式搜索（区分大小写）
- ^/search/google means that the route should start with /search/google (eg http://<ip or domain>/search/google . Is there any difference if we have trailing / at the end (eg http://<ip or domain>/search/google/ instead of http://<ip or domain>/search/google ^/search/google意味着路由应该以/search/google开头（例如http://<ip or domain>/search/google 。如果我们在末尾有尾随/有什么区别（例如http://<ip or domain>/search/google/而不是http://<ip or domain>/search/google
- (/.*)?$ this is the part that I'm a bit confused . (/.*)?$这是我有点困惑的部分。
  - why use () group in this case?为什么在这种情况下使用()组？ What's the common use case of using group?使用 group 的常见用例是什么？
  - why use ?为什么使用? in this case?在这种情况下？ Isn't .* already includes any char zero or more, why do we still need ? .*不是已经包含零个或多个字符，为什么我们还需要?
  - Can we simply remove () and ?我们可以简单地删除()和? such as /search/google/.*$ to get the same behavior as the original one?例如/search/google/.*$以获得与原始行为相同的行为？
set $proxy_uri $1$is_args$args;
- I understand that we are setting a user-defined var called proxy_uri我知道我们正在设置一个名为proxy_uri的用户定义proxy_uri
- what will $1 be replaced with, sometimes someone also include $2 and so on? $1将被替换为什么，有时有人还包含$2等等？
- I think $is_args$args means that if there's a query string (ie http://<ip or domain>/search/google?fruit=apple , $is_args$args will be replaced with ?fruit=apple我认为$is_args$args意味着如果有一个查询字符串（即http://<ip or domain>/search/google?fruit=apple ， $is_args$args将被替换为?fruit=apple
proxy_pass http://google.com$proxy_uri
- I would assume it just redirects the user to http://google.com$proxy_uri ???我认为它只是将用户重定向到http://google.com$proxy_uri ？？？ same as http redirect 301???与http重定向301相同？？？

Thank you very much in advance!非常感谢您提前！

Answer 1

Being a non-native English speaker, I thought someone will answer your question with a more perfect English than mine, but since no one did it for the last five days, I would try to do it by myself.作为一个非英语母语者，我认为有人会用比我更完美的英语来回答你的问题，但由于过去五天没有人这样做，我会尝试自己做。

~ means that it will perform regex search (case sensitive) ~表示它将执行正则表达式搜索（区分大小写）

I think the more correct term is "perform matching against a regex pattern".我认为更正确的术语是“对正则表达式模式执行匹配”。

^/search/google means that the route should start with /search/google (eg http://<ip or domain>/search/google . Is there any difference if we have trailing / at the end (eg http://<ip or domain>/search/google/ instead of http://<ip or domain>/search/google ^/search/google意味着路由应该以/search/google开头（例如http://<ip or domain>/search/google 。如果我们在末尾有尾随/有什么区别（例如http://<ip or domain>/search/google/而不是http://<ip or domain>/search/google

Will be answered below.下面会一一解答。

why use () group in this case?为什么在这种情况下使用()组？ What's the common use case of using group?使用 group 的常见用例是什么？

This is a numbered capturing group .这是一个编号的捕获组。 Content of the string matched this group can be referenced later as $1 .与该组匹配的字符串内容稍后可以作为$1引用。 Second numbered capture group, being present in the regex pattern, can be referenced as $2 and so on.第二个编号的捕获组，出现在正则表达式模式中，可以被引用为$2等等。 There is also the named capture groups exists, when you can use your own variable name instead of $1 , $2 , etc. A good example of using named capture groups is given at this ServerFault thread.还有一个名为捕获组存在，当你可以用它代替你自己的变量名$1 ， $2 ，等一个良好的使用命名捕获组在给出的例子这ServerFault线程。

BTW the answer you are referencing mentions numbered capture groups (but not the named capture groups).顺便说一句，您引用的答案提到了编号的捕获组（但不是命名的捕获组）。

why use ?为什么使用? in this case?在这种情况下？ Isn't .* already includes any char zero or more, why do we still need ? .*不是已经包含零个或多个字符，为什么我们还需要?

Did you notice our capture group is (/.*) , not the (.*) ?你注意到我们的捕获组是(/.*) ，而不是(.*)吗？ This way it will match /search/google/<any suffix> but not the /search/googles etc. A question sign made this capturing group optional ( /search/google will match our regex pattern too).这样它将匹配/search/google/<any suffix>但不匹配/search/google/<any suffix> /search/googles等。一个问号使这个捕获组可选（ /search/google也将匹配我们的正则表达式模式）。

Can we simply remove () and ?我们可以简单地删除()和? such as /search/google/.*$ to get the same behavior as the original one?例如/search/google/.*$以获得与原始行为相同的行为？

No, as we need that $1 value later.不，因为我们稍后需要$1价值。 If you understand all the above information correctly, you should see it can be /<any suffix> or an empty string.如果您正确理解了以上所有信息，您应该会看到它可以是/<any suffix>或空字符串。

what will $1 be replaced with, sometimes someone also include $2 and so on? $1将被替换为什么，有时有人还包含$2等等？

Already answered.已经回答了。

I think $is_args$args means that if there's a query string (ie http://<ip or domain>/search/google?fruit=apple , $is_args$args will be replaced with ?fruit=apple我认为$is_args$args意味着如果有一个查询字符串（即http://<ip or domain>/search/google?fruit=apple ， $is_args$args将被替换为?fruit=apple

Yes, exactly.对，就是这样。

I would assume it just redirects the user to http://google.com$proxy_uri ???我认为它只是将用户重定向到http://google.com$proxy_uri ？？？ same as http redirect 301???与http重定向301相同？？？

Totally wrong.完全错误。 The difference is briefly described here although that answer doesn't mention you can additionally modify the response before sending it to the client (for example, using the sub_filter module).尽管该答案没有提到您可以在将响应发送到客户端之前另外修改响应（例如，使用sub_filter模块），但此处简要描述了差异。

NGINX 位置块正则表达式和代理传递

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-10-12 14:07:11

NGINX 位置块正则表达式和代理传递

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-10-12 14:07:11

解决方案1
1 已采纳 2021-10-12 14:07:11