简体   繁体   English

如何阻止Varnish缓存Sitemap?

[英]How to stop Varnish from caching Sitemap?

I'm running a Wordpress blog on Nginx and Varnish. 我正在Nginx和Varnish上运行Wordpress博客。 I'm using the following configuration for Varnish: 我为Varnish使用以下配置:

# This is a basic VCL configuration file for varnish.  See the vcl(7)
# man page for details on VCL syntax and semantics.
# 
# Default backend definition.  Set this to point to your content
# server.
# 
backend default {
    .host = "127.0.0.1";
    .port = "8080";
    .connect_timeout = 600s;
    .first_byte_timeout = 600s;
    .between_bytes_timeout = 600s;
    .max_connections = 800;
}


acl purge {
        "localhost";
}

sub vcl_recv {
    set req.grace = 2m;

  # Set X-Forwarded-For header for logging in nginx
  remove req.http.X-Forwarded-For;
  set    req.http.X-Forwarded-For = client.ip;


  # Remove has_js and CloudFlare/Google Analytics __* cookies.
  set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(_[_a-z]+|has_js)=[^;]*", "");
  # Remove a ";" prefix, if present.
  set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");



# Either the admin pages or the login
if (req.url ~ "/wp-(login|admin|cron)") {
        # Don't cache, pass to backend
        return (pass);
}


# Remove the wp-settings-1 cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-1=[^;]+(; )?", "");

# Remove the wp-settings-time-1 cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-time-1=[^;]+(; )?", "");

# Remove the wp test cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wordpress_test_cookie=[^;]+(; )?", "");

# Static content unique to the theme can be cached (so no user uploaded images)
# The reason I don't take the wp-content/uploads is because of cache size on bigger blogs
# that would fill up with all those files getting pushed into cache
if (req.url ~ "wp-content/themes/" && req.url ~ "\.(css|js|png|gif|jp(e)?g)") {
    unset req.http.cookie;
}

# Even if no cookies are present, I don't want my "uploads" to be cached due to their potential size
if (req.url ~ "/wp-content/uploads/") {
    return (pass);
}

# Check the cookies for wordpress-specific items
if (req.http.Cookie ~ "wordpress_" || req.http.Cookie ~ "comment_") {
        # A wordpress specific cookie has been set
    return (pass);
}



    # allow PURGE from localhost
    if (req.request == "PURGE") {
        if (!client.ip ~ purge) {
            error 405 "Not allowed.";
        }
        return (lookup);
    }


    # Force lookup if the request is a no-cache request from the client
    if (req.http.Cache-Control ~ "no-cache") {
        return (pass);
    }


# Try a cache-lookup
return (lookup);

}

sub vcl_fetch {
    #set obj.grace = 5m;
    set beresp.grace = 2m;

}

sub vcl_hit {
        if (req.request == "PURGE") {
                purge;
                error 200 "Purged.";
        }
}

sub vcl_miss {
        if (req.request == "PURGE") {
                purge;
                error 200 "Purged.";
        }
}

I've followed the tutorial mentioned here 我已遵循此处提到的教程

Everything works fine, but I'm using Yoast SEO Plugin to generate the Sitemap dynamically after every new post. 一切正常,但是我在每个新帖子发布后都使用Yoast SEO插件动态生成Sitemap。 It generates a sitemap index named sitemap_index.xml that contains other sitemaps (for posts, pages, author etc). 它会生成一个名为sitemap_index.xml的站点地图索引,其中包含其他站点地图(用于帖子,页面,作者等)。 This is also working fine. 这也很好。

  1. The problem is how can I prevent Varnish from caching my Sitemaps? 问题是如何防止Varnish缓存站点地图?
  2. How can I prevent Varnish from messing with Google Analytics? 如何防止Varnish与Google Analytics(分析)混淆? It shouldn't stop GA from providing me with a correct report. 它不应阻止GA向我提供正确的报告。

I'm new to Varnish, can someone please guide me on how to modify the config. 我是Varnish的新手,有人可以指导我如何修改配置。 :( Please help. :( 请帮忙。

UPDATE: 更新:

Will it work if I include the following into sub vcl_recv 如果我将以下内容包含在sub vcl_recv它将起作用sub vcl_recv

if (req.url ~ "\.xml(\.gz)?$") {
   return (pass);
}

PLEASE remove these lines !! 删除这些行!

if (req.url ~ "\.xml(\.gz)?$") {
   return (pass);
}

Returning (pass) is a workaround but it's not how you want to use Varnish. 返回(通过)是一种解决方法,但这不是您要使用Varnish的方式。 Varnish is here to cache pages and contents like sitemap_index.xml Varnish在这里用于缓存页面和内容,例如sitemap_index.xml

You already implemented PURGE mechanism in VCL, so the simplest way to handle your sitemap_index.xml issue is to PURGE it ! 您已经在VCL中实现了PURGE机制,因此处理sitemap_index.xml问题的最简单方法是对其进行PURGE!

The basic principle is that sitemap_index.xml need to be cached as long as no new post has been made. 基本原则是,只要未发布新帖子, sitemap_index.xml需要缓存sitemap_index.xml Then, every time a new post is created, you have to inform Varnish that sitemap_index.xml is no longer valid by sending the HTTP request below (pasted from official documentation (1)) : 然后,每次创建新帖子时,您都必须通过发送以下HTTP请求(从官方文档(1)中粘贴)来通知Varnish sitemap_index.xml不再有效:

PURGE /sitemap_index.xml HTTP/1.0
Host: example.com

So, I guess you will have the choice by editing your module manually or by using the Varnish HTTP Purge / WordPress module (and probably hack it manually also) (2) 因此,我想您可以通过手动编辑模块或使用Varnish HTTP Purge / WordPress模块​​(也可以手动修改)来选择(2)

  1. https://www.varnish-cache.org/docs/3.0/tutorial/purging.html#http-purges https://www.varnish-cache.org/docs/3.0/tutorial/purging.html#http-purges

  2. http://wordpress.org/plugins/varnish-http-purge/ http://wordpress.org/plugins/varnish-http-purge/

Will it work if I include the following into sub vcl_recv 如果我将以下内容包含在sub vcl_recv中,它将起作用吗?

if (req.url ~ ".xml(.gz)?$") { return (pass); if(req.url〜“ .xml(.gz)?$”){return(pass); } }

This will work. 这将起作用。 Place it near the top of the function. 将其放在函数顶部附近。 Keep in mind though, that it will prevent caching of all .xml files and all .xml.gz files. 但是请记住,这将防止缓存所有 .xml文件和所有 .xml.gz文件。 Granted, most of the xml files and xml.gz files you are probably serving, site maps, still it is a consideration, in case they are not. 当然,您可能正在提供的大多数xml文件和xml.gz文件(站点地图)仍然是一个考虑因素,以防万一它们不是。

i can`t give you the exact syntax, but you should pipe* the request for the sitemap. 我无法提供确切的语法,但是您应该通过管道发送对站点地图的请求。

*pipe - match the request in your vcl and direct it always to fetch it from the server. * pipe-匹配vcl中的请求,并始终将其定向以从服务器获取。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM