Varnish缓存实现动静分离
一、简介
Web缓存是指一个Web资源(html,js,css,images...)存在与Web服务器和客户端(浏览器),缓存会根据进来的请求报文做出响应,后缓存一份到本地的缓存中;当下一个请求到来的时候,如果是相同的URL,缓存会根据缓存机制决定是直接使用从缓存中响应访问请求还是向后端服务器再次发送请求,取决于缓存是否过期及其请求的内容是否发生改变。在前面所学的memcached中有过缓存的概念,但此内存缓存存在很大的弊端,被当今互联网企业所淘汰,varnish具有高速缓存的功能,得到了很多大型网站的青睐。有效的缓存能减少后端主机的压力,实现快速响应用户的请求,提高用户体验。
二、varnish工作原理及其相关配置说明
varnish架构图:
原理:varnish主要是有management及child进程所组成,management进程主要负责提供命令行接口、编译vcl,健康状态检测child子进程是否存活及其监控varnish,而child子进程负责工作线程,生成缓存日志,查看缓存是否过期等一系列工作。
vcl(varnish configuraltion languages):varnish域专用配置语言,是基于状态引擎,转台之间存在着相关性,但彼此之间相互隔离,每个引擎使用return来退出当前状态并进入下一个状态,不同的状态的引擎是不尽相同。
vcl处理流程图:
请求流程:请求分为为可缓存和不可缓存,当请求可缓存时,是否命中,命中则从本地缓存响应,未命中则到达后端主机取得相应的结果,公共缓存则可缓存,缓存一份到缓存后再次响应给客服端,如私有数据则不可缓存直接响应即可。
数据流向:
vcl_recv-->vcl_hash-->
1)vcl_hit-->vcl_deliver
2)vcl_hist-->vcl_pass-->vcl_backend_fetch
vcl_miss-->vcl_pass
vcl_miss-->vcl_backend_fetch
vcl_purge-->vcl_synth
vcl_pipe-->done
vcl_backend_fetch-->vcl_backend_respose
vcl_backend_fetch-->vcl_backend_error
实例配置:
sub vcl_recv {
if (req.method == "PRI") {
/* We do not support SPDY or HTTP/2.0 */
return (synth(405));
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe); #当请求方法不是上诉方法时直接交给后端主机
}
if (req.method != "GET" && req.method != "HEAD") {
/* We only deal with GET and HEAD by default */
return (pass); #请求方法不是get/head时交给backend_fetch
}
if (req.http.Authorization || req.http.Cookie) {#当请求报文中含有认证和cookie信息时交给后端主机
/* Not cacheable by default */
return (pass);
}
return (hash); #除了上述方法外其他都交由hash处理后return一个状态信息,接下来一个处理动作
}
后端主机响应:
sub vcl_backend_response {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Surrogate-control ~ "no-store" ||
(!beresp.http.Surrogate-Control &&
beresp.http.Cache-Control ~ "no-cache|no-store|private") ||
beresp.http.Vary == "*") {
/*
* Mark as "Hit-For-Pass" for the next 2 minutes
*/
set beresp.ttl = 120s;
set beresp.uncacheable = true;
}
return (deliver);
}
测试:
backend default { #将请求发往后端主机
.host = "10.1.4.6";
.port = "80";
}
sub vcl_recv { #当接受到的报文中以test.html的页面时不查找缓存
if (req.url ~ "^/test.html$"){
return(pass);
}
}
sub vcl_deliver { #如果命中大于0时则在首部加上hit和IP地址
if (obj.hits>0) {
set resp.http.X-Cache = "Hit via" + " " + server.ip;
} else {
set resp.http.X-Cache = "Miss via" + " " + server.ip;
}
}
请求到达后可以使用的VCL内建公用变量:
公用变量名称 | 含义 |
req.backend | 指定请求对应的后端主机 |
server.ip | 表示服务器端IP |
client.ip | 表示客户端IP |
req.request | 指定请求的类型,例如:GET HEAD POST等 |
req.url | 指定请求的地址 |
req.proto | 表示客户端发起请求的HTTP协议版本 |
req.http.header | 表示对用请求中的HTTP头部信息 |
req.restarts | 表示请求重启的次数,默认最大值为4 |
后端主机响应Varnish之前,可以使用的公用变量:
公用变量名称 | 含义 |
beresp.request | 指定请求类型,例如GET和HEAD等 |
beresp.url | 指定请求的地址 |
beresp.proto | 表示客户端发起的请求的HTTP协议版本 |
beresp.http.header | 表示对应请求中的HTTP头部信息 |
beresp.ttl | 表示缓存的生存周期,也就是cache保留多长时间,单位为秒 |
从cache或后端主机获取内容后,可以使用的公用变量:
公用变量名称 | 含义 |
obj.status | 表示返回内容的请求状态代码,例如200、302和504等 |
obj.cacheable | 表示返回的内容是否可以缓存,也就是说,如果HTTP返回的是202、203、300、301、302、404或410等,并且有非0的生存期,则可以缓存。 |
obj.valid | 表示是否有效的HTTP应答 |
obj.response | 表示返回内容的请求状态信息 |
obj.proto | 表示返回内容的HTTP协议版本 |
obj.ttl | 表示返回内容的生存周期,也就是缓存时间,单位为秒 |
obj.lastuse | 表示返回上一次请求到现在的间隔时间,单位为秒 |
对客户端应答时,可以使用的公用变量:
公用变量名称 | 含义 |
resp.status | 表示返回给客户端的HTTP状态代码 |
resp.proto | 表示返回给客户端的HTTP协议版本 |
resp.http.header | 表示返回给客户端的HTTP头部信息 |
resp.response | 表示返回给客户端的HTTP状态信息 |
三、Varnish负载均衡及其动静分离实战
实验环境如下:
主机 | IP | 服务功能 |
Varnish | 10.1.10.65/16 | Varnish Server |
web1 | 10.1.10.66/16 | httpd server |
web2 | 10.1.10.67/16 | httpd server |
注意:在实现两台后端主机负载均衡时需将此路径设置为不缓存直接从后端主机中取得数据
#
# This is an example VCL file for Varnish.
#
# It does not do anything by default, delegating control to the
# builtin VCL. The builtin VCL is called when there is no explicit
# return statement.
#
# See the VCL chapters in the Users Guide at https://www.varnish-cache.org/docs/
# and http://varnish-cache.org/trac/wiki/VCLExamples for more examples.
# Marker to tell the VCL compiler that this VCL has been adapted to the
# new 4.0 format.
vcl 4.0;
import directors; #导入directors模块
# Default backend definition. Set this to point to your content server.
probe check { #定义健康状态检测
.url = "/"; #检测的路径URL
.window = 5; #检测次数
.threshold = 4; #检测次数中成功多少次才算健康
.interval = 2s; #两次健康状态检测之间的时间间隔
.timeout = 1s; #检测超时时长
}
backend websrv1 { #添加后端主机websrv1
.host = "10.1.10.66"; #后端主机IP地址
.port = "80"; #后端主机监听的端口
.probe = check; #调用健康状态机制
}
backend websrv2 { #添加后端主机websrv2
.host = "10.1.10.67"; #后端主机IP地址
.port = "80"; #后端主机监听的端口
.probe = check; #调用健康状态机制
}
sub vcl_init { #创建后端主机组
new websrv = directors.round_robin(); #设置主机组的调度算法,有两种,另一种为random
websrv.add_backend(websrv1); #将后端主机加入到组中
websrv.add_backend(websrv2); #将后端主机加入到组中
}
sub vcl_recv {
# Happens before we check if we have this in cache already.
#
# Typically you clean up the request here, removing cookies you don't need,
# rewriting the request, etc.
set req.backend_hint=websrv.backend(); #设置将请求都调度都后端主机负载均衡
if (req.url ~"(?i)^/login") { #设置/login目录不检测缓存直接送达后端主机
return(pass);
}
if ( req.method == "PURGE") { #自定义purge方法
return(purge);
}
}
sub vcl_purge { #调用purge方法,返回你想返回的状态码及其信息
return(synth(200,"Pugred."));
}
sub vcl_backend_response {
# Happens after we have read the response headers from the backend.
#
# Here you clean the response headers, removing silly Set-Cookie headers
# and other mistakes your backend does.
}
sub vcl_deliver {
# Happens when we have all the pieces we need, and are about to send the
# response to the client.
#
# You can do accounting or modifying the final object here.
if (obj.hits>0) { #自定义响应报文的首部信息
set resp.http.X-Cache = "Hit via "+ server.ip;
}else {
set resp.http.X-Cache = "Miss via "+ server.ip;
}
}
配置完成后可使用varnish_reload_vcl完成编译和应用此配置,也可使用varnishadm实现,负载均衡实现图如下:
配置动静分离配置如下:
# This is an example VCL file for Varnish.
#
# It does not do anything by default, delegating control to the
# builtin VCL. The builtin VCL is called when there is no explicit
# return statement.
#
# See the VCL chapters in the Users Guide at https://www.varnish-cache.org/docs/
# and http://varnish-cache.org/trac/wiki/VCLExamples for more examples.
# Marker to tell the VCL compiler that this VCL has been adapted to the
# new 4.0 format.
vcl 4.0;
import directors;
probe check {
.url = "/";
.window = 5;
.threshold = 4;
.interval = 2s;
.timeout = 1s;
}
backend websrv1 {
.host = "10.1.10.66";
.port = "80";
.probe = check;
}
backend websrv2 {
.host = "10.1.10.67";
.port = "80";
.probe = check;
}
sub vcl_init {
new websrv = directors.round_robin();
websrv.add_backend(websrv1);
websrv.add_backend(websrv2);
}
sub vcl_recv {
# Happens before we check if we have this in cache already.
#
# Typically you clean up the request here, removing cookies you don't need,
# rewriting the request, etc.
if (req.url ~ "(?i)\.php$") { #将.php结尾的文件发往websrv1
set req.backend_hint = websrv1;
} else {
set req.backend_hint = websrv2; #将其他结尾的文件发往websrv1
}
if (req.url ~"(?i)^/login") {
return(pass);
}
if ( req.method == "PURGE") {
return(purge);
}
}
sub vcl_purge {
return(synth(200,"Pugred."));
}
sub vcl_backend_response {
# Happens after we have read the response headers from the backend.
#
# Here you clean the response headers, removing silly Set-Cookie headers
# and other mistakes your backend does.
if (beresp.http.cache.control !~ "s-maxage") {
if (bereq.url ~ "(?i)\.jpg$") {
set beresp.ttl = 3600s; #设置.jpg结尾的图片TTL时长,加长其缓存时长
unset beresp.http.Set-Cookie; #取消追踪图片的cookie信息
}
}
if (beresp.http.cache.control !~ "s-maxage") {
if (bereq.url ~ "(?i)\.css$") {
set beresp.ttl = 3600s;
unset beresp.http.Set-Cookie;
}
}
}
sub vcl_deliver {
# Happens when we have all the pieces we need, and are about to send the
# response to the client.
#
# You can do accounting or modifying the final object here.
if (obj.hits>0) {
set resp.http.X-Cache = "Hit via "+ server.ip;
}else {
set resp.http.X-Cache = "Miss via "+ server.ip;
}
}
实验图:
将动态页面发往websrv1,实现动静分离效果。
将动态静态页面发往websrv2,实现动静分离效果。
总结:varnish主要是通过哈希URL实现是否缓存,varnish在接收用户请求,后端服务器响应用户请求时,通过一系列的处理后将缓存一份到varnishu服务器,当客服端再次请求时,缓存服务器中的数据未过期或内容为发生改变时将直接从缓存中响应,大大的减轻了后端主机的压力。
Varnish 的详细介绍:请点这里
Varnish 的下载地址:请点这里