LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-12-2014, 10:46 PM   #1
bluethundr
Member
 
Registered: Jun 2003
Location: Summit, NJ
Distribution: CentOS 5.4
Posts: 124

Rep: Reputation: 15
Unhappy Varnish Web Back End is Sick


I have Varnish reporting that one of the web back ends that it's trying to load balance is reported as 'Sick' and it can't seem to direct any web traffic to the 'sick' web host.

Here is one of the errors I have that Varnish is considering this back end host to be 'sick':

Code:
[root@varnish1:~] #varnishadm -T 127.0.0.1:6082 debug.health -S /etc/varnish/secret

Backend web1 is Healthy

Current states  good: 10 threshold:  8 window: 10

Average responsetime of good probes: 0.001247

Oldest                                                    Newest

================================================================

4444444444444444444444444444444444444444444444444444444444444444 Good IPv4

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Good Xmit

RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR Good Recv

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Happy

Backend web2 is Sick

Current states  good:  0 threshold:  8 window: 10

Average responsetime of good probes: 0.000000

Oldest                                                    Newest

================================================================

---------------------------------------------------------------- Happy
Here's some more convincing that varnish is doing that this host is 'sick' via varnishlog:

Code:
[root@varnish1:~] #varnishlog | grep web2
    0 Backend_health - web2 Still sick ------- 0 8 10 0.000000 0.000000
    0 Backend_health - web2 Still sick ------- 0 8 10 0.000000 0.000000
    0 Backend_health - web2 Still sick ------- 0 8 10 0.000000 0.000000
    0 Backend_health - web2 Still sick ------- 0 8 10 0.000000 0.000000
    0 Backend_health - web2 Still sick ------- 0 8 10 0.000000 0.000000
    0 Backend_health - web2 Still sick ------- 0 8 10 0.000000 0.000000
    0 Backend_health - web2 Still sick ------- 0 8 10 0.000000 0.000000
But I am at a loss understand why because the web host appears fine to me based both on curling the URLs to the probe I have setup in the VCL file that are hosted on that machine and browsing to the sites hosted on that machine in a web browser.

Curling the probe URL on the sick web host:

Code:
[root@varnish1:~] #curl -i http://web2.mywebsite.com/favicon.ico
HTTP/1.1 200 OK
Date: Tue, 13 May 2014 02:04:17 GMT
Server: Apache/2.2.23 (CentOS)
Last-Modified: Sun, 22 Dec 2013 00:53:19 GMT
ETag: "2a8003-47e-4ee14efeebdc0"
Accept-Ranges: bytes
Content-Length: 1150
Content-Type: text/plain; charset=UTF-8
Now doing the same thing via IP:
Code:
[root@varnish1:~] #curl -i http://10.10.1.98/favicon.ico
HTTP/1.1 200 OK
Date: Tue, 13 May 2014 02:05:29 GMT
Server: Apache/2.2.23 (CentOS)
Last-Modified: Sat, 10 May 2014 05:06:09 GMT
ETag: "1c5461-47e-4f904ac13b240"
Accept-Ranges: bytes
Content-Length: 1150
Content-Type: text/plain; charset=UTF-8
Wget also indicate that the 'sick host' is in fact working. First by IP

Code:
[root@varnish1:~] #wget -O /dev/null -S http://10.10.1.98/favicon.ico
--2014-05-12 21:50:55--  http://10.10.1.98/favicon.ico
Connecting to 10.10.1.98:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Date: Tue, 13 May 2014 01:50:56 GMT
  Server: Apache/2.2.23 (CentOS)
  Last-Modified: Sat, 10 May 2014 05:06:09 GMT
  ETag: "1c5461-47e-4f904ac13b240"
  Accept-Ranges: bytes
  Content-Length: 1150
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: text/plain; charset=UTF-8
Length: 1150 (1.1K) [text/plain]
Saving to: “/dev/null”

100%[=========================================================================================>] 1,150       --.-K/s   in 0s

2014-05-12 21:50:55 (79.8 MB/s) - “/dev/null” saved [1150/1150]
And I can also do a wget of the sick node via URL:
Code:
[root@varnish1:~] #wget -O /dev/null -S http://web2.mywebsite.com/favicon.ico
--2014-05-12 22:20:06--  http://web2.mywebsite.com/favicon.ico
Resolving web1.mywebsite.com... 10.10.1.98
Connecting to web1.mywebsite.com|10.10.1.98|:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Date: Tue, 13 May 2014 02:20:07 GMT
  Server: Apache/2.2.23 (CentOS)
  Last-Modified: Sun, 22 Dec 2013 00:53:19 GMT
  ETag: "2a8003-47e-4ee14efeebdc0"
  Accept-Ranges: bytes
  Content-Length: 1150
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: text/plain; charset=UTF-8
Length: 1150 (1.1K) [text/plain]
Saving to: “/dev/null”

100%[=========================================================================================>] 1,150       --.-K/s   in 0s

2014-05-12 22:20:06 (78.2 MB/s) - “/dev/null” saved [1150/1150]
And now the same thing to the 'healthy' web node (web1) by IP:

Code:
[root@varnish1:~] #wget -O /dev/null -S http://10.10.1.94/favicon.ico
--2014-05-12 21:54:06--  http://10.10.1.94/favicon.ico
Connecting to 10.10.1.94:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Date: Tue, 13 May 2014 01:54:05 GMT
  Server: Apache/2.2.15 (CentOS)
  Last-Modified: Wed, 05 Mar 2014 19:27:01 GMT
  ETag: "e123b-47e-4f3e1013feb40"
  Accept-Ranges: bytes
  Content-Length: 1150
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: image/vnd.microsoft.icon
Length: 1150 (1.1K) [image/vnd.microsoft.icon]
Saving to: “/dev/null”

100%[=========================================================================================>] 1,150       --.-K/s   in 0s

2014-05-12 21:54:06 (149 MB/s) - “/dev/null” saved [1150/1150]
I'm trying to replicate here what types of request Varnish would be making to both backends.

I also took a look at apache on the host that Varnish reports as sick:

And apache seems to be running fine on the 'sick' host:

Code:
[root@beta:/var/www/jf-current] #apachectl -S

VirtualHost configuration:

wildcard NameVirtualHosts and _default_ servers:

*:443                  is a NameVirtualHost

         default server beta.mywebsite.com (/etc/httpd/conf.d/002_jf_beta_ssl.conf:78)

         port 443 namevhost beta.mywebsite.com (/etc/httpd/conf.d/002_jf_beta_ssl.conf:78)

*:80                   is a NameVirtualHost

         default server ref.mywebsite.com (/etc/httpd/conf.d/001_ref.mywebsite.com.conf:1)

         port 80 namevhost ref.mywebsite.com (/etc/httpd/conf.d/001_ref.mywebsite.com.conf:1)

         port 80 namevhost qa.mywebsite.com (/etc/httpd/conf.d/002_qa.mywebsite.com.conf:1)

         port 80 namevhost beta.mywebsite.com (/etc/httpd/conf.d/003_beta.mywebsite.com.conf:1)

         port 80 namevhost beta-test.mywebsite.com (/etc/httpd/conf.d/004_beta-test.mywebsite.com.conf:1)

         port 80 namevhost admin.mywebsite.com (/etc/httpd/conf.d/005_admin.mywebsite.com.conf:1)

         port 80 namevhost admin.mywebsite.com (/etc/httpd/conf.d/10_admin.mywebsite.com.conf:1)

         port 80 namevhost beta-test.mywebsite.com (/etc/httpd/conf.d/10_beta-test.mywebsite.com.conf:1)

         port 80 namevhost beta.mywebsite.com (/etc/httpd/conf.d/10_beta.mywebsite.com.conf:1)

         port 80 namevhost qa.mywebsite.com (/etc/httpd/conf.d/10_qa.mywebsite.com.conf:1)

         port 80 namevhost ref.mywebsite.com (/etc/httpd/conf.d/10_ref.mywebsite.com.conf:1)

Syntax OK
And I can verify that apache is listening on both port 80 and 443 (as if the curls, wgets and apachectl commands wasn’t enough to convince you):

Code:
[root@beta:~] #lsof -i :80 -i :443
COMMAND   PID   USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
httpd   31757   root    3u  IPv6 96842102      0t0  TCP *:http (LISTEN)
httpd   31757   root    5u  IPv6 96842106      0t0  TCP *:https (LISTEN)
httpd   31763 apache    3u  IPv6 96842102      0t0  TCP *:http (LISTEN)
httpd   31763 apache    5u  IPv6 96842106      0t0  TCP *:https (LISTEN)
httpd   31764 apache    3u  IPv6 96842102      0t0  TCP *:http (LISTEN)
httpd   31764 apache    5u  IPv6 96842106      0t0  TCP *:https (LISTEN)
httpd   31765 apache    3u  IPv6 96842102      0t0  TCP *:http (LISTEN)
httpd   31765 apache    5u  IPv6 96842106      0t0  TCP *:https (LISTEN)
httpd   31765 apache   33u  IPv6 96900727      0t0  TCP beta.mywebsite.com:http->server.logistica-solutions.com:46767 (CLOSE_WAIT)
httpd   31766 apache    3u  IPv6 96842102      0t0  TCP *:http (LISTEN)
httpd   31766 apache    5u  IPv6 96842106      0t0  TCP *:https (LISTEN)
httpd   31767 apache    3u  IPv6 96842102      0t0  TCP *:http (LISTEN)
httpd   31767 apache    5u  IPv6 96842106      0t0  TCP *:https (LISTEN)
httpd   31768 apache    3u  IPv6 96842102      0t0  TCP *:http (LISTEN)
httpd   31768 apache    5u  IPv6 96842106      0t0  TCP *:https (LISTEN)
httpd   31769 apache    3u  IPv6 96842102      0t0  TCP *:http (LISTEN)
httpd   31769 apache    5u  IPv6 96842106      0t0  TCP *:https (LISTEN)
httpd   31770 apache    3u  IPv6 96842102      0t0  TCP *:http (LISTEN)
httpd   31770 apache    5u  IPv6 96842106      0t0  TCP *:https (LISTEN)
My probe and host definitions in the varnish default.vcl looks like this:

Code:
probe favicon {

  .url = "/favicon.ico";

  .timeout = 34ms;

  .interval = 1s;

  .window = 10;

  .threshold = 8;

}

backend web1  {

  .host = "10.10.1.94";

  .port = "80";

  .probe = favicon;

}

backend web2  {

  .host = "10.10.1.98";

  .port = "80";

  .probe = favicon;

}

director www random {

  { .backend = web1 ; .weight = 2;  }

  { .backend = web2 ; .weight = 2;   }
There's more stuff in the VCL file, but that's all that strikes me as relevant. I'll also try and attach the whole VCL file just in case any info not covered in the above may still help there.

Please ANY help, advice, wisdom or guesses you can share on this one would be wonderfully received. Heck, even if you have Ghost Stories to tell, tell them here. I just don't want to hear crickets on this one because this thing is driving me over the edge!

Thanks
Tim
Attached Files
File Type: txt default.vcl.txt (2.2 KB, 3 views)
 
Old 05-14-2014, 03:24 PM   #2
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 3,774
Blog Entries: 1

Rep: Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339
So for the same object you are getting two different timestamps.

Last-Modified: Sun, 22 Dec 2013 00:53:19 GMT

Last-Modified: Sat, 10 May 2014 05:06:09 GMT


Im wondering why.

Last edited by szboardstretcher; 05-14-2014 at 03:26 PM.
 
  


Reply

Tags
vcl


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Speed Up Your Web Site with Varnish LXer Syndicated Linux News 0 06-19-2013 08:21 PM
LXer: Gollem: A Web-based file manager for back-end data LXer Syndicated Linux News 0 11-08-2008 07:20 AM
Multiple Apache - Front-end & Back-end in one server grant-skywalker Linux - Server 3 08-27-2008 03:04 PM
How do I write a back end App for a Web Server using C++? Furrage Programming 4 11-13-2007 09:20 AM
help on web design and back end programming tools amolgupta Programming 7 02-21-2006 03:12 PM


All times are GMT -5. The time now is 12:47 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration