Since a week or 2 I’ve been using Varnish in front of my websites on my server. Varnish is marketed as a “web application accelerator”. To be honest, the homepage of the project doesn’t really give a good description of what it does. For that you’ve got to dive a little bit deeper into their website; the about-page actually advises you to read the Wikipeda article.
Basically it’s a HTTP reverse proxy that caches the made requests into virtual memory. It’s also capable of doing load balancing and supports health checking of your backends (e.g. the actual apps generating and serving your pages).
So why cache HTTP requests into memory? Several reasons actually:
- Your webapp will most likely take more time to generate a page than it takes to serve a static copy of it. Don’t believe me? Benchmark a WordPress site.
- Since your webapp will most likely generate the same page for the same route over and over it’s a waste of CPU to do so. With Varnish caching the request it only has to do this once. Without Varnish (or any other caching at your server stack) you can do some caching on the client side, but that still requires every visitor to request the page from your app at least once.
- Continuing from the last sentence of the previous point, imagine getting Slashdotted. Since Varnish will be serving a cached copy of the page you’re offloading your app and your server’s resources won’t get eaten up by all your new simultaneous visitors.
This all sounds very nice and it is, but there are some caveats. When Varnish caches a request it will not only look to the requested URL, but also the request headers. Since most websites can place a lot of Cookies on the visitor’s PC those are sent to your server as well, which are part of the request. When Varnish sees Cookies are being sent from the client it’ll directly forward the request to your app and never cache the page.
Luckily you can tell Varnish to ignore Cookies in certain cases so it’ll cache a page and serve a cached copy of it. The kind of Cookies you can usually ignore are those from third parties such as Google Analytics. Your app (server) probably doesn’t need them, so they can be ignored. The client-side Javascript that requires these Cookies still has access to them.
Pages with user specific content on your website such as greeting messages are also something you don’t want Varnish to cache. Since this usually requires a Cookie Varnish won’t cache the page anyway. In case you do use some other kind of identification check you don’t want Mike to see “Welcome back, John” when Mike visits your website after John. Dynamic content parts can be supported with Edge Side Includes (ESI). ESI is similar to Server Side Includes (SSI). With ESI Varnish still caches your page, but dynamic content is inserted where ESI statements are put when serving the page.
As I’m still new to Varnish as well I’d like to hear about your experience. If I’m telling something here that’s off, please let me know so I can correct it. If in the future I learn some neat Varnish tricks I’ll blog about it here. For now I’m very pleased with running Varnish in front of my HTTP server and I feel pretty confident in case I get Slashdotted (unlikely though).