PHP Caching on the Edge
The goal is to never generate the same response twice.
This is another ‘notes post’ – nothing too organize here (for now).
Changing the HTTP headers with php:
Caching spec – read it on the night you feel like you can’t sleep.
1. HTTP expiration
Is the data fresh? only when the version is stale we will go to the server to get a new data.
HTTP Headers for Expiration
2. HTTP validation
Last-Modified / If-Modified-Since
ETag / If-None-Match
(!) Http cache headers only work with ‘safe’ HTTP methods (GET/HEAD) – meaning, these method won’t change the application state.
Expires – give the date/time after which the response is considered stale.
$expires – gmdate(‘D, j M Y H:i:s T’, time()+5); // expire in 5sec
header(‘Expires: ‘ . $expires);
* With expiration of less then few days you might hit problems because the clock in the web server and the clock in the client is not showing the same time.
* HTTP/1.1 spec says that you are not allowed to send expire for more then 1 year in the future. WHY?
A better way:
* Use cache-control
header (‘Cache-Control: max-age=5’);
(!) So use expire only for things you want to cache for very long time (more then 48h).
On other cases, use cache-control.
Usage of ETag
You can compute the etag on your own or use a tool that do the work for you (some good framework).
* Expiration wins over Validation – first we checking the expiration.
* Expiration allows you to scale as less requests hit your server.
Validation saves bandwidth.
PHP and cache
$_SESSION['foo'] = 'bla';
(!) By default – when you set a cookie… php set for YOU no-cache/no-store/must-revalidate
Because if you have a cookie, you don’t want to cache the page.
It’s more a ‘safe’ default. So no one else could take this information from you.
header(‘Cache-Control: private/public, max-age=5’);
private – will prevent any proxy from cache this page.
Types of Caches
1. Browser cache – for example it’s good for images.
2. Proxy cache – inside a company/organization. It still on the ‘client side’ and it will mask all the clients that site behind it.
We can find it in big companies and lots of ISPs. It’s public.
3. Gateway cache (=reverse proxy or Http Accelerator or Surrogate Cache) – This is just like a proxy BUT it is on the server side.
Shared cache on the server side. Make your site more scalable, reliable and improve performance.
In practice, most caches avoid anything with: cache-control, cookie, www-authenticate, post/Put, 302/307 status codes.
Cache-Control in php save us with a safe default of ‘private’.
A gateway cache won’t cache anything ‘private’ or carrying a cookie – so how can we use Google Analytic and still have cache?
use varnish… or any other proxy that remove+add the cookies before the request hit the server.
In cases where you want your code to run with reverse proxy like Akamai/varnish (esi tag is supported) you should use:
Surrogate-Capability: abc=”Surrogate/1.0 ESI/1.0″
Surrogate-Control: content =”ESI/1.0″ <– this will let the reverse proxy to parse the tags.
(!) You need to set these features in Varnish (they are not there by default).
Symfony2 – got a reverse proxy that is written in PHP. You might want to use it if you can’t afford to use varnish (it’s free – but let say your hosting company won’t install it)
In the end of the day, you want to hit the application as less as possible.
You can do it with HTTP headers and ESI.
Last but not least, please remember that, the one who picks his targets carefully, acts quietly, and achieves his objectives — he wins wars.