How to check Cloudflare cache status programmatically
Just imagine: your small web app, that allows user to download medium-large ZIP files, due a lucky reddit post, suddenly start to receive huge amount of traffic, and specifically a lot of downloads.
The server is a small tier with a very small bandwith: a peak of downloads after a new file release turn in a DDOS for your webapp.
How to improve the webapp performance without buy a bigger VPS tier?
A quicker solution could be rely on a CDN, such as Cloudflare.
Cloudflare's caching system is very powerfull, and configuration is very simple.
However, the base configuration don't cache some file type (like .ZIP), here a list of enabled filetypes:
https://support.cloudflare.com/hc/en-us/articles/200172516-What-file-extensions-does-CloudFlare-cache-for-static-content-
If you need to cache all files on a specific section of your website, you need to configure a custom page rule, here an example for static html files, easily adaptable for ZIP files:
- Log into your Cloudflare account.
- From the dropdown menu on the top left, select your domain.
- Click the Page Rules app in the top menu.
- The first step is creating a pattern and then applying a rule to that pattern. You'll need to find or create a way to differentiate static versus dynamic content by the URL. Some possibilities could be creating a directory for static content, appending a unique file extension to static pages, or adding a query parameter to mark content as static. Here are three examples of patterns you could create for each of those options:
*example.com/static/* [/static/ subdirectory for static HTML pages] *example.com/*.shtml [.shtml file extension to signify HTML that is static] *example.com/*?*static=true* [adding static=true query parameter]
You'll want to design the pattern to only describe pages you know are static.
- Click Cache everything in the Custom caching dropdown menu.
- Click Add rule.
But also in this case, if a new released file is pretty big, the time that elapses between the pubblication of the download link and the caching from Cloudflare can also turn into a server's suffering.
In this case, i come up with a simple solution: just enabling the download link only when the new file has been cached by Cloudflare.
Indeed, in the Cloudflare support site there is an interesting section about the CF-Cache-Status response header:
It is possible to check if Cloudflare is caching my site or a specific file by checking the responses shown in the "CF-Cache-Status" header
When a file is covered by caching, the cdn adds this specific header, in order to notify the cachien status, specifically:
HIT: resource in cache, served from CDN cache
MISS: resource not in cache, served from origin server
EXPIRED: resource was in cache but has since expired, served from origin server
STALE: resource is in cache but is expired, served from CDN cache because another visitor's request has caused the CDN to fetch the resource from the origin server. This is a very uncommon occurrence and will only impact visitors that want the page right when it expires.
IGNORED: resource is cacheable but not in cache because it hasn't met the threshold (number of requests, usually 3), served from origin server. Will become a HIT once it passes the threshold.
REVALIDATED: REVALIDATED means we had a stale representation of the object in our cache, but we revalidated it by checking using an If-Modified-Since header.
UPDATING: A status of UPDATING indicates that the cache is currently populating for that resource and the response was served stale from the existing cached item. This status is typically only seen when large and/or very popular resources are being added to the cache.
Here a example output made with CURL:
$ curl -I https://mywebappurl/download/hugefile.zip HTTP/2 200 date: Wed, 16 May 2018 15:34:45 GMT content-type: text/html; charset=UTF-8 set-cookie: __cfduid=d2c1b7df0d1b091c5027483dcabe74ecb1526484885; expires=Thu, 16-May-19 15:34:45 GMT; path=/; domain=.mywebappurl; HttpOnly; Secure cf-cache-status: HIT expires: Thu, 16 May 2019 15:34:45 GMT cache-control: public, max-age=31536000 expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct" server: cloudflare cf-ray: 41bee907b83096f6-FRA
In this example, the file is correctly cached.
So, with a simple php snippet is it possible to trigger the caching of a file and check its state, using the get_headers function:
function checkCDNStatus($url) {
$headers = get_headers($url,1);
return $headers["CF-Cache-Status"];
}
Now is simple enabling the file download only once the checkCDNStatus
function returns "HIT":
if (checkCDNStatus($downloadurl) != "HIT") {
echo "<a href=\"#\">Download</a>";
} else {
echo "<a href=\"$downloadurl\">Download</a>";
}