Posts Tagged: Cache

A Web cache sits between one or more Web servers (also known as origin servers) and a client or many clients, and watches requests come by, saving copies of the responses — like HTML pages, images and files (collectively known as representations) — for itself. Then, if there is another request for the same URL, it can use the response that it has, instead of asking the origin server for it again.

When cached items became invalid?

There are a number of reasons why browser could make a conditional or unconditional request for item that is already in the cache.

Item is expired

The cached data is no longer fresh according to Cache-Control or Expires.

These specify the “freshness lifetime” of a resource, that is, the time period during which the browser can use the cached resource without checking to see if a new version is available from the web server. They are “strong caching headers” that apply unconditionally; that is, once they’re set and the resource is downloaded, the browser will not issue any GET requests for the resource until the expiry date or maximum age is reached.

Browser applied its heuristic and decided that item should be revalidated by server based on

Last-Modified and ETag. These specify some characteristic about the resource that the browser checks to determine if the files are the same. In the Last-Modified header, this is always a date. In the ETag header, this can be any value that uniquely identifies a resource (file versions or content hashes are typical).

Last-Modified is a “weak” caching header in that the browser applies a heuristic to determine whether to fetch the item from cache or not. (The heuristics are different among different browsers.) However, these headers allow the browser to efficiently update its cached resources by issuing conditional GET requests when the user explicitly reloads the page. Conditional GETs don’t return the full response unless the resource has changed at the server, and thus have lower latency than full GETs.

Read more!

When data in cache invalidates?

Here is an interesting statistics on how the cache is used nowadays:

  • 55% of resources don’t specify a max-age value
  • 46% of the resources without any max-age remained unchanged over a 2 week period
  • some of the most popular resources on the Web are only cacheable for an hour or two
  • 40-60% of daily users to your site don’t have your resources in their cache
  • 30% of users have a full cache
  • for users with a full cache, the median time to fill their cache is 4 hours of active browsing

There are a number of reasons why browser could make a conditional request for item that is already in the cache:

  • The cached item is no longer fresh according to Cache-Control or Expires
  • The cached item was delivered with a VARY header
  • The containing page was navigated to via META REFRESH
  • JavaScript in the page called reload on the location object, passing TRUE for bReloadSource
  • The request was for a cross-host HTTPS resource on browser startup
  • The user refreshed the page
Read more!

Behind Refresh Button

Today I’d like to continue browser’s cache topic and discuss one of the ways user can invalidate cached resources. I’m talking about F5 versus CTRL+F5 shortcuts (versus other ways to “refresh” the page) in modern browsers. So what is the difference between pressing F5 or CTRL+F5 in a browser window?

HTTP Request Types

First, let’s step back for a moment and take a look into HTTP theory.

Web browsers make two types of requests over HTTP(S) – conditional requests and unconditional requests.

An unconditional request is made when the client browser does not have a cached copy of the resource available locally. In this case, the server is expected to return the resource with a HTTP/200 OK response. If the response’s headers allows it, the client may cache this response in order to reuse it later.

If the browser later needs to retrieve the resource which is available locally (in cache). That resource’s header are checked in order to determine whether it is fresh or not. If the cached copy is still fresh then no server request will be made at all.

After that if the browser needs the resource, which is in the cache, but expired already (older than its max-age or past the Expires date), then client will make conditional request in order to figure out whether cached copy could be reused or should be replaced with some fresh version of the resource. The conditional request contains an If-Modified-Since and/or If-None-Match header that indicates to the server what version of the content the browser already has in its cache. If browser’s version is still valid, the server will return HTTP/304 Not Modified headers with no body, otherwise it can return HTTP/200 OK response with the new version of the resource.

Read more!

How to check if browser caching disabled

As you may know there are several ways to “tell” the browser or proxy-server what to cache and what to do not cache. They’re reliable and commonly used across the browsers, but if there is no proxy-cache in the middle and user disabled browser’s cache then your client cache policy will be ignored. In this case you may want to adjust your cache policy and be more effective in cache-less environment.

Before starting our findings, I’d like to outline that I’ll test my approach in IE8, IE9, FF 9 and Chrome 18 browsers. Also we should assume that there is no proxy-caches or reverse proxy-caches on the way to the server. All results in my experiment will be sent to javascript console.

CSS caching and optimizations

So, let’s consider the ways to check browser (or proxy) cache availability. The main idea is simple – load some mutating resource couple times and see whether they (copies) are equal or not. By saying “mutating resource” I mean .css or .js or any other resource which has different content each time user request it from the server. For example, we may have .css file with the following content:

.cachedetect{ width: [incremental number here]px; }

where [incremental number here] will be replaced with some different number each time the resource is requested. Then we will apply .cachedetect style to some div element on the page and measure it’s width by using javascript. And finally we can say whether div’s width was changed after browser downloaded the .css file for the second time.

But having such .css is a bad approach as browsers (except IE9) have implemented optimization to do not load the same .css file twice if it’s address is equal. In other words, such html will lead to only single HTTP request to the server:

<link type="text/css" rel="stylesheet" href="cssdetect.css">
<link type="text/css" rel="stylesheet" href="cssdetect.css">

Amusingly, if first .css request was done from HTML markup, and second one was done from javascript, then even IE9 won’t download it for second time.

Read more!