When cached items became invalid?

There are a number of reasons why browser could make a conditional or unconditional request for item that is already in the cache.

Item is expired

The cached data is no longer fresh according to Cache-Control or Expires.

These specify the “freshness lifetime” of a resource, that is, the time period during which the browser can use the cached resource without checking to see if a new version is available from the web server. They are “strong caching headers” that apply unconditionally; that is, once they’re set and the resource is downloaded, the browser will not issue any GET requests for the resource until the expiry date or maximum age is reached.

Browser applied its heuristic and decided that item should be revalidated by server based on

Last-Modified and ETag. These specify some characteristic about the resource that the browser checks to determine if the files are the same. In the Last-Modified header, this is always a date. In the ETag header, this can be any value that uniquely identifies a resource (file versions or content hashes are typical).

Last-Modified is a “weak” caching header in that the browser applies a heuristic to determine whether to fetch the item from cache or not. (The heuristics are different among different browsers.) However, these headers allow the browser to efficiently update its cached resources by issuing conditional GET requests when the user explicitly reloads the page. Conditional GETs don’t return the full response unless the resource has changed at the server, and thus have lower latency than full GETs.

User refresh

The user refreshed the page by using CTRL+F5 or other alternative way to ignore-cache-and-reload-page.

Cache is full

The cache became full and browser decided to perform cache cleanup. Browser will use Least Recently Used algorithm to determine what (and how many) items should be removed from the cache. So there is a chance that a valid item will be removed from the cache, i.e. next time user access the page, the removed item will be requested from the server again (unconditional request).

Unfortunately there is no strict standard for Least Recently Used algorithm implementation, so each browser has its own version with some tricks and hacks:

Internet Explorer’s cache scavenger is kicked off as the cache limit (size or object count) is approached. Its goal is to remove the least valuable 10% (by size) of the objects in the cache. It does this by enumerating the objects in the cache.

Cache entries that have been used more than once receive an increasing number of reuse points; to get more than 10,000 points, the resource must have been reused over a period longer than 12 hours. This helps prevent resources that were reused frequently but only within a short period (e.g. when you’re browsing around a site you rarely visit) from getting the same amount of credit as a resource that is reused frequently across a long period (e.g. a script on a site that you visit every day).

We also take the MIME type of the resource into account; script, CSS, and HTML/XHTML resources receive full credit, while other resource types (e.g. images, audio, etc) receive only half of their allocated points.

Javascript reload

JavaScript on the page called location.reload(true), where true indicates that browser must reload the document from the server.

From mozilla docs:

reload(forceget) – reload the document from the current URL. forceget is a boolean, which, when it is true , causes the page to always be reloaded from the server. If it is false or not specified, the browser may reload the page from its cache.

For those who is interesting, here are more than 500 ways to reload page by using Javascript.

Vary header

The cached item was delivered with a VARY header.

The Vary field value indicates the set of request-header fields that fully determines, while the response is fresh, whether a cache is permitted to use the response to reply to a subsequent request without revalidation. For uncacheable or stale responses, the Vary field value advises the user agent about the criteria that were used to select the representation.

META REFRESH

The containing page was navigated to via META REFRESH.

From the spec:

If the navigation was started either as a result of a meta refresh, or the location.reload() method, or other equivalent actions, let the navigation type be TYPE_RELOAD.

(here are few tips of how to disable meta-refresh redirect in your browser)

Invalidation After Updates or Deletions

Invalidation could happen as a side effect of another request that passes through the cache. For example, if URL associated with a cached response subsequently gets a POST, PUT or DELETE request, the cached response will be invalidated.