a brief history of 404
404 is an HTTP status code. Every time you visit a web page, your computer (the ‘client’) is requesting data from a server using HTTP, or Hypertext Transfer Protocol. Before the web page is even displayed in your browser, the web server has sent the HTTP header, which contains the status code. Not surprisingly, your browser has sent the server its own headers, which contain a lot more information about you than you think!
For a normal web page, the status is 200 OK. You don’t see this because the server proceeds to send you the contents of the page. It’s only when you encounter an error that you see the actual status code, such as 404 Not Found.
HTTP status codes were established by the World Wide Web Consortium (W3C) in 1992, as a part of the HTTP 0.9 spec. They were defined by Tim Berners-Lee, the same person who single-handedly invented the web and the first web browser in 1990. We at the 404 Research Lab like to think of him as The Man Who Made All Of This Possible.
Berners-Lee based the HTTP status codes on FTP status codes, which were already well established by 1990; the official FTP spec is dated 1985, although FTP has actually been in use much longer.
Let’s dissect 404.
The first 4 indicates a client error. The server is saying that you’ve done something wrong, such as misspell the URL or request a page which is no longer there. Conversely, a 5xx error indicates a server-side problem. It also indicates an error which may be transient; if you try it again, it may work.
The middle 0 refers to a general syntax error. This could indicate a spelling mistake.
The last 4 just indicates the specific error in the group of 40x, which also includes 400: Bad Request, 401: Unauthorized, etc.
It’s been said that 404 was named after a room at CERN (if you read about Tim Berners-Lee above, you’ll know that that’s where the web began) where the original web servers were located. However, Tom S. tells us:
“Having visited CERN myself, I can tell you that Room 404 is not on the fourth floor – the CERN office numbering system doesn’t work like that – the first digit usually refers to the *building* number (ie. building 4), and the second two to the office number. But, strangely, there is no room “04” in building “4”, the offices start at “410” and work upwards – don’t ask me why. Sorry to disappoint you all, but there is no Room 404 in CERN – it simply doesn’t exist, and certainly hasn’t been preserved as “the place where the web began”. In fact, there *is* a display about this, including a model of the first NeXT server, but the whole “Room 404″ thing is just a myth.”
According to the W3C, 404 Not Found is only supposed to be used in cases where the server cannot find the requested location and is unsure of its status. If a page has permanently been deleted, it is supposed to use 410: Gone to indicate a permanent change. But has anyone ever seen 410? It must be 404…
You can find a detailed explanation here.
If you have access to the logfiles for your website, take a look at them. You’ll find that one of the fields is the HTTP status code. Look and see if anyone visiting your site got a 404. If you notice that there are consistent errors, look and see what the referring document is. Do you have a broken link on your site? Does another site link to you with a misspelled URL? These are things you can correct easily, which will help prevent 404 errors on your website. For more tips on 404 errors, visit 404 Pros.