The Request Sent Bad Data; What’s the Response? – Hacker Noon

Looking at how HTTP 4xx Status Code RFC has evolved

Reading about 4xx status codes might seem dull, but it has been a source of (religious?) debate over the years. A lot of ambiguity has since been clarified (changed might be a better word) by the Internet Engineering Task Force (IETF), so I will look at what it was, what it is now, and the consequences for you, my fellow developer.

Pick your poison

Here’s a pseudo test case, using a plausible but unspecified request library:

const body = {
title: "Ethel the Aardvark goes Quantity Surveying",
ISBN: "no clue"
request(url, body, {"Content-type":"application/json"})
.catch(failure => {...})

Everything is correct: the headers are right and the data is well-formed. However, one of the values is not valid (ISBN), and the server can’t process the request. What are the options?

  1. Return 400 Bad Request status
  2. Return some other 4xx status
  3. Return 200 OK, with an error encoded in the response body

What the standard once said about 400 status code

The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.

Circa 1999. Note it specifies “malformed syntax”. In the ISBN example shown, there was no malformed syntax…it’s just a bad data value that’s the problem. For this reason, people objected to sending a 400 back for bad data, since the standard was very specific to what 400 meant.

Later on, status code 422 was added. This status code was about the semantics of the request, not its syntax. If the request data was uninterpretable, then a 422 response seems to better adhere to the standard.

Alas, times have changed, and sending a 422 in response to bad data is seemingly frowned upon.

What the standard now says

Circa 2014. Here’s the latest text:

The 400 (Bad Request) status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).

This is a broader definition than the original, not restricted to syntax errors, but it doesn’t mention data errors specifically. What to make of that? One clue can be found by taking look at what it says about custom status codes:

HTTP status codes are extensible. HTTP clients are not required to
understand the meaning of all registered status codes, though such
understanding is obviously desirable. However, a client MUST
understand the class of any status code, as indicated by the first
digit, and treat an unrecognized status code as being equivalent to
the x00 status code of that class, with the exception that a
recipient MUST NOT cache a response with an unrecognized status code.

So your client code doesn’t have to now how to handle every status code, just the class (first digit) of the status code. Okay, but what do you do with that first digit?

For example, if an unrecognized status code of 471 is received by a client, the client can assume that there was something wrong with its request and treat the response as if it had received a 400 (Bad
Request) status code. The response message will usually contain a
representation that explains the status.

So here are the relevant clues, which I’ve highlighted. The first is that there is no 471 status code registered with the Internet Assigned Numbers Authority (IANA) organization. So the spec is not speaking of a “standard” status code. It goes on to say that if a status code is unrecognized, then handle it like a 400 status code.

The implication here is that your 400 status code handling logic is the same logic used for unrecognized status codes. Therefore, the 400 status code handler logic of your client is your fallback, generic, all-purpose handler. Any 4xx error that occurs should be capable of being dealt with by your 400 status code handling logic (even if less than ideally).

So, can I define my own 4xx status codes then?

You can, but there should be a reason to do so. Needless to say, all error statuses, custom or otherwise, should be documented in your API.

The only good reason to use a non-standard code (and there are plenty of them), is because you expect it to be handled differently from a 400 error. In other words, you are using a special status code to (hopefully) trigger better handling behavior in the client.

What if I later find out that my custom status code conflicts with some other custom status code defined elsewhere?

You define what your server’s status codes mean, and you’ve documented them for every request you support. You never, ever, ever have your service act as a proxy for another service’s status codes! That would be bad. If you need to relay information to a client from another service that uses a status code for other reasons, then pass that information in the error message (using whatever documented status code is appropriate), or log it.

What about option #3: 200 OK with error body?

I’ll give you my opinion, based on personal frustration:

Was the request really okay? If not, why are you saying it was by passing back a 200 OK, then, in the body of the response, say that it wasn’t? You contradict yourself doing that, and make the client do extra work.

The other issue is: What is the shape of your error response? Is it a simple string? An Object? Is is reasonable to expect the client to have custom code to parse and display YOUR error body, as well as B’s and C’s and D’s services’ error bodies? Isn’t this non-standard? Wasn’t the whole point of your not using a 4xx code to stick to the standards (as you interpreted them)?

400 is okay in most cases

A 400 status code handler is the fall back handler. As per the standard, it has to handle all status codes that are not handled elsewhere. Those may be registered status codes, or custom.

The one exception could be one of prioritization. It may be that the usual data-related 400 status is due to a data validation error, caught by the server, but missed by the client. Something to fix, but perhaps not urgent. On the other hand, there might be some condition occurring on the server (SQL syntax error, for instance), that needs immediate attention from a database administrator. A more specific status code, sent to and handled by the client, may alert the user to call the help desk (or, have the server send a 5xx code instead).

read original article here