3-HTTP Messages - 台部落

Please indicate the source: http://blog.csdn.net/gaoxiangnumber1

Welcome to my github: https://github.com/gaoxiangnumber1

3.1 The Flow of Messages

HTTP messages are the blocks of data sent between HTTP applications. These blocks of data begin with some text meta-information describing the message contents and meaning, followed by optional data. These messages flow between clients, servers, and proxies.

3.1.1 Messages Commute Inbound to the Origin Server

HTTP uses the terms inbound and outbound to describe transactional direction. Messages travel inbound to the origin server, and when their work is done, they travel outbound back to the user agent(Figure 3-1).

3.1.2 Messages Flow Downstream

All messages flow downstream, regardless of whether they are request messages or response messages(Figure 3-2). The sender of any message is upstream of the receiver.

3.2 The Parts of a Message

Figure 3-3. Each message contains either a request from a client or a response from a server. They consist of three parts: a start line describing the message, a block of headers containing attributes, and an optional body containing data.
The start line and headers are ASCII text, broken up by lines. Each line ends with a carriage return(‘\r’ 13) and a line-feed character(‘\n’ 10). This end-of-line sequence is written “CRLF”. Robust applications also should accept just a line-feed character.
The body is an optional chunk of data that can contain text or binary data or can be empty.

3.2.1 Message Syntax

All HTTP messages fall into two types: request messages(that request an action from a web server) and response messages(that carry results of a request back to a client). Both request and response messages have the same basic message structure. Figure 3-4 shows request and response messages to get a GIF image.

Format of a request message:

<method> <request-URL> <version>
<headers>

<entity-body>

Format of a response message(note that the syntax differs only in the start line):

<version> <status> <reason-phrase>
<headers>

<entity-body>

method
The action that the client wants the server to perform on the resource.
request-URL
A complete URL naming the requested resource, or the path component of the URL. If you are talking directly to the server, the path component of the URL is usually okay as long as it is the absolute path to the resource(the server can assume itself as the host/port of the URL).
version
The version of HTTP that the message is using. Its format:
HTTP/.
major and minor both are integers.
status
A three-digit number describing what happened during the request. The first digit of each code describes the general class of status(“success,” “error,” etc.). An exhaustive list of status codes defined in the HTTP specification and their meanings is provided later in this chapter.
reason-phrase
A human-readable version of the numeric status code, consisting of all the text until the end-of-line sequence. The reason phrase is meant solely for human consumption, so, response lines containing “HTTP/1.0 200 NOT OK” and “HTTP/1.0 200 OK” should be treated as equivalent success indications.
headers
Zero or more headers, each of which is a name, followed by a colon(:), followed by optional whitespace, followed by a value, followed by a CRLF. The headers are terminated by a blank line(CRLF), marking the end of the list of headers and the beginning of the entity body. Some versions of HTTP, such as HTTP/1.1, require certain headers to be present for the request or response message to be valid.
entity-body
The entity body contains a block of arbitrary data. Not all messages contain entity bodies, so sometimes a message terminates with a bare CRLF.
Figure 3-5 demonstrates hypothetical request and response messages.

A set of HTTP headers should always end in a blank line(bare CRLF), even if there are no headers and even if there is no entity body. Historically, many clients and servers omitted the final CRLF if there was no entity body. To interoperate with these popular implementations, clients and servers should accept messages that end without the final CRLF.

3.2.2 Start Lines

All HTTP messages begin with a start line: for a request message says what to do, for a response message says what happened.

3.2.2.1 Request line

The start line for a request message(request line) contains:
<method> <request-URL> <version>
1. A method describing what operation the server should perform.
2. A request URL describing the resource on which to perform the method.
3. An HTTP version tells the server what dialect of HTTP the client is speaking.
  All these fields are separated by whitespace.

3.2.2.2 Response line

Response messages carry status information and any resulting data from an operation back to a client. The start line for a response message(response line) contains:
<version> <status> <reason-phrase>
1. HTTP version that the response message is using.
2. A numeric status code.
3. A textual reason phrase describing the status of the operation.
  All these fields are separated by whitespace.

3.2.2.3 Methods

The method begins the start line of requests, telling the server what to do. The HTTP specifications have defined a set of common request methods. Table 3-1.

Not all servers implement all seven of the methods in Table 3-1. Because HTTP was designed to be extensible, other servers may implement their own request methods in addition to these that are called extension methods.

3.2.2.4 Status codes

As methods tell the server what to do, status codes tell the client what happened. Status codes are returned in the start line of each response message. Both a numeric and a human-readable status are returned. The different status codes are grouped into classes by their three-digit numeric codes. Table 3-2.

Table 3-3 lists the most common status codes.

3.2.2.5 Reason phrases

The reason phrase is the last component of the start line of the response and it provides a textual explanation of the status code. Reason phrases are paired one-to-one with status codes. The HTTP specification does not provide any hard and fast rules for what reason phrases should look like.

3.2.2.6 Version numbers

Version numbers appear in both request and response message start lines in the format HTTP/x.y. They provide a means for HTTP applications to tell each other what version of the protocol they conform to.
Version numbers are intended to provide applications speaking HTTP with a clue about each other’s capabilities and the format of the message. An HTTP Version 1.2 application communicating with an HTTP Version 1.1 application should know that it should not use any new 1.2 features.
The version number indicates the highest version of HTTP that an application supports. This can lead to confusion between applications because HTTP/1.0 applications interpret a response with HTTP/1.1 in it to indicate that the response is a 1.1 response, when in fact that’s just the level of protocol used by the responding application.
Each number in the version(e.g., “1” and “0” in HTTP/1.0) is treated as a separate number. When comparing HTTP versions, each number must be compared separately in order to determine which is the higher version. E.g., HTTP/2.22 is a higher version than HTTP/2.3 because 22 is a larger number than 3.

3.2.3 Headers

HTTP header fields add additional information to request and response messages. They are lists of name/value pairs. E.g., the following header line assigns 19 to the Content-Length header field:
Content-length: 19

3.2.3.1 Header classifications

The HTTP specification defines several header fields. Applications are free to invent their own home-brewed headers. HTTP headers are classified into:
　　General headers: Can appear in both request and response messages.
　　Request headers: Provide more information about the request.
　　Response headers: Provide more information about the response.
　　Entity headers: Describe body size and contents, or the resource itself.
　　Extension headers: New headers that are not defined in the specification.
HTTP header syntax: [name][:][optional whitespace][field value][“\r\n”]. Table 3-4 lists some common header examples. A summary of all the headers in Appendix C.

3.2.3.2 Header continuation lines

Long header lines can be made more readable by breaking them into multiple lines, preceding each extra line with at least one space or tab character. E.g.:

HTTP/1.0 200 OK
Content-Type: image/gif
Content-Length: 8572
Server: Test Server
    Version 1.0

3.2.4 Entity Bodies

The third part of an HTTP message is the optional entity body that are the payload of HTTP messages. They are the things that HTTP was designed to transport.

3.2.5 Version 0.9 Messages

Figure 3-6. HTTP/0.9 messages consisted of requests and responses, but the request contained merely the method and the request URL, and the response contained only the entity. No version information, no status code or reason phrase, and no headers were included.

3.3 Methods

Table 3-1. Not all methods are implemented by every server. To be compliant with HTTP Version 1.1, a server need implement the GET and HEAD methods for its resources.
Even when servers do implement all of these methods, the methods have restricted uses. E.g., servers that support DELETE or PUT would not want anyone to be able to delete or store resources. These restrictions generally are set up in the server’s configuration, so they vary from site to site and from server to server.

3.3.1 Safe Methods

HTTP defines a set of methods that are called safe methods. The GET and HEAD methods are said to be safe since nothing will happen on the server as a result of the HTTP request. Consider when you are shopping online at Joe’s Hardware and you click on the “submit purchase” button. Clicking on the button submits a POST request with your credit card information, and an action is performed on the server on your behalf. In this case, the action is your credit card being charged for your purchase.
There is no guarantee that a safe method won’t cause an action to be performed. Safe methods are meant to allow HTTP application developers to let users know when an unsafe method that may cause some action to be performed is being used. In this example, your web browser may pop up a warning message letting you know that you are making a request with an unsafe method and something might happen on the server(e.g., your credit card being charged).

3.3.2 GET

GET is used to ask a server to send a resource. HTTP/1.1 requires servers to implement this method. Figure 3-7.

3.3.3 HEAD

The HEAD method behaves exactly like the GET method, but the server returns only the headers in the response, no entity body returned. This allows a client to inspect the headers for a resource without having to get the resource. Using HEAD, you can:
1. Find out about a resource(e.g., determine its type) without getting it.
2. See if an object exists, by looking at the status code of the response.
3. Test if the resource has been modified, by looking at the headers.
Server developers must ensure that the headers returned are exactly those that a GET request would return. The HEAD method also is required for HTTP/1.1 compliance. Figure 3-8.

3.3.4 PUT

The PUT method writes documents to a server. Some publishing systems let you create web pages and install them on a web server using PUT(Figure 3-9).

The semantics of the PUT method are for the server to take the body of the request and either use it to create a new document named by the requested URL or, if that URL already exists, use the body to replace it.

3.3.5 POST

The POST method was designed to send input data to the server and it is often used to support HTML forms. The data from a filled-in form typically is sent to the server, which then marshals it off to where it needs to go(e.g., to a server gateway program, which then processes it). Figure 3-10 shows a client making an HTTP request sending form data to a server with the POST method.

POST is used to send data to a server, PUT is used to deposit data into a resource on the server(e.g., a file).

3.3.6 TRACE

When a client makes a request, the request may travel through firewalls, proxies, gateways, or other applications. Each of these can modify the original HTTP request. The TRACE method allows clients to see what its request is when request finally to the server.
A TRACE request initiates a “loopback” diagnostic at the destination server. The server at the final leg of the trip bounces back a TRACE response, with the request message it received in the body of its response. A client can then how its original message was modified along the request/response chain of any intervening HTTP applications(Figure 3-11).

The TRACE method is used for diagnostics; i.e., verifying that requests are going through the request/response chain as intended. It’s also used for seeing the effects of proxies and other applications on your requests.
TRACE has the drawback of assuming that intervening applications will treat different types of requests(i.e., different methods) the same. Many HTTP applications do different things depending on the method, e.g., a proxy might pass a POST request directly to the server but attempt to send a GET request to another HTTP application(such as a web cache). TRACE does not provide a mechanism to distinguish methods. Generally, intervening applications make the call as to how they process a TRACE request.
No entity body can be sent with a TRACE request. The entity body of the TRACE response contains the request that the responding server received.

3.3.7 OPTIONS

The OPTIONS method asks the server to tell us about the various supported capabilities of the web server. You can ask a server about what methods it supports in general or for particular resources.
This provides a means for client applications to determine how best to access various resources without actually having to access them. Figure 3-12.

3.3.8 DELETE

The DELETE method asks the server to delete the resources specified by the request URL. The client application is not guaranteed that the delete is carried out. This is because the HTTP specification allows the server to override the request without telling the client. Figure 3-13.

3.3.9 Extension Methods

Extension methods are methods that are not defined in the HTTP/1.1 specification. They provide developers with a means of extending the capabilities of the HTTP services their servers implement on the resources that the servers manage. Common extension methods are listed in Table 3-5. These methods are all part of the WebDAV HTTP extension(Chapter 19) that helps support publishing of web content to web servers over HTTP.

Not all extension methods are defined in a formal specification. If you define an extension method, it’s likely not to be understood by other HTTP applications. It’s possible that your HTTP applications could run into extension methods being used by other applications that it does not understand.
In these cases, it is best to be tolerant of extension methods. Proxies should try to relay messages with unknown methods through to downstream servers if they are capable of doing that without breaking end-to-end behavior. Otherwise, they should respond with a 501 Not Implemented status code.

3.4 Status Codes

Table 3-2.

3.4.1 100-199: Informational Status Codes

Table 3-6 lists the defined informational status codes.

100 Continue status code: It’s intended to optimize the case where an HTTP client application has an entity body to send to a server but wants to check that the server will accept the entity before it sends it.

3.4.1.1 Clients and 100 Continue

If a client is sending an entity to a server and is willing to wait for a 100 Continue response before it sends the entity, the client needs to send an Expect request header (Appendix C) with the value 100-continue. If the client is not sending an entity, it shouldn’t send a 100-continue Expect header, because this confuse the server into thinking that the client might be sending an entity.
A client application should use 100-continue only to avoid sending a server a large entity that the server will not be able to handle or use. Clients that send an Expect header for 100-continue should not wait forever for the server to send a 100 Continue response. After some timeout, the client should send the entity.
In practice, client implementors also should be prepared to deal with unexpected 100 Continue responses. Some errant HTTP applications send this code inappropriately.

3.4.1.2 Servers and 100 Continue

If a server receives a request with the Expect header and 100-continue value, it should respond with either the 100 Continue response or an error code(Table 3-9). Servers should never send a 100 Continue status code to clients that do not send the 100-continue expectation. But some errant servers do this.
If the server receives some(or all) of the entity before it has had a chance to send a 100 Continue response, it does not need to send this status code, because the client already has decided to continue. When the server is done reading the request, it still needs to send a final status code for the request(it can skip the 100 Continue status).
If a server receives a request with a 100-continue expectation and it decides to end the request before it has read the entity body(e.g., because an error has occurred), it should not only send a response and close the connection since this can prevent the client from receiving the response(Section 4.7.4.2).

3.4.1.3 Proxies and 100 Continue

When a proxy receives from a client a request that contains the 100-continue expectation:
1. If the proxy either knows that the next-hop server(Chapter 6) is HTTP/1.1 compliant or does not know what version the next-hop server is compliant with, it should forward the request with the Expect header in it.
2. If it knows that the next-hop server is compliant with a version of HTTP earlier than 1.1, it should respond with the 417 Expectation Failed error.
If a proxy decides to include an Expect header and 100-continue value in its request on behalf of a client that is compliant with HTTP/1.0 or earlier, it should not forward the 100 Continue response(if it receives one from the server) to the client, because the client won’t know what to make of it.
It can pay for proxies to maintain state about next-hop servers and the versions of HTTP they support(at least for servers that have received requests), so they can better handle requests received with a 100-continue expectation.

3.4.2 200-299: Success Status Codes

Servers have an array of status codes to indicate success, matched up with different types of requests. Table 3-7.

3.4.3 300-399: Redirection Status Codes

The redirection status codes either tell clients to use alternate locations for the resources they’re interested in or provide an alternate response instead of the content. If a resource has moved, a redirection status code and an optional Location header can be sent to tell the client that the resource has moved and where it can now be found(Figure 3-14). This allows browsers to go to the new location transparently, without bothering their human users.

Some of the redirection status codes can be used to validate an application’s local copy of a resource with the origin server. E.g., an HTTP application can check if the local copy of its resource is still up-to-date or if the resource has been modified on the origin server. Figure 3-15.

The client sends a If-Modified-Since header saying to get the document only if it has been modified since October 1997. The document has not changed since this date, so the server replies with a 304 status code instead of the contents.
It’s good practice for responses to non-HEAD requests that include a redirection status code to include an entity with a description and links to the redirected URL(the first response message in Figure 3-14). Table 3-8.

A bit overlap between the 302, 303, and 307 status codes. There is some nuance to how these status codes are used, most of which stems from differences in the ways that HTTP/1.0 and HTTP/1.1 applications treat these status codes.
When an HTTP/1.0 client makes a POST request and receives a 302 redirect status code in response, it will follow the redirect URL in the Location header with a GET request to that URL.
The HTTP/1.1 specification uses the 303 status code to get this same behavior: servers send the 303 status code to redirect a client’s POST request to be followed with a GET request.
To get around the confusion, the HTTP/1.1 specification says to use the 307 status code in place of the 302 status code for temporary redirects to HTTP/1.1 clients. Servers can then save the 302 status code for use with HTTP/1.0 clients.
What this matters is that servers need to check a client’s HTTP version to select which redirect status code to send in a redirect response.

3.4.4 400-499: Client Error Status Codes

Sometimes a client sends something that a server can’t handle, such as a bad formed request message or a request for a URL that does not exist.
Many of the client errors are dealt with by your browser, without it ever bothering you. A few, like 404, might pass through. Table 3-9.

3.4.5 500-599: Server Error Status Codes

Sometimes a client sends a valid request, but the server itself has an error. This could be a client running into a limitation of the server or an error in one of the server’s components, such as a gateway resource.
Proxies often run into problems when trying to talk to servers on a client’s behalf. Proxies issue 5XX server error status codes to describe the problem(Chapter 6). Table 3-10 lists the defined server error status codes.

3.5 Headers

Headers and methods work together to determine what clients and servers do. Appendix C summarizes all these headers in more detail.
There are headers that are specific for each type of message and headers that are more general in purpose, providing information in both request and response messages. Headers fall into five main classes.
General headers: These are generic headers used by both clients and servers. E.g., the Date header allows both sides to indicate the time and date at which the message was constructed: Date: Tue, 3 Oct 1974 02:16:00 GMT
Request headers: Request headers are specific to request messages. They provide extra information to servers, such as what type of data the client is willing to receive. E.g., the following Accept header tells the server that the client will accept any media type that matches its request: Accept: */*
Response headers: Response messages have their own set of headers that provide information to the client(e.g., what type of server the client is talking to). E.g., the following Server header tells the client that it is talking to a Version 1.0 Hut server: Server: Hut/1.0
Entity headers: Entity headers refer to headers that deal with the entity body. E.g., the following Content-Type header lets the application know that the data is an HTML document in the asian-1 character set: Content-Type: text/html; charset=asian-1
Extension headers: Extension headers are nonstandard headers that have been created by application developers but not yet added to the sanctioned HTTP specification. HTTP programs need to tolerate and forward extension headers, even if they don’t know what the headers mean.

3.5.1 General Headers

General headers provide information about a message regardless of its type. Table 3-11 lists the general informational headers.

[4] Date lists the acceptable date formats for the Date header.
[5] Chunked transfer codings are discussed in Section 15.6.3.1.

3.5.1.1 General caching headers

HTTP/1.0 introduced headers that allowed HTTP applications to cache local copies of objects instead of always fetching them directly from the origin server. The latest version of HTTP has a set of cache parameters. Table 3-12.

[6] Pragma technically is a request header. It was never specified for use in responses. Because of its common misuse as a response header, many clients and proxies will interpret Pragma as a response header, but the precise semantics are not well defined. In any case, Pragma is deprecated in favor of Cache-Control.

3.5.2 Request Headers

Request headers are headers that make sense only in a request message. They give information about who or what is sending the request, where the request originated, or what the preferences and capabilities of the client are. Servers can use the information the request headers give them about the client to try to give the client a better response. Table 3-13 lists the request informational headers.

[7] Client-IP and the UA-* headers are not defined in RFC 2616 but are implemented by many HTTP client applications.
[8] An RFC 822 email address format.
[9] While implemented by some clients, the UA-* headers can be considered harmful. Content, specifically HTML, should not be targeted at specific client configurations.

3.5.2.1 Accept headers

Accept headers tell servers the client’s preferences and capabilities: what they want, what they can use, and what they don’t want. Servers can use this information to make good decisions about what to send. Table 3-14.

[10] See Section 15.6.2 for more on the TE header.

3.5.2.2 Conditional request headers

Clients can put restrictions on a request by using conditional request headers: e.g., if the client already has a copy of a document, it asks a server to send the document only if it is different from the copy the client already has. Table 3-15

[11] See Chapter 7 for more on entity tags. The tag is an identifier for a version of the resource.
[12] See Section 15.9 for more on the Range header.

3.5.2.3 Request security headers

HTTP attempts to make transactions secure by requiring clients to authenticate themselves before getting access to certain resources(more in Chapter 14). Table 3-16 lists the request security headers.

[13] The Cookie header is not defined in RFC 2616; it is discussed in Chapter 11.

3.5.2.4 Proxy request headers

Table 3-17 lists the proxy request headers.

[14] See Section 6.6.2.1.

3.5.3 Response Headers

Response headers provide clients with information, such as who is sending the response, the capabilities of the responder. These headers help the client deal with the response and make better requests in the future. Table 3-18.

[15] Implies that the response has traveled through an intermediary, possibly from a proxy cache.
[16] The Public header is defined in RFC 2068 but does not appear in the HTTP definition(RFC 2616).
[17] The Title header is not defined in RFC 2616; see the original HTTP/1.0 draft definition(http://www.w3.org/Protocols/HTTP/HTTP2.html).

3.5.3.1 Negotiation headers

HTTP/1.1 provides servers and clients with the ability to negotiate for a resource if multiple representations are available. Table 3-19.

3.5.3.2 Response security headers

Table 3-20 lists the response security headers.

[18] Set-Cookie and Set-Cookie2 are extension headers that are covered in Chapter 11.

3.5.4 Entity Headers

Because both request and response messages can contain entities, these headers can appear in either type of message.
Entity headers tell the receiver of the message what it’s dealing with. Table 3-21.

3.5.4.1 Content headers

The content headers provide information about the content of the entity, revealing its type, size, and other information useful for processing it. For instance, a web browser can look at the content type returned and know how to display the object. Table 3-22.

[19] The Content-Base header is not defined in RFC 2616.

3.5.4.2 Entity caching headers

The general caching headers provide directives about how or when to cache. The entity caching headers provide information about the entity being cached, e.g., information needed to validate whether a cached copy of the resource is still valid and hints about how better to estimate when a cached resource may no longer be valid. Table 3-23.

[20] Entity tags are basically identifiers for a particular version of a resource.

3.6 For More Information

Please indicate the source: http://blog.csdn.net/gaoxiangnumber1

Welcome to my github: https://github.com/gaoxiangnumber1