Http file upload rfc
Instead of providing the client with a slot the service MAY respond with an error if the requested file size is too large. In addition the entity MAY inform the requester about the maximum file size.
For any other type of error the service SHOULD respond with appropriate error types to indicate temporary or permanent errors. The retry element MUST include an attribute 'stamp' which indicates the time at which the requesting entity may try again.
There is no further XMPP communication required between the upload service and the client. If the upload fails for whatever reasons the client MAY request a new slot. Note: This section is not normative; it may be updated when general web security recommendations change in the future. It is recommended to run the HTTP upload domain used for GET requests in appropriate isolation from other HTTP based services to avoid user-generated, malicious scripts to be executed in the context of said services.
Isolation techniques can include, but are not limited to, setting the Content-Security-Policy. The provided policy will prohibit a browser from executing all active content from the HTTP upload domain default-src 'none' and forbid embedding it from other pages frame-ancestors 'none'.
More information on Content-Security-Policy can be found on infosec. So let's assume for the time being that you're working with some reasonable non-IIS server. How do you really deal with file upload? It turns out to be easy. First, you design your form so that it will actually do an upload.
In short, do this:. If that's what you actually want, this is pretty useful. However, the RFC leaves behavior in this situation undefined, so you shouldn't rely on any particular behavior. I haven't looked to see what IE does in this situation. Undoubtedly something different. So this much information I already knew going into my horrible project, or at least knew of it. That's why I assumed that the server end was just as simple. And as I mentioned, in Perl it isn't much more difficult than retrieving normal posted data is already.
Oh, Microsoft has a solution of sorts, called the something-or-other manager, and IIS 5. OK, so when this post gets to the server, what does it look like? In particular, use of external profiling information to determine the exact mapping is not permitted. Note: This use of the term "character set" is more commonly referred to as a "character encoding. HTTP character sets are identified by case-insensitive tokens. Implementors should be aware of IETF character set requirements [ 38 ] [ 41 ].
See section 3. Content codings are primarily used to allow a document to be compressed or otherwise usefully transformed without losing the identity of its underlying media type and without loss of information. Frequently, the entity is stored in coded form, transmitted directly, and only decoded by the recipient. Although the value describes the content-coding, what is more important is that it indicates what decoding mechanism will be required to remove the encoding.
Initially, the registry contains the following tokens: gzip An encoding format produced by the file compression program "gzip" GNU zip as described in RFC [ 25 ]. Use of program names for the identification of encoding formats is not desirable and is discouraged for future encodings. Their use here is representative of historical practice, not good design.
New content-coding value tokens SHOULD be registered; to allow interoperability between clients and servers, specifications of the content coding algorithms needed to implement a new value SHOULD be publicly available and adequate for independent implementation, and conform to the purpose of content coding defined in this section.
This differs from a content coding in that the transfer-coding is a property of the message, not of the original entity. Whenever a transfer-coding is applied to a message-body, the set of transfer-codings MUST include "chunked", unless the message is terminated by closing the connection.
When the "chunked" transfer- coding is used, it MUST be the last transfer-coding applied to the message-body. These rules allow the recipient to determine the transfer-length of the message section 4. Transfer-codings are analogous to the Content-Transfer-Encoding values of MIME [ 7 ], which were designed to enable safe transport of binary data over a 7-bit transport service. However, safe transport has a different focus for an 8bit-clean transfer protocol.
In HTTP, the only unsafe characteristic of message-bodies is the difficulty in determining the exact body length section 7. Initially, the registry contains the following tokens: "chunked" section 3.
A server which receives an entity-body with a transfer-coding it does not understand SHOULD return Unimplemented , and close the connection. This allows dynamically produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message.
The chunked encoding is ended by any chunk whose size is zero, followed by the trailer, which is terminated by an empty line. The trailer allows the sender to include additional HTTP header fields at the end of the message. The Trailer header field can be used to indicate which header fields are included in a trailer see section In other words, the origin server is willing to accept the possibility that the trailer fields might be silently discarded along the path to the client.
It avoids a situation where compliance with the protocol would have necessitated a possibly infinite buffer on the proxy. An example process for decoding a Chunked-Body is presented in appendix The type, subtype, and parameter attribute names are case- insensitive.
Parameter values might or might not be case-sensitive, depending on the semantics of the parameter name.
The presence or absence of a parameter might be significant to the processing of a media-type, depending on its definition within the media type registry. The media type registration process is outlined in RFC [ 17 ]. Use of non-registered media types is discouraged. An entity-body transferred via HTTP messages MUST be represented in the appropriate canonical form prior to its transmission except for "text" types, as defined in the next paragraph.
When in canonical form, media subtypes of the "text" type use CRLF as the text line break. HTTP relaxes this requirement and allows the transport of text media with plain CR or LF alone representing a line break when it is done consistently for an entire entity-body.
In addition, if the text is represented in a character set that does not use octets 13 and 10 for CR and LF respectively, as is the case for some multi-byte character sets, HTTP allows the use of whatever octet sequences are defined by that character set to represent the equivalent of CR and LF for line breaks. If an entity-body is encoded with a content-coding, the underlying data MUST be in a form defined above prior to being encoded. The "charset" parameter is used with some media types to define the character set section 3.
When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO" when received via HTTP. All multipart types share a common syntax, as defined in section 5. These restrictions exist in order to preserve the self-delimiting nature of a multipart message- body, wherein the "end" of the message-body is indicated by the ending multipart boundary.
In general, HTTP treats a multipart message-body no differently than any other media type: strictly as payload. Most fields using product tokens also allow sub-products which form a significant part of the application to be listed, separated by white space.
By convention, the products are listed in order of their significance for identifying the application. A weight is normalized to a real number in the range 0 through 1, where 0 is the minimum and 1 the maximum value. Computer languages are explicitly excluded. The name space of language tags is administered by the IANA.
Example tags include: en, en-US, en-cockney, i-cherokee, x-pig-latin Fielding, et al. The last three tags above are not registered tags; all but the last are examples of tags which could be registered in future. The definition of how they are used and compared as cache validators is in section An entity tag consists of an opaque quoted string, possibly prefixed by a weakness indicator.
A weak entity tag can only be used for weak comparison. An entity tag MUST be unique across all versions of all entities associated with a particular resource. The use of the same entity tag value in conjunction with entities obtained by requests on different URIs does not imply the equivalence of those entities. An entity can be broken down into subranges according to various structural units. Both types of message consist of a start-line, zero or more header fields also known as "headers" , an empty line i.
Each header field consists of a name followed by a colon ":" and the field value. Field names are case-insensitive. Header fields can be extended over multiple lines by preceding each extra line with at least one SP or HT. Applications ought to follow "common form", where one is known or indicated, when generating HTTP constructs, since there might exist some implementations that fail to accept anything Fielding, et al. The order in which header fields with differing field names are received is not significant.
However, it is "good practice" to send general-header fields first, followed by request-header or response- header fields, and ending with the entity-header fields. Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma.
The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded. The message-body differs from the entity-body only when a transfer-coding has been applied, as indicated by the Transfer-Encoding header field section Transfer-Encoding is a property of the message, not of the Fielding, et al.
However, section 3. The rules for when a message-body is allowed in a message differ for requests and responses.
The presence of a message-body in a request is signaled by the inclusion of a Content-Length or Transfer-Encoding header field in the request's message-headers.
For response messages, whether or not a message-body is included with a message is dependent on both the request method and the response status code section 6. All other responses do include a message-body, although it MAY be of zero length. When a message-body is included with a message, the transfer-length of that body is determined by one of the following in order of precedence : 1. Any response message which "MUST NOT" include a message-body such as the 1xx, , and responses and any response to a HEAD request is always terminated by the first empty line after the header fields, regardless of the entity-header fields present in the message.
If a Transfer-Encoding header field section If a Content-Length header field section This media type UST NOT be used unless the sender knows that the recipient can arse it; the presence in a request of a Range header with ultiple byte- range specifiers from a 1. A range header might be forwarded by a 1. By the server closing the connection. Closing the connection cannot be used to indicate the end of a request body, since that would leave no possibility for the server to send back a response.
If a request contains a message-body and a Content-Length is not given, the server SHOULD respond with bad request if it cannot determine the length of the message, or with length required if it wishes to insist on receiving a valid Content-Length. These header fields apply only to the Fielding, et al.
However, new or experimental header fields may be given the semantics of general header fields if all parties in the communication recognize them to be general-header fields. Unrecognized header fields are treated as entity-header fields. The elements are separated by SP characters. The method is case-sensitive. The return code of the response always notifies the client whether a method is currently allowed on a resource, since the set of allowed methods can change dynamically.
An origin server SHOULD return the status code Method Not Allowed if the method is known by the origin server but not allowed for the requested resource, and Not Implemented if the method is unrecognized or not implemented by the origin server. The proxy is requested to forward the request or service it from a valid cache, and return the response.
Note that the proxy MAY forward the request on to another proxy or directly to the server Fielding, et al. In order to avoid request loops, a proxy MUST be able to recognize all of its server names, including any aliases, local variations, and the numeric IP address. The most common form of Request-URI is that used to identify a resource on an origin server or gateway.
For example, a client wishing to retrieve the resource above directly from the origin server would create a TCP connection to port 80 of the host "www. The Request-URI is transmitted in the format specified in section 3. Note: The "no rewrite" rule prevents the proxy from changing the meaning of the request when the origin server is improperly using a non-reserved URI character for a reserved purpose. But see section If the host as determined by rule 1 or 2 is not a valid host on the server, the response MUST be a Bad Request error message.
These fields act as request modifiers, with semantics equivalent to the parameters on a programming language method invocation. However, new or experimental header fields MAY be given the semantics of request- header fields if all parties in the communication recognize them to be request-header fields. These codes are fully defined in section The Reason-Phrase is intended to give a short textual description of the Status-Code. The Status-Code is intended for use by automata and the Reason-Phrase is intended for the human user.
The client is not required to examine or display the Reason- Phrase. The last two digits do not have any categorization role. The reason phrases listed here are only recommendations -- they MAY be replaced by local equivalents without affecting the protocol. HTTP applications are not required to understand the meaning of all registered status codes, though such understanding is obviously desirable. However, applications MUST understand the class of any status code, as indicated by the first digit, and treat any unrecognized response as being equivalent to the x00 status code of that class, with the exception that an unrecognized response MUST NOT be cached.
For example, if an unrecognized status code of is received by the client, it can safely assume that there was something wrong with its request and treat the response as if it had received a status code. In such cases, user agents SHOULD present to the user the entity returned with the response, since that entity is likely to include human- readable information which will explain the unusual status.
These header fields give information about the server and about further access to the resource identified by the Request-URI. However, new or experimental header fields MAY be given the semantics of response- header fields if all parties in the communication recognize them to be response-header fields.
An entity consists of entity-header fields and an entity-body, although some responses will only include the entity-headers. In this section, both sender and recipient refer to either the client or the server, depending on who sends and who receives the entity. The entity-body is obtained from the message-body by decoding any Transfer-Encoding that might have been applied to ensure safe and proper transfer of the message.
Content-Encoding may be used to indicate any additional content codings applied to the data, usually for the purpose of data compression, that are a property of the requested resource. There is no default encoding.
Section 4. The use of inline images and other associated data often require a client to make multiple requests of the same server in a short amount of time. Analysis of these performance problems and results from a prototype implementation are available [ 26 ] [ 30 ]. Persistent HTTP connections have a number of advantages: - By opening and closing fewer TCP connections, CPU time is saved in routers and hosts clients, servers, proxies, gateways, tunnels, or caches , and memory used for TCP protocol control blocks can be saved in hosts.
Pipelining allows a client to make multiple requests without waiting for each response, allowing a single TCP connection to be used much more efficiently, with much lower elapsed time.
Clients using future versions of HTTP might optimistically try a new feature, but if communicating with an older server, retry with old semantics after an error is reported. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.
Persistent connections provide a mechanism by which a client and a server can signal the close of a TCP connection. This signaling takes place using the Connection header field section If the server chooses to close the connection immediately after sending the response, it SHOULD send a Connection header including the connection-token close.
In case the client does not want to maintain a connection for more than that request, it SHOULD send a Connection header including the connection-token close.
If either the client or the server sends the close token in the Connection header, that request becomes the last one for the connection. In order to remain persistent, all messages on the connection MUST have a self-defined message length i.
A server MUST send its responses to those requests in the same order that the requests were received. Clients which assume persistent connections and pipeline immediately after connection establishment SHOULD be prepared to retry their connection if the first pipelined attempt fails. Clients MUST also be prepared to resend their requests if the server closes the connection before sending all of the corresponding responses. Otherwise, a premature termination of the transport connection could lead to indeterminate results.
A client wishing to send a non-idempotent request SHOULD wait to send that request until it has received the response status for the previous request. The proxy server MUST signal persistent connections separately with its clients and the origin servers or other proxy servers that it connects to.
Each persistent connection applies to only one transport link. Proxy servers might make this a higher value since it is likely that the client will be making more connections through the same server.
The use of persistent connections places no requirements on the length or existence of this time-out for either the client or the server. Clients and servers SHOULD both constantly watch for the other side of the transport close, and respond to it as appropriate. If a client or server does not detect the other side's close promptly it could cause unnecessary resource drain on the network. A client, server, or proxy MAY close the transport connection at any time.
For example, a client might have started to send a new request at the same time that the server has decided to close the "idle" connection. From the server's point of view, the connection is being closed while it was idle, but from the client's point of view, a request is in progress.
This means that clients, servers, and proxies MUST be able to recover from asynchronous close events. Client software SHOULD reopen the transport connection and retransmit the aborted sequence of requests without user interaction so long as the request sequence is idempotent see section 9.
Confirmation by user-agent software with semantic understanding of the application MAY substitute for user confirmation. Clients that use persistent connections SHOULD limit the number of simultaneous connections that they maintain to a given server. These guidelines are intended to improve HTTP response times and avoid congestion. The latter technique can exacerbate network congestion. If the body is being sent using a "chunked" encoding section 3.
In some cases, it might either be inappropriate or highly inefficient for the client to send the body if the server will reject the message without looking at the body. Because of the presence of older implementations, the protocol allows ambiguous situations in which a client may send "Expect: continue" without receiving either a Expectation Failed status or a Continue status.
Therefore, when a client sends this header field to an origin server possibly via a proxy from which it has never seen a Continue status, the client SHOULD NOT wait for an indefinite period before sending the request body. Otherwise, the client might not reliably receive the response message.
However, this requirement is not be construed as preventing a server from defending itself against denial-of-service attacks, or from badly broken client implementations. This requirement overrides the general rule for forwarding of 1xx responses see section If the client does retry this request, it MAY use the following "binary exponential backoff" algorithm to be assured of obtaining a reliable response: 1.
Initiate a new connection to the server 2. Transmit the request-headers 3. Initialize a variable R to the estimated round-trip time to the server e. Wait either for an error response from the server, or for T seconds whichever comes first 6. If no error response is received, after T seconds transmit the body of the request. If client sees that the connection is closed prematurely, repeat from step 1 until the request is accepted, an error response is received, or the user becomes impatient and terminates the retry process.
Although this set can be expanded, additional methods cannot be assumed to share the same semantics for separately extended clients and servers. The Host request-header field section These methods ought to be considered "safe".
Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.
A sequence is idempotent if a single execution of the entire sequence always yields a result that is not changed by a reexecution of all, or part, of that sequence. For example, a sequence is non-idempotent if its result depends on a value that is later modified in the same sequence. A sequence that never has side effects is idempotent, by definition provided that no concurrent operations are being executed on the same set of resources.
Responses to this method are not cacheable. A server that does not support such an extension MAY discard the request body. A response SHOULD include any header fields that indicate optional features implemented by the server and applicable to that resource e. The format for such a Fielding, et al.
Content negotiation MAY be used to select the appropriate response format. If the Max-Forwards field-value is an integer greater than zero, the proxy MUST decrement the field-value when it forwards the request. If the Request-URI refers to a data-producing process, it is the produced data which shall be returned as the entity in the response and not the source text of the process, unless that text happens to be the output of the process.
A conditional GET method requests that the entity be transferred only under the circumstances described by the conditional header field s. The conditional GET method is intended to reduce unnecessary network usage by allowing cached entities to be refreshed without requiring multiple requests or transferring data already held by the client.
A partial GET requests that only part of the entity be transferred, as described in section The partial GET method is intended to reduce unnecessary network usage by allowing partially-retrieved entities to be completed without transferring data already held by the client. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself.
This method is often used for testing hypertext links for validity, accessibility, and recent modification. POST is designed to allow a uniform method to cover the following functions: - Annotation of existing resources; - Posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles; - Providing a block of data, such as the result of submitting a form, to a data-handling process; - Extending a database through an append operation.
The posted entity is subordinate to that URI in the same way that a file is subordinate to a directory containing it, a news article is subordinate to a newsgroup to which it is posted, or a record is subordinate to a database. In this case, either OK or No Content is the appropriate response status, depending on whether or not the response includes an entity that describes the result. Responses to this method are not cacheable, unless the response includes appropriate Cache-Control or Expires header fields.
However, the See Other response can be used to direct the user agent to retrieve a cacheable resource. If the Request-URI does not point to an existing resource, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI.
If a new resource is created, the origin server MUST inform the user agent via the Created response. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations.
If the server desires that the request be applied to a different URI, Fielding, et al. For example, an article might have a URI for identifying "the current version" which is separate from the URI identifying each particular version.
This method MAY be overridden by human intervention or other means on the origin server. The client cannot be guaranteed that the operation has been carried out, even if the status code returned from the origin server indicates that the action has been completed successfully. A successful response SHOULD be OK if the response includes an entity describing the status, Accepted if the action has not yet been enacted, or No Content if the action has been enacted but the response does not include an entity.
The final recipient is either the Fielding, et al. TRACE allows the client to see what is being received at the other end of the request chain and use that data for testing or diagnostic information. The value of the Via header field section Use of the Max-Forwards header field allows the client to limit the length of the request chain, which is useful for testing a chain of proxies forwarding messages in an infinite loop.
SSL tunneling [ 44 ]. There are no required headers for this class of status code. A client MUST be prepared to accept one or more 1xx status responses prior to a regular response, even if the client does not expect a Continue status message. Unexpected 1xx status responses MAY be ignored by a user agent. Proxies MUST forward 1xx responses, unless the connection between the proxy and its client has been closed, or unless the proxy itself requested the generation of the 1xx response.
For example, if a Fielding, et al. This interim response is used to inform the client that the initial part of the request has been received and has not yet been rejected by the server. The client SHOULD continue by sending the remainder of the request or, if the request has already been completed, ignore this response. The server MUST send a final response after the request has been completed. See section 8. The server will switch protocols to those defined by the response's Upgrade header field immediately after the empty line which terminates the response.
For example, switching to a newer version of HTTP is advantageous over older versions, and switching to a real-time, synchronous protocol might be advantageous when delivering resources that use such features.
The information returned with the response is dependent on the method used in the request, for example: GET an entity corresponding to the requested resource is sent in the response; HEAD the entity-header fields corresponding to the requested resource are sent in the response without any message-body; POST an entity describing or containing the result of the action; Fielding, et al. The newly created resource can be referenced by the URI s returned in the entity of the response, with the most specific URI for the resource given by a Location header field.
The response SHOULD include an entity containing a list of resource characteristics and location s from which the user or user agent can choose the one most appropriate. The entity format is specified by the media type given in the Content-Type header field.
The origin server MUST create the resource before returning the status code. A response MAY contain an ETag response header field indicating the current value of the entity tag for the requested variant just created, see section The request might or might not eventually be acted upon, as it might be disallowed when processing actually takes place.
There is no facility for re-sending a status code from an asynchronous operation such as this. The response is intentionally non-committal. Its purpose is to allow a server to accept a request for some other process perhaps a batch-oriented process that is only run once per day without requiring that the user agent's connection to the server persist until the process is completed. The entity returned with this response SHOULD include an indication of the request's current status and either a pointer to a status monitor or some estimate of when the user can expect the request to be fulfilled.
The set presented MAY be a subset or superset of the original version. For example, including local annotation information about the resource might result in a superset of the metainformation known by the origin server. Use of this response code is not required and is only appropriate when the response would otherwise be OK. This response is primarily intended to allow input for actions to take place without causing a change to the user agent's active document view, although any new or updated metainformation SHOULD be applied to the document currently in the user agent's active view.
This response is primarily intended to allow input for actions to take place via user input, followed by a clearing of the form in which the input is given so that the user can easily initiate another input action.
If the response is the result of an If-Range request that used a weak validator, the response MUST NOT include other entity-headers; this prevents inconsistencies between cached entity-bodies and updated headers.
Otherwise, the response MUST include all of the entity-headers that would have been returned with a OK response to the same request. Note: previous versions of this specification recommended a maximum of five redirections. Content developers should be aware that there might be clients that implement such a fixed limitation. Unless it was a HEAD request, the response SHOULD include an entity containing a list of resource characteristics and location s from which the user or user agent can choose the one most appropriate.
The entity format is specified by the media type given in the Content- Type header field. Depending upon the format and the capabilities of Fielding, et al. However, this specification does not define any standard for such automatic selection. This response is cacheable unless indicated otherwise. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible.
If the status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.
This response is only cacheable if indicated by a Cache-Control or Expires header field. However, most existing user agent implementations treat as if it were a response, performing a GET on the Location field-value regardless of the original request method.
The status codes and have been added for servers that wish to make unambiguously clear which kind of reaction is expected of the client. This method exists primarily to allow the output of a POST-activated script to redirect the user agent to a selected resource.
0コメント