RFC2616 HTTP/1.1 標誌轉換及通用語法

   name = definition
      The name of a rule is simply the name itself (without any
      enclosing "<" and ">") and is separated from its definition by the
      equal "=" character. White space is only significant in that
      indentation of continuation lines is used to indicate a rule
      definition that spans more than one line. Certain basic rules are
      in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. Angle
      brackets are used within definitions whenever their presence will
      facilitate discerning the use of rule names.

   "literal"
      Quotation marks surround literal text. Unless stated otherwise,
      the text is case-insensitive.

   rule1 | rule2
      Elements separated by a bar ("|") are alternatives, e.g., "yes |
      no" will accept yes or no.

   (rule1 rule2)
      Elements enclosed in parentheses are treated as a single element.
      Thus, "(elem (foo | bar) elem)" allows the token sequences "elem
      foo elem" and "elem bar elem".

   *rule
      The character "*" preceding an element indicates repetition. The
      full form is "<n>*<m>element" indicating at least <n> and at most
      <m> occurrences of element. Default values are 0 and infinity so
      that "*(element)" allows any number, including zero; "1*element"
      requires at least one; and "1*2element" allows one or two.

   [rule]
      Square brackets enclose optional elements; "[foo bar]" is
      equivalent to "*1(foo bar)".
   
   N rule
      Specific repetition: "<n>(element)" is equivalent to
      "<n>*<n>(element)"; that is, exactly <n> occurrences of (element).
      Thus 2DIGIT is a 2-digit number, and 3ALPHA is a string of three
      alphabetic characters.

   #rule
      A construct "#" is defined, similar to "*", for defining lists of
      elements. The full form is "<n>#<m>element" indicating at least
      <n> and at most <m> elements, each separated by one or more commas
      (",") and OPTIONAL linear white space (LWS). This makes the usual
      form of lists very easy; a rule such as
         ( *LWS element *( *LWS "," *LWS element ))
      can be shown as
         1#element
      Wherever this construct is used, null elements are allowed, but do
      not contribute to the count of elements present. That is,
      "(element), , (element) " is permitted, but counts as only two
      elements. Therefore, where at least one element is required, at
      least one non-null element MUST be present. Default values are 0
      and infinity so that "#element" allows any number, including zero;
      "1#element" requires at least one; and "1#2element" allows one or
      two.

   ; comment
      A semi-colon, set off some distance to the right of rule text,
      starts a comment that continues to the end of line. This is a
      simple way of including useful notes in parallel with the
      specifications.

   implied *LWS
      The grammar described by this specification is word-based. Except
      where noted otherwise, linear white space (LWS) can be included
      between any two adjacent words (token or quoted-string), and
      between adjacent words and separators, without changing the
      interpretation of a field. At least one delimiter (LWS and/or

      separators) MUST exist between any two tokens (for the definition
      of "token" below), since they would otherwise be interpreted as a
      single token.
   
   OCTET          = <any 8-bit sequence of data>
   CHAR           = <any US-ASCII character (octets 0 - 127)>
   UPALPHA        = <any US-ASCII uppercase letter "A".."Z">
   LOALPHA        = <any US-ASCII lowercase letter "a".."z">
   ALPHA          = UPALPHA | LOALPHA
   DIGIT          = <any US-ASCII digit "0".."9">
   CTL            = <any US-ASCII control character
                    (octets 0 - 31) and DEL (127)>
   CR             = <US-ASCII CR, carriage return (13)>
   LF             = <US-ASCII LF, linefeed (10)>
   SP             = <US-ASCII SP, space (32)>
   HT             = <US-ASCII HT, horizontal-tab (9)>
   <">            = <US-ASCII double-quote mark (34)>
   
   CRLF           = CR LF
   
   LWS            = [CRLF] 1*( SP | HT )
   
   TEXT           = <any OCTET except CTLs, but including LWS>
   
   HEX            = "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT
   
   token          = 1*<any CHAR except CTLs or separators>
   separators     = "(" | ")" | "<" | ">" | "@"
                          | "," | ";" | ":" | "\" | <">
                          | "/" | "[" | "]" | "?" | "="
                          | "{" | "}" | SP | HT
   
   comment        = "(" *( ctext | quoted-pair | comment ) ")"
   ctext          = <any TEXT excluding "(" and ")">
   
   quoted-string  = ( <"> *(qdtext | quoted-pair ) <"> )
   qdtext         = <any TEXT except <">>
   
   quoted-pair    = "\" CHAR

WSP basic header syntax:
         Text-string = [Quote] *TEXT End-of-string
        ; If the first character in the TEXT is in the range of 128-255,  a Quote character must precede it.
        ; Otherwise the Quote character must be omitted. The Quote is not part of the contents.
        
        Token-text = Token End-of-string
        
        Quoted-string = <Octet 34> *TEXT End-of-string
        ;The TEXT encodes an RFC2616 Quoted-string with the enclosing quotation-marks <"> removed
        
        Extension-media = *TEXT End-of-string
        ; This encoding is used for media values, which have no well-known binary encoding
        
        Short-integer = OCTET
        ; Integers in range 0-127 shall be encoded as a one octet value with the most  significant bit set
        ; to one (1xxx xxxx) and with the value in the remaining least significant bits.

        Long-integer = Short-length Multi-octet-integer
        ; The Short-length indicates the length of the Multi-octet-integer
        
        Multi-octet-integer = 1*30 OCTET
        ; The content octets shall be an unsigned integer value
        ; with the most significant octet encoded first (big-endian representation).
        ; The minimum number of octets must be used to encode the value.
        
        Uintvar-integer = 1*5 OCTET
        ; The encoding is the same as the one defined for uintvar in Section 8.1.2.

        Constrained-encoding = Extension-Media | Short-integer
        ; This encoding is used for token values, which have no well-known binary encoding, or when
        ; the assigned number of the well-known encoding is small enough to fit into Short-integer.
        
        Quote = <Octet 127>
        
        End-of-string = <Octet 0>
        
        Value-length = Short-length | (Length-quote Length)
        ; Value length is used to indicate the length of the value to follow
        Short-length = <Any octet 0-30>
        Length-quote = <Octet 31>
        Length = Uintvar-integer
        
        No-value = <Octet 0>
        ; Used to indicate that the parameter actually has no value,
        ; eg, as the parameter "bar" in ";foo=xxx; bar; baz=xyzzy".

        Text-value = No-value | Token-text | Quoted-string

        Integer-Value = Short-integer | Long-integer
        
        Date-value = Long-integer
        ; The encoding of dates shall be done in number of seconds from
        ; 1970-01-01, 00:00:00 GMT.

        Delta-seconds-value = Integer-value

        Q-value = 1*2 OCTET
        ; The encoding is the same as in Uintvar-integer, but with restricted size.  When quality factor 0
        ; and quality factors with one or two decimal digits are encoded, they shall be multiplied by 100
        ; and incremented by one, so that they encode as a one-octet value in range 1-100,
        ; ie, 0.1 is encoded as 11 (0x0B) and 0.99 encoded as 100 (0x64).  Three decimal quality
        ; factors shall be multiplied with 1000 and incremented by 100, and  the result shall be encoded
        ; as a one-octet or two-octet uintvar,  eg, 0.333 shall be encoded as 0x83 0x31.
        ; Quality factor 1 is the default value and shall never be sent.

        Version-value = Short-integer | Text-string
        ; The three most significant bits of the Short-integer value are interpreted to encode a major
        ; version number in the range 1-7, and the four least significant bits contain a minor version
        ; number in the range 0-14.  If there is only a major version number, this is encoded by
        ; placing the value 15 in the four least significant bits.   If the version to be encoded fits these
        ; constraints, a Short-integer  must be used, otherwise a Text-string shall be used.
        
        Uri-value = Text-string
        ; URI value should be encoded per [RFC2616], but service user may use a different format.
        
        Parameter = Typed-parameter | Untyped-parameter

        Typed-parameter =  Well-known-parameter-token Typed-value
        ; the actual expected type of the value is implied by the well-known parameter

        Well-known-parameter-token = Integer-value
        ; the code values used for parameters are specified in the Assigned Numbers appendix

        Typed-value = Compact-value | Text-value
        ; In addition to the expected type, there may be no value.
        ; If the value cannot be encoded using the expected type, it shall be encoded as text.

        Compact-value = Integer-value |
                                             Date-value | Delta-seconds-value | Q-value | Version-value |
                                             Uri-value

        Untyped-parameter = Token-text Untyped-value
        ; the type of the value is unknown, but it shall be encoded as an integer, if that is possible.

        Untyped-value = Integer-value | Text-value

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章