`
lobin
  • 浏览: 379367 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

HTTP协议分析

 
阅读更多
写道
Hypertext Transfer Protocol -- HTTP/1.1
https://www.iteye.com/blog/lobin-620196

 

 

HTTP协议是一种文本协议,所以HTTP消息是文本协议的消息,不过解析起来没那么简单。

 

HTTP消息

 

HTTP消息包括HTTP请求和HTTP响应消息。

写道
HTTP-message = Request | Response ; HTTP/1.1 messages

 

HTTP消息总体形式

HTTP消息包括三部分,起始行、HTTP头,HTTP消息体组成。

写道
generic-message = start-line
*(message-header CRLF)
CRLF
[ message-body ]
start-line = Request-Line | Status-Line

HTTP头

写道
message-header = field-name ":" [ field-value ]
field-name = token
field-value = *( field-content | LWS )
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string>

 

消息体

写道
message-body = entity-body
| <entity-body encoded as per Transfer-Encoding>

 

写道
entity-body = *OCTET

 

 

HTTP请求

写道
Request = Request-Line ; Section 5.1
*(( general-header ; Section 4.5
| request-header ; Section 5.3
| entity-header ) CRLF) ; Section 7.1
CRLF
[ message-body ] ; Section 4.3

 

 

 

写道

 

Request-Line = Method SP Request-URI SP HTTP-Version CRLF

HTTP响应

写道
Response = Status-Line ; Section 6.1
*(( general-header ; Section 4.5
| response-header ; Section 6.2
| entity-header ) CRLF) ; Section 7.1
CRLF
[ message-body ] ; Section 7.2

 

 

写道
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

 

 

HTTP消息解析

从HTTP消息总体形式来看,解析HTTP消息,通常的办法就是依次解析,依次将起始行,然后就是一个个HTTP头,最后就是消息体分别解析出来。这看着很自然也很简单,也是最笨的办法。而且采用这种方法后面会发现在解析每个HTTP头的时候会比较麻烦。

 

第1步:解析起始行,起始行的结束标识就是CRLF。

第2步:解析第1个HTTP头,每个HTTP头后面也都有一个CRLF结束标识。

第3步:解析第2个HTTP头,每个HTTP头后面也都有一个CRLF结束标识。

第4步:解析第3个HTTP头,每个HTTP头后面也都有一个CRLF结束标识。

... ...

第n步:解析最后一个HTTP头,每个HTTP头后面也都有一个CRLF结束标识。

第n+1步:解析CRLF,这个是HTTP头和消息体的分隔符,标识HTTP头部的结束,以及消息体的开始。

最后1步:解析消息体。

 

上面的步骤都看着挺简单的。

 

至于上面的每一步,起始行在HTTP请求中就是请求行,解析起始行将在HTTP请求解析中细讲。在第2步到第n步解析每个HTTP头的时候,虽然说每个HTTP头后面也都有一个CRLF结束标识,但从HTTP头的定义来看,虽然只是简单的key-value形式,但field-value中定义了一组规则,并且field-value中可能也包含CRLF,所以简单的以CRLF作为一个HTTP头的结束是不行的。既然无法简单的根据CRLF来决定一个HTTP头的结束,也就不好判断整个HTTP头的结束。解析单个HTTP头将在HTTP头解析中细讲。

 

另一种就是没有HTTP头的情况,这种情况解析起来也简单,起始行后面有个CRLF结束标识,不管是HTTP请求中的请求行,还是HTTP响应中的状态行,都有CRLF结束标识,再加上HTTP头和消息体的分隔符CRLF,连续2个CRLF。

 

其实不管是没有HTTP头的情况,包括存在HTTP头的情况,都有连续的2个CRLF,因为每个HTTP头后面也都有一个CRLF结束标识,再加上HTTP头和消息体的分隔符CRLF。

 

最后就是消息体的解析,消息体就是一组8位字节,看着也挺简单,目前暂时不需要做什么解析。

 

另一种方法就是先分别解析起始行,整个HTTP头以及消息体,这不像上面的常规方法依次解析。

 

虽然上面常规的方法解析起来比较麻烦,但从分析的结果看,也给我们提供了一个信息,那就是我们可以很简单的解析出整个HTTP头,这样就不需要停留在怎样正确的解析每个HTTP头,以致解析工作无法继续下去。

 

第1步:解析起始行,起始行的结束标识就是CRLF。

第2步:解析整个HTTP头。

最后1步:解析消息体。

 

但是不管怎么样,不管是上面常规的方法,还是现在这种方法,终究是要将每个HTTP头出来的。

 

HTTP头解析

写道
message-header = field-name ":" [ field-value ]
field-name = token
field-value = *( field-content | LWS )
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string>

HTTP头的形式field-name ":" [ field-value ],就是一对key-value。

field-value是可选的,最简单的情况就是:

field-name:

 

HTTP头的解析主要难点在于field-value的解析。field-value由多个field-content或者LWS组成:

写道
field-value = *( field-content | LWS )

从这里也可以看出,最简单的情况就是

field-name:

可以先简单点考虑两种情况:

 

第1种情况

写道
field-value = *(LWS)

 

写道
LWS = [CRLF] 1*( SP | HT )

CRLF是可选的。

field-name=SP

field-name=HT

 

field-name=CRLF SP

field-name=CRLF HT

 

field-name=SP SP SP

field-name=HT HT HT

 

field-name=SP HT SP HT SP HT

 

field-name=CRLF SP SP SP

field-name=CRLF HT HT HT

 

field-name=CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP

field-name=CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT

 

field-name=CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT

 

从这些例子来看,这种情况和

field-name:

效果其实都是一样的。但使得HTTP头的解析变得麻烦了些。

 

第2种情况

写道
field-value = *(field-content)
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string>

对于第2种情况,field-value可能有多个TEXT组成,或者由token, separators, and quoted-string的组合组成。所以这里也可以分两种情况

 

第2.1种情况

写道
TEXT = <any OCTET except CTLs,
but including LWS>

 

写道
OCTET = <any 8-bit sequence of data>

 

写道
CTL = <any US-ASCII control character
(octets 0 - 31) and DEL (127)>

 

field-name=a

 

field-name=a SP

field-name=a HT

 

field-name=a CRLF SP

field-name=a CRLF HT

 

field-name=a SP SP SP

field-name=a HT HT HT

 

field-name=a SP HT SP HT SP HT

 

field-name=a CRLF SP SP SP

field-name=a CRLF HT HT HT

 

field-name=a CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP

field-name=a CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT

 

field-name=a CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT

 

 

field-name=a SP a 

field-name=a HT a 

 

field-name=a CRLF SP a 

field-name=a CRLF HT a 

 

field-name=a SP SP SP a 

field-name=a HT HT HT a 

 

field-name=a SP HT SP HT SP HT a 

 

field-name=a CRLF SP SP SP a 

field-name=a CRLF HT HT HT a 

 

field-name=a CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP a 

field-name=a CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT a 

 

field-name=a CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT a 

 

 

 

field-name=a SP a SP

field-name=a HT a SP 

 

field-name=a CRLF SP a CRLF SP 

field-name=a CRLF HT a CRLF HT 

 

field-name=a SP SP SP a SP SP SP 

field-name=a HT HT HT a HT HT HT 

 

field-name=a SP HT SP HT SP HT a SP HT SP HT SP HT 

 

field-name=a CRLF SP SP SP a CRLF SP SP SP 

field-name=a CRLF HT HT HT a CRLF HT HT HT 

 

field-name=a CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP a CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP 

field-name=a CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT a CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT 

 

field-name=a CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT a CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT 

 

 

field-name=a SP a SP a

field-name=a HT a SP a 

 

field-name=a CRLF SP a CRLF SP a 

field-name=a CRLF HT a CRLF HT a 

 

field-name=a SP SP SP a SP SP SP a 

field-name=a HT HT HT a HT HT HT a 

 

field-name=a SP HT SP HT SP HT a SP HT SP HT SP HT a 

 

field-name=a CRLF SP SP SP a CRLF SP SP SP a 

field-name=a CRLF HT HT HT a CRLF HT HT HT a 

 

field-name=a CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP a CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP a 

field-name=a CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT a CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT a 

 

field-name=a CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT a CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT a 

 

 

field-name=SP a 

field-name=HT a 

 

field-name=CRLF SP a 

field-name=CRLF HT a 

 

field-name=SP SP SP a 

field-name=HT HT HT a 

 

field-name=SP HT SP HT SP HT a 

 

field-name=CRLF SP SP SP a 

field-name=CRLF HT HT HT a 

 

field-name=CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP a 

field-name=CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT a 

 

field-name=CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT a 

 

 

field-name=SP a SP  

field-name=HT a HT 

 

field-name=CRLF SP a CRLF SP 

field-name=CRLF HT a CRLF HT 

 

field-name=SP SP SP a SP SP SP 

field-name=HT HT HT a HT HT HT 

 

field-name=SP HT SP HT SP HT a SP HT SP HT SP HT 

 

field-name=CRLF SP SP SP a CRLF SP SP SP 

field-name=CRLF HT HT HT a CRLF HT HT HT 

 

field-name=CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP a CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP 

field-name=CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT a CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT 

 

field-name=CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT a CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT 

 

 

field-name=SP a SP a  

field-name=HT a HT a 

 

field-name=CRLF SP a CRLF SP a 

field-name=CRLF HT a CRLF HT a 

 

field-name=SP SP SP a SP SP SP a 

field-name=HT HT HT a HT HT HT a 

 

field-name=SP HT SP HT SP HT a SP HT SP HT SP HT a 

 

field-name=CRLF SP SP SP a CRLF SP SP SP a 

field-name=CRLF HT HT HT a CRLF HT HT HT a 

 

field-name=CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP a CRLF SP SP SP CRLF SP SP SP CRLF SP SP SP a 

field-name=CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT a CRLF HT HT HT CRLF HT HT HT CRLF HT HT HT a 

 

field-name=CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT a CRLF SP HT SP HT CRLF SP HT SP HT CRLF SP HT SP HT a 

 

第2.2种情况

写道
token = 1*<any CHAR except CTLs or separators>

 

写道
separators = "(" | ")" | "<" | ">" | "@"
| "," | ";" | ":" | "" | <">

| "/" | "[" | "]" | "?" | "="
| "{" | "}" | SP | HT

 

写道
quoted-string = ( <"> *(qdtext | quoted-pair ) <"> )
qdtext = <any TEXT except <">>

 

上面HTTP消息的解析思路都是一种正向解析,如上面解析整个HTTP消息,以及每一步解析,如解析起始行,解析第1个HTTP头,解析CRLF,解析整个HTTP头,解析消息体。按照这种正向解析去解析HTTP头的话不能像上面那样单纯根据CRLF来判断一个HTTP头的结束,因为field-value中可能存在LWS,其中可能包含CRLF。对于LWS,如果存在CRLF的话,后面会跟一个SP或HT,而标志一个HTTP头结束的CRLF标识后面不可能存在SP或HT。

 

所以可以根据CRLF标识后面是否存在SP或HT来判断一个HTTP头的结束,如果存在SP或HT,则还没有结束,否则标识一个HTTP头的结束。

 

 

其实在解析出解析整个HTTP头之后,在解析单个HTTP头的时候可以按照逆向解析的思路去解析单个HTTP头会非常简单。

 

逆向解析

 

HTTP消息解析包括HTTP请求和HTTP响应消息的解析

 

HTTP请求解析

 

HTTP响应解析

 

增量解析

增量解析适用于网络上发送或返回的HTTP消息,包括接收从客户端或UA发送过来的HTTP请求,以及从服务器返回过来的HTTP响应,在每次从读缓冲中读取到HTTP消息的时候,可能只读取到部分不完整的消息,而我们又不知道整个HTTP消息的长度,这个时候采用增量分析比较好。

 

通过正则表达式解析HTTP消息

 

词法分析解析HTTP消息的思路

HTTP协议针对协议请求和响应消息制定了一套通过的文法,包含一系列规则。参考HTTP规范的Generic Grammar。

所以也可以按照词法分析的那一套来对HTTP消息进行解析。

 

HTTP规范中的一些规则

HTTP协议的定义看着挺规范的,但有时候感觉太随意了。里边有很多细节上的说明,虽然描述得很细,但看到细处又感觉很模糊,尤其是对消息的描述。总体感觉太松散随意。

 

HTTP规范中关于*rule的规则

 

HTTP协议中有这样一个规则:

*rule

 

里边很多地方大量使用这种规则进行描述。在一个元素前面出现*号(*(element))表示对元素重复n次,n为0,1,2,...无穷大。

 

它的完整形式是这样的:

<n>*<m>element

 

它表示对element重复n到m次。

 

如:

1*( SP | HT )

表示1个SP或者HT。SP表示空格,HT表示水平tab键。

 

如:

*(message-header CRLF)

表示n个(message-header CRLF),n为0,1,2,...无穷大。

 
如:

*( field-content | LWS )

表示n个field-content或者LWS,n为0,1,2,...无穷大。

 

 

HTTP规范中关于请求/响应头字段field-value的规则

看了下HTTP协议中消息结构,包括请求和响应消息的定义。

 

看着简单,一个简单的文本传输协议。但其实挺复杂的,里边细节很多。

 

光一个header头的定义就很多很细。

 

比如请求/响应头字段field-value的的描述。

 

HTTP规范中关于请求头字段field-value的规则

field-value = *( field-content | LWS )

 

这个实际就比较复杂。

 

可能实际见到的就是一个简单的文本字符串。

 

但其中的细节还包括SP, HT, LWS, CR, LF, CRLF, *(重复), 还可以多行, 折叠等。这些组合重复再折叠再组合再重复起来,规则还是蛮复杂的。

 

 

 

 

请求链和响应链

                            ----------request chian-------->

user agent(UA)----------------v-------------------origin server(0)

                            <--------response chain--------

 

                            -----------------request chian--------------->

user agent(UA)---------- A ---------- B ---------- C ----------origin server(0)

                            <---------------response chain---------------

 

请求

按照HTTP规范rfc2616定义,一个最简单的请求至少因该是:

Method SP Request-URI SP HTTP-Version CRLF

CRLF

没有请求头部以及消息体,其中Method SP Request-URI SP HTTP-Version CRLF为请求行,在请求行后面有一个CRLF,表示请求头部结束标识,表示没有请求头部,加上请求行本身后跟一个CRLF,有两个CRLF。

 

但是在一些老版本实现中,可能:

Method SP Request-URI SP HTTP-Version CRLF

或者

Method SP Request-URI CRLF

 

解析http请求的请求行就是的Method SP Request-URI SP HTTP-Version后面跟的CRLF。 

 

在解析请求头部时(不是解析单个请求头,而是解析整个请求头部),如果没有请求头部,请求行后面会再跟一个CRLF;如果有请求头部,请求行后面就不会直接跟一个CRLF,而是第一个请求头的名称,但也不能以下一个CRLF作为整个请求头部的结束标识,因为如果有请求头部,每个请求头部后面都会跟一个CRLF,如果有折叠的情况,单个请求头部中可能就会出现多个CRLF,而是应该在解析到下一个CRLF的时候,如果下下2个字节也是CRLF的话,这个CRLF才是整个请求头部的结束标识。

 

这里列举一些例子:

Method SP Request-URI SP HTTP-Version CRLF

CRLF

这里没有请求头,请求行后面直接再跟一个CRLF。

Method SP Request-URI SP HTTP-Version CRLF

X-Token:CRLF

CRLF

这里有一个X-Token头部,但没有值。

Method SP Request-URI SP HTTP-Version CRLF

X-Token:SP SP SP SP CRLF

CRLF

这里有一个X-Token头部,但没有值。

Method SP Request-URI SP HTTP-Version CRLF

X-Token:HT HT HT HT CRLF

CRLF

这里有一个X-Token头部,但没有值。

Method SP Request-URI SP HTTP-Version CRLF

X-Token:SP HT SP HT CRLF

CRLF

这里有一个X-Token头部,但没有值。

Method SP Request-URI SP HTTP-Version CRLF

X-Token:CRLF SP CRLF

CRLF

这里有一个X-Token头部,没有值,但出现了折叠。

Method SP Request-URI SP HTTP-Version CRLF

X-Token:CRLF HT CRLF

CRLF

这里有一个X-Token头部,没有值,但出现了折叠。

Method SP Request-URI SP HTTP-Version CRLF

X-Token: CRLF HT HT HT HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF HT CRLF

CRLF

这里有一个X-Token头部,没有值,但出现了折叠。

 

另一篇文章参考:https://lobin.iteye.com/blog/2438193

 

简单例子:

GET / HTTP/1.1

User-Agent: curl/7.30.0

Host: localhost

Accept: */*

 

 

 

复杂点的例子:

GET / HTTP/1.1

Host: localhost

Connection: keep-alive

Cache-Control: max-age=0

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8

Upgrade-Insecure-Requests: 1

User-Agent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/49.0.2623.112 Safari/537.36

Accept-Encoding: gzip, deflate, sdch

Accept-Language: zh-CN,zh;q=0.8

 

 

 

 

 

 

文件上传的请求例子:

POST /upload?version=1.1 HTTP/1.1
User-Agent: Jakarta Commons-HttpClient/3.1
Host: localhost:8081
Content-Length: 7209
Content-Type: multipart/form-data; boundary=kyyDslUM6Gm9renr5e6Hh_lUgJyRhHpKe5F

--kyyDslUM6Gm9renr5e6Hh_lUgJyRhHpKe5F
Content-Disposition: form-data; name="newSecret"
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 8bit

c5a0f4e6-af9e-4100-91d9-cc04dcb0d14d
--kyyDslUM6Gm9renr5e6Hh_lUgJyRhHpKe5F
Content-Disposition: form-data; name="fileName"
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

ABC装饰模式.bmp
--kyyDslUM6Gm9renr5e6Hh_lUgJyRhHpKe5F
Content-Disposition: form-data; name="file"; filename="????.bmp"; filename="装饰模式.bmp"
Content-Type: application/octet-stream; charset=utf-8
Content-Transfer-Encoding: binary

�PNG

IHDR  )     pv�   vpAg  )  C xx�/  �IDATx���/|����ͯ���]q�E�Zq����:�����8z�z�=Ź�8ע�+(ꨣ�R�{��կȏ�e�Lv��3�<�y^!�l&�L���ɦs}} �����  �u�=  ��=  ��=  ��=  ��=  �D��p�H��|G
P��PCG��{:���H&'`"?��=  ��97 �h"sϯ���
@�1ᆚ��P�ިI�  ����x D�{ �Ś+jb� �� �k"s�=�_�Q�������U  �O�z�-  �D�{  ���=���b�51y�d2����O&��/^�Ȟ>{������/d�� �������677S�������xP��97���K�=���5�������]MH"2� plooocc#�O����_��]�z�R&�������w.//�F�^��v�q�R������S��x0������^�w||���8������ښ�JA$�sn\� �诿�������e�&�C�܃������$��������*�!N�A���͈�8"�{�� ���  ��f�G�� �h"s{�@4��  ��� �g�q�  \�{X�V������u���tro���Tt����{R?Z�1�<xppp��S\�|ln���πHh��L&���;;;�+�d��(s�3������`0��%�冢��2�m�#LSH2��TP����5u����������e
��B��la� %u�(1�Ʀ2�'2��� -q~~���^s���aP��|D��1�٘8����m
������� ��&�}U���du:�쨢�]����7R�-:r\8Y���\��#3�".0�܏oǏ��v�#��&Y���ֻ}U�����R"[��㘏��� d�{Xw�8+Vu.���
�1���h���G�n0���'��A��A[�qD�{��# ���
q.�Œ��#r�7_h������@CD�@��x�
�=A�{8w�8������ɓϟ?���Ͽ����۷������h}}}gg��ݻ_�|����駟|W
?�{^�x�
�}o߾=>>���������ϙ�������˗7o�|��ݫW�F���֖�J!���7�xq�֭���ӧOKKK��`ss����w�x����������������ښ���܃BZǗ�~���t:�8������w
��Gb�]��X���vz@
����8��g��Kb�]���+:��;����x�6��>��\��fZ��zڈ������=��#(Z��u���= ��ϓD����q��� AD� h��I"s����8k}]� "7&j�O7$��uvIk|�c�6���)��NX���Fk�%�.�uvIk|���H踇�<����ڤ���s�C[�i������+�p1G���t:�k�n��
�Кx"����p��x'r� �@�I��ܣ�=��5�Z_ ��� m��<Id�Q�A�g��D�^��)��R(�#r�7�vz@
�'��=��#(Z��u���= ��ϓD����q��� ADn�e�/��@
�X���vz@
���[��g���(�=�~��i������nǠ���?h�RI��7�Z�5>�^����B|����?|���Z]]=;;�r��_��5u�6>|�<�`:m`�P�{�#>f�nj����0��� \#��{�4�� p�����1#>�-�X��H�)��P2�Ow�a&7�=q���d��K9�T���g]�G��rO���>����̟�|⠪�ג��ɤ곤�@Q/j��Ngb�Z5���k��e>�r��}*ά���ja�'�ע��2��J><~������׵������������I�}T����d�O'�D�5ݸZ�O�j�ՙ��gZ��}�望�, �\�C%��{:==������UX�MS]%>�K��:U�h��3}�����Ԙ>[��}�B�I}����ߡ�o�k�\\\�������HjX�}W���'��A�;6��3+}�@�9����O��G��|��#�}�ڸ'5~�94K��t
��S%�l��d���E�P��N�������#���ڹ�u<�~<'�
�^5L�s���zz���`�"�]\\�z���///S����_\\�um\\�[jR���3�K���դ���l��+?�R�;I�v�����_�+wf?�����Wۓ�ߵ����hkk�|�����[�������j�l+w�6U�7��N��ܳ��i�?!��W;�w��f�_�Wm=��k�t
vq�]�����2���$�oݺ����^�W����Q�.��֍fe��A�ϛ���f���7__��Q����`Z�W-_�`�hժ��@����F��Sv��揀�[�\� j����F������:tjE<�Oeh꫁^�-�,{
��&oW���?��ؚ��vř'L::0�=(�f��u9u�B�{���GĸA)?:��fNm�&�ä1"�{� >f�e.���6?Q�$x�qT"����U�{��o��\���eJ&�d��GQV����t���Q�"P��G���Rߧ�D�}k�{���2�J&���-k��z
9���aE'�'�){��w�)�xf�-w�lKu�9
��t�-̹Y�5>3�p7N��a�2u^K�'m.>hmaƜtj�mo%�L������P@rG<�|�aS\>9Q��'y�ij)*�Y0�r
9��Nrf,{g��� ��T��ގ�S�'"'���c�1���O��wF�Ta8��������n���A[��@+�
P�k� ���5+�=@sXﱀ����"N�Y��ݷ�d�����\ �L�Y�67�'���=�&�Щ/�䖙y���e�G� sn(�zO��2)�ŀ̹Y�5>���@M��Q��0�:�S���J�-� �k�rvM m�z��nj���p�f��  #��{�4�� p����Y���,<G��n�F��S�b���0c�:Y�|g�uOįeV�{P��zO���v�Rgy����,����)s��s�0�,o�"u�%���Rtk�1i���maƜ\�v��N'������w3�|�i�νjg��Q�U>���1����>]
C�
�=���I�Hr�)�n�q��E���{�-����#@:r$q�+� `�I���?N�xǸ�%w�n%H��nW��H���؊J&����0h�A�?�Sߧ�-3GsIÓhsn@���� j"� &�Y8��� � 3$��
�
 �� �� �� �� �� �� �� �� ��� ��9�o!�   %tEXtcreate-date 2010-03-07T14:25:18+00:00[[�J   %tEXtmodify-date 2010-03-07T14:25:18+00:00��~    IEND�B`�
--kyyDslUM6Gm9renr5e6Hh_lUgJyRhHpKe5F--

 

 

HTTP协议的实现,涉及到的组件包括client, UA, 以及cache, proxy, gateway,origin server等server组件。

 

client

 

UA

 

server

 

cache

 

proxy

 

gateway

 

所以HTTP应用的编写涉及到客户端的编写,包括client和UA,通常是一个浏览器,当然浏览器还包括网页的渲染和浏览器脚本的解释执行,这并不是HTTP客户端的内容。当然也可以是一个普通的客户端。还有就是客户端代理,客户端缓存。除了客户端的编写,还包括HTTP服务器的编写,HTTP服务器还包括HTTP缓存服务器,HTTP代理服务器,HTTP网关服务器的编写。

 

HTTP服务器的编写

HTTP服务器的编写通常还包括以下服务器模块,以下服务器也可以是一个单独的服务器,比如HTTP缓存服务器,代理服务器,网关服务器。

 

HTTP缓存服务器的编写

 

HTTP代理服务器的编写

 

HTTP网关服务器的编写

 

 HTTP/1.1为发送端定义了"close"连接选项以暗示响应完成后连接将被关闭。

 

 

Connection: close

 

如果客户端发送的请求中包含Connection: Keep-Alive头部,也就是说客户端会维持连接,收到服务器响应后也不主动断开连接。服务在响应该请求后可以选择立即关闭连接,在响应该请求中应该包含Connection: close头部。

Connection: Keep-Alive

 

HTTP规范中关于请求中的请求方式

HTTP规范中关于请求中的请求方式

 

Method         = "OPTIONS"                ; Section 9.2

                      | "GET"                    ; Section 9.3

                      | "HEAD"                   ; Section 9.4

                      | "POST"                   ; Section 9.5

                      | "PUT"                    ; Section 9.6

                      | "DELETE"                 ; Section 9.7

                      | "TRACE"                  ; Section 9.8

                      | "CONNECT"                ; Section 9.9

                      | extension-method

       extension-method = token

 

在RFC 1945 HTTP/1.0规范中只定义GET、HEAD、POST三种请求方式。

 

HTTP/1.1实际上存在这两个版本:RFC 2068以及加强的HTTP/1.1 RFC 2616。

 

RFC 2068 HTTP/1.1在RFC 1945 HTTP/1.0的基础上新增了OPTIONS、PUT 、DELETE、TRACE这几种请求方式。

 

RFC 2616 HTTP/1.1在RFC 2068 HTTP/1.1的基础上新增了CONNECT这个请求方式。 RFC 2616 HTTP/1.1总共定义了8中请求方式。

 

 

OPTIONS

例子:

请求

OPTIONS /test?data=aaaa HTTP/1.1\r\n

Host: localhost\r\n

\r\n

响应:

HTTP/1.1 200 

Allow: GET,HEAD,POST,PUT,PATCH,DELETE,OPTIONS

Content-Length: 0

Date: Sat, 23 Mar 2019 19:14:13 GMT

 

GET

 

这里是一个wget下载一篇html文档的例子

$ wget -d http://localhost:81

DEBUG output created by Wget 1.21 on darwin16.7.0.

 

Reading HSTS entries from /Users/admin/.wget-hsts

Converted file name 'index.html' (UTF-8) -> 'index.html' (US-ASCII)

--2022-05-25 22:25:39--  http://localhost:81/

Resolving localhost... ::1, 127.0.0.1

Caching localhost => ::1 127.0.0.1

Connecting to localhost|::1|:81... Closed fd 5

failed: Connection refused.

Connecting to localhost|127.0.0.1|:81... connected.

Created socket 5.

Releasing 0x00007fd11cf00c90 (new refcount 1).

 

---request begin---

GET / HTTP/1.1

User-Agent: Wget/1.21

Accept: */*

Accept-Encoding: identity

Host: localhost:81

Connection: Keep-Alive

 

---request end---

HTTP request sent, awaiting response... 

---response begin---

HTTP/1.1 200 OK

Server: nginx/1.22.0

Date: Wed, 25 May 2022 14:25:39 GMT

Content-Type: text/html

Content-Length: 672

Last-Modified: Wed, 25 May 2022 12:16:25 GMT

Connection: keep-alive

ETag: "628e1e19-2a0"

Accept-Ranges: bytes

 

---response end---

200 OK

Registered socket 5 for persistent reuse.

Length: 672 [text/html]

Saving to: 'index.html'

 

index.html          100%[===================>]     672  --.-KB/s    in 0s      

 

2022-05-25 22:25:39 (42.7 MB/s) - 'index.html' saved [672/672]

 

 

 

这里有一个wget下载一张图片的例子

$ wget -d http://localhost:81/start.jpg

DEBUG output created by Wget 1.21 on darwin16.7.0.

 

Reading HSTS entries from /Users/admin/.wget-hsts

Converted file name 'start.jpg' (UTF-8) -> 'start.jpg' (US-ASCII)

--2022-05-25 20:27:23--  http://localhost:81/start.jpg

Resolving localhost... ::1, 127.0.0.1

Caching localhost => ::1 127.0.0.1

Connecting to localhost|::1|:81... Closed fd 5

failed: Connection refused.

Connecting to localhost|127.0.0.1|:81... connected.

Created socket 5.

Releasing 0x00007fbe1fd00bc0 (new refcount 1).

 

---request begin---

GET /start.jpg HTTP/1.1

User-Agent: Wget/1.21

Accept: */*

Accept-Encoding: identity

Host: localhost:81

Connection: Keep-Alive

 

---request end---

HTTP request sent, awaiting response... 

---response begin---

HTTP/1.1 200 OK

Server: nginx/1.22.0

Date: Wed, 25 May 2022 12:27:23 GMT

Content-Type: image/jpeg

Content-Length: 61932

Last-Modified: Wed, 25 May 2022 12:18:25 GMT

Connection: keep-alive

ETag: "628e1e91-f1ec"

Accept-Ranges: bytes

 

---response end---

200 OK

Registered socket 5 for persistent reuse.

Length: 61932 (60K) [image/jpeg]

Saving to: 'start.jpg'

 

start.jpg           100%[===================>]  60.48K  --.-KB/s    in 0s      

 

 

2022-05-25 20:27:23 (120 MB/s) - 'start.jpg' saved [61932/61932]

 

 

这里还有一个通过Java URL打开一个Http连接并请求一张图片的例子

        String host = "192.168.0.3";
        int port = 16191;

        String url = "http://" + host + ":" + port + "/start.jpg";
        URL httpUrl = null;
        try {
            httpUrl = new URL(url);
        } catch (MalformedURLException e) {
            Log.d(getClass().getName(), "Invalid url: " + url);
        }

        try {
            URLConnection conn = httpUrl.openConnection();
            InputStream is = conn.getInputStream();

            byte[] bytes = new byte[512];
            int offset = 0, nbytes = 0;
            while ((nbytes = is.read(bytes, offset, bytes.length - offset)) > 0) {
                offset += nbytes;
                if (offset >= bytes.length) {
                    byte[] newBytes = new byte[bytes.length * 2];
                    System.arraycopy(bytes, 0, newBytes, 0, bytes.length);
                    bytes = newBytes;
                }
            }

            for (int i = 0; i < offset; i++) {
                System.out.print(Integer.toHexString(bytes[i]) + " ");
                if ((i + 1) % 8 == 0) {
                    System.out.print(" ");
                }
                if ((i + 1) % 16 == 0) {
                    System.out.println();
                }
            }

            System.out.println(new String(bytes, 0, offset));
        } catch (IOException e) {
            e.printStackTrace();
        }

以上代码将发起一个Http请求,如下:

GET /start.jpg HTTP/1.1

User-Agent: Java/11.0.11

Host: 192.168.0.3:16191

Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2

Connection: keep-alive

 

 

 

还有conditional GET,partial GET。

 

conditional GET

 

partial GET

 

HEAD

Head方法与GET一样,只是服务器不能在响应中返回消息体,除此之外,Head请求的响应消息的头部域中元信息应该和GET请求是相同的。

 

请求:

HEAD /test?data=aaaa HTTP/1.1\r\n

Host: localhost\r\n

\r\n

响应:

HTTP/1.1 200 

Content-Type: text/plain;charset=UTF-8

Content-Length: 4

Date: Sat, 23 Mar 2019 16:59:49 GMT

 

POST

POST请求的响应结果是不可缓存的,除非响应消息中包含Cache-Control或者Expires头部字段。但是返回303可引导用户代理(UA, user agent)去读取缓存的资源。

 

例子

请求:

POST /user/create2.jsp HTTP/1.1\r\n

Host: localhost\r\n

Content-Length:19\r\n

Content-Type: application/x-www-form-urlencoded\r\n

\r\n

响应:

HTTP/1.1 200 

Set-Cookie: JSESSIONID=5D18E9CF6362460F5B9835619A867B7C; Path=/; HttpOnly

Content-Type: text/html;charset=UTF-8

Content-Length: 57

Date: Sat, 23 Mar 2019 19:00:02 GMT

 

 

 

<html>

<body>

<h2>Create Success!</h2>

</body>

 

 

</html>

 

 

请求:

POST /user/create.jsp HTTP/1.1\r\n

Host: localhost\r\n

Content-Length:19\r\n

Content-Type: application/x-www-form-urlencoded\r\n

\r\n

响应:

HTTP/1.1 302 

Set-Cookie: JSESSIONID=BA2ED490B80165EB6555F81EB130A6EB; Path=/; HttpOnly

Location: http://localhost/index.html

Content-Type: text/html;charset=UTF-8

Content-Length: 0

 

Date: Sat, 23 Mar 2019 18:56:31 GMT

 

 

 

请求:

这个是从浏览器访问中抓取到的http请求。

 

POST /user/create.jsp HTTP/1.1

Host: localhost:8081

Connection: keep-alive

Content-Length: 19

Cache-Control: max-age=0

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8

Origin: http://localhost:8081

Upgrade-Insecure-Requests: 1

User-Agent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36

Content-Type: application/x-www-form-urlencoded

Referer: http://localhost:8081/user/create.html

Accept-Encoding: gzip, deflate

Accept-Language: zh-CN,zh;q=0.8

Cookie: JSESSIONID=D04DB72DCC214F53A54F12339337FE90

 

 

username=&password=

 

响应:

HTTP/1.1 302 

Set-Cookie: JSESSIONID=BB22DC186C3D7F929F4918BA62CD372E; Path=/; HttpOnly

Location: http://localhost:8081/index.html

Content-Type: text/html;charset=UTF-8

Content-Length: 0

Date: Sat, 23 Mar 2019 19:04:51 GMT

 

PUT

 

 

DELETE

 

 

TRACE

 

 

CONNECT

 

 

另外有些HTTP实现还支持PATCH。

 

PATCH

例子

请求:

PATCH /test?data=aaaa HTTP/1.1\r\n

Host: localhost\r\n

\r\n

响应

HTTP/1.1 200 

Content-Type: text/plain;charset=UTF-8

Content-Length: 4

Date: Sat, 23 Mar 2019 19:16:44 GMT

 

aaaa

 

 

 

 

 

 

HTTP规范中关于响应中的状态行、消息头部以及消息体

 

HTTP响应中的状态行、消息头

 

HTTP响应包括状态行,消息头,消息体以及实体等内容。

 

http服务器在处理请求响应时,发回一个响应消息给客户端:

Response      = Status-Line               ; Section 6.1
                       *(( general-header        ; Section 4.5
                        | response-header        ; Section 6.2
                        | entity-header ) CRLF)  ; Section 7.1
                       CRLF
                       [ message-body ]          ; Section 7.2

响应消息开始就是一个响应状态行:

Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

 

如:

HTTP1/1 100 Continue

HTTP1/1 200 OK 

HTTP1/1 304 Not Modified

HTTP1/1 404 Not Found

HTTP1/1 500 Internal Server Error

 

 

然后接着就是一组响应的头部信息,包括通用头部,响应头部,实体头部。

message-header = field-name ":" [ field-value ]
       field-name     = token
       field-value    = *( field-content | LWS )
       field-content  = <the OCTETs making up the field-value
                        and consisting of either *TEXT or combinations
                        of token, separators, and quoted-string>

 

消息体

message-body = entity-body
                    | <entity-body encoded as per Transfer-Encoding>

 

这里有一个wget下载一张图片返回404的例子

$ wget -d http://localhost:81/start1.jpg

DEBUG output created by Wget 1.21 on darwin16.7.0.

 

Reading HSTS entries from /Users/admin/.wget-hsts

Converted file name 'start1.jpg' (UTF-8) -> 'start1.jpg' (US-ASCII)

--2022-05-26 00:10:51--  http://localhost:81/start1.jpg

Resolving localhost... ::1, 127.0.0.1

Caching localhost => ::1 127.0.0.1

Connecting to localhost|::1|:81... Closed fd 5

failed: Connection refused.

Connecting to localhost|127.0.0.1|:81... connected.

Created socket 5.

Releasing 0x00007fc577600a30 (new refcount 1).

 

---request begin---

GET /start1.jpg HTTP/1.1

User-Agent: Wget/1.21

Accept: */*

Accept-Encoding: identity

Host: localhost:81

Connection: Keep-Alive

 

---request end---

HTTP request sent, awaiting response... 

---response begin---

HTTP/1.1 404 Not Found

Server: nginx/1.22.0

Date: Wed, 25 May 2022 16:10:51 GMT

Content-Type: text/html

Content-Length: 153

Connection: keep-alive

 

---response end---

404 Not Found

Registered socket 5 for persistent reuse.

Skipping 153 bytes of body: [<html>

<head><title>404 Not Found</title></head>

<body>

<center><h1>404 Not Found</h1></center>

<hr><center>nginx/1.22.0</center>

</body>

</html>

] done.

2022-05-26 00:10:51 ERROR 404: Not Found.

 

 

 

百度的一个例子:

这个是请求百度https://www.baidu.com/,一个https的例子

HTTP/1.1 200 OK

Accept-Ranges: bytes

Cache-Control: no-cache

Connection: Keep-Alive

Content-Length: 14722

Content-Type: text/html

Date: Wed, 27 Feb 2019 09:14:09 GMT

Etag: "5c653bc8-3982"

Last-Modified: Thu, 14 Feb 2019 09:58:32 GMT

P3p: CP=" OTI DSP COR IVA OUR IND COM "

Pragma: no-cache

Server: BWS/1.1

Set-Cookie: BAIDUID=D75E7EE2E2BAA148E6917C73134D337B:FG=1; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com

Set-Cookie: BIDUPSID=D75E7EE2E2BAA148E6917C73134D337B; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com

Set-Cookie: PSTM=1551258849; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com

Vary: Accept-Encoding

X-Ua-Compatible: IE=Edge,chrome=1

 

<!DOCTYPE html><!--STATUS OK-->

<html>

<head>

<meta http-equiv="content-type" content="text/html;charset=utf-8">

<meta http-equiv="X-UA-Compatible" content="IE=Edge">

<link rel="dns-prefetch" href="//s1.bdstatic.com"/>

<link rel="dns-prefetch" href="//t1.baidu.com"/>

<link rel="dns-prefetch" href="//t2.baidu.com"/>

<link rel="dns-prefetch" href="//t3.baidu.com"/>

<link rel="dns-prefetch" href="//t10.baidu.com"/>

<link rel="dns-prefetch" href="//t11.baidu.com"/>

<link rel="dns-prefetch" href="//t12.baidu.com"/>

<link rel="dns-prefetch" href="//b1.bdstatic.com"/>

<title>百度一下,你就知道</title>

<link href="https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/home/css/index.css" rel="stylesheet" type="text/css" />

<!--[if lte IE 8]><style index="index" >#content{height:480px\9}#m{top:260px\9}</style><![endif]-->

<!--[if IE 8]><style index="index" >#u1 a.mnav,#u1 a.mnav:visited{font-family:simsun}</style><![endif]-->

<script>var hashMatch = document.location.href.match(/#+(.*wd=[^&].+)/);if (hashMatch && hashMatch[0] && hashMatch[1]) {document.location.replace("http://"+location.host+"/s?"+hashMatch[1]);}var ns_c = function(){};</script>

<script>function h(obj){obj.style.behavior='url(#default#homepage)';var a = obj.setHomePage('//www.baidu.com/');}</script>

<noscript><meta http-equiv="refresh" content="0; url=/baidu.html?from=noscript"/></noscript>

<script>window._ASYNC_START=new Date().getTime();</script>

</head>

<body link="#0000cc"><div id="wrapper" style="display:none;"><div id="u"><a href="//www.baidu.com/gaoji/preferences.html"  onmousedown="return user_c({'fm':'set','tab':'setting','login':'0'})">搜索设置</a>|<a id="btop" href="/"  onmousedown="return user_c({'fm':'set','tab':'index','login':'0'})">百度首页</a>|<a id="lb" href="https://passport.baidu.com/v2/?login&tpl=mn&u=http%3A%2F%2Fwww.baidu.com%2F" onclick="return false;"  onmousedown="return user_c({'fm':'set','tab':'login'})">登录</a><a href="https://passport.baidu.com/v2/?reg&regType=1&tpl=mn&u=http%3A%2F%2Fwww.baidu.com%2F"  onmousedown="return user_c({'fm':'set','tab':'reg'})" target="_blank" class="reg">注册</a></div><div id="head"><div class="s_nav"><a href="/" class="s_logo" onmousedown="return c({'fm':'tab','tab':'logo'})"><img src="//www.baidu.com/img/baidu_jgylogo3.gif" width="117" height="38" border="0" alt="到百度首页" title="到百度首页"></a><div class="s_tab" id="s_tab"><a href="http://news.baidu.com/ns?cl=2&rn=20&tn=news&word=" wdfield="word"  onmousedown="return c({'fm':'tab','tab':'news'})">新闻</a>&#12288;<b>网页</b>&#12288;<a href="http://tieba.baidu.com/f?kw=&fr=wwwt" wdfield="kw"  onmousedown="return c({'fm':'tab','tab':'tieba'})">贴吧</a>&#12288;<a href="http://zhidao.baidu.com/q?ct=17&pn=0&tn=ikaslist&rn=10&word=&fr=wwwt" wdfield="word"  onmousedown="return c({'fm':'tab','tab':'zhidao'})">知道</a>&#12288;<a href="http://music.baidu.com/search?fr=ps&key=" wdfield="key"  onmousedown="return c({'fm':'tab','tab':'music'})">音乐</a>&#12288;<a href="http://image.baidu.com/i?tn=baiduimage&ps=1&ct=201326592&lm=-1&cl=2&nc=1&word=" wdfield="word"  onmousedown="return c({'fm':'tab','tab':'pic'})">图片</a>&#12288;<a href="http://v.baidu.com/v?ct=301989888&rn=20&pn=0&db=0&s=25&word=" wdfield="word"   onmousedown="return c({'fm':'tab','tab':'video'})">视频</a>&#12288;<a href="http://map.baidu.com/m?word=&fr=ps01000" wdfield="word"  onmousedown="return c({'fm':'tab','tab':'map'})">地图</a>&#12288;<a href="http://wenku.baidu.com/search?word=&lm=0&od=0" wdfield="word"  onmousedown="return c({'fm':'tab','tab':'wenku'})">文库</a>&#12288;<a href="//www.baidu.com/more/"  onmousedown="return c({'fm':'tab','tab':'more'})">更多»</a></div></div><form id="form" name="f" action="/s" class="fm" ><input type="hidden" name="ie" value="utf-8"><input type="hidden" name="f" value="8"><input type="hidden" name="rsv_bp" value="1"><span class="bg s_ipt_wr"><input name="wd" id="kw" class="s_ipt" value="" maxlength="100"></span><span class="bg s_btn_wr"><input type="submit" id="su" value="百度一下" class="bg s_btn" onmousedown="this.className='bg s_btn s_btn_h'" onmouseout="this.className='bg s_btn'"></span><span class="tools"><span id="mHolder"><div id="mCon"><span>输入法</span></div><ul id="mMenu"><li><a href="javascript:;" name="ime_hw">手写</a></li><li><a href="javascript:;" name="ime_py">拼音</a></li><li class="ln"></li><li><a href="javascript:;" name="ime_cl">关闭</a></li></ul></span><span class="shouji"><strong>推荐&nbsp;:&nbsp;</strong><a href="http://w.x.baidu.com/go/mini/8/10000020" onmousedown="return ns_c({'fm':'behs','tab':'bdbrowser'})">百度浏览器,打开网页快2秒!</a></span></span></form></div><div id="content"><div id="u1"><a href="http://news.baidu.com" name="tj_trnews" class="mnav">新闻</a><a href="http://www.hao123.com" name="tj_trhao123" class="mnav">hao123</a><a href="http://map.baidu.com" name="tj_trmap" class="mnav">地图</a><a href="http://v.baidu.com" name="tj_trvideo" class="mnav">视频</a><a href="http://tieba.baidu.com" name="tj_trtieba" class="mnav">贴吧</a><a href="https://passport.baidu.com/v2/?login&tpl=mn&u=http%3A%2F%2Fwww.baidu.com%2F" name="tj_login" id="lb" onclick="return false;">登录</a><a href="//www.baidu.com/gaoji/preferences.html" name="tj_settingicon" id="pf">设置</a><a href="//www.baidu.com/more/" name="tj_briicon" id="bri">更多产品</a></div><div id="m"><p id="lg"><img src="//www.baidu.com/img/bd_logo.png" width="270" height="129"></p><p id="nv"><a href="http://news.baidu.com">新&nbsp;闻</a> <b>网&nbsp;页</b> <a href="http://tieba.baidu.com">贴&nbsp;吧</a> <a href="http://zhidao.baidu.com">知&nbsp;道</a> <a href="http://music.baidu.com">音&nbsp;乐</a> <a href="http://image.baidu.com">图&nbsp;片</a> <a href="http://v.baidu.com">视&nbsp;频</a> <a href="http://map.baidu.com">地&nbsp;图</a></p><div id="fm"><form id="form1" name="f1" action="/s" class="fm"><span class="bg s_ipt_wr"><input type="text" name="wd" id="kw1" maxlength="100" class="s_ipt"></span><input type="hidden" name="rsv_bp" value="0"><input type=hidden name=ch value=""><input type=hidden name=tn value="baidu"><input type=hidden name=bar value=""><input type="hidden" name="rsv_spt" value="3"><input type="hidden" name="ie" value="utf-8"><span class="bg s_btn_wr"><input type="submit" value="百度一下" id="su1" class="bg s_btn" onmousedown="this.className='bg s_btn s_btn_h'" onmouseout="this.className='bg s_btn'"></span></form><span class="tools"><span id="mHolder1"><div id="mCon1"><span>输入法</span></div></span></span><ul id="mMenu1"><div class="mMenu1-tip-arrow"><em></em><ins></ins></div><li><a href="javascript:;" name="ime_hw">手写</a></li><li><a href="javascript:;" name="ime_py">拼音</a></li><li class="ln"></li><li><a href="javascript:;" name="ime_cl">关闭</a></li></ul></div><p id="lk"><a href="http://baike.baidu.com">百科</a> <a href="http://wenku.baidu.com">文库</a> <a href="http://www.hao123.com">hao123</a><span>&nbsp;|&nbsp;<a href="//www.baidu.com/more/">更多&gt;&gt;</a></span></p><p id="lm"></p></div></div><div id="ftCon"><div id="ftConw"><p id="lh"><a id="seth" onClick="h(this)" href="/" onmousedown="return ns_c({'fm':'behs','tab':'homepage','pos':0})">把百度设为主页</a><a id="setf" href="//www.baidu.com/cache/sethelp/index.html" onmousedown="return ns_c({'fm':'behs','tab':'favorites','pos':0})" target="_blank">把百度设为主页</a><a onmousedown="return ns_c({'fm':'behs','tab':'tj_about'})" href="http://home.baidu.com">关于百度</a><a onmousedown="return ns_c({'fm':'behs','tab':'tj_about_en'})" href="http://ir.baidu.com">About Baidu</a></p><p id="cp">&copy;2018&nbsp;Baidu&nbsp;<a href="/duty/" name="tj_duty">使用百度前必读</a>&nbsp;京ICP证030173号&nbsp;<img src="http://s1.bdstatic.com/r/www/cache/static/global/img/gs_237f015b.gif"></p></div></div><div id="wrapper_wrapper"></div></div><div class="c-tips-container" id="c-tips-container"></div>

<script>window.__async_strategy=2;</script>

<script>var bds={se:{},su:{urdata:[],urSendClick:function(){}},util:{},use:{},comm : {domain:"http://www.baidu.com",ubsurl : "http://sclick.baidu.com/w.gif",tn:"baidu",queryEnc:"",queryId:"",inter:"",templateName:"baidu",sugHost : "http://suggestion.baidu.com/su",query : "",qid : "",cid : "",sid : "",indexSid : "",stoken : "",serverTime : "",user : "",username : "",loginAction : [],useFavo : "",pinyin : "",favoOn : "",curResultNum:"",rightResultExist:false,protectNum:0,zxlNum:0,pageNum:1,pageSize:10,newindex:0,async:1,maxPreloadThread:5,maxPreloadTimes:10,preloadMouseMoveDistance:5,switchAddMask:false,isDebug:false,ishome : 1},_base64:{domain : "http://b1.bdstatic.com/",b64Exp : -1,pdc : 0}};var name,navigate,al_arr=[];var selfOpen = window.open;eval("var open = selfOpen;");var isIE=navigator.userAgent.indexOf("MSIE")!=-1&&!window.opera;var E = bds.ecom= {};bds.se.mon = {'loadedItems':[],'load':function(){},'srvt':-1};try {bds.se.mon.srvt = parseInt(document.cookie.match(new RegExp("(^| )BDSVRTM=([^;]*)(;|$)"))[2]);document.cookie="BDSVRTM=;expires=Sat, 01 Jan 2000 00:00:00 GMT"; }catch(e){}</script>

<script>if(!location.hash.match(/[^a-zA-Z0-9]wd=/)){document.getElementById("ftCon").style.display='block';document.getElementById("u1").style.display='block';document.getElementById("content").style.display='block';document.getElementById("wrapper").style.display='block';setTimeout(function(){try{document.getElementById("kw1").focus();document.getElementById("kw1").parentNode.className += ' iptfocus';}catch(e){}},0);}</script>

<script type="text/javascript" src="https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/protocol/https/jquery/jquery-1.10.2.min_f2fb5194.js"></script>

<script>(function(){var index_content = $('#content');var index_foot= $('#ftCon');var index_css= $('head [index]');var index_u= $('#u1');var result_u= $('#u');var wrapper=$("#wrapper");window.index_on=function(){index_css.insertAfter("meta:eq(0)");result_common_css.remove();result_aladdin_css.remove();result_sug_css.remove();index_content.show();index_foot.show();index_u.show();result_u.hide();wrapper.show();if(bds.su&&bds.su.U&&bds.su.U.homeInit){bds.su.U.homeInit();}setTimeout(function(){try{$('#kw1').get(0).focus();window.sugIndex.start();}catch(e){}},0);if(typeof initIndex=='function'){initIndex();}};window.index_off=function(){index_css.remove();index_content.hide();index_foot.hide();index_u.hide();result_u.show();result_aladdin_css.insertAfter("meta:eq(0)");result_common_css.insertAfter("meta:eq(0)");result_sug_css.insertAfter("meta:eq(0)");wrapper.show();};})();</script>

<script>window.__switch_add_mask=1;</script>

<script type="text/javascript" src="https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/global/js/instant_search_newi_redirect1_20bf4036.js"></script>

<script>initPreload();$("#u,#u1").delegate("#lb",'click',function(){try{bds.se.login.open();}catch(e){}});if(navigator.cookieEnabled){document.cookie="NOJS=;expires=Sat, 01 Jan 2000 00:00:00 GMT";}</script>

<script>$(function(){for(i=0;i<3;i++){u($($('.s_ipt_wr')[i]),$($('.s_ipt')[i]),$($('.s_btn_wr')[i]),$($('.s_btn')[i]));}function u(iptwr,ipt,btnwr,btn){if(iptwr && ipt){iptwr.on('mouseover',function(){iptwr.addClass('ipthover');}).on('mouseout',function(){iptwr.removeClass('ipthover');}).on('click',function(){ipt.focus();});ipt.on('focus',function(){iptwr.addClass('iptfocus');}).on('blur',function(){iptwr.removeClass('iptfocus');}).on('render',function(e){var $s = iptwr.parent().find('.bdsug');var l = $s.find('li').length;if(l>=5){$s.addClass('bdsugbg');}else{$s.removeClass('bdsugbg');}});}if(btnwr && btn){btnwr.on('mouseover',function(){btn.addClass('btnhover');}).on('mouseout',function(){btn.removeClass('btnhover');});}}});</script>

<script type="text/javascript" src="https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/home/js/bri_7f1fa703.js"></script>

<script>(function(){var _init=false;window.initIndex=function(){if(_init){return;}_init=true;var w=window,d=document,n=navigator,k=d.f1.wd,a=d.getElementById("nv").getElementsByTagName("a"),isIE=n.userAgent.indexOf("MSIE")!=-1&&!window.opera;(function(){if(/q=([^&]+)/.test(location.search)){k.value=decodeURIComponent(RegExp["\x241"])}})();(function(){var u = G("u1").getElementsByTagName("a"), nv = G("nv").getElementsByTagName("a"), lk = G("lk").getElementsByTagName("a"), un = "";var tj_nv = ["news","tieba","zhidao","mp3","img","video","map"];var tj_lk = ["baike","wenku","hao123","more"];un = bds.comm.user == "" ? "" : bds.comm.user;function _addTJ(obj){addEV(obj, "mousedown", function(e){var e = e || window.event;var target = e.target || e.srcElement;if(target.name){ns_c({'fm':'behs','tab':target.name,'un':encodeURIComponent(un)});}});}for(var i = 0; i < u.length; i++){_addTJ(u[i]);}for(var i = 0; i < nv.length; i++){nv[i].name = 'tj_' + tj_nv[i];}for(var i = 0; i < lk.length; i++){lk[i].name = 'tj_' + tj_lk[i];}})();(function() {var links = {'tj_news': ['word', 'http://news.baidu.com/ns?tn=news&cl=2&rn=20&ct=1&ie=utf-8'],'tj_tieba': ['kw', 'http://tieba.baidu.com/f?ie=utf-8'],'tj_zhidao': ['word', 'http://zhidao.baidu.com/search?pn=0&rn=10&lm=0'],'tj_mp3': ['key', 'http://music.baidu.com/search?fr=ps&ie=utf-8'],'tj_img': ['word', 'http://image.baidu.com/i?ct=201326592&cl=2&nc=1&lm=-1&st=-1&tn=baiduimage&istype=2&fm=&pv=&z=0&ie=utf-8'],'tj_video': ['word', 'http://video.baidu.com/v?ct=301989888&s=25&ie=utf-8'],'tj_map': ['wd', 'http://map.baidu.com/?newmap=1&ie=utf-8&s=s'],'tj_baike': ['word', 'http://baike.baidu.com/search/word?pic=1&sug=1&enc=utf8'],'tj_wenku': ['word', 'http://wenku.baidu.com/search?ie=utf-8']};var domArr = [G('nv'), G('lk'),G('cp')],kw = G('kw1');for (var i = 0, l = domArr.length; i < l; i++) {domArr[i].onmousedown = function(e) {e = e || window.event;var target = e.target || e.srcElement,name = target.getAttribute('name'),items = links[name],reg = new RegExp('^\\s+|\\s+\x24'),key = kw.value.replace(reg, '');if (items) {if (key.length > 0) {var wd = items[0], url = items[1],url = url + ( name === 'tj_map' ? encodeURIComponent('&' + wd + '=' + key) : ( ( url.indexOf('?') > 0 ? '&' : '?' ) + wd + '=' + encodeURIComponent(key) ) );target.href = url;} else {target.href = target.href.match(new RegExp('^http:\/\/.+\.baidu\.com'))[0];}}name && ns_c({'fm': 'behs','tab': name,'query': encodeURIComponent(key),'un': encodeURIComponent(bds.comm.user || '') });};}})();};if(window.pageState==0){initIndex();}})();document.cookie = 'IS_STATIC=1;expires=' + new Date(new Date().getTime() + 10*60*1000).toGMTString();</script>

</body></html>

 

这是我自己的一个例子:

这个是请求我自己的例子http://localhost:8081/test?data=aaaa,一个http的例子

HTTP/1.1 200 

Content-Type: text/plain;charset=UTF-8

Content-Length: 4

Date: Wed, 27 Feb 2019 09:20:18 GMT

 

aaaa

 

如果发送http请求消息:

GET /test?data=aaaa\r\n

返回

aaaa

这里只把消息内容发回给了客户端,没有响应状态行,也没有必要的响应头,这个响应开始有点让人匪夷所思。而且在这个例子中,发送的http请求没有请求头部信息,只发送了一个请求行:GET /test?data=aaaa\r\n,还不完整,少了协议和版本信息,完整的请求行信息应该是:GET /test?data=aaaa HTTP/1.1

 

Java Http & Https请求客户端代码-直接基于Socket,不依赖其他第三方库

 

HttpInputStream
 
public class HttpInputStream extends InputStream {
 
private InputStream inputStream;
 
public HttpInputStream(InputStream inputStream) {
this.inputStream = inputStream;
}
 
@Override
public int read() throws IOException {
return inputStream.read();
}
 
public int read(byte b[]) throws IOException {
return inputStream.read(b);
}
 
public int read(byte b[], int off, int len) throws IOException {
return inputStream.read(b, off, len);
}
 
public long skip(long n) throws IOException {
return inputStream.skip(n);
}
 
public int available() throws IOException {
return inputStream.available();
}
 
public void close() throws IOException {
inputStream.close();
}
 
}
 
 
HttpOutputStream
 
public class HttpOutputStream extends OutputStream {
 
private OutputStream outputStream;
 
public HttpOutputStream(OutputStream outputStream) {
this.outputStream = outputStream;
}
 
@Override
public void write(int b) throws IOException {
outputStream.write(b);
}
 
public void write(byte b[]) throws IOException {
outputStream.write(b);
}
 
public void write(byte b[], int off, int len) throws IOException {
outputStream.write(b, off, len);
}
 
public void flush() throws IOException {
outputStream.flush();
}
 
public void close() throws IOException {
outputStream.close();
}
}
 
 
HttpConnection
 
public class HttpConnection {
 
private String host;
 
private int port;
 
private boolean secure;
 
private Socket socket;
 
private InputStream is;
 
private OutputStream os;
 
public HttpConnection(String host, int port) {
this(host, port, false);
}
 
public HttpConnection(String host, int port, boolean secure) {
this.host = host;
this.port = port;
this.secure = secure;
}
 
public void open() throws IOException {
if (secure) {
socket = SSLSocketFactory.getDefault().createSocket(host, port);
} else {
socket = new Socket(host, port);
}
is = new HttpInputStream(socket.getInputStream());
os = new HttpOutputStream(socket.getOutputStream());
}
 
public static HttpConnection getConnection(String host, int port) throws IOException {
HttpConnection connection = new HttpConnection(host, port);
connection.open();
return connection;
}
public static HttpConnection getConnection(String host, int port, boolean secure) throws IOException {
HttpConnection connection = new HttpConnection(host, port, secure);
connection.open();
return connection;
}
 
public InputStream getInputStream() {
return is;
}
 
public OutputStream getOutputStream() {
return os;
}
}
 
Http请求客户端代码:
 
@Test
public void http() throws IOException {
HttpConnection connection = HttpConnection.getConnection("www.baidu.com", 80);
try {
OutputStreamWriter writer = new OutputStreamWriter(connection.getOutputStream());
//writer.write("GET http://m.alaemall.com/default.shtml HTTP/1.0\r\n");
//writer.write("GET /default.shtml HTTP/1.0\r\n");
//writer.write("GET http://m.alaemall.com/default.shtml HTTP/1.1\r\n");
//writer.write("GET /default.shtml HTTP/1.1\r\n");
writer.write("GET / HTTP/1.1\r\n");
//writer.write("Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n");
//writer.write("Accept-Encoding:gzip,deflate,sdch\r\n");
//writer.write("Accept-Language:zh-CN,zh;q=0.8\r\n");
//writer.write("Cache-Control:max-age=0\r\n");
//writer.write("Connection:keep-alive\r\n");
//writer.write("Cookie:JSESSIONID=98F8CB4369EAD78CC321B2E5BD344191\r\n");
//writer.write("Host:m.alaemall.com\r\n");
//writer.write("Host:Host:m.alaemall.com\r\n");
//writer.write("Host:m.alaepay.com\r\n");
writer.write("Host:www.baidu.com\r\n");
//writer.write("Host: \r\n");
writer.write("\r\n");
writer.flush();
InputStream is = connection.getInputStream();
StringBuilder sb = new StringBuilder();
byte[] b = new byte[512];
int nb = -1;
while ((nb = is.read(b)) != -1) {
sb.append(new String(b, 0, nb));
}
System.out.println(sb.toString());
} catch (UnknownHostException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
 
Https请求客户端代码:
 
@Test
public void https() throws IOException {
HttpConnection connection = HttpConnection.getConnection("www.baidu.com", 443, true);
try {
OutputStreamWriter writer = new OutputStreamWriter(connection.getOutputStream());
//writer.write("GET http://m.alaemall.com/default.shtml HTTP/1.0\r\n");
//writer.write("GET /default.shtml HTTP/1.0\r\n");
//writer.write("GET http://m.alaemall.com/default.shtml HTTP/1.1\r\n");
//writer.write("GET /default.shtml HTTP/1.1\r\n");
writer.write("GET / HTTP/1.1\r\n");
//writer.write("Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n");
//writer.write("Accept-Encoding:gzip,deflate,sdch\r\n");
//writer.write("Accept-Language:zh-CN,zh;q=0.8\r\n");
//writer.write("Cache-Control:max-age=0\r\n");
//writer.write("Connection:keep-alive\r\n");
//writer.write("Cookie:JSESSIONID=98F8CB4369EAD78CC321B2E5BD344191\r\n");
//writer.write("Host:m.alaemall.com\r\n");
//writer.write("Host:Host:m.alaemall.com\r\n");
//writer.write("Host:m.alaepay.com\r\n");
writer.write("Host:www.baidu.com\r\n");
//writer.write("Host: \r\n");
writer.write("\r\n");
writer.flush();
InputStream is = connection.getInputStream();
StringBuilder sb = new StringBuilder();
byte[] b = new byte[512];
int nb = -1;
while ((nb = is.read(b)) != -1) {
sb.append(new String(b, 0, nb));
}
System.out.println(sb.toString());
} catch (UnknownHostException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
 
返回结果:
 
HTTP/1.1 200 OK
Server: bfe/1.0.8.18
Date: Thu, 29 Sep 2016 15:38:28 GMT
Content-Type: text/html
Content-Length: 14720
Connection: keep-alive
Last-Modified: Tue, 30 Aug 2016 09:16:00 GMT
Vary: Accept-Encoding
Set-Cookie: BAIDUID=FCC74F1440586BD1DE5DDDCCE6B4ABB5:FG=1; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com
Set-Cookie: BIDUPSID=FCC74F1440586BD1DE5DDDCCE6B4ABB5; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com
Set-Cookie: PSTM=1475163508; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com
P3P: CP=" OTI DSP COR IVA OUR IND COM "
X-UA-Compatible: IE=Edge,chrome=1
Pragma: no-cache
Cache-control: no-cache
Accept-Ranges: bytes
Set-Cookie: __bsi=14310701737733550345_00_263_N_N_1_0301_002F_N_N_N_0; expires=Thu, 29-Sep-16 15:38:33 GMT; domain=www.baidu.com; path=/
 
<!DOCTYPE html><!--STATUS OK-->
<html>
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=Edge">
。。。。。。

 

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics