`

HttpClient设置HTTP请求头Header

阅读更多

用Firebug对POST的数据进行监控 请求 HTTP头 信息,得到如下内容:

Accept	application/json, text/javascript, */*
Accept-Encoding	gzip, deflate
Accept-Language	en-us,en;q=0.5
Cache-Control	no-cache
Content-Length	432
Content-Type	application/x-www-form-urlencoded; charset=UTF-8
Host	www.huaxixiang.com
Pragma	no-cache
Proxy-Connection	keep-alive
Refere   http://www.huaxixiang.com/CrossPriceDetail.action
User-Agent	Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko/20100101 Firefox/11.0
X-Requested-With	XMLHttpRequest

 

 

用HttpClient模仿浏览器访问页面,加载URL的HTML信息,为了良好的加载网站的信息,不被限制.

为了说明请求头的信息添加了一个小测试项目LoginTest,添加页面index.jsp,添加主要代码打印Http Header的JSP页面.

主要打印Http Header信息.

1. index.jsp

 

out.println("Protocol: " + request.getProtocol()); 
out.println("Scheme: " + request.getScheme()); 
out.println("Server Name: " + request.getServerName() ); 
out.println("Server Port: " + request.getServerPort()); 
out.println("Protocol: " + request.getProtocol()); 
out.println("Server Info: " + getServletConfig().getServletContext().getServerInfo()); 
out.println("Remote Addr: " + request.getRemoteAddr());
out.println("Remote Host: " + request.getRemoteHost()); 
out.println("Character Encoding: " + request.getCharacterEncoding()); 
out.println("Content Length: " + request.getContentLength()); 
out.println("Content Type: "+ request.getContentType()); 
out.println("Auth Type: " + request.getAuthType()); 
out.println("HTTP Method: " + request.getMethod()); 
out.println("Path Info: " + request.getPathInfo()); 
out.println("Path Trans: " + request.getPathTranslated()); 
out.println("Query String: " + request.getQueryString()); 
out.println("Remote User: " + request.getRemoteUser()); 
out.println("Session Id: " + request.getRequestedSessionId()); 
out.println("Request URI: " + request.getRequestURI()); 
out.println("Servlet Path: " + request.getServletPath()); 
out.println("Accept: " + request.getHeader("Accept")); 
out.println("Host: " + request.getHeader("Host")); 
out.println("Referer : " + request.getHeader("Referer")); 
out.println("Accept-Language : " + request.getHeader("Accept-Language")); 
out.println("Accept-Encoding : " + request.getHeader("Accept-Encoding")); 
out.println("User-Agent : " + request.getHeader("User-Agent")); 
out.println("Connection : " + request.getHeader("Connection")); 
out.println("Cookie : " + request.getHeader("Cookie")); 
out.println("Created : " + session.getCreationTime()); 
out.println("LastAccessed : " + session.getLastAccessedTime()); 

 

2. 使用IE浏览器加载http://127.0.0.1:8080/LoginTest/index.jsp返回内容如下:

 

Protocol: HTTP/1.1 
Scheme: http 
Server Name: 127.0.0.1 
Server Port: 8080 
Protocol: HTTP/1.1 
Server Info: Apache Tomcat/6.0.18 
Remote Addr: 127.0.0.1 
Remote Host: 127.0.0.1 
Character Encoding: null 
Content Length: -1 
Content Type: null 
Auth Type: null 
HTTP Method: GET 
Path Info: null 
Path Trans: null 
Query String: null 
Remote User: null 
Session Id: E2C384C095E34AD355684EB554517FB1 
Request URI: /LoginTest/index.jsp 
Servlet Path: /index.jsp 
Accept: */* 
Host: 127.0.0.1:8080 
Referer : null 
Accept-Language : en-us 
Accept-Encoding : gzip, deflate 
User-Agent : Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.3; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E) 
Connection : Keep-Alive 
Cookie : JSESSIONID=E2C384C095E34AD355684EB554517FB1 
Created : 1322294859981 
LastAccessed : 1322294859981

 

3.  后面使用HttpClient不设置header信息加载http://127.0.0.1:8080/LoginTest/index.jsp信息如下:

 

Protocol: HTTP/1.1
Scheme: httpServer 
Name: 127.0.0.1
Server Port: 8080
Protocol: HTTP/1.1
Server Info: Apache Tomcat/6.0.18
Remote Addr: 127.0.0.1
Remote Host: 127.0.0.1
Character Encoding: null
Content Length: -1
Content Type: null
Auth Type: null
HTTP Method: GET
Path Info: null
Path Trans: null
Query String: null
Remote User: null
Session Id: null
Request URI: /LoginTest/index.jspServlet 
Path: /index.jsp
Accept: null
Host: 127.0.0.1:8080
Referer : null
Accept-Language : null
Accept-Encoding : null
User-Agent : Apache-HttpClient/4.1.1 (java 1.5)
Connection : Keep-Alive
Cookie : null
Created : 1322293843369
LastAccessed : 1322293843369

 

分析: 由于这里纯粹加载页面,没有动用CookieStore自动管理Cookie,在上面没有能显示Cookie,SessionID的信息,区别于浏览器的的User-Agent,Cookie,SessionID,Accept,Accept-Language,Accept-Encoding等信息都没有进行设置.

对于爬取网站在HttpClient中设置Host,Referer,User-Agent,Connection,Cookie和爬取的频率和入口Url有讲究.

 

4. 考虑设置HttpClient的Header信息代码:

 

HashMap<String, String> headers = new HashMap<String, String>();
headers.put("Referer", p.url);
headers.put("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.2.6) Gecko/20100625 

Firefox/3.6.6 Greatwqs");
headers.put("Accept","text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
headers.put("Accept-Language","zh-cn,zh;q=0.5");
headers.put("Host","www.yourdomain.com");
headers.put("Accept-Charset","ISO-8859-1,utf-8;q=0.7,*;q=0.7");
headers.put("Referer", "http://www.yourdomian.com/xxxAction.html");
HttpRequestBase httpget = ......
httpget.setHeaders(headers);

 

 5. 由新的HttpClient执行http://127.0.0.1:8080/LoginTest/index.jsp得到的HTML信息如下:

 

Protocol: HTTP/1.1
Scheme: http
Server Name: www.yourdomain.com
Server Port: 80
Protocol: HTTP/1.1
Server Info: Apache Tomcat/6.0.18
Remote Addr: 127.0.0.1
Remote Host: 127.0.0.1
Character Encoding: null
Content Length: -1
Content Type: null
Auth Type: null
HTTP Method: GET
Path Info: null
Path Trans: null
Query String: null
Remote User: null
Session Id: null
Request URI: /LoginTest/index.jsp
Servlet Path: /index.jsp
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Host: www.yourdomain.com
Referer : http://www.yourdomian.com/xxxAction.html
Accept-Language : zh-cn,zh;q=0.5
Accept-Encoding : null
User-Agent : Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.2.6) Gecko/20100625 Firefox/3.6.6 

Greatwqs
Connection : Keep-Alive
Cookie : null
Created : 1322294148709
LastAccessed : 1322294148709

 

分享到:
评论

相关推荐

    httpclient用法,发送get和post请求,设置header

    httpclient的用法,发送get请求和post请求,设置header

    SpringBoot使用httpclient发送Post请求时

    public static String post(String url, String params){ log.info("post url:" + url + " params:" + ... httpPost.setHeader("Content-type", "application/json"); httpPost.setEntity(stringEntity); Closeable

    HttpClient模拟get,post请求并发送请求参数(json等)

    NULL 博文链接:https://javasam.iteye.com/blog/2117845

    httpClient

    // 设置请求重试处理,用的是默认的重试处理:请求三次 getMethod.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, new DefaultHttpMethodRetryHandler()); String response = ""; /* 3 执行 HTTP ...

    Delphi两种方法演示HTTP请求:GET/POST.rar

    Delphi两种方法演示HTTP请求:GET/POST方法使用演示,程序将演示使用这两种方法建立会话、建立一个http请求句柄、发送一个指定请求到httpserver、接收header信息和一个http请求、指向一个接收请求信息的缓冲区的指针...

    kettle抽取http请求,发送json参数.docx

    kettle抽取http请求,发送json参数示例

    异步HttpClient

    同步http请求将导致 tomcat 的业务线程被阻塞。一旦某接口网络出现问题,可能会阻塞tomcat业务线程,从而无法处理正常业务。很多公司使用另开线程池的方式进行异步调用来解决tomcat线程阻塞问题。但由于本系统中接口...

    HttpClientUtils

    一款基于HttpClient的工具类。 可二次开发 ,目前已集成文件上传 参数传递 Json传输 请求头添加等功能

    HttpClient获取OAuth2.0中的code

    通过httpclient post去获取,response返回码是302,返回的code放在header的Location中。 请求的时候client_id,response_type,redirect_uri,state拼接在url后面,account和password放在body表单(x-...

    httpclient绕过登陆验证码直接抓取内部数据

    在使用 HttpClient 时,我们可以通过 setCookieStore 方法将 Cookie 添加到 HttpClient 中,这样在下次请求中,HttpClient 会自动携带 Cookie。但是,这种方法需要注意 Cookie 的路径、域名和过期时间等信息,如果不...

    HttpClientUtil.java

    Java的HttpClient帮助类 自己以前写的java模拟请求帮助类 ...1、包含header头构造 2、会话session维持 3、使用HttpClient 详细见博客链接:https://blog.csdn.net/zhulinniao/article/details/103651687

    httpclient

    httpclient HTTP客户端安装: go get github.com/go-light/httpclient/v2用法发出简单的GET请求headers:= http.Header {} headers.Set(“ Content-Type”,“ application / json”)headers.Set(“ Accept-...

    Paw HTTP Client MAC

    Paw HTTP Client 是Mac OS下最好用的HTTP客户端模拟测试工具,可以让Web开发者设置各种请求Header和参数,模拟发送HTTP请求,测试响应数据,支持OAuth, HTTP Basic Auth, Cookies等,这对于开发Web服务的应用很有...

    HTTP请求库java-requests.zip

    Java的世界里,HttpClient 是一个功能强大的Http请求库,然而接口非常复杂,设计上遵从正交性,简单的请求也需要写比较多的代码,更不要说隐藏在各种细节里面的高级用法了。Requests, 是一个模仿python requests ...

    HttpClient:php使用socket模拟post、get操作,支持http、socket4、5代理

    GET请求 $ http = new HttpClient ();$ http -&gt; set_header ( 'User-Agent' , 'Mozilla/5.0 (Windows NT 5.1; rv:13.0) Gecko/20100101 Firefox/13.0.1' ); $ http -&gt; get ( 'http://www.example.com/' ); echo $ ...

    HttpConnectionTool工具类

    通过HttpConnection请求跨域接口 public String invokeServiceMethod(String url,Map,String&gt; params,Map, String&gt; header) throws Exception{ HttpClient httpclient=new DefaultHttpClient(); HttpResponse...

    一个简单好用的http请求框架.rar

    有的需要在header里放置签名,有的需要SSL的双向认证,有的只需要SSL的单向认证;有的以JSON 方式进行序列化,有的以XML方式进行序列化。类似于这样细节的差别太多了。 不同的公司API规范不一样,这很正常。但是...

    HttpClientFactory工厂类

    HttpClientFactory引入即可使用。 //get请求 public static HttpResponse doGet(String url, Header[] headers);...//获取响应头的字符集 public static String getEntityCharSet(HttpResponse response) ;

    winform调用webapi获取Token授权案例,webapi使用oauth2.0权限控制

    通过winform使用httpclient客户端调用webApi接口,api使用oauth2.0权限控制,调用接口需要进行token获取认证、

    android好用的框架(封装了数据库,注解,网络,图片缓存的框架)

    public void addHeader(String header, String value) //添加http请求头 //------------------get 请求----------------------- public void get( String url, AjaxCallBack&lt;? extends Object&gt; callBack) ...

Global site tag (gtag.js) - Google Analytics