httpclient获取网页内容没设置User Agent导致有些网站抓不取到内容

log4j:WARN No appenders could be found for logger (org.apache.commons.httpclient.HttpClient).

log4j:WARN Please initialize the log4j system properly.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="zh-CN" dir="ltr">

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

    <style type="text/css">

      .clearfix:after {

        content: ".";

        display: block;

        height: 0;

        clear: both;

        visibility: hidden;

      }

      .clearfix {

        display:block;

      }

      .left {

        float: left;

      }

      h1 {font-size: 20px;color: #6293BB;}

      p  {font-size: 14px;color: #6293BB;}

    </style>

  </head>

  <body>

    <div style="padding:50px 0 0 300px">

      <h1>您的访问请求被拒绝</h1>

    </div>

    <div class="clearfix">

      <div class="left" style="padding-left:120px">

        <img src="/images/filenotfound.jpg" width="128" height="128" />

      </div>

      <div class="left" style="width:700px;padding:30px 0 0 30px">

        <p>您可能使用了网络爬虫抓取ITeye网站页面!</p>

        <p>ITeye网站不允许您使用网络爬虫对ITeye进行恶意的网页抓取,请您立刻停止该抓取行为!</p>

        <p>如果您的网络爬虫不属于恶意抓取行为,希望ITeye网站允许你进行网页抓取,请和ITeye管理员联系,取得授权: webmaster<img src='/images/email.gif' alt="Email" />iteye.com</p>

      </div>

    </div>

    <div style="padding:20px 0 0 500px">

    </div>

  </body>

</html>

HttpClient httpClient = new HttpClient();
		GetMethod getMethod = new GetMethod("http://www.iteye.com/");
		/**
		 * 设计USER_AGENT 如果不设置的话就禁止了改网页的内容
		 */
		String USER_AGENT="Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv:1.9.1.2) Gecko/20090803 Fedora/3.5.2-2.fc11 Firefox/3.5.2";
		String User_Agent="Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; GTB5; .NET CLR 1.1.4322; .NET CLR 2.0.50727; Alexa Toolbar; MAXTHON 2.0)";
		
		httpClient.getParams().setParameter(HttpMethodParams.USER_AGENT,User_Agent);//设置信息

相关推荐