HTTPClient模拟登陆人人网
zz:
目的:http://www.iteye.com/topic/638206
使用HTTPClient4.0.1登录到人人网,并从特定的网页抓取数据。
总结&注意事项:
- HttpClient(DefaultHttpClient)代表了一个会话,在同一个会话中,HttpClient对cookie自动进行管理(当然,也可以在程序中进行控制)。
- 在同一个会话中,当使用post或是get发起一个新的请求时,一般需要对调用前一个会话的abort()方法,否则会抛出异常。
- 有些网站登录成功后会重定向(302,303),比如这里的人人网。如果发出的是post请求,需要从响应头中取出location,并再次向网站发送请求,以获取最终数据。
- 抓取程序不要运行地过于频繁,大部分站点都有抵制刷网站机制。人人网访问过于频繁会锁账号。
- 使用录制工具录制出登录时向网站发出的请求参数。在这里,我使用了badboy,导出成jmeter文件,在jmeter中就可以看到登录时向网站发送的参数列表和相应的值。
- 人人网属于登陆流程比较简单的网站,后一篇会介绍一家比较难搞的网站。
代码:
public
class
RenRen {
// The configuration items
private
static
String userName =
"YourMailinRenren"
;
private
static
String password =
"YourPassword"
;
private
static
String redirectURL =
"http://blog.renren.com/blog/304317577/449470467
"
;
// Don't change the following URL
private
static
String renRenLoginURL =
"http://www.renren.com/PLogin.do
"
;
// The HttpClient is used in one session
private
HttpResponse response;
private
DefaultHttpClient httpclient =
new
DefaultHttpClient();
private
boolean
login() {
HttpPost httpost = new
HttpPost(renRenLoginURL);
// All the parameters post to the web site
List<NameValuePair> nvps = new
ArrayList<NameValuePair>();
nvps.add(new
BasicNameValuePair(
"origURL"
, redirectURL));
nvps.add(new
BasicNameValuePair(
"domain"
,
"renren.com"
));
nvps.add(new
BasicNameValuePair(
"isplogin"
,
"true"
));
nvps.add(new
BasicNameValuePair(
"formName"
,
""
));
nvps.add(new
BasicNameValuePair(
"method"
,
""
));
nvps.add(new
BasicNameValuePair(
"submit"
,
"登录"
));
nvps.add(new
BasicNameValuePair(
"email"
, userName));
nvps.add(new
BasicNameValuePair(
"password"
, password));
try
{
httpost.setEntity(new
UrlEncodedFormEntity(nvps, HTTP.UTF_8));
response = httpclient.execute(httpost);
} catch
(Exception e) {
e.printStackTrace();
return
false
;
} finally
{
httpost.abort();
}
return
true
;
}
private
String getRedirectLocation() {
Header locationHeader = response.getFirstHeader("Location"
);
if
(locationHeader ==
null
) {
return
null
;
}
return
locationHeader.getValue();
}
private
String getText(String redirectLocation) {
HttpGet httpget = new
HttpGet(redirectLocation);
// Create a response handler
ResponseHandler<String> responseHandler = new
BasicResponseHandler();
String responseBody = ""
;
try
{
responseBody = httpclient.execute(httpget, responseHandler);
} catch
(Exception e) {
e.printStackTrace();
responseBody = null
;
} finally
{
httpget.abort();
httpclient.getConnectionManager().shutdown();
}
return
responseBody;
}
public
void
printText() {
if
(login()) {
String redirectLocation = getRedirectLocation();
if
(redirectLocation !=
null
) {
System.out.println(getText(redirectLocation));
}
}
}
public
static
void
main(String[] args) {
RenRen renRen = new
RenRen();
renRen.printText();
}
} 相关推荐
84487600 2020-08-16
似水流年梦 2020-08-09
knightwatch 2020-07-26
fengchao000 2020-06-16
标题无所谓 2020-06-14
sicceer 2020-06-12
yanghui0 2020-06-09
yanghui0 2020-06-09
创建一个 HttpClient 实例,这个实例需要调用 Dispose 方法释放资源,这里使用了 using 语句。接着调用 GetAsync,给它传递要调用的方法的地址,向服务器发送 Get 请求。
wanghongsha 2020-06-04
jiaguoquan00 2020-05-26
zhaolisha 2020-05-16
wanghongsha 2020-05-05
wanghongsha 2020-04-14
knightwatch 2020-04-11
hygbuaa 2020-03-27
zergxixi 2020-03-24