java HttpClient+Jsoup打造灌水利器再也不怕起

系统 1564 0

不知道多久以前就有过写个自动回帖的小软件一直没有实现,最近闲下来了遂研究了下,本人小菜对于HTTP协议一知半解只能在请教google大神了,把我的想法跟google大神说了之后,google大神说这小子不错,这是为防火事业做贡献啊!特赐予小弟以下神器:

1、 HttpClient 4.3.1 (GA)

以下列出的是 HttpClient 提供的主要的功能,要知道更多详细的功能可以参见 HttpClient 的主页。

  • 实现了所有 HTTP 的方法(GET,POST,PUT,HEAD 等)
  • 支持自动转向
  • 支持 HTTPS 协议
  • 支持代理服务器等

2、Jsoup

jsoup 的主要功能如下

  • 从一个 URL,文件或字符串中解析 HTML
  • 使用 DOM 或 CSS 选择器来查找、取出数据
  • 可操作 HTML 元素、属性、文本
  • 使用与jquery几乎一样的语法

废话不多说直接进入正题,在HTTPClient源码包内包含example文件夹此文件夹内包含一些基本用法这些例子入门足够了找到ClientFormLogin.java具体解释注释已经很清楚了大致意思就是模拟HTTP请求存储cookies。

测试网站: http://bbs.dakele.com/

因为此网站对登录做了特殊处理所以与标准的DZ论坛可能会有些出入请自行修改

对网站的分析使用的chrome自带的审查元素,这个折腾了不少时间

登录地址: http://passport.dakele.com/login.do?product=bbs

输入错误的用户名和密码会发现实际登录地址为 http://passport.dakele.com/logon.do 注意【i/n的区别刚开始没注意以为见鬼了】

返回错误信息

      {"err_msg":"帐号或密码错误"}
    

输入正确信息返回

直接输入rediret连接和正常登录

获取跳转链接:

      
        private
      
      
         LoginResult getRedirectUrl(){
        LoginResult loginResult 
      
      = 
      
        null
      
      
        ;
        CloseableHttpClient httpClient 
      
      =
      
         HttpClients.createDefault();
        HttpPost httpost 
      
      = 
      
        new
      
      
         HttpPost(LOGINURL);
        httpost.setHeader(
      
      "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
      
        );
        httpost.setHeader(
      
      "Accept-Language", "zh-CN,zh;q=0.8"
      
        );
        httpost.setHeader(
      
      "Cache-Control", "max-age=0"
      
        );
        httpost.setHeader(
      
      "Connection", "keep-alive"
      
        );
        httpost.setHeader(
      
      "Host", "passport.dakele.com"
      
        );
        httpost.setHeader(
      
      "User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36"
      
        );
        List 
      
      <NameValuePair> nvps = 
      
        new
      
       ArrayList <NameValuePair>
      
        ();
        nvps.add(
      
      
        new
      
       BasicNameValuePair("product", "bbs"
      
        ));
        nvps.add(
      
      
        new
      
       BasicNameValuePair("surl", "http://bbs.dakele.com/"
      
        ));
        nvps.add(
      
      
        new
      
       BasicNameValuePair("username", "yourname"));
      
        //
      
      
        用户名
      
      
        nvps.add(
      
        new
      
       BasicNameValuePair("password", "yourpass"));
      
        //
      
      
        密码
      
      
        nvps.add(
      
        new
      
       BasicNameValuePair("remember", "0"
      
        ));

        httpost.setEntity(
      
      
        new
      
      
         UrlEncodedFormEntity(nvps, Consts.UTF_8));
        CloseableHttpResponse response2 
      
      = 
      
        null
      
      
        ;
        
      
      
        try
      
      
         {
            response2 
      
      =
      
         httpClient.execute(httpost);
            
      
      
        if
      
      (response2.getStatusLine().getStatusCode()==200
      
        ){
                HttpEntity entity 
      
      =
      
         response2.getEntity();
                String entityString 
      
      =
      
         EntityUtils.toString(entity);
                JSONArray jsonArray 
      
      = JSONArray.fromObject("["+entityString+"]"
      
        );
                JsonConfig jsonConfig
      
      =
      
        new
      
      
         JsonConfig();
                jsonConfig.setArrayMode(JsonConfig.MODE_OBJECT_ARRAY);
                jsonConfig.setRootClass(LoginResult.
      
      
        class
      
      
        );
                LoginResult[] results
      
      =
      
         (LoginResult[]) JSONSerializer.toJava( jsonArray, jsonConfig );
                
      
      
        if
      
      (results.length==1
      
        ){
                    loginResult 
      
      = results[0
      
        ];
                }
            }
        } 
      
      
        catch
      
      
         (ClientProtocolException e) {
            e.printStackTrace();
        } 
      
      
        catch
      
      
         (IOException e) {
            e.printStackTrace();
        }
      
      
        finally
      
      
        {
            
      
      
        try
      
      
         {
                response2.close();
                httpClient.close();
            } 
      
      
        catch
      
      
         (IOException e) {
                e.printStackTrace();
            }
        }
        
      
      
        return
      
      
         loginResult;
    }
      
    

登录代码:

      
        public
      
      
        boolean
      
      
         login(){
        
      
      
        boolean
      
       flag = 
      
        false
      
      
        ;
        LoginResult loginResult 
      
      =
      
         getRedirectUrl();
        
      
      
        if
      
      (loginResult.getResult().equals("true"
      
        )){
            cookieStore 
      
      = 
      
        new
      
      
         BasicCookieStore();
            globalClient 
      
      =
      
         HttpClients.custom().setDefaultCookieStore(cookieStore).build();
            HttpGet httpGet 
      
      = 
      
        new
      
      
         HttpGet(loginResult.getRedirect());
            httpGet.setHeader(
      
      "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
      
        );
            httpGet.setHeader(
      
      "Accept-Language", "zh-CN,zh;q=0.8"
      
        );
            httpGet.setHeader(
      
      "Connection", "keep-alive"
      
        );
            httpGet.setHeader(
      
      "Host"
      
        , HOST);
            httpGet.setHeader(
      
      "User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36"
      
        );
           
      
      
        try
      
      
         {
            globalClient.execute(httpGet);
        } 
      
      
        catch
      
      
         (ClientProtocolException e) {
            e.printStackTrace();
        } 
      
      
        catch
      
      
         (IOException e) {
            e.printStackTrace();
        }
            List
      
      <Cookie> cookies2 =
      
         cookieStore.getCookies();
            
      
      
        if
      
      
         (cookies2.isEmpty()) {
                log.error(
      
      "cookie is empty"
      
        );
            } 
      
      
        else
      
      
         {
                
      
      
        for
      
       (
      
        int
      
       i = 0; i < cookies2.size(); i++
      
        ) {
                    
                }
            }
        }
        
        
      
      
        return
      
      
         flag;
    }
      
    

到此已经登录成功可以进行只有登录号才能做的事了,什么?你不知道当然是灭火了

首先取得需要回复的帖子地址,列表页比较有规律所有没有写自动发现的所以写了个循环@1

      
        for
      
      (
      
        int
      
       i=1;i<200;i++
      
        ){
            String basurl
      
      ="http://bbs.dakele.com/forum-43-"+i+".html"
      
        ;
            log.info(basurl);
            List
      
      <String> urls =
      
         dakele.getThreadURLs(basurl);
            
      
      
        for
      
      
        (String url:urls){
                
      
      
        //
      
      
        log.info(url);
      
      
                ReplayContent content =
      
         dakele.preReplay(url);
                
      
      
        if
      
      (content!=
      
        null
      
      
        ){
                    log.info(content.getUrl());
                    log.info(content.getMessage());
                    
      
      
        //
      
      
        dakele.replay( content);
                    
      
      
        //
      
      
        Thread.sleep(15300);
      
      
                        }
            }
        }
      
    

在列表页内获取帖子地址:

      String html =
      
         EntityUtils.toString(entity);
            Document document 
      
      =
      
         Jsoup.parse(html,HOST);
            Elements elements
      
      =document.select("tbody[id^=normalthread_] > tr > td.new > a.xst"
      
        );
            
      
      
        for
      
      (
      
        int
      
       i=0;i<elements.size();i++
      
        ){
                Element e 
      
      =
      
         elements.get(i);
                urList.add(e.attr(
      
      "abs:href"
      
        ));
            }
      
    

在需要回复的帖子内获得需要提交的form表单地址以及构造回复内容

      
        public
      
      
         ReplayContent preReplay(String url){
        ReplayContent content 
      
      = 
      
        null
      
      
        ;
        HttpGet get  
      
      = 
      
        new
      
      
         HttpGet(url);
        get.setHeader(
      
      "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
      
        );
        get.setHeader(
      
      "Accept-Language", "zh-CN,zh;q=0.8"
      
        );
        get.setHeader(
      
      "Connection", "keep-alive"
      
        );
        get.setHeader(
      
      "Host"
      
        , HOST);
        get.setHeader(
      
      "User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36"
      
        );
        
      
      
        try
      
      
         {
            CloseableHttpResponse response 
      
      =
      
         globalClient.execute(get);
            HttpEntity entity 
      
      =
      
         response.getEntity();
            String html 
      
      =
      
         EntityUtils.toString(entity);
            Document document 
      
      =
      
         Jsoup.parse(html, HOST);
            Element postForm 
      
      = document.getElementById("fastpostform"
      
        );
            
      
      
        if
      
      (!postForm.toString().contains("您现在无权发帖"
      
        )){
                content 
      
      = 
      
        new
      
      
         ReplayContent();
                content.setUrl(url);
                
                log.debug(postForm.attr(
      
      "abs:action"
      
        ));
                content.setAction(postForm.attr(
      
      "abs:action"
      
        ));
                
                
                
      
      
        ////////

      
                      Elements teElements = document.select("td[id^=postmessage_]"
      
        );
                String message 
      
      = ""
      
        ;
                
      
      
        for
      
      (
      
        int
      
       i=0;i<teElements.size();i++
      
        ){
                    String temp 
      
      = teElements.get(i).html().replaceAll( "(?is)<.*?>", ""
      
        );
                    
      
      
        if
      
      (temp.contains("发表于"
      
        )){
                        String[] me 
      
      = temp.split("\\s+"
      
        );
                        temp 
      
      = me[me.length-1
      
        ];
                    }
                    message
      
      +=temp.replaceAll("\\s+", ""
      
        );
                }
                log.debug(message.replaceAll(
      
      "\\s+", ""
      
        ));
                
      
      
        //////////////
      
      
        /
      
      
        /*
      
      
        取最后一条评论
                Element messageElement= document.select("td[id^=postmessage_]").last();
//                String message = messageElement.html().replaceAll("\\&[a-zA-Z]{1,10};", "").replaceAll("<[^>]*>", "").replaceAll("[(/>)<]", "");
                String message = messageElement.html().replaceAll( "(?is)<.*?>", "");
                
      
      
        */
      
      
        if
      
      (message.contains("发表于"
      
        )){
                    String[] me 
      
      = message.split("\\s+"
      
        );
                    message 
      
      = me[me.length-1
      
        ];
                }
                content.setMessage(message.replaceAll(
      
      "&nbsp;", "").replaceAll("上传", "").replaceAll("附件", "").replaceAll("下载", ""
      
        ));
                Elements inputs 
      
      = postForm.getElementsByTag("input"
      
        );
                
      
      
        for
      
      
        (Element input:inputs){
                    log.debug(input.attr(
      
      "name")+":"+input.attr("value"
      
        ));
                    
      
      
        if
      
      (input.attr("name").equals("posttime"
      
        )){
                        content.setPosttime(input.attr(
      
      "value"
      
        ));
                    }
      
      
        else
      
      
        if
      
      (input.attr("name").equals("formhash"
      
        )){
                        content.setFormhash(input.attr(
      
      "value"
      
        ));
                    }
      
      
        else
      
      
        if
      
      (input.attr("name").equals("usesig"
      
        )){
                        content.setUsesig(input.attr(
      
      "value"
      
        ));
                    }
      
      
        else
      
      
        if
      
      (input.attr("name").equals("subject"
      
        )){
                        content.setSubject(input.attr(
      
      "value"
      
        ));
                    }
                }
            }
      
      
        else
      
      
        {
                log.warn(
      
      "您现在无权发帖:"+
      
        url);
            }
        } 
      
      
        catch
      
      
         (ClientProtocolException e) {
            e.printStackTrace();
        } 
      
      
        catch
      
      
         (IOException e) {
            e.printStackTrace();
        }
        
      
      
        return
      
      
         content;
    }
      
    

地址有了,内容有了接下来开始放水了

      
        public
      
      
        void
      
      
         replay(ReplayContent content){
        
        HttpPost httpost 
      
      = 
      
        new
      
      
         HttpPost(content.getAction());
        httpost.setHeader(
      
      "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
      
        );
        httpost.setHeader(
      
      "Accept-Language", "zh-CN,zh;q=0.8"
      
        );
        httpost.setHeader(
      
      "Cache-Control", "max-age=0"
      
        );
        httpost.setHeader(
      
      "Connection", "keep-alive"
      
        );
        httpost.setHeader(
      
      "Host"
      
        , HOST);
        httpost.setHeader(
      
      "User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36"
      
        );
        List 
      
      <NameValuePair> nvps = 
      
        new
      
       ArrayList <NameValuePair>
      
        ();
        nvps.add(
      
      
        new
      
       BasicNameValuePair("posttime"
      
        , content.getPosttime()));
        nvps.add(
      
      
        new
      
       BasicNameValuePair("formhash"
      
        , content.getFormhash()));
        nvps.add(
      
      
        new
      
       BasicNameValuePair("usesig"
      
        , content.getUsesig()));
        nvps.add(
      
      
        new
      
       BasicNameValuePair("subject"
      
        , content.getSubject()));
        nvps.add(
      
      
        new
      
       BasicNameValuePair("message"
      
        , content.getMessage()));

        httpost.setEntity(
      
      
        new
      
      
         UrlEncodedFormEntity(nvps, Consts.UTF_8));
        
      
      
        //
      
      
        HTTP 三次握手 必须处理响应刚开始没注意卡在这了
      
      
        CloseableHttpResponse response2 = 
      
        null
      
      
        ;
       
        
      
      
        try
      
      
         {
            response2 
      
      =
      
         globalClient.execute(httpost);
            
      
      
        //
      
      
        log.info(content.getAction());
            
      
      
        //
      
      
        log.info(content.getMessage());
      
      
            HttpEntity entity =
      
         response2.getEntity();
            EntityUtils.consume(entity);

      
      
        //
      
      
                    BufferedWriter bw= new BufferedWriter(new FileWriter("d:/tt1.html"));

      
      
        //
      
      
                    bw.write(EntityUtils.toString(response2.getEntity()));

      
      
        //
      
      
                    bw.flush();

      
      
        //
      
      
                    bw.close();
            
      
      
        //
      
      
        System.out.println(EntityUtils.toString(response2.getEntity()));
      
      
        } 
      
        catch
      
      
         (ClientProtocolException e) {
            e.printStackTrace();
        } 
      
      
        catch
      
      
         (IOException e) {
            e.printStackTrace();
        }
        
    }
      
    

当然这只适用于没有验证码的论坛对于有验证码的只能绕道了,

灌水有害,经过一番轰炸这就是结果 QQ截图20140109224028

对于回复内容刚开始只取了当前帖子内最后一条评论然后进行回复,被警告!然后使用IK分词获取关键字代码是贴来的 请移步

参考连接:

缺点:没有使用多线程、没有进行充分测试

代码整理中尽快提供

后期计划:加入签到、做任务功能、把@1循环改为自动发现

小弟第一次发帖其中有不足之处望批评指正

------------------------------------------

下载地址http://pan.baidu.com/s/1jGjwA5g

早上把代码整理了下,现在分享给大家,直接对Myeclipse工程进行的打包解压后可直接导入

修改IKFenci.java 内用户名和密码可直接运行

java HttpClient+Jsoup打造灌水利器再也不怕起火了


更多文章、技术交流、商务合作、联系博主

微信扫码或搜索:z360901061

微信扫一扫加我为好友

QQ号联系: 360901061

您的支持是博主写作最大的动力,如果您喜欢我的文章,感觉我的文章对您有帮助,请用微信扫描下面二维码支持博主2元、5元、10元、20元等您想捐的金额吧,狠狠点击下面给点支持吧,站长非常感激您!手机微信长按不能支付解决办法:请将微信支付二维码保存到相册,切换到微信,然后点击微信右上角扫一扫功能,选择支付二维码完成支付。

【本文对您有帮助就好】

您的支持是博主写作最大的动力,如果您喜欢我的文章,感觉我的文章对您有帮助,请用微信扫描上面二维码支持博主2元、5元、10元、自定义金额等您想捐的金额吧,站长会非常 感谢您的哦!!!

发表我的评论
最新评论 总共0条评论