`
hzywy
  • 浏览: 166138 次
  • 性别: Icon_minigender_1
  • 来自: 长沙
社区版块
存档分类
最新评论

解析百度输入框(在线查询)

阅读更多

   百度大家都会用,但是相信大家用百度的时候在输入框中随便输入什么内容,就会自动出现相关内容。有些人称之为自动补全,这个demo叫做在线查询比较贴近。

   解析百度在线查询方法:

private List<String> onlineSearch(String content) throws IOException {
        if (content == null)
            content = "java";
        String name = "a";
//        String path = "http://nssug.baidu.com/su?wd="
//                + URLEncoder.encode(singer, "UTF-8")
//                + "&prod=mp3&oe=utf-8&callback=undefined";
        String path="http://suggestion.baidu.com/su?wd="+URLEncoder.encode(content, "UTF-8")+"&p=3&cb=window.bdsug.sug";
        System.out.println("path = " + path);
        double fileLength = 0.0;
        File parent = new File("D:\\path");
        if (!parent.exists()) {
            parent.mkdirs();
        }
        File mp3File = new File(parent, name);
        System.out.println(mp3File);
        OutputStream os = null;
        InputStream is = null;
        URL url = new URL(path);
        HttpURLConnection con = (HttpURLConnection) url.openConnection();
        // 此处必须伪造referer,否则会自动返回首页.分析后,与cookie无关
        con.setRequestProperty("User-Agent",
                "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Maxthon;)");
        con.setRequestProperty("Accept-Encoding", "deflate");
        con.setRequestProperty("referer", "http://nssug.baidu.com");
        con.setDoInput(true);
        con.connect();
        if (con.getResponseCode() == HttpURLConnection.HTTP_OK) {
            is = con.getInputStream();
            byte[] b = new byte[1024 * 5];
            int length = -1;
            os = new FileOutputStream(mp3File);
            while ((length = is.read(b)) != -1) {
                os.write(b, 0, length);
            }

            os.flush();
        }

        BufferedReader reader = new BufferedReader(new InputStreamReader(
                new FileInputStream(mp3File), "gbk"));
        String tmp = reader.readLine();
        reader.close();
        Pattern p = Pattern.compile("\\(.*?\\)");
        Matcher m = p.matcher(tmp);
        String result="";
        while (m.find()) {
            result+=m.group();
        }
        result = result.replaceAll("\\(", "");
        result = result.replaceAll("\\)", "");
        Gson gson  = new Gson();
        HashMap object = (HashMap) gson.fromJson(result, Object.class);
       
        ArrayList list = (ArrayList) object.get("s");
        return list;
    }
测试类:

public static void main(String[] args) throws IOException {

        OnlineTest test = new OnlineTest();
       
         List<String> list = new ArrayList<String>();
         list= test.onlineSearch(null);
         for(int i=0;i<list.size();i++) {
             System.out.println(list.get(i));
         }
    }

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics