httpclient抓取網(wǎng)頁內(nèi)容

字號(hào):

1.想下載遠(yuǎn)程URL地址的內(nèi)容。可以使用httpclient現(xiàn)在整理一下相關(guān)的代碼:
    而且解決中文亂碼問題
    方法一:流轉(zhuǎn)碼
    public String convertStreamToString(InputStream is) throws UnsupportedEncodingException {
    BufferedReader reader = new BufferedReader(new InputStreamReader(is,"gbk"));
    StringBuilder sb = new StringBuilder();
    String line = null;
    try {
    while ((line = reader.readLine()) != null) { sb.append(line + "\n");
    }
    } catch (IOException e) {
    e.printStackTrace();
    } finally {
    try {
    is.close();
    } catch (IOException e) {
    e.printStackTrace();
    }
    }
    return sb.toString();
    }
    //下載內(nèi)容
    private String urlContent(String urlString) throws HttpException, IOException {
    HttpClient client = new HttpClient();
    GetMethod get = new GetMethod("http://www.tianya.cn/publicforum/articleslist/0/no20.shtml"); client.executeMethod(get); System.out.print(get.getResponseCharSet()); InputStream iStream = get.getResponseBodyAsStream();
    String contentString = convertStreamToString(iStream);
    get.releaseConnection();
    return contentString;
    }
    通過GET方法能夠?qū)崿F(xiàn)下載網(wǎng)頁內(nèi)容出來的