這裏直接拿了HTTP流實驗了下
- public String getHttpContent(String htmlUrl) throws IOException,
- InterruptedException {
- URL url;
- InputStream is = null;
- HttpURLConnection urlConn = null;
- int count = 0;
- ByteArrayOutputStream baos = new ByteArrayOutputStream();
- try {
- url = new URL(htmlUrl);
- urlConn = (HttpURLConnection) url.openConnection();
- urlConn.setConnectTimeout(20000);
- urlConn.setReadTimeout(20000);
- is = urlConn.getInputStream();
- byte[] buf = new byte[512];
- int ch = -1;
- while ((ch = is.read(buf)) != -1) {
- baos.write(buf, 0, ch);
- count = count + ch;
- }
- } catch (final MalformedURLException me) {
- me.getMessage();
- throw me;
- } catch (final IOException e) {
- e.printStackTrace();
- throw e;
- }
- return new String(baos.toByteArray(), "GB2312");
- }
其實上面的方法很簡單,剛開始那哥們用的BufferedReader去讀,這樣直接讀出來String有問題,解碼不對,後來自己讀到 byteoutputstream裏,然後讀出字節自己手工編碼就對了,可是昨天晚上發現了一個更簡單的方法,我們真是走了一個大大的彎路,如下:
- public String getHttpContent(String htmlurl) throws Exception{
- HttpClient hc = new DefaultHttpClient();
- HttpGet get = new HttpGet(htmlUrl);
- HttpResponse rp = hc.execute(get);
- if (rp.getStatusLine().getStatusCode() == HttpStatus.SC_OK) {
- return EntityUtils.toString(rp.getEntity()).trim();
- }else{
- return null;
- }
- }
apache的這些類用起來還真是方便,以後還要多多學習。