分析界面,在全國公共資源交易平臺使用java獲取全國的招投標數據接口

任務:獲取全國的建築招投標數據信息,並打開界面獲取詳情頁抓取html保存至本地。

  1. 打開網址地址,進行網頁分析。

2.獲取省市區聯動,在控制檯並沒有發現任何往後臺獲取省市區聯動的信息,懷疑是在js裏寫死了。

 

我們在頁面上看到了省一級的菜單源碼。

                        <select id="provinceId">
	        				<option value="0">不限</option>
					                                                                               
							<option value="110000">北京</option>
					                                                                               
							<option value="120000">天津</option>
					                                                                               
							<option value="130000">河北</option>
					                                                                               
							<option value="140000">山西</option>
					                                                                               
							<option value="150000">內蒙古</option>
					                                                                               
							<option value="210000">遼寧</option>
					                                                                               
							<option value="220000">吉林</option>
					                                                                               
							<option value="230000">黑龍江</option>
					                                                                               
							<option value="310000">上海</option>
					                                                                               
							<option value="320000">江蘇</option>
					                                                                               
							<option value="330000">浙江</option>
					                                                                               
							<option value="340000">安徽</option>
					                                                                               
							<option value="350000">福建</option>
					                                                                               
							<option value="360000">江西</option>
					                                                                               
							<option value="370000">山東</option>
					                                                                               
							<option value="410000">河南</option>
					                                                                               
							<option value="420000">湖北</option>
					                                                                               
							<option value="430000">湖南</option>
					                                                                               
							<option value="440000">廣東</option>
					                                                                               
							<option value="450000">廣西</option>
					                                                                               
							<option value="460000">海南</option>
					                                                                               
							<option value="500000">重慶</option>
					                                                                               
							<option value="510000">四川</option>
					                                                                               
							<option value="520000">貴州</option>
					                                                                               
							<option value="530000">雲南</option>
					                                                                               
							<option value="540000">西藏</option>
					                                                                               
							<option value="610000">陝西</option>
					                                                                               
							<option value="620000">甘肅</option>
					                                                                               
							<option value="630000">青海</option>
					                                                                               
							<option value="640000">寧夏</option>
					                                                                               
							<option value="650000">新疆</option>
					                                                                               
							<option value="660000">兵團</option>
						
	        			</select>

在js裏搜索provinceId代碼太多沒有太多有用信息,搜索11000北京,發現了寫在js裏寫死了的所有市信息,我們複製出來。

 

 3.以山東省爲例,獲取山東省獲取的建築招投標信息。

  分析調用後臺接口的列表url:http://deal.ggzy.gov.cn/ds/deal/dealList_find.jsp。

 

static String sendPost(String url,String area,String page) throws UnsupportedEncodingException {
        try {
            //睡眠,防止調用過快被封
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        DefaultHttpClient httpClient = new DefaultHttpClient();
        HttpPost httpPost = new HttpPost(url);
        httpPost.addHeader("Accept","application/json");
        httpPost.addHeader("Content-Type","application/x-www-form-urlencoded; charset=UTF-8");
        httpPost.addHeader("User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36");
        List<NameValuePair> nvps = new ArrayList<NameValuePair>();
        SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd");
        Date currentDate = new Date();
        Calendar c = Calendar.getInstance();
        c.setTime(currentDate);
        c.add(Calendar.DATE, - 9);
        Date d = c.getTime();
        String day = format.format(d);

        //分析了部分請求參數
        nvps.add(new BasicNameValuePair("TIMEBEGIN_SHOW",day));
        nvps.add(new BasicNameValuePair("TIMEEND_SHOW",new SimpleDateFormat("yyyy-MM-dd").format(new Date())));

        //通過在前端分析接口發現 timebegin和timeend相差10天
        nvps.add(new BasicNameValuePair("TIMEBEGIN",day));
        nvps.add(new BasicNameValuePair("TIMEEND",new SimpleDateFormat("yyyy-MM-dd").format(currentDate)));

        nvps.add(new BasicNameValuePair("SOURCE_TYPE","1"));
        nvps.add(new BasicNameValuePair("DEAL_TIME","01"));
        nvps.add(new BasicNameValuePair("DEAL_CLASSIFY","01"));
        nvps.add(new BasicNameValuePair("DEAL_STAGE","0101"));

        //山東的省代碼
        nvps.add(new BasicNameValuePair("DEAL_PROVINCE","370000"));

        //市代碼
        nvps.add(new BasicNameValuePair("DEAL_CITY",area));

        nvps.add(new BasicNameValuePair("DEAL_PLATFORM","0"));
        nvps.add(new BasicNameValuePair("BID_PLATFORM","0"));
        nvps.add(new BasicNameValuePair("DEAL_TRADE","0"));
        nvps.add(new BasicNameValuePair("isShowAll","1"));
        nvps.add(new BasicNameValuePair("PAGENUMBER",page));
        nvps.add(new BasicNameValuePair("FINDTXT",""));
        httpPost.setEntity(new UrlEncodedFormEntity(nvps, "utf-8"));
        String res = "";
        HttpResponse response = null;
        try {
            response = httpClient.execute(httpPost);
            res = EntityUtils.toString(response.getEntity(), "utf-8");
        } catch (Exception e) {
            e.printStackTrace();
        }
       return res;
    }

4.對返回格式進行處理並且打開頁面處理點擊事件。

   頁面返回的json格式,url字爲打開的詳情頁。

        {
            "classify":"01",
            "title":"王馬社區南片老舊小區綜合改造提升工程設計-採購-施工(EPC)總承包",
            "timeShow":"2020-06-29",
            "stageName":"信息類型",
            "platformName":"杭州市電子招投標平臺",
            "classifyShow":"工程建設",
            "tradeShow":"",
            "districtShow":"浙江",
            "url":"http://www.ggzy.gov.cn/information/html/a/330000/0102/202005/28/0033b5bf6b41dcd34309ae1ed58280fa6244.shtml",
            "stageShow":"開標記錄",
            "titleShow":"王馬社區南片老舊小區綜合改造提升工程設計-採購-施工(EPC)總承包"
        }

 打開以後我們發現頁面是停留在開標記錄界面的,但是我們要獲取的招標公告,需要觸發一個點擊事件。

private void addTargetUrl(Spider spider) {


        System.setProperty("webdriver.chrome.driver",
                "C:\\Users\\admin\\AppData\\Local\\Google\\Chrome\\Application\\75.0.3770.100\\chromedriver_win32\\chromedriver.exe");


        //webmagic默認會打開瀏覽器 關閉瀏覽器
        ChromeOptions chromeOptions=new ChromeOptions();
        chromeOptions.addArguments("-headless");
        WebDriver driver = new ChromeDriver(chromeOptions);

        driver.manage().window().maximize();

        //tenderinfos  爬取到的所有url
        for(TenderInfo tenderInfo: tenderInfos){
            try {
                String url = tenderInfo.getUrl();
                driver.get(tenderInfo.getUrl());
                //獲取招標/資審公告按鈕
                WebElement element = driver.findElement(By.xpath("//li[@id='t_0101']"));
                //點擊
                element.click();
                //獲取招標/資審公告 url
                WebElement element1 = driver.findElement(By.xpath("//div[@id='show0101']"));
                String substring = url.substring( url.lastIndexOf("/")+1,url.length()-6);
                String targetUrl = element1.findElement(By.xpath("//iframe[@id='iframe0101']")).getAttribute("src");

                //把外面列表的id追加到招標/資審公告上,來確定列表和url之間的關係
                targetUrl+="?url="+substring;
                spider.addUrl(targetUrl);
                //睡3秒 防止被封
                Thread.sleep(3000);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

    }

 5.獲取到最終的url,使用webmagic打開抓取並保存到本地。

 

String pageUrl = page.getUrl().toString();
        Html pageHtml = page.getHtml();
        try {
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        String substring = pageUrl.substring( pageUrl.lastIndexOf("=")+1,pageUrl.length());
        String xpath = pageHtml.xpath("//div[@id=\"mycontent\"]").toString();
        String html = "<!DOCTYPE html>\n" +
                "<html lang=\"en\">\n" +
                "<head>\n" +
                "    <meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,maximum-scale=1.0,user-scalable=0\">\n"+"<script src=\"jquery-1.6.4.min.js\"></script>"+
                "    <meta\n" +
                "      http-equiv=\"X-UA-Compatible\"\n" +
                "      content=\"IE=edge,chrome=1\"\n" +
                "      charset=\"utf-8\"\n" +
                "    />\n" +
                "</head>\n" +
                "<body>";
        html+=xpath;

        html+="</body>\n" +
                "</html>";


        File fp=new File("F:\\zfpackage\\"+substring+".html");
        PrintWriter pfp= null;
        try {
            pfp = new PrintWriter(fp);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
        pfp.print(html);
        pfp.close();

 

 

最終代碼:




import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import com.magic.demo.ConsolePipeline;
import com.sun.scenario.effect.impl.sw.sse.SSEBlend_SRC_OUTPeer;
import org.apache.commons.lang3.StringUtils;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.util.EntityUtils;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import us.codecraft.webmagic.Page;
import us.codecraft.webmagic.Site;
import us.codecraft.webmagic.Spider;
import us.codecraft.webmagic.downloader.selenium.SeleniumDownloader;
import us.codecraft.webmagic.processor.PageProcessor;
import us.codecraft.webmagic.selector.Html;

import java.io.*;
import java.sql.SQLOutput;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.Date;
import java.util.List;

/**
 * 獲取所有的項目文件列表
 */
public class TenderInfoWebmagic implements PageProcessor {


    //webmagic site信息
        private Site site = Site
                .me()
                .setSleepTime(30000)
                // .setCycleRetryTimes(5)失敗則會重試
                .setUserAgent(
                        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36");

    //招投標獲取列表接口
    private static String BIAO_URL = "http://deal.ggzy.gov.cn/ds/deal/dealList_find.jsp";

    //招投標所有的詳情url
    private static List<TenderInfo> tenderInfos = new ArrayList<TenderInfo>();


    private static String provinceInfo = "";

    //城市二級信息
    private static String addressInfo = "{\"120000\":[{\"id\":\"120000\",\"name\":\"省本級\"},{\"id\":\"120101\",\"name\":\"和平區\"},{\"id\":\"120102\",\"name\":\"河東區\"},{\"id\":\"120103\",\"name\":\"河西區\"},{\"id\":\"120104\",\"name\":\"南開區\"},{\"id\":\"120105\",\"name\":\"河北區\"},{\"id\":\"120106\",\"name\":\"紅橋區\"},{\"id\":\"120110\",\"name\":\"東麗區\"},{\"id\":\"120111\",\"name\":\"西青區\"},{\"id\":\"120112\",\"name\":\"津南區\"},{\"id\":\"120113\",\"name\":\"北辰區\"},{\"id\":\"120114\",\"name\":\"武清區\"},{\"id\":\"120115\",\"name\":\"寶坻區\"},{\"id\":\"120116\",\"name\":\"濱海新區\"},{\"id\":\"120117\",\"name\":\"寧河區\"},{\"id\":\"120118\",\"name\":\"靜海區\"},{\"id\":\"120119\",\"name\":\"薊州區\"}],\"450000\":[{\"id\":\"450000\",\"name\":\"省本級\"},{\"id\":\"450100\",\"name\":\"南寧市\"},{\"id\":\"450200\",\"name\":\"柳州市\"},{\"id\":\"450300\",\"name\":\"桂林市\"},{\"id\":\"450400\",\"name\":\"梧州市\"},{\"id\":\"450500\",\"name\":\"北海市\"},{\"id\":\"450600\",\"name\":\"防城港市\"},{\"id\":\"450700\",\"name\":\"欽州市\"},{\"id\":\"450800\",\"name\":\"貴港市\"},{\"id\":\"450900\",\"name\":\"玉林市\"},{\"id\":\"451000\",\"name\":\"百色市\"},{\"id\":\"451100\",\"name\":\"賀州市\"},{\"id\":\"451200\",\"name\":\"河池市\"},{\"id\":\"451300\",\"name\":\"來賓市\"},{\"id\":\"451400\",\"name\":\"崇左市\"}],\"140000\":[{\"id\":\"140000\",\"name\":\"省本級\"},{\"id\":\"140100\",\"name\":\"太原市\"},{\"id\":\"140200\",\"name\":\"大同市\"},{\"id\":\"140300\",\"name\":\"陽泉市\"},{\"id\":\"140400\",\"name\":\"長治市\"},{\"id\":\"140500\",\"name\":\"晉城市\"},{\"id\":\"140600\",\"name\":\"朔州市\"},{\"id\":\"140700\",\"name\":\"晉中市\"},{\"id\":\"140800\",\"name\":\"運城市\"},{\"id\":\"140900\",\"name\":\"忻州市\"},{\"id\":\"141000\",\"name\":\"臨汾市\"},{\"id\":\"141100\",\"name\":\"呂梁市\"}],\"630000\":[{\"id\":\"630000\",\"name\":\"省本級\"},{\"id\":\"630100\",\"name\":\"西寧市\"},{\"id\":\"630200\",\"name\":\"海東市\"},{\"id\":\"632200\",\"name\":\"海北藏族自治州\"},{\"id\":\"632300\",\"name\":\"黃南藏族自治州\"},{\"id\":\"632500\",\"name\":\"海南藏族自治州\"},{\"id\":\"632600\",\"name\":\"果洛藏族自治州\"},{\"id\":\"632700\",\"name\":\"玉樹藏族自治州\"},{\"id\":\"632800\",\"name\":\"海西蒙古族藏族自治州\"}],\"440000\":[{\"id\":\"440000\",\"name\":\"省本級\"},{\"id\":\"440100\",\"name\":\"廣州市\"},{\"id\":\"440200\",\"name\":\"韶關市\"},{\"id\":\"440300\",\"name\":\"深圳市\"},{\"id\":\"440400\",\"name\":\"珠海市\"},{\"id\":\"440500\",\"name\":\"汕頭市\"},{\"id\":\"440600\",\"name\":\"佛山市\"},{\"id\":\"440700\",\"name\":\"江門市\"},{\"id\":\"440800\",\"name\":\"湛江市\"},{\"id\":\"440900\",\"name\":\"茂名市\"},{\"id\":\"441200\",\"name\":\"肇慶市\"},{\"id\":\"441300\",\"name\":\"惠州市\"},{\"id\":\"441400\",\"name\":\"梅州市\"},{\"id\":\"441500\",\"name\":\"汕尾市\"},{\"id\":\"441600\",\"name\":\"河源市\"},{\"id\":\"441700\",\"name\":\"陽江市\"},{\"id\":\"441800\",\"name\":\"清遠市\"},{\"id\":\"441900\",\"name\":\"東莞市\"},{\"id\":\"442000\",\"name\":\"中山市\"},{\"id\":\"445100\",\"name\":\"潮州市\"},{\"id\":\"445200\",\"name\":\"揭陽市\"},{\"id\":\"445300\",\"name\":\"雲浮市\"}],\"430000\":[{\"id\":\"430000\",\"name\":\"省本級\"},{\"id\":\"430100\",\"name\":\"長沙市\"},{\"id\":\"430200\",\"name\":\"株洲市\"},{\"id\":\"430300\",\"name\":\"湘潭市\"},{\"id\":\"430400\",\"name\":\"衡陽市\"},{\"id\":\"430500\",\"name\":\"邵陽市\"},{\"id\":\"430600\",\"name\":\"岳陽市\"},{\"id\":\"430700\",\"name\":\"常德市\"},{\"id\":\"430800\",\"name\":\"張家界市\"},{\"id\":\"430900\",\"name\":\"益陽市\"},{\"id\":\"431000\",\"name\":\"郴州市\"},{\"id\":\"431100\",\"name\":\"永州市\"},{\"id\":\"431200\",\"name\":\"懷化市\"},{\"id\":\"431300\",\"name\":\"婁底市\"},{\"id\":\"433100\",\"name\":\"湘西土家族苗族自治州\"}],\"620000\":[{\"id\":\"620000\",\"name\":\"省本級\"},{\"id\":\"620100\",\"name\":\"蘭州市\"},{\"id\":\"620200\",\"name\":\"嘉峪關市\"},{\"id\":\"620300\",\"name\":\"金昌市\"},{\"id\":\"620400\",\"name\":\"白銀市\"},{\"id\":\"620500\",\"name\":\"天水市\"},{\"id\":\"620600\",\"name\":\"武威市\"},{\"id\":\"620700\",\"name\":\"張掖市\"},{\"id\":\"620800\",\"name\":\"平涼市\"},{\"id\":\"620900\",\"name\":\"酒泉市\"},{\"id\":\"621000\",\"name\":\"慶陽市\"},{\"id\":\"621100\",\"name\":\"定西市\"},{\"id\":\"621200\",\"name\":\"隴南市\"},{\"id\":\"622900\",\"name\":\"臨夏回族自治州\"},{\"id\":\"623000\",\"name\":\"甘南藏族自治州\"}],\"640000\":[{\"id\":\"640000\",\"name\":\"省本級\"},{\"id\":\"640100\",\"name\":\"銀川市\"},{\"id\":\"640200\",\"name\":\"石嘴山市\"},{\"id\":\"640300\",\"name\":\"吳忠市\"},{\"id\":\"640400\",\"name\":\"固原市\"},{\"id\":\"640500\",\"name\":\"中衛市\"}],\"230000\":[{\"id\":\"230000\",\"name\":\"省本級\"},{\"id\":\"230100\",\"name\":\"哈爾濱市\"},{\"id\":\"230200\",\"name\":\"齊齊哈爾市\"},{\"id\":\"230300\",\"name\":\"雞西市\"},{\"id\":\"230400\",\"name\":\"鶴崗市\"},{\"id\":\"230500\",\"name\":\"雙鴨山市\"},{\"id\":\"230600\",\"name\":\"大慶市\"},{\"id\":\"230700\",\"name\":\"伊春市\"},{\"id\":\"230800\",\"name\":\"佳木斯市\"},{\"id\":\"230900\",\"name\":\"七臺河市\"},{\"id\":\"231000\",\"name\":\"牡丹江市\"},{\"id\":\"231100\",\"name\":\"黑河市\"},{\"id\":\"231200\",\"name\":\"綏化市\"},{\"id\":\"232700\",\"name\":\"大興安嶺地區\"}],\"410000\":[{\"id\":\"410000\",\"name\":\"省本級\"},{\"id\":\"410100\",\"name\":\"鄭州市\"},{\"id\":\"410200\",\"name\":\"開封市\"},{\"id\":\"410300\",\"name\":\"洛陽市\"},{\"id\":\"410400\",\"name\":\"平頂山市\"},{\"id\":\"410500\",\"name\":\"安陽市\"},{\"id\":\"410600\",\"name\":\"鶴壁市\"},{\"id\":\"410700\",\"name\":\"新鄉市\"},{\"id\":\"410800\",\"name\":\"焦作市\"},{\"id\":\"410900\",\"name\":\"濮陽市\"},{\"id\":\"411000\",\"name\":\"許昌市\"},{\"id\":\"411100\",\"name\":\"漯河市\"},{\"id\":\"411200\",\"name\":\"三門峽市\"},{\"id\":\"411300\",\"name\":\"南陽市\"},{\"id\":\"411400\",\"name\":\"商丘市\"},{\"id\":\"411500\",\"name\":\"信陽市\"},{\"id\":\"411600\",\"name\":\"周口市\"},{\"id\":\"411700\",\"name\":\"駐馬店市\"},{\"id\":\"419001\",\"name\":\"濟源市\"}],\"330000\":[{\"id\":\"330000\",\"name\":\"省本級\"},{\"id\":\"330100\",\"name\":\"杭州市\"},{\"id\":\"330200\",\"name\":\"寧波市\"},{\"id\":\"330300\",\"name\":\"溫州市\"},{\"id\":\"330400\",\"name\":\"嘉興市\"},{\"id\":\"330500\",\"name\":\"湖州市\"},{\"id\":\"330600\",\"name\":\"紹興市\"},{\"id\":\"330700\",\"name\":\"金華市\"},{\"id\":\"330800\",\"name\":\"衢州市\"},{\"id\":\"330900\",\"name\":\"舟山市\"},{\"id\":\"331000\",\"name\":\"台州市\"},{\"id\":\"331100\",\"name\":\"麗水市\"}],\"510000\":[{\"id\":\"510000\",\"name\":\"省本級\"},{\"id\":\"510100\",\"name\":\"成都市\"},{\"id\":\"510300\",\"name\":\"自貢市\"},{\"id\":\"510400\",\"name\":\"攀枝花市\"},{\"id\":\"510500\",\"name\":\"瀘州市\"},{\"id\":\"510600\",\"name\":\"德陽市\"},{\"id\":\"510700\",\"name\":\"綿陽市\"},{\"id\":\"510800\",\"name\":\"廣元市\"},{\"id\":\"510900\",\"name\":\"遂寧市\"},{\"id\":\"511000\",\"name\":\"內江市\"},{\"id\":\"511100\",\"name\":\"樂山市\"},{\"id\":\"511300\",\"name\":\"南充市\"},{\"id\":\"511400\",\"name\":\"眉山市\"},{\"id\":\"511500\",\"name\":\"宜賓市\"},{\"id\":\"511600\",\"name\":\"廣安市\"},{\"id\":\"511700\",\"name\":\"達州市\"},{\"id\":\"511800\",\"name\":\"雅安市\"},{\"id\":\"511900\",\"name\":\"巴中市\"},{\"id\":\"512000\",\"name\":\"資陽市\"},{\"id\":\"513200\",\"name\":\"阿壩藏族羌族自治州\"},{\"id\":\"513300\",\"name\":\"甘孜藏族自治州\"},{\"id\":\"513400\",\"name\":\"涼山彝族自治州\"}],\"210000\":[{\"id\":\"210000\",\"name\":\"省本級\"},{\"id\":\"210100\",\"name\":\"瀋陽市\"},{\"id\":\"210200\",\"name\":\"大連市\"},{\"id\":\"210300\",\"name\":\"鞍山市\"},{\"id\":\"210400\",\"name\":\"撫順市\"},{\"id\":\"210500\",\"name\":\"本溪市\"},{\"id\":\"210600\",\"name\":\"丹東市\"},{\"id\":\"210700\",\"name\":\"錦州市\"},{\"id\":\"210800\",\"name\":\"營口市\"},{\"id\":\"210900\",\"name\":\"阜新市\"},{\"id\":\"211000\",\"name\":\"遼陽市\"},{\"id\":\"211100\",\"name\":\"盤錦市\"},{\"id\":\"211200\",\"name\":\"鐵嶺市\"},{\"id\":\"211300\",\"name\":\"朝陽市\"},{\"id\":\"211400\",\"name\":\"葫蘆島市\"}],\"530000\":[{\"id\":\"530000\",\"name\":\"省本級\"},{\"id\":\"530100\",\"name\":\"昆明市\"},{\"id\":\"530300\",\"name\":\"曲靖市\"},{\"id\":\"530400\",\"name\":\"玉溪市\"},{\"id\":\"530500\",\"name\":\"保山市\"},{\"id\":\"530600\",\"name\":\"昭通市\"},{\"id\":\"530700\",\"name\":\"麗江市\"},{\"id\":\"530800\",\"name\":\"普洱市\"},{\"id\":\"530900\",\"name\":\"臨滄市\"},{\"id\":\"532300\",\"name\":\"楚雄彝族自治州\"},{\"id\":\"532500\",\"name\":\"紅河哈尼族彝族自治州\"},{\"id\":\"532600\",\"name\":\"文山壯族苗族自治州\"},{\"id\":\"532800\",\"name\":\"西雙版納傣族自治州\"},{\"id\":\"532900\",\"name\":\"大理白族自治州\"},{\"id\":\"533100\",\"name\":\"德宏傣族景頗族自治州\"},{\"id\":\"533300\",\"name\":\"怒江傈僳族自治州\"},{\"id\":\"533400\",\"name\":\"迪慶藏族自治州\"}],\"130000\":[{\"id\":\"130000\",\"name\":\"省本級\"},{\"id\":\"130100\",\"name\":\"石家莊市\"},{\"id\":\"130200\",\"name\":\"唐山市\"},{\"id\":\"130300\",\"name\":\"秦皇島市\"},{\"id\":\"130400\",\"name\":\"邯鄲市\"},{\"id\":\"130500\",\"name\":\"邢臺市\"},{\"id\":\"130600\",\"name\":\"保定市\"},{\"id\":\"130700\",\"name\":\"張家口市\"},{\"id\":\"130800\",\"name\":\"承德市\"},{\"id\":\"130900\",\"name\":\"滄州市\"},{\"id\":\"131000\",\"name\":\"廊坊市\"},{\"id\":\"131100\",\"name\":\"衡水市\"}],\"340000\":[{\"id\":\"340000\",\"name\":\"省本級\"},{\"id\":\"340100\",\"name\":\"合肥市\"},{\"id\":\"340200\",\"name\":\"蕪湖市\"},{\"id\":\"340300\",\"name\":\"蚌埠市\"},{\"id\":\"340400\",\"name\":\"淮南市\"},{\"id\":\"340500\",\"name\":\"馬鞍山市\"},{\"id\":\"340600\",\"name\":\"淮北市\"},{\"id\":\"340700\",\"name\":\"銅陵市\"},{\"id\":\"340800\",\"name\":\"安慶市\"},{\"id\":\"341000\",\"name\":\"黃山市\"},{\"id\":\"341100\",\"name\":\"滁州市\"},{\"id\":\"341200\",\"name\":\"阜陽市\"},{\"id\":\"341300\",\"name\":\"宿州市\"},{\"id\":\"341500\",\"name\":\"六安市\"},{\"id\":\"341600\",\"name\":\"亳州市\"},{\"id\":\"341700\",\"name\":\"池州市\"},{\"id\":\"341800\",\"name\":\"宣城市\"}],\"500000\":[{\"id\":\"500000\",\"name\":\"省本級\"},{\"id\":\"500101\",\"name\":\"萬州區\"},{\"id\":\"500102\",\"name\":\"涪陵區\"},{\"id\":\"500103\",\"name\":\"渝中區\"},{\"id\":\"500104\",\"name\":\"大渡口區\"},{\"id\":\"500105\",\"name\":\"江北區\"},{\"id\":\"500106\",\"name\":\"沙坪壩區\"},{\"id\":\"500107\",\"name\":\"九龍坡區\"},{\"id\":\"500108\",\"name\":\"南岸區\"},{\"id\":\"500109\",\"name\":\"北碚區\"},{\"id\":\"500110\",\"name\":\"綦江區\"},{\"id\":\"500111\",\"name\":\"大足區\"},{\"id\":\"500112\",\"name\":\"渝北區\"},{\"id\":\"500113\",\"name\":\"巴南區\"},{\"id\":\"500114\",\"name\":\"黔江區\"},{\"id\":\"500115\",\"name\":\"長壽區\"},{\"id\":\"500116\",\"name\":\"江津區\"},{\"id\":\"500117\",\"name\":\"合川區\"},{\"id\":\"500118\",\"name\":\"永川區\"},{\"id\":\"500119\",\"name\":\"南川區\"},{\"id\":\"500120\",\"name\":\"璧山區\"},{\"id\":\"500151\",\"name\":\"銅梁區\"},{\"id\":\"500152\",\"name\":\"潼南區\"},{\"id\":\"500153\",\"name\":\"榮昌區\"},{\"id\":\"500154\",\"name\":\"開州區\"},{\"id\":\"500155\",\"name\":\"梁平區\"},{\"id\":\"500156\",\"name\":\"武隆區\"},{\"id\":\"500229\",\"name\":\"城口縣\"},{\"id\":\"500230\",\"name\":\"豐都縣\"},{\"id\":\"500231\",\"name\":\"墊江縣\"},{\"id\":\"500233\",\"name\":\"忠縣\"},{\"id\":\"500235\",\"name\":\"雲陽縣\"},{\"id\":\"500236\",\"name\":\"奉節縣\"},{\"id\":\"500237\",\"name\":\"巫山縣\"},{\"id\":\"500238\",\"name\":\"巫溪縣\"},{\"id\":\"500240\",\"name\":\"石柱土家族自治縣\"},{\"id\":\"500241\",\"name\":\"秀山土家族苗族自治縣\"},{\"id\":\"500242\",\"name\":\"酉陽土家族苗族自治縣\"},{\"id\":\"500243\",\"name\":\"彭水苗族土家族自治縣\"}],\"350000\":[{\"id\":\"350000\",\"name\":\"省本級\"},{\"id\":\"350100\",\"name\":\"福州市\"},{\"id\":\"350200\",\"name\":\"廈門市\"},{\"id\":\"350300\",\"name\":\"莆田市\"},{\"id\":\"350400\",\"name\":\"三明市\"},{\"id\":\"350500\",\"name\":\"泉州市\"},{\"id\":\"350600\",\"name\":\"漳州市\"},{\"id\":\"350700\",\"name\":\"南平市\"},{\"id\":\"350800\",\"name\":\"龍巖市\"},{\"id\":\"350900\",\"name\":\"寧德市\"}],\"320000\":[{\"id\":\"320000\",\"name\":\"省本級\"},{\"id\":\"320100\",\"name\":\"南京市\"},{\"id\":\"320200\",\"name\":\"無錫市\"},{\"id\":\"320300\",\"name\":\"徐州市\"},{\"id\":\"320400\",\"name\":\"常州市\"},{\"id\":\"320500\",\"name\":\"蘇州市\"},{\"id\":\"320600\",\"name\":\"南通市\"},{\"id\":\"320700\",\"name\":\"連雲港市\"},{\"id\":\"320800\",\"name\":\"淮安市\"},{\"id\":\"320900\",\"name\":\"鹽城市\"},{\"id\":\"321000\",\"name\":\"揚州市\"},{\"id\":\"321100\",\"name\":\"鎮江市\"},{\"id\":\"321200\",\"name\":\"泰州市\"},{\"id\":\"321300\",\"name\":\"宿遷市\"}],\"220000\":[{\"id\":\"220000\",\"name\":\"省本級\"},{\"id\":\"220100\",\"name\":\"長春市\"},{\"id\":\"220200\",\"name\":\"吉林市\"},{\"id\":\"220300\",\"name\":\"四平市\"},{\"id\":\"220400\",\"name\":\"遼源市\"},{\"id\":\"220500\",\"name\":\"通化市\"},{\"id\":\"220600\",\"name\":\"白山市\"},{\"id\":\"220700\",\"name\":\"松原市\"},{\"id\":\"220800\",\"name\":\"白城市\"},{\"id\":\"222400\",\"name\":\"延邊朝鮮族自治州\"}],\"310000\":[{\"id\":\"310000\",\"name\":\"省本級\"},{\"id\":\"310101\",\"name\":\"黃浦區\"},{\"id\":\"310104\",\"name\":\"徐彙區\"},{\"id\":\"310105\",\"name\":\"長寧區\"},{\"id\":\"310106\",\"name\":\"靜安區\"},{\"id\":\"310107\",\"name\":\"普陀區\"},{\"id\":\"310109\",\"name\":\"虹口區\"},{\"id\":\"310110\",\"name\":\"楊浦區\"},{\"id\":\"310112\",\"name\":\"閔行區\"},{\"id\":\"310113\",\"name\":\"寶山區\"},{\"id\":\"310114\",\"name\":\"嘉定區\"},{\"id\":\"310115\",\"name\":\"浦東新區\"},{\"id\":\"310116\",\"name\":\"金山區\"},{\"id\":\"310117\",\"name\":\"松江區\"},{\"id\":\"310118\",\"name\":\"青浦區\"},{\"id\":\"310120\",\"name\":\"奉賢區\"},{\"id\":\"310151\",\"name\":\"崇明區\"}],\"650000\":[{\"id\":\"650000\",\"name\":\"省本級\"},{\"id\":\"650100\",\"name\":\"烏魯木齊市\"},{\"id\":\"650200\",\"name\":\"克拉瑪依市\"},{\"id\":\"652100\",\"name\":\"吐魯番市\"},{\"id\":\"652200\",\"name\":\"哈密市\"},{\"id\":\"652300\",\"name\":\"昌吉回族自治州\"},{\"id\":\"652700\",\"name\":\"博爾塔拉蒙古自治州\"},{\"id\":\"652800\",\"name\":\"巴音郭楞蒙古自治州\"},{\"id\":\"652900\",\"name\":\"阿克蘇地區\"},{\"id\":\"653000\",\"name\":\"克孜勒蘇柯爾克孜自治州\"},{\"id\":\"653100\",\"name\":\"喀什地區\"},{\"id\":\"653200\",\"name\":\"和田地區\"},{\"id\":\"654000\",\"name\":\"伊犁哈薩克自治州\"},{\"id\":\"654200\",\"name\":\"塔城地區\"},{\"id\":\"654300\",\"name\":\"阿勒泰地區\"},{\"id\":\"659001\",\"name\":\"石河子市\"},{\"id\":\"659002\",\"name\":\"阿拉爾市\"},{\"id\":\"659003\",\"name\":\"圖木舒克市\"},{\"id\":\"659004\",\"name\":\"五家渠市\"}],\"150000\":[{\"id\":\"150000\",\"name\":\"省本級\"},{\"id\":\"150100\",\"name\":\"呼和浩特市\"},{\"id\":\"150200\",\"name\":\"包頭市\"},{\"id\":\"150300\",\"name\":\"烏海市\"},{\"id\":\"150400\",\"name\":\"赤峯市\"},{\"id\":\"150500\",\"name\":\"通遼市\"},{\"id\":\"150600\",\"name\":\"鄂爾多斯市\"},{\"id\":\"150700\",\"name\":\"呼倫貝爾市\"},{\"id\":\"150800\",\"name\":\"巴彥淖爾市\"},{\"id\":\"150900\",\"name\":\"烏蘭察布市\"},{\"id\":\"152200\",\"name\":\"興安盟\"},{\"id\":\"152500\",\"name\":\"錫林郭勒盟\"},{\"id\":\"152900\",\"name\":\"阿拉善盟\"}],\"610000\":[{\"id\":\"610000\",\"name\":\"省本級\"},{\"id\":\"610100\",\"name\":\"西安市\"},{\"id\":\"610200\",\"name\":\"銅川市\"},{\"id\":\"610300\",\"name\":\"寶雞市\"},{\"id\":\"610400\",\"name\":\"咸陽市\"},{\"id\":\"610500\",\"name\":\"渭南市\"},{\"id\":\"610600\",\"name\":\"延安市\"},{\"id\":\"610700\",\"name\":\"漢中市\"},{\"id\":\"610800\",\"name\":\"榆林市\"},{\"id\":\"610900\",\"name\":\"安康市\"},{\"id\":\"611000\",\"name\":\"商洛市\"}],\"540000\":[{\"id\":\"540000\",\"name\":\"省本級\"},{\"id\":\"540100\",\"name\":\"拉薩市\"},{\"id\":\"542100\",\"name\":\"昌都市\"},{\"id\":\"542200\",\"name\":\"山南市\"},{\"id\":\"542300\",\"name\":\"日喀則市\"},{\"id\":\"542400\",\"name\":\"那曲市\"},{\"id\":\"542500\",\"name\":\"阿里地區\"},{\"id\":\"542600\",\"name\":\"林芝市\"}],\"360000\":[{\"id\":\"360000\",\"name\":\"省本級\"},{\"id\":\"360100\",\"name\":\"南昌市\"},{\"id\":\"360200\",\"name\":\"景德鎮市\"},{\"id\":\"360300\",\"name\":\"萍鄉市\"},{\"id\":\"360400\",\"name\":\"九江市\"},{\"id\":\"360500\",\"name\":\"新餘市\"},{\"id\":\"360600\",\"name\":\"鷹潭市\"},{\"id\":\"360700\",\"name\":\"贛州市\"},{\"id\":\"360800\",\"name\":\"吉安市\"},{\"id\":\"360900\",\"name\":\"宜春市\"},{\"id\":\"361000\",\"name\":\"撫州市\"},{\"id\":\"361100\",\"name\":\"上饒市\"}],\"420000\":[{\"id\":\"420000\",\"name\":\"省本級\"},{\"id\":\"420100\",\"name\":\"武漢市\"},{\"id\":\"420200\",\"name\":\"黃石市\"},{\"id\":\"420300\",\"name\":\"十堰市\"},{\"id\":\"420500\",\"name\":\"宜昌市\"},{\"id\":\"420600\",\"name\":\"襄陽市\"},{\"id\":\"420700\",\"name\":\"鄂州市\"},{\"id\":\"420800\",\"name\":\"荊門市\"},{\"id\":\"420900\",\"name\":\"孝感市\"},{\"id\":\"421000\",\"name\":\"荊州市\"},{\"id\":\"421100\",\"name\":\"黃岡市\"},{\"id\":\"421200\",\"name\":\"咸寧市\"},{\"id\":\"421300\",\"name\":\"隨州市\"},{\"id\":\"422800\",\"name\":\"恩施土家族苗族自治州\"},{\"id\":\"429004\",\"name\":\"仙桃市\"},{\"id\":\"429005\",\"name\":\"潛江市\"},{\"id\":\"429006\",\"name\":\"天門市\"},{\"id\":\"429021\",\"name\":\"神農架林區\"}],\"520000\":[{\"id\":\"520000\",\"name\":\"省本級\"},{\"id\":\"520100\",\"name\":\"貴陽市\"},{\"id\":\"520200\",\"name\":\"六盤水市\"},{\"id\":\"520300\",\"name\":\"遵義市\"},{\"id\":\"520400\",\"name\":\"安順市\"},{\"id\":\"520500\",\"name\":\"畢節市\"},{\"id\":\"520600\",\"name\":\"銅仁市\"},{\"id\":\"522300\",\"name\":\"黔西南布依族苗族自治州\"},{\"id\":\"522600\",\"name\":\"黔東南苗族侗族自治州\"},{\"id\":\"522700\",\"name\":\"黔南布依族苗族自治州\"}],\"370000\":[{\"id\":\"370000\",\"name\":\"省本級\"},{\"id\":\"370100\",\"name\":\"濟南市\"},{\"id\":\"370200\",\"name\":\"青島市\"},{\"id\":\"370300\",\"name\":\"淄博市\"},{\"id\":\"370400\",\"name\":\"棗莊市\"},{\"id\":\"370500\",\"name\":\"東營市\"},{\"id\":\"370600\",\"name\":\"煙臺市\"},{\"id\":\"370700\",\"name\":\"濰坊市\"},{\"id\":\"370800\",\"name\":\"濟寧市\"},{\"id\":\"370900\",\"name\":\"泰安市\"},{\"id\":\"371000\",\"name\":\"威海市\"},{\"id\":\"371100\",\"name\":\"日照市\"},{\"id\":\"371300\",\"name\":\"臨沂市\"},{\"id\":\"371400\",\"name\":\"德州市\"},{\"id\":\"371500\",\"name\":\"聊城市\"},{\"id\":\"371600\",\"name\":\"濱州市\"},{\"id\":\"371700\",\"name\":\"菏澤市\"}],\"110000\":[{\"id\":\"110000\",\"name\":\"省本級\"},{\"id\":\"110101\",\"name\":\"東城區\"},{\"id\":\"110102\",\"name\":\"西城區\"},{\"id\":\"110105\",\"name\":\"朝陽區\"},{\"id\":\"110106\",\"name\":\"豐臺區\"},{\"id\":\"110107\",\"name\":\"石景山區\"},{\"id\":\"110108\",\"name\":\"海淀區\"},{\"id\":\"110109\",\"name\":\"門頭溝區\"},{\"id\":\"110111\",\"name\":\"房山區\"},{\"id\":\"110112\",\"name\":\"通州區\"},{\"id\":\"110113\",\"name\":\"順義區\"},{\"id\":\"110114\",\"name\":\"昌平區\"},{\"id\":\"110115\",\"name\":\"大興區\"},{\"id\":\"110116\",\"name\":\"懷柔區\"},{\"id\":\"110117\",\"name\":\"平谷區\"},{\"id\":\"110118\",\"name\":\"密雲區\"},{\"id\":\"110119\",\"name\":\"延慶區\"}],\"460000\":[{\"id\":\"460000\",\"name\":\"省本級\"},{\"id\":\"460100\",\"name\":\"海口市\"},{\"id\":\"460200\",\"name\":\"三亞市\"},{\"id\":\"460300\",\"name\":\"三沙市\"},{\"id\":\"469001\",\"name\":\"五指山市\"},{\"id\":\"469002\",\"name\":\"瓊海市\"},{\"id\":\"469003\",\"name\":\"儋州市\"},{\"id\":\"469005\",\"name\":\"文昌市\"},{\"id\":\"469006\",\"name\":\"萬寧市\"},{\"id\":\"469007\",\"name\":\"東方市\"},{\"id\":\"469021\",\"name\":\"定安縣\"},{\"id\":\"469022\",\"name\":\"屯昌縣\"},{\"id\":\"469023\",\"name\":\"澄邁縣\"},{\"id\":\"469024\",\"name\":\"臨高縣\"},{\"id\":\"469025\",\"name\":\"白沙黎族自治縣\"},{\"id\":\"469026\",\"name\":\"昌江黎族自治縣\"},{\"id\":\"469027\",\"name\":\"樂東黎族自治縣\"},{\"id\":\"469028\",\"name\":\"陵水黎族自治縣\"},{\"id\":\"469029\",\"name\":\"保亭黎族苗族自治縣\"},{\"id\":\"469030\",\"name\":\"瓊中黎族苗族自治縣\"}]}";

    public static void main(String[] args) throws UnsupportedEncodingException, InterruptedException {


        //初始化城市信息  這裏我們獲取去山東的所有城市
        List<AddressInfo> addressInfos = JSONObject.parseObject(addressInfo).getJSONArray("370000")        .toJavaList(AddressInfo.class);


        for(AddressInfo addressInfo:addressInfos){

            //開始爬取第一個地區
            System.out.println(addressInfo.getName()+"開始");

            //獲取地區招投標信息
            String tenderAdderssInfo = sendPost(BIAO_URL,addressInfo.getId(),"1");
            DeaListResponse deaListResponse = JSONObject.parseObject(tenderAdderssInfo, DeaListResponse.class);
            //無數據
            if(deaListResponse==null || deaListResponse.getTtlrow()==0){
                System.out.println(addressInfo.getName()+"無數據");
                continue;
            }
            List<ToubiaoList> data = deaListResponse.getData();
            //這裏我處理一下數據  準備保存至我們的數據庫
            handlerZhaobiao(data,addressInfo.getName());

            //不止一頁
            if(deaListResponse.getTtlpage()>1){
                for(int i=2;i<=deaListResponse.getCurrentpage();i++){
                    System.out.println(addressInfo.getName()+"第"+i+"頁");
                    tenderAdderssInfo = sendPost(BIAO_URL,addressInfo.getId(),i+"");
                    deaListResponse = JSONObject.parseObject(tenderAdderssInfo, DeaListResponse.class);
                    handlerZhaobiao(data,addressInfo.getName());
                }
            }

            System.out.println(addressInfo.getName()+"結束");
        }

        System.out.println("所有招投標列表數據:"+JSONObject.toJSONString(tenderInfos));


        // 利用webmagic爬取界面信息  初始化webmagic
        TenderInfoWebmagic tenderInfoWebmagic = new TenderInfoWebmagic();
        Spider spider = Spider.create(tenderInfoWebmagic)
                .addPipeline(new ConsolePipeline())
                .setDownloader(new SeleniumDownloader("C:\\Users\\admin\\AppData\\Local\\Google\\Chrome\\Application\\75.0.3770.100\\chromedriver_win32\\chromedriver.exe"));

        tenderInfoWebmagic.addTargetUrl(spider);
        //spider.addUrl("https://www.baidu.com/");
        spider.run();
    }

    private void addTargetUrl(Spider spider) {


        System.setProperty("webdriver.chrome.driver",
                "C:\\Users\\admin\\AppData\\Local\\Google\\Chrome\\Application\\75.0.3770.100\\chromedriver_win32\\chromedriver.exe");


        //webmagic默認會打開瀏覽器 關閉瀏覽器
        ChromeOptions chromeOptions=new ChromeOptions();
        chromeOptions.addArguments("-headless");
        WebDriver driver = new ChromeDriver(chromeOptions);

        driver.manage().window().maximize();

        //tenderinfos  爬取到的所有url
        for(TenderInfo tenderInfo: tenderInfos){
            try {
                String url = tenderInfo.getUrl();
                driver.get(tenderInfo.getUrl());
                //獲取招標/資審公告按鈕
                WebElement element = driver.findElement(By.xpath("//li[@id='t_0101']"));
                //點擊
                element.click();
                //獲取招標/資審公告 url
                WebElement element1 = driver.findElement(By.xpath("//div[@id='show0101']"));
                String substring = url.substring( url.lastIndexOf("/")+1,url.length()-6);
                String targetUrl = element1.findElement(By.xpath("//iframe[@id='iframe0101']")).getAttribute("src");

                //把外面列表的id追加到招標/資審公告上,來確定列表和url之間的關係
                targetUrl+="?url="+substring;
                spider.addUrl(targetUrl);
                //睡3秒 防止被封
                Thread.sleep(3000);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

    }

    static String sendPost(String url,String area,String page) throws UnsupportedEncodingException {
        try {
            //睡眠,防止調用過快被封
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        DefaultHttpClient httpClient = new DefaultHttpClient();
        HttpPost httpPost = new HttpPost(url);
        httpPost.addHeader("Accept","application/json");
        httpPost.addHeader("Content-Type","application/x-www-form-urlencoded; charset=UTF-8");
        httpPost.addHeader("User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36");
        List<NameValuePair> nvps = new ArrayList<NameValuePair>();
        SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd");
        Date currentDate = new Date();
        Calendar c = Calendar.getInstance();
        c.setTime(currentDate);
        c.add(Calendar.DATE, - 9);
        Date d = c.getTime();
        String day = format.format(d);

        //分析了部分請求參數
        nvps.add(new BasicNameValuePair("TIMEBEGIN_SHOW",day));
        nvps.add(new BasicNameValuePair("TIMEEND_SHOW",new SimpleDateFormat("yyyy-MM-dd").format(new Date())));

        //通過在前端分析接口發現 timebegin和timeend相差10天
        nvps.add(new BasicNameValuePair("TIMEBEGIN",day));
        nvps.add(new BasicNameValuePair("TIMEEND",new SimpleDateFormat("yyyy-MM-dd").format(currentDate)));

        nvps.add(new BasicNameValuePair("SOURCE_TYPE","1"));
        nvps.add(new BasicNameValuePair("DEAL_TIME","01"));
        nvps.add(new BasicNameValuePair("DEAL_CLASSIFY","01"));
        nvps.add(new BasicNameValuePair("DEAL_STAGE","0101"));

        //山東的省代碼
        nvps.add(new BasicNameValuePair("DEAL_PROVINCE","370000"));

        //市代碼
        nvps.add(new BasicNameValuePair("DEAL_CITY",area));

        nvps.add(new BasicNameValuePair("DEAL_PLATFORM","0"));
        nvps.add(new BasicNameValuePair("BID_PLATFORM","0"));
        nvps.add(new BasicNameValuePair("DEAL_TRADE","0"));
        nvps.add(new BasicNameValuePair("isShowAll","1"));
        nvps.add(new BasicNameValuePair("PAGENUMBER",page));
        nvps.add(new BasicNameValuePair("FINDTXT",""));
        httpPost.setEntity(new UrlEncodedFormEntity(nvps, "utf-8"));
        String res = "";
        HttpResponse response = null;
        try {
            response = httpClient.execute(httpPost);
            res = EntityUtils.toString(response.getEntity(), "utf-8");
        } catch (Exception e) {
            e.printStackTrace();
        }
       return res;
    }


    public static void  handlerZhaobiao( List<ToubiaoList> data,String cityName){

        System.out.println(cityName+"數據:"+JSONObject.toJSONString(data));
       Date currentDate = new Date();
        for(ToubiaoList toubiaoList : data){
            //詳情url
            String url = toubiaoList.getUrl();
            TenderInfo tenderInfo = new TenderInfo();
            tenderInfo.setUrl(url);
            tenderInfo.setAddress(toubiaoList.getDistrictShow()+"省"+cityName);
            tenderInfo.setId(url.substring( url.lastIndexOf("/")+1,url.length()-6));
            //內容
            tenderInfo.setContent(toubiaoList.getTitle());
            //tenderInfo.setContract_type("工程招標");
            //tenderInfo.setAnnouncement("招標公告");
            //發佈時間
            tenderInfo.setRelease_time(new SimpleDateFormat("yyyy-MM-dd").format(currentDate));
            //tenderInfo.setCreator("f8133aa5ac26efe3da55e6a6882688d7");
            String format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(currentDate);
            //tenderInfo.setCreate_time(format);
            tenderInfos.add(tenderInfo);
        }
        System.out.println(JSONObject.toJSONString(tenderInfos));
    }

    @Override
    public void process(Page page) {
        String pageUrl = page.getUrl().toString();
        Html pageHtml = page.getHtml();
        try {
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        String substring = pageUrl.substring( pageUrl.lastIndexOf("=")+1,pageUrl.length());
        String xpath = pageHtml.xpath("//div[@id=\"mycontent\"]").toString();
        String html = "<!DOCTYPE html>\n" +
                "<html lang=\"en\">\n" +
                "<head>\n" +
                "    <meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,maximum-scale=1.0,user-scalable=0\">\n"+"<script src=\"jquery-1.6.4.min.js\"></script>"+
                "    <meta\n" +
                "      http-equiv=\"X-UA-Compatible\"\n" +
                "      content=\"IE=edge,chrome=1\"\n" +
                "      charset=\"utf-8\"\n" +
                "    />\n" +
                "</head>\n" +
                "<body>";
        html+=xpath;

        html+="</body>\n" +
                "</html>";


        File fp=new File("F:\\zfpackage\\"+substring+".html");
        PrintWriter pfp= null;
        try {
            pfp = new PrintWriter(fp);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
        pfp.print(html);
        pfp.close();
    }

    @Override
    public Site getSite() {
        return site;
    }
}







public class ToubiaoList {

    private String classify;

    private String title;

    private String timeShow;

    private String stageName;

    private String  platformName;

    private String classifyShow;

    private String tradeShow;

    private String districtShow;

    private String url;

    private String stageShow;

    private String titleShow;

    public String getClassify() {
        return classify;
    }

    public void setClassify(String classify) {
        this.classify = classify;
    }

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getTimeShow() {
        return timeShow;
    }

    public void setTimeShow(String timeShow) {
        this.timeShow = timeShow;
    }

    public String getStageName() {
        return stageName;
    }

    public void setStageName(String stageName) {
        this.stageName = stageName;
    }

    public String getPlatformName() {
        return platformName;
    }

    public void setPlatformName(String platformName) {
        this.platformName = platformName;
    }

    public String getClassifyShow() {
        return classifyShow;
    }

    public void setClassifyShow(String classifyShow) {
        this.classifyShow = classifyShow;
    }

    public String getTradeShow() {
        return tradeShow;
    }

    public void setTradeShow(String tradeShow) {
        this.tradeShow = tradeShow;
    }

    public String getUrl() {
        return url;
    }

    public void setUrl(String url) {
        this.url = url;
    }

    public String getStageShow() {
        return stageShow;
    }

    public void setStageShow(String stageShow) {
        this.stageShow = stageShow;
    }

    public String getTitleShow() {
        return titleShow;
    }

    public void setTitleShow(String titleShow) {
        this.titleShow = titleShow;
    }

    public String getDistrictShow() {
        return districtShow;
    }

    public void setDistrictShow(String districtShow) {
        this.districtShow = districtShow;
    }
}



public class TenderInfo {


    private static final long serialVersionUID = 1L;
    private String id;
    private String content;
    private String contract_type;
    private String announcement;
    private String release_time;
    private String address;
    private String creator;
    private String create_time;
    private String modified;
    private String modify_time;
    private String version;
    private String url;

    public static long getSerialVersionUID() {
        return serialVersionUID;
    }

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getContent() {
        return content;
    }

    public void setContent(String content) {
        this.content = content;
    }

    public String getContract_type() {
        return contract_type;
    }

    public void setContract_type(String contract_type) {
        this.contract_type = contract_type;
    }

    public String getAnnouncement() {
        return announcement;
    }

    public void setAnnouncement(String announcement) {
        this.announcement = announcement;
    }

    public String getRelease_time() {
        return release_time;
    }

    public void setRelease_time(String release_time) {
        this.release_time = release_time;
    }

    public String getAddress() {
        return address;
    }

    public void setAddress(String address) {
        this.address = address;
    }

    public String getCreator() {
        return creator;
    }

    public void setCreator(String creator) {
        this.creator = creator;
    }

    public String getCreate_time() {
        return create_time;
    }

    public void setCreate_time(String create_time) {
        this.create_time = create_time;
    }

    public String getModified() {
        return modified;
    }

    public void setModified(String modified) {
        this.modified = modified;
    }

    public String getModify_time() {
        return modify_time;
    }

    public void setModify_time(String modify_time) {
        this.modify_time = modify_time;
    }

    public String getVersion() {
        return version;
    }

    public void setVersion(String version) {
        this.version = version;
    }

    public String getUrl() {
        return url;
    }

    public void setUrl(String url) {
        this.url = url;
    }
}


 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章