python爬蟲，抓取oracle-base上的一些常用腳本

以下利用python實現在http://www.oracle-base.com/dba上抓取上有的數據庫常用腳本，時間關係，寫的比較粗糙，分享給大家，希望大家一起進步。

# -*- coding: utf-8 -*-
#---------------------------------------
#   程序：抓取http://www.oracle-base.com/dba上的腳本
#   版本：0.1
#   作者：carefree
#   日期：2014-03-12
#   語言：Python 2.7
#   功能：抓取http://www.oracle-base.com/dba上的腳本
#---------------------------------------

import urllib
import urllib2
import cookielib
import re
import string
import os
class GetScript:
    # 申明相關的屬性
    def __init__(self):
        self.baseUrl='http://www.oracle-base.com/dba/'
        self.initUrl='http://www.oracle-base.com/dba/scripts.php' #數據抓取測試頁面
        self.results = []    #存儲當前的抓取的腳本
        self.ls = os.linesep   #行終止符
        self.category =''       #類型
        self.filename=''        #腳本名稱
        self.content=''         #腳本內容

    #抓取頁面上的腳本名稱以及類型
    def get_data(self):
        result = urllib2.urlopen(self.initUrl)   #打開頁面
        self.deal_data(result.read())

    # 處理頁面內容
    def deal_data(self,myPage):
        #獲取類似於<li><a href="script.php?category=script_creation&file=table_constraints_ddl.sql">table_constraints_ddl.sql</li>
        result = re.findall('(script[\S]+(\s+)*?.sql")',myPage)
        for x in result:
            #獲取類型以及文件名
            a = x[0].find('&')
            self.category = x[0][20:a]
            self.filename = x[0][x[0].find('=',a+1)+1:-1]
            result.remove(x)
            self.get_script()

    def get_script(self):
        url = self.baseUrl+self.category+'/'+self.filename
        #獲取腳本內容
        self.content = urllib2.urlopen(url).read()

        #將抓取的結果寫入文件
        self.write_tofile()

    #將抓取的結果寫入文件
    def write_tofile(self):
        fname = self.category +'\\' + self.filename
        #fname = self.filename
        while True:
            if not os.path.exists(self.category):
                os.makedirs(r'%s/%s'%(os.getcwd(),self.category))
            if os.path.exists(fname):
                print "ERROR: file '%s\%s' already existing!" %(os.getcwd(),fname) +''
                break
                #fname = raw_input('Input the another file name: ')
            else:
                fobj = open(fname,'w')
                fobj.writelines(self.content)
                fobj.close()
                print 'Have the file is written to '+os.getcwd()+'\\'+fname+'!'
                break

#print '數據已經寫入'+fname+'文件中,處理完成'

#測試代碼
if __name__ == '__main__':
    #print os.getcwd()
    mySpider = GetScript()
    mySpider.get_data()
    print 'Already processing is completed!'

python爬蟲，抓取oracle-base上的一些常用腳本

如何使用 JS 判斷用戶是否處於活躍狀態

lightdb秒級增加列和刪除列（not null帶默認值）

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

ORA-19809: limit exceeded for recovery files（超出了恢復文件數的限制）

oracle 10g打補丁（p5490848_10202_LINUX）

ORA-1122, ORA-1110, ORA-1207 while open the database after crash

授權scott用戶可以開啓執行

雙擊熱備搭建過程中，oratab缺失可能導致無法啓動

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結