Oozie的基本操作

四大服務組件概覽:
Oozie Workflow: 該組件用於定義和執行一個特定順序的mapreduce、hive和pig作業。
Oozie Coordinator:該組件用於支持基於事件、系統資源存在性等條件的workflow的自動化執行。
Oozie Bundle:該引擎可以定義和執行"一束"應用,從而提供一個批量化的方法,將一組Coordinator應用程序一起進行管理。
Oozie:服務器等級協定(Service Level Agreement, SLA):該組件支持workflow應用程序執行過程的記錄跟蹤。

下載:wget http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.3.6.tar.gz
參考:
   http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.3.6/
   https://cwiki.apache.org/confluence/display/OOZIE

oozie安裝
   1、下載oozie和ext-2.2.zip
   2、設置環境變量
      export OOZIE_HOME=/home/hadoop/bigdater/oozie-4.0.0-cdh5.3.6
      export PATH=$PATH:$OOZIE_HOME/bin
   3、進行conf配置信息修改,修改conf/oozie-site.xml文件。
      主要就是進行元數據指定和service指定。
   4、可以在conf/oozie-env.sh中進行參數修改,比如修改端口號,默認端口號爲11000.

   5、oozie根目錄創建libext文件夾,複製mysql的driver壓縮包到lib文件夾中。
      命令:cp ~/bigdater/hive-0.13.1-cdh5.3.6/lib/mysql-connector-java-5.1.31.jar ./libext/
      or
      cp ~/bigdater/softs/mysql-connector-java-5.1.31.jar ./libext/
   6、執行sql創建,執行完成後,mysql中出現數據庫和數據表
      命令:ooziedb.sh create -sqlfile oozie.sql -run
   7、設置hadoop代理用戶。(hive安裝的時候已經設置過)
      hadoop.proxyuser.hadoop.hosts&hadoop.proxyuser.hadoop.groups
   8、在hdfs上創建公用文件夾:執行命令oozie-setup.sh sharelib create -fs hdfs://hh:8020 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz

   9、創建war文件,執行addtowar.sh -inputwar ./oozie.war -outputwar ./oozie-server/webapps/oozie.war -hadoop 2.5.0 $HADOOP_HOME -jars ./libext/mysql-connector-java-5.1.31.jar -extjs ../softs/ext-2.2.zip
   或者
   將hadoop相關包,mysql相關包和ext壓縮包放到libext文件夾中,運行oozie-setup.sh prepare-war也可以創建war文件。
   10、運行:oozied.sh run 或者 oozied.sh start(前者在前端運行,後者在後臺運行)
   11、查看web界面&查看狀態oozie admin -oozie http://hh:11000/oozie -status


   ----------------------------------------------------------
   workflow的模板配置
   <workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:[version]">
        <credentials>...</credentials> <!-- 可選:定義憑證信息,一般用於權限驗證相關的配置 -->
       <start/> <!-- 必須:定義workflow的開始節點 -->
       <!-- 控制語句開始 -->
       <decision>...</decision> <!-- 可選: 定義switch-case語句 -->
       <fork>...</fork> <!-- 可選:定義並行執行開始控制節點 -->
       <join>...</join> <!-- 可選:定義並行執行結束控制節點 -->
       <kill>...</kill> <!-- 可選:kill節點 -->
       <!-- 控制語句結束 -->
       <action name="[NODE-NAME]">....</action> <!-- 具體的action節點 -->
       <end/> <!-- 結束節點 -->
    </workflow-app>

    注意: 其中credentials、start和end節點只可以有一個,其他類型的節點可以有多個。start和end節點是必須有的,其他節點可以沒有。
    -----------------------------------------
    job.template.properties的文件
    nameNode=<hadoop配置項:fs.defaultFS/fs.default.name>
    jobTracker=<hadoop的jobtracker配置項,hadoop2中爲yarn的yarn.resourcemanager.address,默認0.0.0.0:8032>
    queueName=<hadoop的執行隊列名稱,默認default>

    oozie.wf.application.path=<workflow任務在hdfs上的目錄>
    oozie.coord.application.path=<coordinator任務在hdfs上的目錄>
    oozie.bundle.application.path=<bundle任務在hdfs上的目錄>
    ------------------------------------------
    oozie-site.xml的配置

    <?xml version="1.0"?>
    <configuration>
        <property>
          <name>oozie.services</name>
          <value>
             org.apache.oozie.service.JobsConcurrencyService,
             org.apache.oozie.service.SchedulerService,
             org.apache.oozie.service.InstrumentationService,
             org.apache.oozie.service.MemoryLocksService,
             org.apache.oozie.service.CallableQueueService,
             org.apache.oozie.service.UUIDService,
             org.apache.oozie.service.ELService,
             org.apache.oozie.service.AuthorizationService,
             org.apache.oozie.service.UserGroupInformationService,
             org.apache.oozie.service.HadoopAccessorService,
             org.apache.oozie.service.URIHandlerService,
             org.apache.oozie.service.DagXLogInfoService,
             org.apache.oozie.service.SchemaService,
             org.apache.oozie.service.LiteWorkflowAppService,
             org.apache.oozie.service.JPAService,
             org.apache.oozie.service.StoreService,
             org.apache.oozie.service.CoordinatorStoreService,
             org.apache.oozie.service.SLAStoreService,
             org.apache.oozie.service.DBLiteWorkflowStoreService,
             org.apache.oozie.service.CallbackService,
             org.apache.oozie.service.ActionService,
             org.apache.oozie.service.ShareLibService,
             org.apache.oozie.service.ActionCheckerService,
             org.apache.oozie.service.RecoveryService,
             org.apache.oozie.service.PurgeService,
             org.apache.oozie.service.CoordinatorEngineService,
             org.apache.oozie.service.BundleEngineService,
             org.apache.oozie.service.DagEngineService,
             org.apache.oozie.service.CoordMaterializeTriggerService,
             org.apache.oozie.service.StatusTransitService,
             org.apache.oozie.service.PauseTransitService,
             org.apache.oozie.service.GroupsService,
             org.apache.oozie.service.ProxyUserService,
             org.apache.oozie.service.XLogStreamingService,
             org.apache.oozie.service.JvmPauseMonitorService
          </value>
       </property>
       <property>
          <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
          <value>*=/home/hadoop/bigdater/hadoop-2.5.0-cdh5.3.6/etc/hadoop</value>
       </property>
       <property>
          <name>oozie.service.JPAService.create.db.schema</name>
          <value>true</value>
       </property>

       <property>
          <name>oozie.service.JPAService.jdbc.driver</name>
          <value>com.mysql.jdbc.Driver</value>
       </property>

       <property>
          <name>oozie.service.JPAService.jdbc.url</name>
          <value>jdbc:mysql://hh:3306/oozie?createDatabaseIfNotExist=true</value>
       </property>

       <property>
          <name>oozie.service.JPAService.jdbc.username</name>
          <value>hive</value>
       </property>

       <property>
          <name>oozie.service.JPAService.jdbc.password</name>
          <value>hive</value>
       </property>

       <property>
          <name>oozie.service.JPAService.jdbc.password</name>
          <value>hive</value>
       </property>

       <property>
          <name>oozie.processing.timezone</name>
          <value>GMT+0800</value>
       </property>

       <!-- start run the shell action config -->
       <property>
          <name>oozie.service.SchemaService.wf.ext.schemas</name>
          <value>oozie-sla-0.1.xsd,shell-action-0.1.xsd,hive-action-0.2.xsd</value>
       </property>

       <property>
          <name>oozie.service.ActionService.executor.ext.classes</name>
          <value>
             org.apache.oozie.action.hadoop.ShellActionExecutor,
             org.apache.oozie.action.hadoop.HiveActionExecutor
          </value>
       </property>

       <!-- 控制oozie的coordinator任務運行間隔時間是否檢測-->
       <property>
          <name>oozie.service.coord.check.maximum.frequency</name>
          <value>false</value>
       </property>
    </configuration>
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章