四大服務組件概覽: Oozie Workflow: 該組件用於定義和執行一個特定順序的mapreduce、hive和pig作業。 Oozie Coordinator:該組件用於支持基於事件、系統資源存在性等條件的workflow的自動化執行。 Oozie Bundle:該引擎可以定義和執行"一束"應用,從而提供一個批量化的方法,將一組Coordinator應用程序一起進行管理。 Oozie:服務器等級協定(Service Level Agreement, SLA):該組件支持workflow應用程序執行過程的記錄跟蹤。 下載:wget http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.3.6.tar.gz 參考: http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.3.6/ https://cwiki.apache.org/confluence/display/OOZIE oozie安裝 1、下載oozie和ext-2.2.zip 2、設置環境變量 export OOZIE_HOME=/home/hadoop/bigdater/oozie-4.0.0-cdh5.3.6 export PATH=$PATH:$OOZIE_HOME/bin 3、進行conf配置信息修改,修改conf/oozie-site.xml文件。 主要就是進行元數據指定和service指定。 4、可以在conf/oozie-env.sh中進行參數修改,比如修改端口號,默認端口號爲11000. 5、oozie根目錄創建libext文件夾,複製mysql的driver壓縮包到lib文件夾中。 命令:cp ~/bigdater/hive-0.13.1-cdh5.3.6/lib/mysql-connector-java-5.1.31.jar ./libext/ or cp ~/bigdater/softs/mysql-connector-java-5.1.31.jar ./libext/ 6、執行sql創建,執行完成後,mysql中出現數據庫和數據表 命令:ooziedb.sh create -sqlfile oozie.sql -run 7、設置hadoop代理用戶。(hive安裝的時候已經設置過) hadoop.proxyuser.hadoop.hosts&hadoop.proxyuser.hadoop.groups 8、在hdfs上創建公用文件夾:執行命令oozie-setup.sh sharelib create -fs hdfs://hh:8020 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz 9、創建war文件,執行addtowar.sh -inputwar ./oozie.war -outputwar ./oozie-server/webapps/oozie.war -hadoop 2.5.0 $HADOOP_HOME -jars ./libext/mysql-connector-java-5.1.31.jar -extjs ../softs/ext-2.2.zip 或者 將hadoop相關包,mysql相關包和ext壓縮包放到libext文件夾中,運行oozie-setup.sh prepare-war也可以創建war文件。 10、運行:oozied.sh run 或者 oozied.sh start(前者在前端運行,後者在後臺運行) 11、查看web界面&查看狀態oozie admin -oozie http://hh:11000/oozie -status ---------------------------------------------------------- workflow的模板配置 <workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:[version]"> <credentials>...</credentials> <!-- 可選:定義憑證信息,一般用於權限驗證相關的配置 --> <start/> <!-- 必須:定義workflow的開始節點 --> <!-- 控制語句開始 --> <decision>...</decision> <!-- 可選: 定義switch-case語句 --> <fork>...</fork> <!-- 可選:定義並行執行開始控制節點 --> <join>...</join> <!-- 可選:定義並行執行結束控制節點 --> <kill>...</kill> <!-- 可選:kill節點 --> <!-- 控制語句結束 --> <action name="[NODE-NAME]">....</action> <!-- 具體的action節點 --> <end/> <!-- 結束節點 --> </workflow-app> 注意: 其中credentials、start和end節點只可以有一個,其他類型的節點可以有多個。start和end節點是必須有的,其他節點可以沒有。 ----------------------------------------- job.template.properties的文件 nameNode=<hadoop配置項:fs.defaultFS/fs.default.name> jobTracker=<hadoop的jobtracker配置項,hadoop2中爲yarn的yarn.resourcemanager.address,默認0.0.0.0:8032> queueName=<hadoop的執行隊列名稱,默認default> oozie.wf.application.path=<workflow任務在hdfs上的目錄> oozie.coord.application.path=<coordinator任務在hdfs上的目錄> oozie.bundle.application.path=<bundle任務在hdfs上的目錄> ------------------------------------------ oozie-site.xml的配置 <?xml version="1.0"?> <configuration> <property> <name>oozie.services</name> <value> org.apache.oozie.service.JobsConcurrencyService, org.apache.oozie.service.SchedulerService, org.apache.oozie.service.InstrumentationService, org.apache.oozie.service.MemoryLocksService, org.apache.oozie.service.CallableQueueService, org.apache.oozie.service.UUIDService, org.apache.oozie.service.ELService, org.apache.oozie.service.AuthorizationService, org.apache.oozie.service.UserGroupInformationService, org.apache.oozie.service.HadoopAccessorService, org.apache.oozie.service.URIHandlerService, org.apache.oozie.service.DagXLogInfoService, org.apache.oozie.service.SchemaService, org.apache.oozie.service.LiteWorkflowAppService, org.apache.oozie.service.JPAService, org.apache.oozie.service.StoreService, org.apache.oozie.service.CoordinatorStoreService, org.apache.oozie.service.SLAStoreService, org.apache.oozie.service.DBLiteWorkflowStoreService, org.apache.oozie.service.CallbackService, org.apache.oozie.service.ActionService, org.apache.oozie.service.ShareLibService, org.apache.oozie.service.ActionCheckerService, org.apache.oozie.service.RecoveryService, org.apache.oozie.service.PurgeService, org.apache.oozie.service.CoordinatorEngineService, org.apache.oozie.service.BundleEngineService, org.apache.oozie.service.DagEngineService, org.apache.oozie.service.CoordMaterializeTriggerService, org.apache.oozie.service.StatusTransitService, org.apache.oozie.service.PauseTransitService, org.apache.oozie.service.GroupsService, org.apache.oozie.service.ProxyUserService, org.apache.oozie.service.XLogStreamingService, org.apache.oozie.service.JvmPauseMonitorService </value> </property> <property> <name>oozie.service.HadoopAccessorService.hadoop.configurations</name> <value>*=/home/hadoop/bigdater/hadoop-2.5.0-cdh5.3.6/etc/hadoop</value> </property> <property> <name>oozie.service.JPAService.create.db.schema</name> <value>true</value> </property> <property> <name>oozie.service.JPAService.jdbc.driver</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>oozie.service.JPAService.jdbc.url</name> <value>jdbc:mysql://hh:3306/oozie?createDatabaseIfNotExist=true</value> </property> <property> <name>oozie.service.JPAService.jdbc.username</name> <value>hive</value> </property> <property> <name>oozie.service.JPAService.jdbc.password</name> <value>hive</value> </property> <property> <name>oozie.service.JPAService.jdbc.password</name> <value>hive</value> </property> <property> <name>oozie.processing.timezone</name> <value>GMT+0800</value> </property> <!-- start run the shell action config --> <property> <name>oozie.service.SchemaService.wf.ext.schemas</name> <value>oozie-sla-0.1.xsd,shell-action-0.1.xsd,hive-action-0.2.xsd</value> </property> <property> <name>oozie.service.ActionService.executor.ext.classes</name> <value> org.apache.oozie.action.hadoop.ShellActionExecutor, org.apache.oozie.action.hadoop.HiveActionExecutor </value> </property> <!-- 控制oozie的coordinator任務運行間隔時間是否檢測--> <property> <name>oozie.service.coord.check.maximum.frequency</name> <value>false</value> </property> </configuration>
Oozie的基本操作
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.