http://blog.csdn.net/russell_tao/article/details/7090033
04年時維護的第一個商業服務就用了兩次fork產生守護進程的做法,前兩天在網上看到許多帖子以及一些unix書籍,認爲一次fork後產生守護進程足夠了,各有道理吧,不過多了一次fork到底是出於什麼目的呢?
進程也就是task,看看內核裏維護進程的數據結構task_struct,這裏有兩個成員:
[cpp] view plaincopy
- struct task_struct {
- volatile long state;
- int exit_state;
- ...
- }
看看include/linux/sched.h裏的value取值:
[cpp] view plaincopy
- #define TASK_RUNNING 0
- #define TASK_INTERRUPTIBLE 1
- #define TASK_UNINTERRUPTIBLE 2
- #define __TASK_STOPPED 4
- #define __TASK_TRACED 8
- /* in tsk->exit_state */
- #define EXIT_ZOMBIE 16
- #define EXIT_DEAD 32
- /* in tsk->state again */
- #define TASK_DEAD 64
- #define TASK_WAKEKILL 128
- #define TASK_WAKING 256
- #define TASK_STATE_MAX 512
可以看到,進程狀態裏除了大家都理解的running/interuptible/uninterruptible/stop等狀態外,還有一個ZOMBIE狀態,這個狀態是怎麼回事呢?
這是因爲linux裏的進程都屬於一顆樹,樹的根結點是linux系統初始化結束階段時啓動的init進程,這個進程的pid是1,所有的其他進程都是它的子孫。除了init,任何進程一定有他的父進程,而父進程會負責分配(fork)、回收(wait4)它申請的進程資源。這個樹狀關係也比較健壯,當某個進程還在運行時,它的父進程卻退出了,這個進程卻沒有成爲孤兒進程,因爲linux有一個機制,init進程會接管它,成爲它的父進程。這也是守護進程的由來了,因爲守護進程的其中一個要求就是希望init成爲守護進程的父進程。
如果某個進程自身終止了,在調用exit清理完相關的內容文件等資源後,它就會進入ZOMBIE狀態,它的父進程會調用wait4來回收這個task_struct,但是,如果父進程一直沒有調用wait4去釋放子進程的task_struct,問題就來了,這個task_struct誰來回收呢?永遠沒有人,除非父進程終止後,被init進程接管這個ZOMBIE進程,然後調用wait4來回收進程描述符。如果父進程一直在運行着,這個ZOMBIE會永遠的佔用系統資源,用KILL發任何信號量也不能釋放它。這是很可怕的,因爲服務器上可能會出現無數ZOMBIE進程導致機器掛掉。
來看看內核代碼吧。進程在退出時執行sys_exit(C程序裏在main函數返回會執行到),而它會調用do_exit,do_exit首先清理進程使用的資源,然後調用exit_notify方法,將進程置爲殭屍ZOMBIE狀態,決定是否要以init進程做爲當前進程的父進程,最後通知當前進程的父進程:
kernel/exit.c
[cpp] view plaincopy
- static void exit_notify(struct task_struct *tsk)
- {
- int state;
- struct task_struct *t;
- struct list_head ptrace_dead, *_p, *_n;
- if (signal_pending(tsk) && !tsk->signal->group_exit
- && !thread_group_empty(tsk)) {
- /*
- * This occurs when there was a race between our exit
- * syscall and a group signal choosing us as the one to
- * wake up. It could be that we are the only thread
- * alerted to check for pending signals, but another thread
- * should be woken now to take the signal since we will not.
- * Now we'll wake all the threads in the group just to make
- * sure someone gets all the pending signals.
- */
- read_lock(&tasklist_lock);
- spin_lock_irq(&tsk->sighand->siglock);
- for (t = next_thread(tsk); t != tsk; t = next_thread(t))
- if (!signal_pending(t) && !(t->flags & PF_EXITING)) {
- recalc_sigpending_tsk(t);
- if (signal_pending(t))
- signal_wake_up(t, 0);
- }
- spin_unlock_irq(&tsk->sighand->siglock);
- read_unlock(&tasklist_lock);
- }
- write_lock_irq(&tasklist_lock);
- /*
- * This does two things:
- *
- * A. Make init inherit all the child processes
- * B. Check to see if any process groups have become orphaned
- * as a result of our exiting, and if they have any stopped
- * jobs, send them a SIGHUP and then a SIGCONT. (POSIX 3.2.2.2)
- */
- INIT_LIST_HEAD(&ptrace_dead);
- <strong><span style="color:#ff0000;">forget_original_parent(tsk, &ptrace_dead);</span></strong>
- BUG_ON(!list_empty(&tsk->children));
- BUG_ON(!list_empty(&tsk->ptrace_children));
- /*
- * Check to see if any process groups have become orphaned
- * as a result of our exiting, and if they have any stopped
- * jobs, send them a SIGHUP and then a SIGCONT. (POSIX 3.2.2.2)
- *
- * Case i: Our father is in a different pgrp than we are
- * and we were the only connection outside, so our pgrp
- * is about to become orphaned.
- */
- t = tsk->real_parent;
- if ((process_group(t) != process_group(tsk)) &&
- (t->signal->session == tsk->signal->session) &&
- will_become_orphaned_pgrp(process_group(tsk), tsk) &&
- has_stopped_jobs(process_group(tsk))) {
- __kill_pg_info(SIGHUP, (void *)1, process_group(tsk));
- __kill_pg_info(SIGCONT, (void *)1, process_group(tsk));
- }
- /* Let father know we died
- *
- * Thread signals are configurable, but you aren't going to use
- * that to send signals to arbitary processes.
- * That stops right now.
- *
- * If the parent exec id doesn't match the exec id we saved
- * when we started then we know the parent has changed security
- * domain.
- *
- * If our self_exec id doesn't match our parent_exec_id then
- * we have changed execution domain as these two values started
- * the same after a fork.
- *
- */
- if (tsk->exit_signal != SIGCHLD && tsk->exit_signal != -1 &&
- ( tsk->parent_exec_id != t->self_exec_id ||
- tsk->self_exec_id != tsk->parent_exec_id)
- && !capable(CAP_KILL))
- tsk->exit_signal = SIGCHLD;
- /* If something other than our normal parent is ptracing us, then
- * send it a SIGCHLD instead of honoring exit_signal. exit_signal
- * only has special meaning to our real parent.
- */
- if (tsk->exit_signal != -1 && thread_group_empty(tsk)) {
- int signal = tsk->parent == tsk->real_parent ? tsk->exit_signal : SIGCHLD;
- <span style="color:#ff0000;">do_notify_parent(tsk, signal);</span>
- } else if (tsk->ptrace) {
- do_notify_parent(tsk, SIGCHLD);
- }
- <span style="color:#ff0000;"><strong>state = EXIT_ZOMBIE;</strong></span>
- if (tsk->exit_signal == -1 && tsk->ptrace == 0)
- state = EXIT_DEAD;
- tsk->exit_state = state;
- /*
- * Clear these here so that update_process_times() won't try to deliver
- * itimer, profile or rlimit signals to this task while it is in late exit.
- */
- tsk->it_virt_value = 0;
- tsk->it_prof_value = 0;
- write_unlock_irq(&tasklist_lock);
- list_for_each_safe(_p, _n, &ptrace_dead) {
- list_del_init(_p);
- t = list_entry(_p,struct task_struct,ptrace_list);
- release_task(t);
- }
- /* If the process is dead, release it - nobody will wait for it */
- if (state == EXIT_DEAD)
- release_task(tsk);
- /* PF_DEAD causes final put_task_struct after we schedule. */
- preempt_disable();
- tsk->flags |= PF_DEAD;
- }
大家可以看到這段內核代碼的註釋非常全。forget_original_parent這個函數還會把該進程的所有子孫進程重設父進程,交給init進程接管。
回過頭來,看看爲什麼守護進程要fork兩次。這裏有一個假定,父進程生成守護進程後,還有自己的事要做,它的人生意義並不只是爲了生成守護進程。這樣,如果父進程fork一次創建了一個守護進程,然後繼續做其它事時阻塞了,這時守護進程一直在運行,父進程卻沒有正常退出。如果守護進程因爲正常或非正常原因退出了,就會變成ZOMBIE進程。
如果fork兩次呢?父進程先fork出一個兒子進程,兒子進程再fork出孫子進程做爲守護進程,然後兒子進程立刻退出,守護進程被init進程接管,這樣無論父進程做什麼事,無論怎麼被阻塞,都與守護進程無關了。所以,fork兩次的守護進程很安全,避免了殭屍進程出現的可能性。