C++11 std::thread 淺析

C++從11開始在語言層面引進了線程以及同步相關的語義支持，今天針對thread的源碼把自己的理解大概記錄一下。

先看看關於thread類的介紹：

std::thread,
C++ Thread support library std::thread
Defined in header
class thread;
(since C++11)

The class thread represents a single thread of execution. Threads allow multiple functions to execute concurrently.
Threads begin execution immediately upon construction of the associated thread object (pending any OS scheduling delays), starting at the top-level function provided as a constructor argument. The return value of the top-level function is ignored and if it terminates by throwing an exception, std::terminate is called. The top-level function may communicate its return value or an exception to the caller via std::promise or by modifying shared variables (which may require synchronization, see std::mutex and std::atomic)
std::thread objects may also be in the state that does not represent any thread (after default construction, move from, detach, or join), and a thread of execution may be not associated with any thread objects (after detach).
No two std::thread objects may represent the same thread of execution; std::thread is not CopyConstructible or CopyAssignable, although it is MoveConstructible and MoveAssignable.

翻譯如下:

1、類 thread 表示單個執行線程。線程允許多個函數併發執行。

2、一旦關聯的線程對象體被構造，線程立刻開始執行（具體依賴相應OS的調度延遲），線程的執行入口是構造thread對象時傳遞的函數，函數對象，lambda表達式等參數等。

3、線程會忽略執行函數的返回值，若執行函數在執行過程中拋出異常，系統調用 std::terminate終止該線程。

4、執行函數可以通過 std::promise 在多個線程間傳遞返回值或一個異常，也可以通過同步機制如std::mutex,std::atomic等去修改一個共享變量。

5、std::thread 可能處在一個不關聯任何執行體的狀態（在默認構造、移動、 detach 或 join 之後），一個detach後的thread不能再關聯任何執行體。

6、沒有兩個 std::thread 對象會表示同一執行流；因爲std::thread 不可拷貝構造 (CopyConstructible) 和賦值 (CopyAssignable) ，但是擁有移動語義，即移動構造 (MoveConstructible) 和移動賦值 (MoveAssignable) 。

由於不知從何說起，就先從上面這個6個問題開始吧：

問題1：就不用說了，引入線程的目的就爲此。
問題2：先看代碼：


		//普通函數
		void printFun(std::string s)
		{
			std::cout << "in pure function " << s << std::endl;
		}
		
		
		//函數對象
		class PrintObj
		{
		public:
			void operator()(std::string s) const
			{
				std::cout << "in function object: " << s << std::endl;
			}
		};
		
		//Lamdba 表達式
		auto lambFun = [](std::string s) {
			std::cout << "in lambda function: " << s << std::endl;
		};
		
		int main()
		{
			std::string s = "Hello World!";

			/*std::vector<std::thread> threads;
			threads.emplace_back(printFun, s);
			threads.emplace_back(PrintObj(), s);
			threads.emplace_back(lambFun, s);
			for (auto& t : threads)
				t.join();
			std::for_each(threads.begin(), threads.end(), std::mem_fn(&std::thread::join));*/

			std::thread th_1(printFun, s);
			std::thread th_2(PrintObj(), s);
			std::thread th_3(lambFun, s);
			th_1.join();
			th_2.join();
			th_3.join();
			return 0;
		}

輸出結果如下：

結合上面的代碼，首先看看std::thread的源碼：

		
		//std::thread中封裝的數據實體，thread的成員變量_Thr，即爲_Thrd_imp_t類型，
		//_Thrd_imp_t包含兩項數據成員，一個系統底層的handle，一個線程id。
		typedef unsigned int _Thrd_id_t;
		typedef struct
			{	/* thread identifier for Win32 */
			void *_Hnd;	/* Win32 HANDLE */
			_Thrd_id_t _Id;
			} _Thrd_imp_t;
		
		//操作_Thrd_imp_t的宏定義，主要爲thread服務。
		#define _Thr_val(thr) thr._Id
		#define _Thr_set_null(thr) (thr._Hnd = nullptr, thr._Id = 0)
		#define _Thr_is_null(thr) (thr._Id == 0)

		class thread
			{	// class for observing and managing threads
		public:
			class id; //內部類，對thread-id _Thrd_id_t數據類型的封裝。
		
			typedef void *native_handle_type;
		
			thread() noexcept
				{	// construct with no thread
				_Thr_set_null(_Thr); //默認構造函數handle = nullptr,id = 0,不關聯任何執行體，也就是個擺設，基本用不到。
				}
		
			//(1)
			//thread最重要的構造函數，可以看出，利用了perfectly forward特性，將執行體的_Fx和參數_Ax轉發給thread對象（注意還傳遞了&_Thr），然後Lunch這個thread。
			//這裏有一個地方需要特別注意，傳給thread的參數實際上是複製了一份在tuple中，供後續的線程使用，後面會具體解釋這塊。
			template<class _Fn,
				class... _Args,
				class = enable_if_t<!is_same_v<remove_cv_t<remove_reference_t<_Fn>>, thread>>>
				explicit thread(_Fn&& _Fx, _Args&&... _Ax)
				{	// construct with _Fx(_Ax...)
				_Launch(&_Thr,
					_STD make_unique<tuple<decay_t<_Fn>, decay_t<_Args>...> >(
						_STD forward<_Fn>(_Fx), _STD forward<_Args>(_Ax)...));
				}
		
			//(2)
			//析構函數，此處需要注意的是，析構時，會先判斷是否可joinable,如果已經join()
			//或detach()的過，此處顯然什麼都不做。從這裏也可以看出，如果子線程不可joinable，當主線程退出時，子線程也會被terminate。
			~thread() noexcept
				{	// clean up
				if (joinable())
					_STD terminate();
				}
		
			//移動拷貝，不多解釋。
			thread(thread&& _Other) noexcept
				: _Thr(_Other._Thr)
				{	// move from _Other
				_Thr_set_null(_Other._Thr);
				}
		
			//移動複製
			thread& operator=(thread&& _Other) noexcept
				{	// move from _Other
				return (_Move_thread(_Other));
				}
		
			//注意這裏：delete掉拷貝構造和複製，意味着thread的資源只能轉移並不能複製和拷貝。
			thread(const thread&) = delete;
			thread& operator=(const thread&) = delete;
		
			void swap(thread& _Other) noexcept
				{	// swap with _Other
				_STD swap(_Thr, _Other._Thr);
				}
		
			//(3)
			//實際在判斷（thr._Id ！= 0），也就是說，只有thead-id不爲零，thead對象纔是
			可joinable的。從上面可知，default構造函數後，id == 0，線程不可joinable。還有就是join()和detach()後，都會_Thr_set_null(_Thr)，將thead-id設置爲零，
			此時thread變成不可joinable。所以joinable的狀態是一次性的，不可重複在用。
			_NODISCARD bool joinable() const noexcept
				{	// return true if this thread can be joined
				return (!_Thr_is_null(_Thr));
				}
		
			void join();
		
			//先判斷是否可joinable，否則throw exception
			void detach()
				{	// detach thread
				if (!joinable())
					_Throw_Cpp_error(_INVALID_ARGUMENT);
				_Thrd_detachX(_Thr);
				_Thr_set_null(_Thr);
				}
		
			_NODISCARD id get_id() const noexcept;
		
			_NODISCARD static unsigned int hardware_concurrency() noexcept
				{	// return number of hardware thread contexts
				return (_Thrd_hardware_concurrency());
				}
		
			_NODISCARD native_handle_type native_handle()
				{	// return Win32 HANDLE as void *
				return (_Thr._Hnd);
				}
		
		private:
			thread& _Move_thread(thread& _Other)
				{	// move from _Other
				if (joinable())
					_STD terminate();
				_Thr = _Other._Thr;
				_Thr_set_null(_Other._Thr);
				return (*this);
				}
		
			_Thrd_t _Thr; //見開始部分的定義，thread唯一的一個成員變量，包含handle 和 id。
			};


		//不多解釋了，就是對_Thrd_id_t的封裝，把一個基本類型數據unsigned int變成一個封裝類型。
		class thread::id
			{	// thread id
		public:
			id() noexcept
				: _Id(0)
				{	// id for no thread
				}
		
			//重載std::cout << id
			template<class _Ch,
				class _Tr>
				basic_ostream<_Ch, _Tr>& _To_text(
					basic_ostream<_Ch, _Tr>& _Str)
				{	// insert representation into stream
				return (_Str << _Id);
				}
		
		private:
			id(_Thrd_id_t _Other_id)
				: _Id(_Other_id)
				{	// construct from unique id
				}
		
			_Thrd_id_t _Id;
		
			friend thread::id thread::get_id() const noexcept;
			friend thread::id this_thread::get_id() noexcept;
			friend bool operator==(thread::id _Left, thread::id _Right) noexcept;
			friend bool operator<(thread::id _Left, thread::id _Right) noexcept;
			friend hash<thread::id>;
			};

註釋(1)、（2）、（3）解釋如下：

1、參數拷貝的問題，先看例子：


		int num = 0; //global varible
		void changeNum(const int& n)
		{
			int& m = const_cast<int&>(n);
			++m;
			std::cout << "in changeNum: m == " << m << std::endl;
			std::cout << "in changeNum: num == " << num << std::endl;
		}
		
		int main()
		{
			std::thread th(changeNum, num);
			th.join();
			std::cout << "in main: num = " << num << std::endl;
			return 0;
		}

可以看出，按引用傳給thread的變量num，並沒有在子線程中因 ++m 而改變，原因就在於參數在內部被複制了，不管你是按照引用還是其他方式傳遞，具體原因看代碼：


	template<class _Fn,
			class... _Args,
			class = enable_if_t<!is_same_v<remove_cv_t<remove_reference_t<_Fn>>, thread>>>
			explicit thread(_Fn&& _Fx, _Args&&... _Ax)
			{	// construct with _Fx(_Ax...)
			_Launch(&_Thr,
				_STD make_unique<tuple<decay_t<_Fn>, decay_t<_Args>...> >(
					_STD forward<_Fn>(_Fx), _STD forward<_Args>(_Ax)...));
			}

	template<class _Ty>
		struct decay
		{	// determines decayed version of _Ty
		using _Ty1 = remove_reference_t<_Ty>;
	
		using type = conditional_t<is_array_v<_Ty1>,
			add_pointer_t<remove_extent_t<_Ty1>>,
			conditional_t<is_function_v<_Ty1>,
				add_pointer_t<_Ty1>,
				remove_cv_t<_Ty1>>>;
		};
	
	
	template<class _Ty>
		using decay_t = typename decay<_Ty>::type;
	
	template<class _Ty,
		class... _Types,
		enable_if_t<!is_array_v<_Ty>, int> = 0>
		_NODISCARD inline unique_ptr<_Ty> make_unique(_Types&&... _Args)
		{	// make a unique_ptr
		return (unique_ptr<_Ty>(new _Ty(_STD forward<_Types>(_Args)...)));
		}


	template<class _This,
		class... _Rest>
		class tuple<_This, _Rest...>
			: private tuple<_Rest...>
		{	// recursive tuple definition
	public:
		typedef _This _This_type;
		typedef tuple<_Rest...> _Mybase;
	
		template<class _Tag,
			class _This2,
			class... _Rest2,
			enable_if_t<is_same_v<_Tag, _Exact_args_t>, int> = 0>
			constexpr tuple(_Tag, _This2&& _This_arg, _Rest2&&... _Rest_arg)
			: _Mybase(_Exact_args_t{}, _STD forward<_Rest2>(_Rest_arg)...),
				_Myfirst(_STD forward<_This2>(_This_arg))
			{	// construct from one arg per element
			}
	.........................

從上面可以看出，傳遞給std::thread的參數：_Fx,_Ax 先通過std::make_unique 包裝成一個指向tuple的unique_ptr，再傳給_Launch。關鍵就在於構建tuple時的模板實例化參數，decay_t<_Fn>, decay_t<_Args>… ，正是由於它，才導致參數的拷貝。下面先說說decay_t的作用：

decay_t<_Ty>的作用如下：
首先移除_Ty的引用修飾，產生新類型 _Ty1;
a、如果_Ty1是數組，先移除擴展，然後加指針修飾；
	eg: 若_Ty是 int[5],則 decay_t<_Ty>::type == int*
b、若_Ty1不是數組類型，如果是函數，則爲函數指針，否則移除const 和 volatile修飾.

這樣構建的tuple<decay_t<_Fn>, decay_t<_Args>…>對象，在內部分別調用了每個參數對應成員的拷貝構造函數，如 tuple 的構造函數的初始化成員列表中的**_Myfirst(_STD forward<_This2>(_This_arg))**，正因如此，才產生了數據的拷貝。

概括起來，就是傳給std::thread的參數會被內部拷貝一份存儲在std::tuple中。

既然說到這裏了，那再舉一個不拷貝的例子，看看std::call_once的實現，此函數的原型作用如下：

std::call_once C++ Thread support library Defined in header <mutex>

template< class Callable, class… Args >
void call_once( std::once_flag& flag, Callable&& f, Args&&… args );
(since C++11)

Executes the Callable object f exactly once, even if called concurrently, from >several threads.
call_once invokes std::forward(f) with the arguments std::>forward(args)… (as if by std::invoke). Unlike the std::thread >constructor or std::async, the arguments are not moved or copied because they >don’t need to be transferred to another thread of execution.

std::call_once的作用就是隻允許傳遞給它的函數f執行一次（一般用於初始化），即便是多線程環境下。

先看看該函數的大概使用：


	std::once_flag flag1;
	int num = 0;
	void simple_do_once()
	{
		std::call_once(flag1, [](int& x) { 
			++x;
			std::cout << "Simple example: called once\n"; },num);
	}
	
	int main()
	{
		//四個線程同時執行，但是call\_once中的函數只被執行了一次。
		//並且global變量num確實被改變了
		std::thread st1(simple_do_once);
		std::thread st2(simple_do_once);
		std::thread st3(simple_do_once);
		std::thread st4(simple_do_once);
		st1.join();
		st2.join();
		st3.join();
		st4.join();
		std::cout << "num = " << num << std::endl;
		return 0;
}

輸出如下：

前面說過std::thread中傳遞的參數是內部拷貝的，但是此處顯然不是，原因何在？看代碼：


	template<class _Fn,
		class... _Args> inline
		void (call_once)(once_flag& _Flag, _Fn&& _Fx, _Args&&... _Ax)
		{	// call _Fx(_Ax...) once
		//關鍵就在這裏，實例化tuple的參數不在是decay_t<_Fn>了，而是_Fn&&
		//根據perfectly forward的知識，外部傳遞左值，tuple中就是左值引用，
		//傳遞右值，就是右值引用。
		typedef tuple<_Fn&&, _Args&&..., exception_ptr&> _Tuple;
		typedef make_index_sequence<1 + sizeof...(_Args)> _Seq;
	
		exception_ptr _Exc;
		_Tuple _Tup(_STD forward<_Fn>(_Fx), _STD forward<_Args>(_Ax)..., _Exc);

好了，關於std::thread拷貝參數的問題就算徹底講完了。下面說說，如果傳遞給std::thread的函數確實需要引用參數該怎麼辦？看代碼：


	void changeNum(int m, int& n, std::string& s)
	{
		++m;
		++n;
		s.append("__append");
	}
	
	int main()
	{
		int num_1 = 0;
		int num_2 = 10;
		std::string str{ "Hello" };
		//對需傳遞引用的地方用std::ref包裝
		std::thread th(changeNum, num_1, std::ref(num_2), std::ref(str));
		th.join();
		std::cout << "in main: num_1 = " << num_1 << std::endl;
		std::cout << "in main: num_2 = " << num_2 << std::endl;
		std::cout << "in main: str = " << str << std::endl;
		return 0;
	}

輸出如下：

從上面可以看出，經std::ref包裝後，程序產生的結果，確實達到了我們的預期，原因何在？，std::ref的機理有是什麼？


		// FUNCTION TEMPLATES ref AND cref
	template<class _Ty>
		_NODISCARD inline reference_wrapper<_Ty> ref(_Ty& _Val) noexcept
		{	// create reference_wrapper<_Ty> object
		return (reference_wrapper<_Ty>(_Val));
		}
	
	template<class _Ty>
		void ref(const _Ty&&) = delete;
	
	template<class _Ty>
		_NODISCARD inline reference_wrapper<_Ty> ref(reference_wrapper<_Ty> _Val) noexcept
		{	// create reference_wrapper<_Ty> object
		return (_STD ref(_Val.get()));
		}

此函數有三個版本，從上到下依次爲：左值引用，delete掉右值引用，由reference_wrapper產生一個新的對象reference_wrapper，其實質還是指向同一個外部左值。我們例子中用的就是第一個左值引用版本。想想爲什麼要刪除右值引用呢？原因很簡單，std::ref的作用就是爲了通過這種方式以達到改變變量值得目的，爲程序後續所用，而右值即使改變了也不可能被程序其他地方所使用，所以根本就沒有必要就改變它，直接delete掉。
從上面代碼看出，std::ref把_Val包裝成一個reference_wrapper類型的變量，那我們看看reference_wrapper是如何實現的：


	template<class _Ty>
		class reference_wrapper
			: public _Weak_types<_Ty>::type
		{	// stand-in for an assignable reference
	public:
		static_assert(is_object_v<_Ty> || is_function_v<_Ty>,
			"reference_wrapper<T> requires T to be an object type or a function type.");
	
		using type = _Ty;
	
		template<class _Uty,
			enable_if_t<conjunction_v<
				negation<is_same<remove_cv_t<remove_reference_t<_Uty>>, reference_wrapper>>,
				_Refwrap_has_ctor_from<_Ty, _Uty>>, int> = 0>
			reference_wrapper(_Uty&& _Val)
				_NOEXCEPT_COND(_NOEXCEPT_OPER(_Refwrap_ctor_fun<_Ty>(_STD declval<_Uty>())))
				{	// construct
				_Ty& _Ref = _STD forward<_Uty>(_Val);
				_Ptr = _STD addressof(_Ref);
				}
	
		//注意這個類型轉換函數，在真正調用函數的地方，其實存在下面的隱式類型轉換，
		//使得被調用函數的引用參數仍能引用到外部變量，這可能就是這個類的核心所在。
		operator _Ty&() const noexcept
			{	// return reference
			return (*_Ptr);
			}
	
		_NODISCARD _Ty& get() const noexcept
			{	// return reference
			return (*_Ptr);
			}
	
		template<class... _Types>
			auto operator()(_Types&&... _Args) const
			-> decltype(_STD invoke(get(), _STD forward<_Types>(_Args)...))
			{	// invoke object/function
			return (_STD invoke(get(), _STD forward<_Types>(_Args)...));
			}
	
	private:
		_Ty * _Ptr; //此模板實際存儲的是外部變量的地址。
		};

到此，std::ref的作用了說清楚了，其實還有一個std::cref，道理都一樣，只不過實例化reference_wrapper時用const T 參數而已。

2、std::thread 析構函數的問題，先看例子：


	void thr_fun()
	{
		while (true)
		{
			using namespace std::chrono_literals;
			std::this_thread::sleep_for(1s);
			std::cout << "thead-id: " << std::this_thread::get_id()<< std::endl;
		}
	}
	int main()
	{
		std::thread thr{thr_fun};
		std::this_thread::sleep_for(std::chrono::seconds(3));
		//此處專門不join子線程
		return 0;
	}

輸出結果：

此處產生Debug Error！對話框的原因就是因爲子線程還處於joinable狀態，主線程已經退出了。
導致std::thread的析構函數被調用，進而調用terminate()終止程序。所以得注意。

3、關於線程的joinable問題，先看例子：


	//io輸出同步鎖
	std::mutex io_mutex;
	void thr_fun()
	{
		using namespace std::chrono_literals;
		std::this_thread::sleep_for(1s);
		std::lock_guard<std::mutex> lk(io_mutex);
		std::cout << "thead-id: " << std::this_thread::get_id() << std::endl;
	}
	int main()
	{
		std::thread thr_1{ thr_fun };
		std::thread thr_2{ thr_fun };
		std::thread thr_3;
		thr_1.join();
		//thr_2.detach();
		thr_2.join();
		std::cout << "thr_1 thead: " << thr_1.joinable() << std::endl;
		std::cout << "thr_2 thead: " << thr_2.joinable() << std::endl;
		std::cout << "thr_3 thead: " << thr_3.joinable() << std::endl;
		return 0;
	}

輸出結果：

結合這個例子，以及前面的源碼，可以看出，線程是否可joinable等價於線程關聯的id是否爲0，
若調用std::thread的default construct構造線程或者調用join() 和 detach()，都會將線程的id設置爲0，同時在調用join()和 detach()之前還會判斷是否可joinable，否則 _Throw_Cpp_error(_INVALID_ARGUMENT)。

問題2就到此結束。

問題3：
a、線程會忽略執行函數的返回值
answer: 類似java中的thread，函數的返回值沒法從std::thread中得到，實際上std::thread也未提供相關的接口，參考前面貼出的源碼。實際上C++11中提供了類似java中的Callable，FutureTask等高級技法去communicate with many thread，後面再講。

b、若執行函數在執行過程中拋出異常，系統調用 std::terminate終止該線程
answer: 不多解釋，看代碼


	void thr_fun(bool flag)
	{
		if (flag == true)
			throw std::current_exception; //系統直接調用std::terminate終止
		std::cout << "thread-id: " << std::this_thread::get_id() << std::endl;
	}
	int main()
	{
		std::thread thr_1(thr_fun,true);
		thr_1.join();
		return 0;
	}

此篇文章就到這裏。下一篇講講C++11中線程同步的知識，主要涉及：

鎖的類型：
std::mutex,std::timed_mutex,std::shared_mutex,std::shared、_timed_mutex
條件變量：
std::condition_variable, std::cv_status
鎖的管理（RAII）:
std::lock_guard,std::scoped_lock,std::unique_lock,std::shared_lock
鎖的控制策略：
std::defer_lock_t,std::try_to_lock_t,std::adopt_lock_t
一般的鎖算法：
std::lock, std::try_lock
期望和承諾：
std::promise,std::packaged_task,std::future,std::async,std::future_status

C++11 線程

C++11 std::thread 淺析

由於不知從何說起，就先從上面這個6個問題開始吧：

註釋(1)、（2）、（3）解釋如下：

第一次寫文章，望各位海涵！！

如何使用 JS 判斷用戶是否處於活躍狀態

Mono 支持LoongArch架構

lightdb秒級增加列和刪除列（not null帶默認值）

lightdb數據庫超時相關控制參數

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

網絡爬蟲的祕密：如何高效地抓取JD.com視頻鏈接

lightdb mysql 8.0兼容之不可見主鍵

使用 JS 實現在瀏覽器控制檯打印圖片 console.image()

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（四）使用域名訪問網站應用

C++11 線程

BP反向誤差傳播推導

wait和waitpid返回狀態解析

將博客搬至CSDN

線程間同步

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結