python分佈式計算框架PP(Parallel Python)集羣模式試用

【標題】
python分佈式計算框架PP(Parallel Python)集羣模式試用

【背景】

Parallel Python庫(簡稱PP)
網上的教程都是單機多進程測試,決定試試集羣分佈式計算效果


【結論】
用了兩臺物理機,一個4核,一個2核,集羣分佈式計算
可以看到最終加速比爲5.1,計算方法是(60+27)/17=5.1。相當於5.1個CPU核心,比較充分地利用了多個計算機的多個CPU

【操作過程】

配置教程如下:
http://www.parallelpython.com/content/view/15/30/#QUICKCLUSTERS

計算子節點上,需要安裝python,並安裝pp庫, 
pip install pp 即可安裝pp庫
安裝完成後,會有一個 ppserver.py在 scripts目錄中
python ppserver.py  -p 35000 -i 192.168.1.104 -s "123456"
該命令讓計算子結點啓動一個35000端口監聽,密碼是123456,自己的IP是192.168.1.104

主結點中創建ppserver實例時,需要指定ip列表,和secret,與之配合。

主結點運行任務時,可以動態負載均衡方式,發送任務給子結點


試用了官方提供的第一個demo,
http://www.parallelpython.com/content/view/17/31/

並將測試數據改大一些,用了兩臺物理機,一個4核,一個2核,集羣分佈式計算
可以看到最終加速比爲5.1,計算方法是(60+27)/17=5.1。相當於5.1個CPU核心,比較充分地利用了多個計算機的多個CPU

#!/usr/bin/python
# File: sum_primes.py
# Author: VItalii Vanovschi
# Desc: This program demonstrates parallel computations with pp module
# It calculates the sum of prime numbers below a given integer in parallel
# Parallel Python Software: http://www.parallelpython.com
 
import math, sys, time, datetime
import pp
 
def isprime(n):
    """Returns True if n is prime and False otherwise"""
    if not isinstance(n, int):
        raise TypeError("argument passed to is_prime is not of 'int' type")
    if n < 2:
        return False
    if n == 2:
        return True
    max = int(math.ceil(math.sqrt(n)))
    i = 2
    while i <= max:
        if n % i == 0:
            return False
        i += 1
    return True
 
def sum_primes(n):
    """Calculates sum of all primes below given integer n"""
    return sum([x for x in xrange(2,n) if isprime(x)])
 
print """Usage: python sum_primes.py [ncpus]
    [ncpus] - the number of workers to run in parallel, 
    if omitted it will be set to the number of processors in the system
"""
 
# tuple of all parallel python servers to connect with
#ppservers = ()
ppservers = ("192.168.1.104:35000",)
#ppservers=("*",)
 
if len(sys.argv) > 1:
    ncpus = int(sys.argv[1])
    # Creates jobserver with ncpus workers
    job_server = pp.Server(ncpus, ppservers=ppservers, secret="123456")
else:
    # Creates jobserver with automatically detected number of workers
    job_server = pp.Server(ppservers=ppservers, secret="123456")
 
print "Starting pp with", job_server.get_ncpus(), "workers"
 
# Submit a job of calulating sum_primes(100) for execution. 
# sum_primes - the function
# (100,) - tuple with arguments for sum_primes
# (isprime,) - tuple with functions on which function sum_primes depends
# ("math",) - tuple with module names which must be imported before sum_primes execution
# Execution starts as soon as one of the workers will become available
job1 = job_server.submit(sum_primes, (100,), (isprime,), ("math",))
 
# Retrieves the result calculated by job1
# The value of job1() is the same as sum_primes(100)
# If the job has not been finished yet, execution will wait here until result is available
result = job1()
 
print "Sum of primes below 100 is", result
 
start_time = time.time()
 
# The following submits 8 jobs and then retrieves the results
inputs = (500000, 500100, 500200, 500300, 500400, 500500, 500600, 500700, 500000, 500100, 500200, 500300, 500400, 500500, 500600, 500700)
#inputs = (1000000, 1000100, 1000200, 1000300, 1000400, 1000500, 1000600, 1000700)
jobs = [(input, job_server.submit(sum_primes,(input,), (isprime,), ("math",))) for input in inputs]
for input, job in jobs:
    print datetime.datetime.now()
    print "Sum of primes below", input, "is", job()
 
print "Time elapsed: ", time.time() - start_time, "s"
job_server.print_stats()


運行結果:

c:\Python27\python.exe test_pp_official.py
Usage: python sum_primes.py [ncpus]
    [ncpus] - the number of workers to run in parallel,
    if omitted it will be set to the number of processors in the system

Starting pp with 4 workers
Sum of primes below 100 is 1060
2016-08-28 19:07:26.579000
Sum of primes below 500000 is 9914236195
2016-08-28 19:07:33.032000
Sum of primes below 500100 is 9917236483
2016-08-28 19:07:33.035000
Sum of primes below 500200 is 9922237979
2016-08-28 19:07:33.296000
Sum of primes below 500300 is 9926740220
2016-08-28 19:07:33.552000
Sum of primes below 500400 is 9930743046
2016-08-28 19:07:33.821000
Sum of primes below 500500 is 9934746636
2016-08-28 19:07:34.061000
Sum of primes below 500600 is 9938250425
2016-08-28 19:07:37.199000
Sum of primes below 500700 is 9941254397
2016-08-28 19:07:37.202000
Sum of primes below 500000 is 9914236195
2016-08-28 19:07:41.640000
Sum of primes below 500100 is 9917236483
2016-08-28 19:07:41.742000
Sum of primes below 500200 is 9922237979
2016-08-28 19:07:41.746000
Sum of primes below 500300 is 9926740220
2016-08-28 19:07:41.749000
Sum of primes below 500400 is 9930743046
2016-08-28 19:07:41.752000
Sum of primes below 500500 is 9934746636
2016-08-28 19:07:41.756000
Sum of primes below 500600 is 9938250425
2016-08-28 19:07:43.846000
Sum of primes below 500700 is 9941254397
Time elapsed:  17.2770001888 s
Job execution statistics:
 job count | % of all jobs | job time sum | time per job | job server
         6 |         35.29 |      27.4460 |     4.574333 | 192.168.1.104:35000
        11 |         64.71 |      60.2950 |     5.481364 | local
Time elapsed since server creation 17.2849998474
0 active tasks, 4 cores


————————————————
版權聲明:本文爲CSDN博主「delacrxoix_xu」的原創文章,遵循 CC 4.0 BY-SA 版權協議,轉載請附上原文出處鏈接及本聲明。
原文鏈接:https://blog.csdn.net/delacroix_xu/article/details/52347391

發佈了29 篇原創文章 · 獲贊 32 · 訪問量 9萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章