Tensorflow 通過性能分析工具查看變量位置

原創

2018-09-17 03:15

發現個更好的方法，只需要配置下session的config即可：

config=tf.ConfigProto(allow_soft_placement=True,log_device_placement=True)
config.gpu_options.allow_growth=True
sess=tf.Session( config=config)

然後在控制檯的輸出中即可看到變量的位置信息，allow_soft_placement爲True還允許將原本要分配到GPU上的變量分配到CPU端

以下爲原文：

通過tensorflow自動分配變量時，並不清楚變量具體的分配位置時CPU還是GPU
一般情況下，這並不是問題，然而我嘗試分配大變量時，內存溢出了！！！！
才發現embedding variable居然被分配在了GPU上。。。。

搜索發現，tensorflow自帶性能分析工具，參見：[url]http://stackoverflow.com/questions/37751739/tensorflow-code-optimization-strategy[/url]

文末程序的日誌如下：
[img]http://dl2.iteye.com/upload/attachment/0124/0510/e620e71e-7fcc-310c-9293-c2d7f05e00ed.png[/img]

不多說了，上代碼：

# coding=utf-8
'''
測試Tensorflow的性能分析工具;
該工具也可以檢測變量的位置，，

參考網址：http://stackoverflow.com/questions/37751739/tensorflow-code-optimization-strategy

Created on Mar 30, 2017
@author: colinliang
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from tensorflow.python.client import timeline
if __name__ == '__main__':
    run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
    run_metadata = tf.RunMetadata()
    sess = tf.Session()
    print(u'CUPTI的so文件位置變了, 直接運行本程序會產生exception！！！ http://blog.csdn.net/rtygbwwwerr/article/details/51605835')
    print(u'解決方法：我是拷貝了一下/usr/local/cuda/extras/CUPTI/lib64對應3個的文件到 /usr/local/cuda/lib64 中')
    from tensorflow.python.ops import partitioned_variables
    partitioner = partitioned_variables.variable_axis_size_partitioner(max_shard_bytes=512)  # 注意這裏是分割的字節數，而不是float的數量
    dim = 2

#     with tf.device('/cpu:0'):
#         embedding_var=tf.get_variable('embedding_var', shape=[600,dim],partitioner=partitioner, dtype=tf.float32)
    embedding_var = tf.get_variable('embedding_var', shape=[200, dim], partitioner=partitioner, dtype=tf.float32) 

    w = tf.get_variable('w', shape=[dim, 10], dtype=tf.float32)

#     tf.PartitionedVariable  # 如果進行了partition，變量類型不是tf.Variable，而是，由一組tf.Variable組成的tf.PartitionedVariable
    sess.run(tf.global_variables_initializer())
    print('embedding_var: %s' % embedding_var)
#     print('device of embedding_var: %s'% embedding_var.device)  #對於單個變量，可以這麼打印device，但只針對顯式指定device的變量有效。。。

    r = tf.nn.embedding_lookup(embedding_var, [0])
    r = tf.matmul(r, w)
    print(r)
    sess.run(r, options=run_options, run_metadata=run_metadata)

#     print(run_metadata)
    tl = timeline.Timeline(run_metadata.step_stats)
    ctf = tl.generate_chrome_trace_format()
    tracing_log='/tmp/timeline.json'
    with open(tracing_log, 'w') as f:
        f.write(ctf)
    print('DONE')
    print('在chrome中打開：   chrome://tracing  ， 再load %s 即可查看運行日誌'%tracing_log)
    exit(0)
    #############################################


'''
    print(run_metadata) 輸出的部分內容如下
    可以看到 embedding_var 被分片了，每片大小爲400字節，而不是我們指定的512字節。。。
    allocator_name: "gpu_bfc" 爲 gpu_bfc ， 說明該變量被分配在GPU上！！！
'''

#     node_stats {
#       node_name: "embedding_var/part_0"
#       all_start_micros: 1490861057718593
#       op_end_rel_micros: 2
#       all_end_rel_micros: 6
#       memory {
#         allocator_name: "gpu_bfc"
#       }
#       output {
#         tensor_description {
#           dtype: DT_FLOAT
#           shape {
#             dim {
#               size: 50
#             }
#             dim {
#               size: 2
#             }
#           }
#           allocation_description {
#             requested_bytes: 400
#             allocated_bytes: 512
#             allocator_name: "gpu_bfc"
#             allocation_id: 24
#             has_single_reference: true
#             ptr: 1117060600064
#           }
#         }
#       }

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Tensorflow 通過性能分析工具查看變量位置

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

【2024-05-21】以茶會友

PhotoShop - 柔光模式與強光模式的響應曲線對比

Linux (Ubuntu) 下 Eclipse C++ 環境配置

win7硬盤安裝linux （Ubuntu14.04）

python代碼優化筆記，cython等

Tensorflow 通過性能分析工具查看變量位置

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結