ShufflenetV2花卉识模型量化,加载预训练权重进行dump操作

按照之前老师给出的方式(下方链接)进行dump操作,可以正常进行。

  • 下方是3月12日进行操作成功的截图



  • 现在进行同样操作,却出现如下报错,1.2和1.3两个版本都尝试过
   28 03:30:44[mgb] ERR megbrain is about to die abruptly; you can set MGB_WAIT_TERMINATE and rerun to wait for gdb attach: caught deadly signal 11(Segmentation fault)
28 03:30:44[mgb] ERR 
backtrace:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7f0a59bb8980]
python3(PyList_Size+0x4) [0x558213b44b74]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c61551) [0x7f09c1761551]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c5774e) [0x7f09c175774e]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c58256) [0x7f09c1758256]
python3(_PyMethodDef_RawFastCallKeywords+0xe9) [0x558213b84579]
python3(_PyCFunction_FastCallKeywords+0x21) [0x558213b84811]
python3(_PyEval_EvalFrameDefault+0x4834) [0x558213bf8d94]
python3(_PyEval_EvalCodeWithName+0x2e8) [0x558213b31ef8]
python3(_PyFunction_FastCallKeywords+0x325) [0x558213b83ea5]

28 03:30:44[mgb] ERR caught exception in comp node finalize; what(): assertion `m_allocated.size() == m_free.size()' failed at /home/code/src/core/impl/comp_node/comp_node.cpp:120: void mgb::CompNode::EventPool::assert_all_freed()

backtrace:
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c05be6) [0x7f09c1705be6]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(_ZN3mgb15__assert_fail__EPKciS1_S1_S1_z+0x10f) [0x7f09c17d424f]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1cd7fe8) [0x7f09c17d7fe8]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c77e11) [0x7f09c1777e11]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(_ZN3mgb22CompNodeDepedentObject8callbackEv+0x4b) [0x7f09c17d87cb]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(_ZN3mgb16comp_node_detail15DepedentObjList25invoke_callback_and_cleanEv+0xb0) [0x7f09c17d8a50]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(_ZN3mgb8CompNode8finalizeEv+0x9) [0x7f09c17d8e89]
/lib/x86_64-linux-gnu/libc.so.6(+0x43161) [0x7f0a597f8161]
/lib/x86_64-linux-gnu/libc.so.6(+0x4325a) [0x7f0a597f825a]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1da668d) [0x7f09c18a668d]

28 03:30:44[mgb] ERR megbrain is about to die abruptly; you can set MGB_WAIT_TERMINATE and rerun to wait for gdb attach: std::terminate() called
28 03:30:44[mgb] ERR 
backtrace:
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1da66de) [0x7f09c18a66de]
/home/megstudio/.miniconda/envs/xuan/bin/../lib/libstdc++.so.6(+0xacf69) [0x7f09bde89f69]
/home/megstudio/.miniconda/envs/xuan/bin/../lib/libstdc++.so.6(+0xacfab) [0x7f09bde89fab]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x18579df) [0x7f09c13579df]
/lib/x86_64-linux-gnu/libc.so.6(+0x43161) [0x7f0a597f8161]
/lib/x86_64-linux-gnu/libc.so.6(+0x4325a) [0x7f0a597f825a]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1da668d) [0x7f09c18a668d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7f0a59bb8980]
python3(PyList_Size+0x4) [0x558213b44b74]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c61551) [0x7f09c1761551]

28 03:30:44[mgb] ERR megbrain is about to die abruptly; you can set MGB_WAIT_TERMINATE and rerun to wait for gdb attach: std::terminate() called
28 03:30:44[mgb] ERR 
backtrace:
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1da66de) [0x7f09c18a66de]
/home/megstudio/.miniconda/envs/xuan/bin/../lib/libstdc++.so.6(+0xacf69) [0x7f09bde89f69]
/home/megstudio/.miniconda/envs/xuan/bin/../lib/libstdc++.so.6(+0xac3c7) [0x7f09bde893c7]
/home/megstudio/.miniconda/envs/xuan/bin/../lib/libstdc++.so.6(__gxx_personality_v0+0x348) [0x7f09bde89bfa]
/home/megstudio/.miniconda/envs/xuan/bin/../lib/libgcc_s.so.1(+0xcadc) [0x7f0a55bc6adc]
/home/megstudio/.miniconda/envs/xuan/bin/../lib/libgcc_s.so.1(_Unwind_Resume+0x62) [0x7f0a55bc6f7a]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x185cacb) [0x7f09c135cacb]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1cd7fe8) [0x7f09c17d7fe8]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c78549) [0x7f09c1778549]
/lib/x86_64-linux-gnu/libc.so.6(+0x43161) [0x7f0a597f8161]
  • 请问老师是为什么呀?

我试了一下相同的环境可以正常dump

请提供一下上面这个报错的python错误堆栈信息~

运行我们自己模型时在dump操作中又出现了这个问题:

14 13:57:17[mgb] ERR megbrain is about to die abruptly; you can set MGB_WAIT_TERMINATE and rerun to wait for gdb attach: caught deadly signal 11(Segmentation fault)
14 13:57:17[mgb] ERR 
backtrace:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7f8aec1ca980]
/home/megstudio/.miniconda/envs/xuan/bin/python(PyList_Size+0x4) [0x56165451bb74]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c61551) [0x7f8a36307551]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c5774e) [0x7f8a362fd74e]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c58256) [0x7f8a362fe256]
/home/megstudio/.miniconda/envs/xuan/bin/python(_PyMethodDef_RawFastCallDict+0x125) [0x56165452afa5]
/home/megstudio/.miniconda/envs/xuan/bin/python(_PyCFunction_FastCallDict+0x21) [0x56165452b241]
/home/megstudio/.miniconda/envs/xuan/bin/python(_PyEval_EvalFrameDefault+0x5d1b) [0x5616545d127b]
/home/megstudio/.miniconda/envs/xuan/bin/python(_PyEval_EvalCodeWithName+0xa99) [0x5616545096a9]
/home/megstudio/.miniconda/envs/xuan/bin/python(_PyFunction_FastCallKeywords+0x387) [0x56165455af07]

14 13:57:17[mgb] ERR caught exception in comp node finalize; what(): assertion `m_allocated.size() == m_free.size()' failed at /home/code/src/core/impl/comp_node/comp_node.cpp:120: void mgb::CompNode::EventPool::assert_all_freed()

backtrace:
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c05be6) [0x7f8a362abbe6]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(_ZN3mgb15__assert_fail__EPKciS1_S1_S1_z+0x10f) [0x7f8a3637a24f]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1cd7fe8) [0x7f8a3637dfe8]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c77e11) [0x7f8a3631de11]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(_ZN3mgb22CompNodeDepedentObject8callbackEv+0x4b) [0x7f8a3637e7cb]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(_ZN3mgb16comp_node_detail15DepedentObjList25invoke_callback_and_cleanEv+0xb0) [0x7f8a3637ea50]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(_ZN3mgb8CompNode8finalizeEv+0x9) [0x7f8a3637ee89]
/lib/x86_64-linux-gnu/libc.so.6(+0x43161) [0x7f8aebe0a161]
/lib/x86_64-linux-gnu/libc.so.6(+0x4325a) [0x7f8aebe0a25a]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1da668d) [0x7f8a3644c68d]

14 13:57:17[mgb] ERR megbrain is about to die abruptly; you can set MGB_WAIT_TERMINATE and rerun to wait for gdb attach: std::terminate() called
14 13:57:17[mgb] ERR 
backtrace:
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1da66de) [0x7f8a3644c6de]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/zmq/backend/cython/../../../../.././libstdc++.so.6(+0xacf69) [0x7f8ae95b7f69]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/zmq/backend/cython/../../../../.././libstdc++.so.6(+0xacfab) [0x7f8ae95b7fab]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x18579df) [0x7f8a35efd9df]
/lib/x86_64-linux-gnu/libc.so.6(+0x43161) [0x7f8aebe0a161]
/lib/x86_64-linux-gnu/libc.so.6(+0x4325a) [0x7f8aebe0a25a]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1da668d) [0x7f8a3644c68d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7f8aec1ca980]
/home/megstudio/.miniconda/envs/xuan/bin/python(PyList_Size+0x4) [0x56165451bb74]
/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x1c61551) [0x7f8a36307551]

14 13:57:17[mgb] ERR megbrain is about to die abruptly; you can set MGB_WAIT_TERMINATE and rerun to wait for gdb attach: std::terminate() called

代码

@trace(symbolic=True, capture_as_const=True)
def infer_func(img):
    res_mcnn.eval()
    et_dmap = res_mcnn(img)
    return et_dmap

mae=0
for i, (img, gt_dmap) in enumerate(test_dataloader):
    img = mge.tensor(img, dtype="float32")
    gt_dmap = mge.tensor(gt_dmap, dtype="float32")    
    et_dmap = infer_func(img)
    gt_dmap_num = gt_dmap.sum().item()
    et_dmap_num = et_dmap.sum().item()
    mae += abs(et_dmap.sum() - gt_dmap.sum()).item() 
    del img, gt_dmap, et_dmap
print("gt_dmap_num_test={},et_dmap_num_test={}".format(gt_dmap_num, et_dmap_num))
print("MAE={}".format(mae / len(test_dataloader)))

infer_func.dump('./save/res_mcnn_quantized.mge', arg_names=["data"])
mge.save(res_mcnn.state_dict(), './save/res_mcnn_quantized.pkl')

请提供一下完整的代码和模型,我们这边复现一下~