MegEngine Failed to load cuda API library


**【标题】MegEngine Failed to load cuda API library **
在 cuda runtime 官方镜像中安装 python3.8 + MegEngine 后运行报错

  • MegEngine 版本:1.5
  • CPU型号:
  • GPU型号:GTX 1070Ti
  • 系统环境:docker image nvidia/cuda:11.4.1-runtime-ubuntu18.04
  • python版本: 3.8.12


  • 复现步骤:在 cuda runtime 镜像中安装 python3.8,pip install megengine,import megengine 时报错
  • 日志信息:
import megengine as mge

err: Failed to load cuda API library

err: failed to load cuda func: cuCtxGetCurrent

初步确认为未映射 gpu 导致,但开启 gpu 映射 --gpus=all 后,在镜像中 nvidia-smi 可得到

Tue Sep 14 07:01:11 2021       
| NVIDIA-SMI 450.51       Driver Version: 450.51       CUDA Version: 11.4     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  GeForce GTX 107...  Off  | 00000000:04:00.0 Off |                  N/A |
| 27%   26C    P8     5W / 180W |      2MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
|   1  GeForce GTX 107...  Off  | 00000000:08:00.0 Off |                  N/A |
| 27%   26C    P8     5W / 180W |      2MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
|   2  GeForce GTX 107...  Off  | 00000000:86:00.0 Off |                  N/A |
| 27%   24C    P8     5W / 180W |      2MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
|   3  GeForce GTX 107...  Off  | 00000000:8A:00.0 Off |                  N/A |
| 28%   26C    P8     5W / 180W |      2MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|  No running processes found                                                 |

输出,但加载 gpu 模型时报错

RuntimeError: assertion `locator.device >= 0 && locator.device < nr_gpu' failed at ../../../../../../src/core/impl/comp_node/cuda/comp_node.cpp:831: static mgb::CompNode::Impl* mgb::CudaCompNode::load_cuda(const mgb::CompNode::Locator&, const mgb::CompNode::Locator&)
extra message: request gpu0 out of valid range [0, 0)