为使您的问题得到快速解决,建议参考以下模板:
【模型forward函数中batchnorm2d处遇到megengine.core._imperative_rt.core2.AsyncError: An async error is reported.】
(简洁、精准的描述您的问题,例如“int8模型,多次抽feature,存在可见误差”)
【版本、环境信息】
- MegEngine 版本:1.9.1
- CPU型号:__
- GPU型号:NVIDIA 2080Ti
- 系统环境:ubuntu18.04, 64位,brain++环境
- python版本: 3.6.13
【模型信息】
- 算法:(请提供算法源码,如有特殊实现请简单介绍)
- 性能对比:(现在速度 vs 之前速度, shape是多少之类等)
- 模型文件地址:(请提供模型文件地址)
【Load_and_run LOG】
- 请提供Load_and_run复现LOG
【如为报错请提供以下复现信息】
- 复现步骤:将pytorch版本代码迁移为megengine版本,模型forward中的batchnorm2d处报错
- 日志信息:_The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/multiprocessing/process.py”, line 93, in run
self._target(*self._args, **self._kwargs)
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/site-packages/megengine/distributed/launcher.py”, line 58, in _run_wrapped
ret = func(*args, **kwargs)
File “/data/UnsupMVS/KD-MVS-release-megengine/train_unsup.py”, line 239, in main
train(model, model_loss, optimizer, gm, TrainImgLoader, TestImgLoader, start_epoch, logger, args)
File “/data/UnsupMVS/KD-MVS-release-megengine/train_unsup.py”, line 85, in train
loss, scalar_outputs, image_outputs = train_sample(model, model_loss, optimizer, gm, sample, args)
File “/data/UnsupMVS/KD-MVS-release-megengine/train_unsup.py”, line 127, in train_sample
outputs = model(sample_cuda[“imgs”], sample_cuda[“proj_matrices”], sample_cuda[“depth_values”])
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/site-packages/megengine/module/module.py”, line 149, in call
outputs = self.forward(*inputs, **kwargs)
File “/data/UnsupMVS/KD-MVS-release-megengine/models/cas_mvsnet.py”, line 368, in forward
var_reg=self.var_regression if self.share_cr else self.var_regression[stage_idx])
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/site-packages/megengine/module/module.py”, line 149, in call
outputs = self.forward(*inputs, **kwargs)
File “/data/UnsupMVS/KD-MVS-release-megengine/models/cas_mvsnet.py”, line 220, in forward
log_var = var_reg(ref_feature)
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/site-packages/megengine/module/module.py”, line 149, in call
outputs = self.forward(*inputs, **kwargs)
File “/data/UnsupMVS/KD-MVS-release-megengine/models/cas_mvsnet.py”, line 192, in forward
x = self.conv0(x)
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/site-packages/megengine/module/module.py”, line 149, in call
outputs = self.forward(*inputs, **kwargs)
File “/data/UnsupMVS/KD-MVS-release-megengine/models/module.py”, line 222, in forward
return F.relu(self.bn(self.conv(x)))
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/site-packages/megengine/module/module.py”, line 149, in call
outputs = self.forward(*inputs, **kwargs)
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/site-packages/megengine/module/batchnorm.py”, line 77, in forward
self._check_input_ndim(inp)
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/site-packages/megengine/module/batchnorm.py”, line 325, in _check_input_ndim
if len(inp.shape) != 4:
File “/home/dingyikang/anaconda3/envs/kdmvs-megengin/lib/python3.6/site-packages/megengine/tensor.py”, line 112, in shape
shape = super().shape
megengine.core.imperative_rt.core2.AsyncError: An async error is reported. See above for the actual cause. Hint: This is where it is reported, not where it happened. You may call `megengine.config.async_level = 0 to get better error reporting.
-
代码关键片段:_def init(self, in_channels):
super(UncertaintyNet, self).init()
self.inplanes = in_channels
self.conv0 = ConvBnReLU(self.inplanes, 4self.inplanes, 3, 1, 1)
self.conv1 = ConvBnReLU(4self.inplanes, 8self.inplanes, 3, 1, 1)
self.conv2 = ConvBnReLU(8self.inplanes, self.inplanes, 3, 1, 1)
self.var = nn.Conv2d(self.inplanes, 1, 3, 1, 1)def forward(self, x):
x = self.conv0(x)
x = self.conv1(x)
x = self.conv2(x)
x = self.var(x)
x = F.squeeze(x, axis=1)
return x
class ConvBnReLU(nn.Module):
def init(self, in_channels, out_channels, kernel_size=3, stride=1, pad=1):
super(ConvBnReLU, self).init()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride=stride, padding=pad, bias=False)
self.bn = nn.BatchNorm2d(out_channels)
def forward(self, x):
return F.relu(self.bn(self.conv(x)))
在self.bn处报错_