关于数据集划分~

为使您的问题得到快速解决,建议选择对应标签。

我看示例里数据集都是划分为训练集和测试集,是不是megengine里不需要额外使用验证集?

这个具体数据集如何设置看您实际模型训练需求,如果需要额外引入交叉验证集的话,可以模仿测试集的写法再加入一个 Dataset/Dataloader 就好

我是提前将数据集划分好,训练集、测试集、验证集都是固定的,这个时候只想将固定的那部分验证集放在训练中起验证作用,怎么办呢? :sob: :sob:

可以参考这里: 使用 Data 构建输入 Pipeline — MegEngine 1.13.1 文档

from megengine.data.transform import ToMode
from megengine.data import DataLoader, RandomSampler

dataset = YourImageDataset("/path/to/image/folder")

# you can implement the function to randomly split your dataset
train_set, val_set, test_set = random_split(dataset)

# B is your batch-size, ie. 128
train_dataloader = DataLoader(train_set,
      sampler=RandomSampler(train_set, batch_size=B),
      transform=ToMode('CHW'),
)
val_dataloader = DataLoader(val_set,
      sampler=SequentialSampler(val_set, batch_size=B),
      transform=ToMode('CHW'),
)
test_dataloader = DataLoader(test_set,
      sampler=SequentialSampler(test_set, batch_size=B),
      transform=ToMode('CHW'),
)

for epoch in range(epochs):
    for images, targets in train_dataloder:
        # train one epoch
    for images, targets in val_dataloader:
        # validate after training
    for image, targets in test_dataloader:
        # test after training

1赞