VGG

DD-Ranking supports implementation of VGG in both DC and torchvision.

We provide the following interface to initialize a ConvNet model:

dd_ranking.utils.get_vgg(model_name: str, im_size: tuple, channel: int, num_classes: int, depth: int, batchnorm: bool, use_torchvision: bool, pretrained: bool, model_path: str) [SOURCE]

Parameters

  • model_name(str): Name of the model. Please navigate to models for the model naming convention in DD-Ranking.
  • im_size(tuple): Image size.
  • channel(int): Number of channels of the input image.
  • num_classes(int): Number of classes.
  • depth(int): Depth of the network.
  • batchnorm(bool): Whether to use batch normalization.
  • use_torchvision(bool): Whether to use torchvision to initialize the model.
  • pretrained(bool): Whether to load pretrained weights.
  • model_path(str): Path to the pretrained model weights.
NOTE
When using torchvision VGG on image size smaller than 224 x 224, we make the following modifications:

For 32x32 image size:

model.classifier = nn.Sequential(OrderedDict([
    ('fc1', nn.Linear(512 * 1 * 1, 4096)),
    ('relu1', nn.ReLU(True)),
    ('drop1', nn.Dropout()),
    ('fc2', nn.Linear(4096, 4096)),
    ('relu2', nn.ReLU(True)),
    ('drop2', nn.Dropout()),
    ('fc3', nn.Linear(4096, num_classes)),
]))

For 64x64 image size:

model.classifier = nn.Sequential(OrderedDict([
    ('fc1', nn.Linear(512 * 2 * 2, 4096)),
    ('relu1', nn.ReLU(True)),
    ('drop1', nn.Dropout()),
    ('fc2', nn.Linear(4096, 4096)),
    ('relu2', nn.ReLU(True)),
    ('drop2', nn.Dropout()),
    ('fc3', nn.Linear(4096, num_classes)),
]))