VGG
DD-Ranking supports implementation of VGG in both DC and torchvision.
We provide the following interface to initialize a ConvNet model:
dd_ranking.utils.get_vgg(model_name: str, im_size: tuple, channel: int, num_classes: int, depth: int, batchnorm: bool, use_torchvision: bool, pretrained: bool, model_path: str) [SOURCE]
Parameters
- model_name(str): Name of the model. Please navigate to models for the model naming convention in DD-Ranking.
- im_size(tuple): Image size.
- channel(int): Number of channels of the input image.
- num_classes(int): Number of classes.
- depth(int): Depth of the network.
- batchnorm(bool): Whether to use batch normalization.
- use_torchvision(bool): Whether to use torchvision to initialize the model.
- pretrained(bool): Whether to load pretrained weights.
- model_path(str): Path to the pretrained model weights.
NOTE
When using torchvision VGG on image size smaller than 224 x 224, we make the following modifications:
For 32x32 image size:
model.classifier = nn.Sequential(OrderedDict([
('fc1', nn.Linear(512 * 1 * 1, 4096)),
('relu1', nn.ReLU(True)),
('drop1', nn.Dropout()),
('fc2', nn.Linear(4096, 4096)),
('relu2', nn.ReLU(True)),
('drop2', nn.Dropout()),
('fc3', nn.Linear(4096, num_classes)),
]))
For 64x64 image size:
model.classifier = nn.Sequential(OrderedDict([
('fc1', nn.Linear(512 * 2 * 2, 4096)),
('relu1', nn.ReLU(True)),
('drop1', nn.Dropout()),
('fc2', nn.Linear(4096, 4096)),
('relu2', nn.ReLU(True)),
('drop2', nn.Dropout()),
('fc3', nn.Linear(4096, num_classes)),
]))