It is very
hard to have a fair comparison among different object detectors. There
is no straight answer on which model is the best. For real-life
applications, we make choices to balance accuracy and speed. Besides the
detector types, we need to aware of other choices that impact the
performance:
- Feature extractors (VGG16, ResNet, Inception, MobileNet).
- Output strides for the extractor.
- Input image resolutions.
- Matching strategy and IoU threshold (how predictions are excluded in calculating loss).
- Non-max suppression IoU threshold.
- Hard example mining ratio (positive v.s. negative anchor ratio).
- The number of proposals or predictions.
- Boundary box encoding.
- Data augmentation.
- Training dataset.
- Use of multi-scale images in training or testing (with cropping).
- Which feature map layer(s) for object detection.
- Localization loss function.
- Deep learning software platform used.
- Training configurations including batch size, input image resize, learning rate, and learning rate decay.
No comments:
Post a Comment