Posts

Showing posts from 2021

Vision Transformers for Dense Prediction (ICCV 2021): State of the art accuracy on depth estimation and semantic segmentation (realtime >30 FPS)

Image
Vision Transformers for Dense Prediction:  (ICCV 2021) State of the art accuracy on depth estimation and semantic segmentation (realtime >30 FPS) We introduce dense vision transformers, an architecture that leverages vision transformers in place of convolutional networks as a backbone for dense prediction tasks. We assemble tokens from various stages of the vision transformer into image-like representations at various resolutions and progressively combine them into full-resolution predictions using a convolutional decoder. The transformer backbone processes representations at a constant and relatively high resolution and has a global receptive field at every stage. These properties allow the dense vision transformer to provide finer-grained and more globally coherent predictions when compared to fully-convolutional networks. Our experiments show that this architecture yields substantial improvements on dense prediction tasks, especially when a large amount of training data is availa

Scaled YOLO v4 (CVPR2021): Absolute Top-1 neural network for object detection on MS COCO dataset

Image
Scaled YOLO v4 is the best neural network for object detection on MS COCO dataset Scaled YOLO v4 (CVPR2021) outperforms neural networks in accuracy: Google EfficientDet D7x / DetectoRS or SpineNet-190(self-trained on extra-data) Amazon Cascade-RCNN ResNest200 Microsoft RepPoints v2 Facebook RetinaNet SpineNet-190 And many others… Scaled YOLOv4 is more accurate and faster than neural networks: Google EfficientDet D0-D7x Google SpineNet S49s — S143 Baidu Paddle-Paddle PP YOLO And many others… Scaled YOLO v4 is a series of neural networks built on top of the improved and scaled YOLOv4 network. Our neural network was trained from scratch without using pre-trained weights (Imagenet or any other).  The YOLOv4-tiny neural network speed reaches 1774 FPS on a gaming graphics card GPU RTX 2080Ti when using TensorRT + tkDNN (batch = 4, FP16) Read full article :  https://arxiv.org/abs/2011.08036 Read CVPR 2021 paper : https://openaccess.thecvf.com/content/CVPR2021/html/Wang_Scaled-YOLOv4_Scaling_C