Technology

DeiT

DeiT (Data-efficient Image Transformers) is a Vision Transformer (ViT) variant that achieves high-accuracy image classification using a novel distillation token, cutting training time and data requirements significantly.

DeiT, introduced by Hugo Touvron et al., solves the original Vision Transformer's (ViT) data hunger: it delivers competitive, convolution-free performance without massive external pre-training datasets. The core innovation is a **distillation token**—a teacher-student strategy that efficiently transfers inductive bias, often from a ConvNet teacher, to the ViT architecture. This method allows training on ImageNet-1k only, completing the process in less than three days on a single machine. The result: the distilled model hits up to **85.2% top-1 accuracy**, making high-performance visual transformers accessible and resource-efficient.

https://huggingface.co/docs/transformers/model_doc/deit

1 project · 1 city

Related technologies

DenseNet121 1 Google Colab 12 Python 613 PyTorch 262

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Transformers para Daño Sísmico

Quito Apr 24

PyTorch Python