CLIP
Learning Transferable Visual Models From Natural Language Supervision 리뷰
Learning Transferable Visual Models From Natural Language Supervision 리뷰
An Image is worth 16x16 words: Transformers for image recognition at scale 리뷰
Attention Is All You Need 리뷰
End-to-End Object Detection with Transformers 리뷰
EfficientDet: Scalable and Efficient Object Detection 리뷰