Talks and presentations

#17 Towards Open World Object Detection

January 30, 2022

Research Paper Explanation Video, YouTube, channel - Sahil Khose

The paper is about Continual Object Detection, determining the unknown objects in the environment, and then learning the new labeled classes without re-training on the entire dataset in an online manner. Link for the video

#8 On Tiny Episodic Memories in Continual Learning

July 25, 2021

Research Paper Explanation Video, YouTube, channel - Sahil Khose

This paper shows how tiny episodic memories are helpful in continual learning, and show that repetitive training on even tiny memories of past tasks does not harm generalization, on the contrary, it improves it! Link for the video

#5 CPC: Data-Efficient Image Recognition with Contrastive Predictive Coding

July 08, 2021

Research Paper Explanation Video, YouTube, channel - Sahil Khose

Deep Mind’s breakthrough paper: ‘Contrastive Predictive Coding 2.0’ (CPC 2) CPC 2 not only crushes AlexNet ‘s scores of 59.3% and 81.8% Top-1 and Top-5 accuracies with just 2% of the ImageNet data (60.4% and 83.9%) but given just 1% of the ImageNet data it achieves 78.3% Top-5 acc outperforming supervised classifier trained on 5x more data! Continuing with training on all the available images (100%) it not just outperforms fully supervised systems by 3.2% (Top-1 acc) but it still manages to outperform these supervised models with just 50% of the ImageNet data! Link for the video

#3 XCiT: Cross-Covariance Image Transformers (Facebook AI)

June 23, 2021

Research Paper Explanation Video, YouTube, channel - Sahil Khose

After dominating Natural Language Processing, Transformers have taken over Computer Vision recently with the advent of Vision Transformers. However, the attention mechanism’s quadratic complexity in the number of tokens means that Transformers do not scale well to high-resolution images. XCiT is a new Transformer architecture, containing XCA, a transposed version of attention, reducing the complexity from quadratic to linear, and at least on image data, it appears to perform on par with other models. What does this mean for the field? Is this even a transformer? What really matters in deep learning? Link for the video

#1 A Deep Multi-Modal Explanation Model for Zero-Shot Learning (XZSL)

June 13, 2021

Research Paper Explanation Video, YouTube, channel - Sahil Khose

Zero-shot learning (ZSL) has attracted significant attention due to its capabilities of classifying new images from unseen classes. In this paper, we propose to address a new and challenging task, namely explainable zero-shot learning (XZSL), which aims to generate visual and textual explanations to support the classification decision. Link for the video