Posts by Collection

portfolio

publications

BERT based Transformers lead the way in Extraction of Health Information from Social Media

Published in SMM4H, NAACL 2021, 2021

Sidharth Ramesh, Abhiraj Tiwari, Parthivi Choubey, Saisha Kashyap, Sahil Khose, Kumud Lakara, Nishesh Singh, Ujjwal Verma

This paper describes our submission for the Social Media Mining for Health (SMM4H) 2021 shared tasks. We participated in 2 tasks: (1) Classificiation, extraction and normalization of adverse drug effect (ADE) mentions in English tweets (Task-1) and (2) Classification of COVID-19 tweets containing symptoms (Task-6). We stood first in task 1-a and second in task 1-b and 6.

Download here

XCI-Sketch: Extraction of Color Information from Images for Generation of Colored Outlines and Sketches

Published in 1. ML for Creativity and Design, 2. Deep Generative Models and Downstream Applications, 3. CtrlGen: Controllable Generative Modeling in Language and Vision, and 4. New in ML workshop, NeurIPS 2021, 2021

Harsh Rathod, Manisimha Varma, Parna Chowdhury, Sameer Saxena, V Manushree, Ankita Ghosh, Sahil Khose

Sketches are a medium to convey a visual scene from an individual’s creative perspective. The addition of color substantially enhances the overall expressivity of a sketch. This paper proposes two methods to mimic human-drawn colored sketches by utilizing the Contour Drawing Dataset. Our first approach renders colored outline sketches by applying image processing techniques aided by k-means color clustering. The second method uses a generative adversarial network to develop a model that can generate colored sketches from previously unobserved images. We assess the results obtained through quantitative and qualitative evaluations.

Download here

A Studious Approach to Semi-Supervised Learning

Published in ICBINB, NeurIPS 2021, 2021

Sahil Khose, Shruti Jain, V Manushree

This paper is an ablation study of distillation in a semi-supervised setting, which not just reduces the number of parameters of the model but can achieve this while improving the performance over the baseline supervised model and making it better at generalizing. We find that the fewer the labels, the more this approach benefits from a smaller student network. This brings forward the potential of distillation as an effective solution to enhance performance in semi-supervised computer vision tasks while maintaining deployability.

Download here

Transformer based ensemble for emotion detection

Published in WASSA, ACL 2022, 2022

Aditya Kane, Shantanu Patankar, Sahil Khose, Neeraja Kirtane

Detecting emotions in languages is important to accomplish a complete interaction between humans and machines. This paper describes our contribution to the WASSA 2022 shared task which handles this crucial task of emotion detection. We have to identify the following emotions: sadness, surprise, neutral, anger, fear, disgust, joy based on a given essay text. We are using an ensemble of ELECTRA and BERT models to tackle this problem achieving an F1 score of 62.76%. Our codebase (this https URL) and our WandB project (this https URL) is publicly available.

Download here

talks

#1 A Deep Multi-Modal Explanation Model for Zero-Shot Learning (XZSL)

Published:

Zero-shot learning (ZSL) has attracted significant attention due to its capabilities of classifying new images from unseen classes. In this paper, we propose to address a new and challenging task, namely explainable zero-shot learning (XZSL), which aims to generate visual and textual explanations to support the classification decision. Link for the video

#3 XCiT: Cross-Covariance Image Transformers (Facebook AI)

Published:

After dominating Natural Language Processing, Transformers have taken over Computer Vision recently with the advent of Vision Transformers. However, the attention mechanism’s quadratic complexity in the number of tokens means that Transformers do not scale well to high-resolution images. XCiT is a new Transformer architecture, containing XCA, a transposed version of attention, reducing the complexity from quadratic to linear, and at least on image data, it appears to perform on par with other models. What does this mean for the field? Is this even a transformer? What really matters in deep learning? Link for the video

#5 CPC: Data-Efficient Image Recognition with Contrastive Predictive Coding

Published:

Deep Mind’s breakthrough paper: ‘Contrastive Predictive Coding 2.0’ (CPC 2) CPC 2 not only crushes AlexNet ‘s scores of 59.3% and 81.8% Top-1 and Top-5 accuracies with just 2% of the ImageNet data (60.4% and 83.9%) but given just 1% of the ImageNet data it achieves 78.3% Top-5 acc outperforming supervised classifier trained on 5x more data! Continuing with training on all the available images (100%) it not just outperforms fully supervised systems by 3.2% (Top-1 acc) but it still manages to outperform these supervised models with just 50% of the ImageNet data! Link for the video

#17 Towards Open World Object Detection

Published:

The paper is about Continual Object Detection, determining the unknown objects in the environment, and then learning the new labeled classes without re-training on the entire dataset in an online manner. Link for the video