Content-aware multi-level guidance for interactive instance segmentation

In proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
 

Abstract

In interactive instance segmentation, users give feedback to iteratively refine segmentation masks. The user-provided clicks are transformed into guidance maps which provide the network with necessary cues on the whereabouts of the object of interest. Guidance maps used in current systems are purely distance-based and are either too localized or non-informative. We propose a novel transformation of user clicks to generate content-aware guidance maps that leverage the hierarchical structural information present in an image. Using our guidance maps, even the most basic FCNs are able to outperform existing approaches that require state-of-the-art segmentation networks pre-trained on large scale segmentation datasets. We demonstrate the effectiveness of our proposed transformation strategy through comprehensive experimentation in which we significantly raise state-of-the-art on four standard interactive segmentation benchmarks.

Bibtex

@INPROCEEDINGS{majumder-2019-content,
     author = {Majumder, Soumajit and Yao, Angela},
      title = {Content-aware multi-level guidance for interactive instance segmentation},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
       year = {2019},
      month = jun,
   abstract = {In interactive instance segmentation, users give feedback to iteratively refine segmentation masks.
               The user-provided clicks are transformed into guidance maps which provide the network with necessary
               cues on the whereabouts of the object of interest. Guidance maps used in current systems are purely
               distance-based and are either too localized or non-informative. We propose a novel transformation of
               user clicks to generate content-aware guidance maps that leverage the hierarchical structural
               information present in an image. Using our guidance maps, even the most basic FCNs are able to
               outperform existing approaches that require state-of-the-art segmentation networks pre-trained on
               large scale segmentation datasets. We demonstrate the effectiveness of our proposed transformation
               strategy through comprehensive experimentation in which we significantly raise state-of-the-art on
               four standard interactive segmentation benchmarks.}
}