I am very confused about the "mask" in the detr.py. Could you explain what is this, please? And, if Intput's image was resized to the same size then we don't need "mask" right?
def downsample_masks(self, masks, x):
masks = tf.cast(masks, tf.int32)
masks = tf.expand_dims(masks, -1)
masks = tf.compat.v1.image.resize_nearest_neighbor(masks, tf.shape(x)[1:3], align_corners=False, half_pixel_centers=False)
masks = tf.squeeze(masks, -1)
masks = tf.cast(masks, tf.bool)
return masks
def call(self, inp, training=False, post_process=False):
x, masks = inp
x = self.backbone(x, training=training)
I am very confused about the "mask" in the detr.py. Could you explain what is this, please? And, if Intput's image was resized to the same size then we don't need "mask" right?