Abstract:
We present a classification-based approach for the best next view selection and show how we can plausibly obtain a supervisory signal for this task. The proposed approach is end-to-end trainable and aims to get the best possible 3d reconstruction quality with an actively selected second view, given a passively chosen initial view. The proposed model consists of two stages: a classifier and a reconstructor network trained directly from ground truth voxels, as opposed to exhaustively selecting ground truth of best pair views. While testing, the proposed method assumes no prior knowledge of the underlying 3d shape for selecting the next best view. We demonstrate the proposed method's effectiveness via detailed experiments on synthetic and real images and show how it provides improved reconstruction quality than the existing state of the art 3d reconstruction and the next best view prediction techniques. © 2022 ieee.