Example: bankruptcy

A Point Set Generation Network for 3D Object ...

A Point Set Generation Network for3D Object Reconstruction from a Single ImageHaoqiang Fan Institute for InterdisciplinaryInformation SciencesTsinghua Su Leonidas GuibasComputer Science DepartmentStanford of 3D data by deep neural networks hasbeen attracting increasing attention in the research com-munity. The majority of extant works resort to regularrepresentations such as volumetric grids or collections ofimages; however, these representations obscure the naturalinvariance of 3D shapes under geometric transformations,and also suffer from a number of other issues.

Deep learning for geometric object synthesis In gen-eral, the field of how to predict geometries in an end-to-end fashion is quite a virgin land. In particular, our output, 3D point set, is still not a typical object in the deep learning community. A point set contains orderless samples from a metric-measure space. Therefore, equivalent ...

Tags:

  Learning, Deep, Deep learning

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of A Point Set Generation Network for 3D Object ...

1 A Point Set Generation Network for3D Object Reconstruction from a Single ImageHaoqiang Fan Institute for InterdisciplinaryInformation SciencesTsinghua Su Leonidas GuibasComputer Science DepartmentStanford of 3D data by deep neural networks hasbeen attracting increasing attention in the research com-munity. The majority of extant works resort to regularrepresentations such as volumetric grids or collections ofimages; however, these representations obscure the naturalinvariance of 3D shapes under geometric transformations,and also suffer from a number of other issues.

2 In this paperwe address the problem of 3D reconstruction from a singleimage, generating a straight-forward form of output pointcloud coordinates. Along with this problem arises a uniqueand interesting issue, that the groundtruth shape for aninput image may be ambiguous. Driven by this unorthodoxoutput form and the inherent ambiguity in groundtruth, wedesign architecture, loss function and learning paradigmthat are novel and effective. Our final solution is aconditional shape sampler, capable of predicting multipleplausible 3D Point clouds from an input image.

3 Inexperiments not only can our system outperform state-of-the-art methods on single image based 3d reconstructionbenchmarks; but it also shows strong performance for 3 Dshape completion and promising ability in making multipleplausible IntroductionAs we try to duplicate the successes of current deepconvolutional architectures in the 3D domain, we face afundamental representational issue. Extant deep net archi-tectures for both discriminative and generative learning inthe signal domain are well-suited to data that is regularlysampled, such as images, audio, or video.

4 However,most common 3D geometry representations, such as 2 Dmeshes or Point clouds are not regular structures and donot easily fit into architectures that exploit such regularity equal contributionInputReconstructed 3D Point cloudFigure 3D Point cloud of thecompleteobject can bereconstructed from a single image. Each Point is visualized as asmall sphere. The reconstruction is viewed at two viewpoints (0 and90 along azimuth). A segmentation mask is used to indicatethe scope of the Object in the weight sharing, etc. That is why the majority ofextant works on using deep nets for 3D data resort toeither volumetric grids or collections of images (2D viewsof the geometry).

5 Such representations, however, lead todifficult trade-offs between sampling resolution and netefficiency. Furthermore, they enshrine quantization artifactsthat obscure natural invariances of the data under rigidmotions, this paper we address the problem of generating the3D geometry of an Object based on a single image of thatobject. We explore generative networks for 3D geometrybased on a Point cloud representation. A Point cloudrepresentation may not be as efficient in representing theunderlying continuous 3D geometry as compared to a CADmodel using geometric primitives or even a simple mesh,but for our purposes it has many advantages.

6 A Point cloudis a simple, uniform structure that is easier to learn, asit does not have to encode multiple primitives or combi-natorial connectivity patterns. In addition, a Point cloudallows simple manipulation when it comes to geometrictransformations and deformations, as connectivity does not1605have to be updated. Our pipeline infers the Point positions ina 3D frame determined by the input image and the inferredviewpoint this unorthodox Network output, one of our chal-lenges is how to measure loss during training, as the samegeometry may admit different Point cloud representationsat the same degree of approximation.

7 Unlike the usualL2type losses, we use the solution of a transportationproblem based on the Earth Mover s distance (EMD),effectively solving an assignment problem. We exploit anapproximation to the EMD to provide speed as well asensure differentiability for end-to-end approach effectively attempts to solve the ill-posedproblem of 3D structure recovery from a single projectionusing certain learned priors. The Network has to estimatedepth for the visible parts of the image and hallucinate therest of the Object geometry, assessing the plausibility of sev-eral different completions.

8 From a statistical perspective, itwould be ideal if we can fully characterize the landscapeof the ground truth space, or be able to sample plausiblecandidates accordingly. If we view this as a regressionproblem, then it has a rather unique and interesting featurearising from inherent Object ambiguities in certain are situations where there are multiple, equally good3D reconstructions of a 2D image, making our problem verydifferent from classical regression/classification settings,where each training sample has a unique ground truthannotation.

9 In such settings the proper loss definition canbe crucial in getting the most meaningful final algorithm is a conditional sampler, whichsamples plausible 3D Point clouds from the estimatedground truth space given an input image. Experiments onboth synthetic and real world data verify the effectivenessof our method. Our contributions can be summarized asfollows: We use deep learning techniques to study the Point setgeneration problem; On the task of 3D reconstruction from a singleimage, we apply our Point set Generation Network andsignificantly outperform state of the art; We systematically explore issues in the architectureand loss function design for Point Generation Network .

10 We discuss and address the ground-truth ambiguityissue for the 3D reconstruction from single image code demonstrating our system can be Related Work3D reconstruction from single imagesWhile mostresearches focus on multi-view geometry such as SFMand SLAM [10,9], ideally, one expect that 3D can bereconstructed from the abundant single-view this setting, however, the problem is ill-posedand priors must be incorporated. Early work such asShapeFromX [12,1] made strong assumptions over theshape or the environment lighting conditions.