Im2SurfTex: Surface Texture Generation via Neural Backprojection of Multi-View Images

1Univesity of Cyprus
2CYENS CoE
3Technical University of Crete


Given a text prompt and an untextured 3D shape,Im2SurfTex generates a texture for it by learning to backproject images produced by text-to-image (T2I) diffusion models to the shape's texture space. Left: Im2SurfTex diminishes artifacts on surfaces with self-occlusions and complex geometry, preserving finer details where alternatives like Paint3D struggle, resulting in backprojection issues, such as the guard grill's texture appearing on the candle inside the lantern. Right: Im2SurfTex prevents seam formation on high-curvature surfaces and seamlessly blends multiple views. In contrast, other approaches, such as MatAtlas, often introduce texture discontinuities, as seen on the apple, or fail to resolve multi-view inconsistencies, leading to visible artifacts, as in the teapot.



Abstract


We present Im2SurfTex, a method that generates textures for input 3D shapes by learning to aggregate multi-view image outputs produced by 2D image diffusion models onto the shapes’ texture space. Unlike existing texture generation techniques that use ad hoc backprojection and averaging schemes to blend multiview images into textures, often resulting in texture seams and artifacts, our approach employs a trained, feedforward neural module to boost texture coherency. The key ingredient of our module is to leverage neural attention and appropriate positional encodings of image pixels based on their corresponding 3D point positions, normals, and surface-aware coordinates as encoded in geodesic distances within surface patches. These encodings capture texture correlations between neighboring surface points, ensuring better texture continuity. Experimental results show that our module improves texture quality, achieving superior performance in high-resolution texture generation.


A gallery of 3D shapes across various categories, showcasing high-resolution, seamless, and coherent textures generated by Im2SurfTex.


Method Overview


(Top) The Im2SurfTex, pipeline utilizes depth images and a text prompt to generate a number of candidate views (RGB images) for a given shape. The views are aggregated through a learned backprojection module that incorporates geometric information, such as 3D location, normals, angles between normals, and view vectors, as well as geodesic neighborhood information (bottom right) of shape points corresponding to pixels of the generated RGB images. The backprojection module integrates several cross-attention blocks (bottom left) used to infer texel features and colors from the appearance and geometric information gathered from relevant, non-background pixels across all available views. As some texels may remain uncolored, an inpainting and high-definition (HD) module is applied to refine the texture map following Paint3D



Paper


Im2SurfTex: Surface Texture Generation via Neural Backprojection of Multi-View Images

Yiangos Georgiou, Marios Loizou, Melinos Averkiou and Evangelos Kalogerakis

picture_as_pdf PDF
format_quote BibTeX