3D to 2D: Using Image Segmentation to Automate Rotoscoping

Implemented in Python

The objective of this research project is to use neural network techniques, specifically a U-Net model to create an image segmentation tool to extract shape, background, and 4 distinct shade values from an inputted image of a 3-dimensional scene. This project works towards the problem of automating rotoscoping within the field of animation, by transforming a 3D visual into a 2D representation.

Background: Rotoscoping is a widely used technique in animation that involves tracing frame-by-frame over a video to create a 2D sequence. A frequent first step in this rotoscoping process, as well as in many artistic practices, is to break down the 3D figure into its key shapes before then moving on to further detail. One can see how in a larger animation project that is minutes or even hours long at the standard of 30 frames per second, this frame-by-frame drawing process can become exceptionally tedious and time consuming. There exist some tools within the field that ease this manual process, however it remains an extremely time- consuming task. The motivation for this project is to explore how a trained neural network can streamline this process and reduce the manual effort required from animators. Successfully automating this step would mark a significant advancement in the rotoscoping process, providing animators with a more efficient tool to create animation from live-action or 3D sequences while allowing them to focus on creative refinement.

Fig 1: Input sequence, Fig 2: Output segmented sequence