• Hi!
    I'm Yang Zhou

    I am a research scientist in the Dynamic Media Organization (DMO) at Adobe Research.

    I completed my Ph.D. study in the Computer Graphics Research Group at UMass Amherst, advised by Prof. Evangelos Kalogerakis. I obtained my master's degree from Georgia Institute of Technology and my master & bachelor's degree from Shanghai Jiao Tong University, advised by Prof. Weiyao Lin.

    I work in the areas of computer graphics, computer vision and deep learning. In particular, I am interested in using deep learning techniques to help artists, stylists and animators to make better design. In particular, I am interested in the field of digital human, character animation, audio-visual learning, image/video translation and rigging/skinning.

    Download CV



[NEW!] [Oct. 2022] 1 paper accepted by SIGGRAPH ASIA 2022.

[NEW!] [Oct. 2022] 2 papers accepted by ECCV 2022.

► [Jun. 2022] 2 papers accepted by CVPR 2022.

► [May 2021] Start my new journey at Adobe Research as a full-time research scientist.

► [Mar. 2021] Gave a talk on deep learning architectures for character animation at Intelligent Graphics Lab, Chinese Academy of Sciences.

► [Nov. 2020] Our summer intern project #OnTheBeatSneak was presented at Adobe MAX 2020 (Sneak Peek). [Quick Look] [Full Youtube Link] [Press]

► [Aug. 2020] Our paper MakeItTalk accepted by SIGGRAPH ASIA 2020. [Video]

► [Apr. 2020] Our paper RigNet accepted by SIGGRAPH 2020. [Video]

► [Nov. 2019] Our summer intern project #SweetTalkSneak was presented at Adobe MAX 2019 (Sneak Peek). [Youtube Link] [Press]

► [Aug. 2019] Our paper on Animation Skeleton Prediction accepted by 3DV 2019.

► [Jul. 2019] Our paper SceneGraphNet accepted by ICCV 2019.

► [Jun. 2019] Joined Adobe CIL (Seattle) as a summer intern.

► [Jun. 2018] Joined Wayfair Next Research as a summer intern and fall co-op intern.

► [Apr. 2018] Our paper VisemeNet accepted by SIGGRAPH 2018. [Video]



MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds 2021-2022

Zhan X., Yang Zhou, Li Y., E. Kalogerakis

We present MoRig, a method that automatically rigs character meshes driven by single-view point cloud streams capturing the motion of performing characters. Our method is also able to animate the 3D meshes according to the captured point cloud motion. At the heart of our approach lies a deep neural network that encodes motion cues from the point clouds into features that are informative about the articulated parts of the performing character. These features guide the inference of an appropriate skeletal rig for the input mesh, which is then animated based on the input point cloud motion. Our method can rig and animate diverse characters, including humanoids, quadrupeds, and toys with varying articulations. It is designed to account for occluded regions in the input point cloud sequences and any mismatches in the part proportions between the input mesh and captured character.

[Project Page] [Paper] [Code]

Skeleton-free Pose Transfer for Stylized 3D Characters 2021-2022

Z. Liao, J. Yang, J. Saito, G. Pons-Moll, Yang Zhou
ECCV 2022

We present the first method that automatically transfers poses between stylized 3D characters without skeletal rigging. In contrast to previous attempts to learn pose transformations on fixed or topology-equivalent skeleton templates, our method focuses on a novel scenario to handle skeleton-free characters with diverse shapes, topologies, and mesh connectivities. The key idea of our method is to represent the characters in a unified articulation model so that the pose can be transferred through the correspondent parts. To achieve this, we propose a novel pose transfer network that predicts the character skinning weights and deformation transformations jointly to articulate the target character to match the desired pose. Our method is trained in a semi-supervised manner absorbing all existing character data with paired/unpaired poses and stylized shapes. It generalizes well to unseen stylized characters and inanimate objects.

[Project Page] [Paper] [Code]

Learning Visibility for Robust Dense Human Body Estimation 2021-2022

C. Yao, J. Yang, D. Ceylan, Y. Zhou, Yang Zhou, M. Yang
ECCV 2022

Estimating 3D human pose and shape from 2D images is a crucial yet challenging task. In this work, we learn dense human body estimation that is robust to partial observations. We explicitly model the visibility of human joints and vertices in the x, y, and z axes separately. The visibility in x and y axes help distinguishing out-of-frame cases, and the visibility in depth axis corresponds to occlusions (either self-occlusions or occlusions by other objects). We obtain pseudo ground-truths of visibility labels from dense UV correspondences and train a neural network to predict visibility along with 3D coordinates. We show that visibility can serve as 1) an additional signal to resolve depth ordering ambiguities of self-occluded vertices and 2) a regularization term when fitting a human body model to the predictions.

[Paper] [Code]

Audio-driven Neural Gesture Reenactment with Video Motion Graphs 2020-2022

Yang Zhou, J. Yang, D. Li, J. Saito, D. Aneja, E. Kalogerakis
CVPR 2022

Human speech is often accompanied by body gestures including arm and hand gestures. We present a method that reenacts a high-quality video with gestures matching a target speech audio. The key idea of our method is to split and re-assemble clips from a reference video through a novel video motion graph encoding valid transitions between clips. To seamlessly connect different clips in the reenactment, we propose a pose-aware video blending network which synthesizes video frames around the stitched frames between two clips. Moreover, we developed an audio-based gesture searching algorithm to find the optimal order of the reenacted frames. Our system generates reenactments that are consistent with both the audio rhythms and the speech content.

[Project Page] [Paper] [Code]

APES: Articulated Part Extraction from Sprite Sheets 2021-2022

Z. Xu, M. Fisher, Yang Zhou, D. Aneja, R. Dudhat, L. Yi, E. Kalogerakis
CVPR 2022

Rigged puppets are one of the most prevalent representations to create 2D character animations. Creating these puppets requires partitioning characters into independently moving parts. In this work, we present a method to automatically identify such articulated parts from a small set of character poses shown in a sprite sheet, which is an illustration of the character that artists often draw before puppet creation. Our method is trained to infer articulated body parts, e.g. head, torso and limbs, that can be re-assembled to best reconstruct the given poses. Our results demonstrate significantly better performance than alternatives qualitatively and quantitatively.

[Project Page] [Paper]

MakeItTalk: Speaker-Aware Talking-Head Animation 2019-2020

Yang Zhou, X. Han, E. Shechtman, J. Echevarria, E. Kalogerakis, D. Li

We present a method that generates expressive talking heads from a single facial image with audio as the only input. Our method first disentangles the content and speaker information in the input audio signal. The audio content robustly controls the motion of lips and nearby facial regions, while the speaker information determines the specifics of facial expressions and the rest of the talking head dynamics. Our method is able to synthesize photorealistic videos of entire talking heads with full range of motion and also animate artistic paintings, sketches, 2D cartoon characters, Japanese mangas, stylized caricatures in a single unified framework.

[Project Page] [Paper] [Video] [New Video!] [Code]

RigNet: Neural Rigging for Articulated Characters 2018-2019

Z. Xu, Yang Zhou, E. Kalogerakis, C. Landreth, K. Singh

We present RigNet, an end-to-end automated method for producing animation rigs from input character models. Given an input 3D model representing an articulated character, RigNet predicts a skeleton that matches the animator expectations in joint placement and topology. It also estimates surface skin weights based on the predicted skeleton. Our method is based on a deep architecture that directly operates on the mesh representation without making assumptions on shape class and structure.

[Project Page] [Video] [Code] [Paper]

SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation 2018-2019

Yang Zhou, Z. While, E. Kalogerakis
International Conference Computer Vision (ICCV), 2019

We propose a neural message passing approach to augment an input 3D indoor scene with new objects matching their surroundings. Given an input, potentially incomplete, 3D scene and a query location, our method predicts a probability distribution over object types that fit well in that location. Our distribution is predicted though passing learned messages in a dense graph whose nodes represent objects in the input scene and edges represent spatial and structural relationships.

[Project Page] [Paper] [Code]

Predicting Animation Skeletons for 3D Articulated Models via Volumetric Nets 2018-2019

Z. Xu, Yang Zhou, E. Kalogerakis, K. Singh
International Conference on 3D Vision (3DV) 2019

We present a learning method for predicting animation skeletons for input 3D models of articulated characters. In contrast to previous approaches that fit pre-defined skeleton templates or predict fixed sets of joints, our method produces an animation skeleton tailored for the structure and geometry of the input 3D model.

[Project Page] [Code]

Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55 2017

L. Yi, L. Shao, M. Savva, H. Huang, Yang Zhou, et al.
International Conference Computer Vision Workshop (ICCVW), 2017

ShapeNet is an ongoing effort to establish a richly-annotated, large-scale dataset of 3D shapes. We collaborate with ShapeNet team in helping building the training and testing dataset of “Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55”. In particular, we help check the geometry duplicates in ShapeNet Core dataset.

[3D Shape Reconstruction and Segmentation Task Page] [Paper] [ShapeNet Duplicate Check]

A Tube-and-Droplet-based Approach for Representing and Analyzing Motion Trajectories 2014-2016

W. Lin, Yang Zhou, H. Xu, J. Yan, M. Xu, J. Wu, Z. Liu
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 39(8), pp. 1489-1503, 2017

We address the problem of representing motion trajectories in a highly informative way, and consequently utilize it for analyzing trajectories. We apply our tube-and-droplet representation to trajectory analysis applications including trajectory clustering, trajectory classification & abnormality detection, and 3D action recognition.

[Project Page] [Paper] [Dataset] [Code]

Unsupervised Trajectory Clustering via Adaptive Multi-Kernel-based Shrinkage 2014-2015

H. Xu, Yang Zhou, W. Lin, H. Zha
International Conference Computer Vision (ICCV), pp. 4328-4336, 2015

We introduce an adaptive multi-kernel-based estimation process to estimate the 'shrunk' positions and speeds of trajectories' points. This kernel-based estimation effectively leverages both multiple structural information within a trajectory and the local motion patterns across multiple trajectories, such that the discrimination of the shrunk point can be properly increased.


Representing and recognizing motion trajectories: a tube and droplet approach 2013-2014

Yang Zhou, W. Lin, H. Su, J. Wu, J. Wang, Y. Zhou
ACM Intl. Conf. on Multimedia (MM), pp. 1077-1080. 2014

This paper addresses the problem of representing and recognizing motion trajectories. We propose a 3D tube which can effectively embed both motion and scene-related information of a motion trajectory and a droplet-based method which can suitably catch the characteristics of the 3D tube for activity recognition.




Adobe, Inc | Media Intelligent Lab

May 2021 | Research Scientist

Working on digital human related projects.

Adobe, Inc | Media Intelligent Lab

June, 2020 | Research Intern

Collaborate with researchers on 3D facial/skeleton animations based on deep learning approaches.

Our intern project #OnTheBeatSneak was presented at Adobe MAX 2020 (Sneak Peek).

[Quick Look] [Full Youtube Link] [Press]

Adobe, Inc | Creative Intelligence Lab

June, 2019 | Research Intern

Collaborate with researchers on audio-driven cartoon and real human facial animations and lip-sync technologies based on deep learning approaches.

Our intern project #SweetTalk was presented at Adobe MAX 2019 (Sneak Peek).

[Youtube Link] [Press]

Wayfair, Inc | Wayfair Next Research

June, 2018 | Research Intern

Working on 3D scene systhesis based on deep learning approaches.

NetEase Game, Inc

June, 2015 | Management Trainee

Working on mobile game design, especially on profit models and user-experiences.

Best way to

Contact Me

Best way to reach me is to send an Email