Publications

Highlights

(For a full list see below)

Play and Learn: Using Video Games to Train Computer Vision Models

Video games are a compelling source of annotated data as they can readily provide fine-grained groundtruth for diverse tasks. However, it is not clear whether the synthetically generated data has enough resemblance to the real-world images to improve the performance of computer vision models in practice. We present experiments assessing the effectiveness on real-world data of systems trained on synthetic RGB images that are extracted from a video game. We collected over 60000 synthetic samples from a modern video game with similar conditions to the real-world CamVid and Cityscapes datasets. We provide several experiments to demonstrate that the synthetically generated RGB images can be used to improve the performance of deep neural networks on both image segmentation and depth estimation. These results show that a convolutional network trained on synthetic data achieves a similar test error to a network that is trained on real-world data for dense image classification. Furthermore, the synthetically generated RGB images can provide similar or better results compared to the real-world datasets if a simple domain adaptation technique is applied. Our results suggest that collaboration with game developers for an accessible interface to gather data is potentially a fruitful direction for future work in computer vision.

A. Shafaei, J. J. Little, Mark Schmidt

BMVC (2016)

Real-Time Human Motion Capture with Multiple Depth Cameras

Commonly used human motion capture systems require intrusive attachment of markers that are visually tracked with multiple cameras. In this work we present an efficient and inexpensive solution to markerless motion capture using only a few Kinect sensors. Unlike the previous work on 3d pose estimation using a single depth camera, we relax constraints on the camera location and do not assume a co-operative user. We apply recent image segmentation techniques to depth images and use curriculum learning to train our system on purely synthetic data. Our method accurately localizes body parts without requiring an explicit shape model. The body joint locations are then recovered by combining evidence from multiple views in real-time. We also introduce a dataset of ~6 million synthetic depth frames for pose estimation from multiple cameras and exceed state-of-the-art results on the Berkeley MHAD dataset.

A. Shafaei, J. J. Little

CRV (2016)

Full List

Modular Generative Adversarial Networks
B. Zhao, B. Chang, Z. Jie and L. Sigal
European Conference on Computer Vision (ECCV), 2018

Probabilistic Video Generation using Holistic Attribute Control
J. He, A. Lehrmann, J. Marino, G. Mori and L. Sigal
European Conference on Computer Vision (ECCV), 2018

A Neural Multi-sequence Alignment TeCHnique (NeuMATCH)
P. Dogan, B. Li, L. Sigal and M. Gross
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Show Me a Story: Towards Coherent Neural Story Illustration
H. Ravi, L. Wang, C Muniz, L. Sigal, D. Metaxas and M. Kapadia
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Predicting Personality from Book Preferences with User-Generated Content Labels
N. Annalyn, M. W. Bos, L. Sigal and B. Li
IEEE Transactions on Affective Computing (TAC), 2018

Where should cameras look at soccer games: improving smoothness using the overlapped hidden Markov model
J Chen and J J. Little
Compuer Vision and Image Understanding (2017)

Story Albums: Creating Fictional Stories from Personal Photograph Sets
O. Radiano, Y. Graber, M. Mahler, L. Sigal and A. Shamir
Computer Graphics Forum, Volume 36, 2017

Non-parametric Structured Outputs Networks
A. Lehrmann and L. Sigal
Neural Information Processing Systems (NIPS), 2017

Visual Reference Resolution using Attention Memory for Visual Dialog
P. H. Seo, A. Lehrmann, B. Han and L. Sigal
Neural Information Processing Systems (NIPS), 2017

Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
F. Xiao, L. Sigal and Y. J. Lee
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Story Albums: Creating Fictional Stories from Personal Photograph Sets
O. Radiano, Y. Graber, M. Mahler, L. Sigal and A. Shamir
Computer Graphics Forum, Volume 36, 2017

Learn How to Choose: Independent Detectors versus Composite Visual Phrases

Winter Conference on Applications of Computer Vision (WACV), 2017

Where should cameras look at soccer games: improving smoothness using the overlapped hidden Markov model
J Chen and J J. Little
Compuer Vision and Image Understanding (2017)

Learning Online Smooth Predictions for Realtime Camera Planning using Recurrent Decision Trees
J Chen, H M. Le. P Carr, Y Yue, J J. Little
Computer Vision and Pattern Recognition (2016)

Real-time Physics-based Motion Capture with Sparse Sensors
S. Andrews, I. Huerta, T. Komura, L. Sigal and K. Mitchell
European Conference on Visual Media Production (CVMP), 2016

Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization
B. Xu, Y. Fu, Y.-G. Jiang, B. Li and L. Sigal
IEEE Transactions on Affective Computing (TAC), 2016

Learning Language-Visual Embedding for Movie Understanding with Natural-Language
A. Torabi, N. Tandon and L. Sigal
arXiv:1609.081241, 2016

Semi-supervised Vocabulary-informed Learning
Y. Fu and L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

Learning Activity Progression in LSTMs for Activity Detection and Early Detection
S. Ma, L. Sigal and S. Sclaroff
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

Harnessing Object and Scene Semantics for Large-Scale Video Understanding
Z. Wu, Y. Fu, Y.-G. Jiang and L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

Video Emotion Recognition with Transferred Deep Feature Encodings
B. Xu, Y. Fu, Y.-G. Jiang, B. Li and L. Sigal
ACM International Conference in Multimedia Retrieval (ICMR), 2016

Knowledge Transfer with Interactive Learning of Semantics Relationships
J. Choi, S. Hwang, L. Sigal and L. Davis

Exploiting View-Specific Appearance Similarities Across Classes for Zero-shot Pose Prediction: A Metric Learning Approach
A. Kuznetsova, S. Hwang, B. Rosenhahn and L. Sigal
AAAI Conference on Artificial Intelligence (AAAI), 2016

Learning to Generate Posters of Scientific Papers
Y. Qiang, Y. Fu, Y. Guo, Z.-H. Zhou and L. Sigal
AAAI Conference on Artificial Intelligence (AAAI), 2016

Play and Learn: Using Video Games to Train Computer Vision Models
A. Shafaei, J. J. Little, Mark Schmidt
BMVC (2016)

Real-Time Human Motion Capture with Multiple Depth Cameras
A. Shafaei, J. J. Little
CRV (2016)

Learning Online Smooth Predictions for Realtime Camera Planning using Recurrent Decision Trees
J Chen, H M. Le. P Carr, Y Yue, J J. Little
Computer Vision and Pattern Recognition (2016)

Storyline Representation of Egocentric Videos with an Application to Story-based Search
B. Xiong, G. Kim and L. Sigal
IEEE International Conference on Computer Vision (ICCV), 2015

Learning from Synthetic Data Using a Stacked Multichannel Autoencoder
X. Zhang, Y. Fu, S. Jiang, L. Sigal and G. Agam
IEEE International Conference on Machine Learning and Applications (ICMLA), 2015

Cross-Domain Matching with Squared-Loss Mutual Information
M. Yamada, L. Sigal, M. Raptis, M. Toyoda, Y. Chang and M. Sugiyama
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2015

A Perceptual Control Space for Garment Simulation
L. Sigal, M. Mahler, S. Diaz, K. McIntosh, E. Carter, T. Richards and J. Hodgins
ACM Transactions on Graphics (Proc. SIGGRAPH), 2015

Discovering Collective Narratives of Theme Parks from Large Collections of Visitors Photo Streams
G. Kim and L. Sigal
KDD 2015

Hierarchical Maximum-Margin Clustering
G.-T. Zhou, S. Hwang, M. Schmidt, L. Sigal and G. Mori
arXiv:1502.01827, 2015

Joint Photo Stream and Blog Post Summarization and Exploration
G. Kim, S. Moon, L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015

Ranking and Retrival of Image Sequences from Multiple Paragraph Queries
G. Kim, S. Moon, L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015

Space-Time Tree Ensemble for Action Recognition
S. Ma, L. Sigal, S. Sclaroff
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015

Expanding Object Detector’s Horizon: Incremental Learning Framework for Object Detection in Videos
A. Kuznetsova, S.-J. Hwang, B. Rosenhahn, L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015

Learning to Select and Order Vacation Photographs
F. Sadeghi, J. R. Tena, A. Farhadi, L. Sigal
IEEE Winter Conference on Applications of Computer Vision (WACV), 2015

Family Member Identification from Photo Collections
Q. Dai, P. Carr, L. Sigal, D. Hoiem
IEEE Winter Conference on Applications of Computer Vision (WACV), 2015

Unlabelled 3D Motion Examples Improve Cross View Action Recognition
A. Gupta, A. Shafaei, J. J. Little and R. J. Woodham
BMVC (2014)

A Unified Semantic Embedding: Relating Taxonomies and Attributes
S.-J. Hwang, L. Sigal
Neural Information Processing Systems (NIPS), 2014

Parameterizing Object Detectors in the Continuous Pose Space
K. He, L. Sigal, S. Sclaroff
European Conference on Computer Vision (ECCV), 2014

Nonparametric Clustering with Distance Dependent Hierarchies
S. Ghosh, M. Raptis, L. Sigal, E. Sudderth
Conference on Uncertainty in Artificial Intelligence (UAI), 2014

Joint Summarization of Large-scale Collections of Web Images and Videos for Storyline Reconstruction
G. Kim, L. Sigal, E. P. Xing
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014

Domain Adaptation for Structured Regression
M. Yamada, Y. Chang and L. Sigal
International Journal of Computer Vision (IJCV), Special Issue on Domain Adaptation for Vision Applications, 2014

High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso
M. Yamada, W. Jitkrittum, L. Sigal, E. P. Xing and M. Sugiyama
Neural Computation (NC), 26(1):185-207, 2014

Covariate Shift Adaptation for Discriminative 3D Pose Estimation
M. Yamada, L. Sigal and M. Raptis
EEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2013

Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization
N. Shapovalova, M. Raptis, L. Sigal
G. Mori, Neural Information Processing Systems (NIPS), 2013

From Subcategories to Visual Composites: A Multi-Level Framework for Object Detection
T. Lan, M. Raptis, L. Sigal, G. Mori
IEEE International Conference on Computer Vision (ICCV), 2013

Poselet Key-framing: A Model for Human Activity Recognition
M. Raptis, L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013

Dynamical Simulation Priors for Human Motion Tracking
M. Vondrak, L. Sigal and O. C. Jenkins
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 35(1):52-65, 2013

Canonical Locality Preserving Latent Variable Model for Discriminative Pose Inference
Y. Tian, L. Sigal, F. De la Torre and Y. Jia
Image and Vision Computing (IVC), 31(3):223-230, 2013

Destination Flow for Crowd Simulation
S. Pellegrini, J. Gall, L. Sigal, L. van Gool
Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams (ARTEMIS’12), 2012

No Bias Left Behind: Covariate Shift Adaptation for Discriminative 3D Pose Estimation
M. Yamada, L. Sigal, M. Raptis
European Conference on Computer Vision (ECCV), 2012

Multi-linear Data-Driven Dynamic Hair Model with Efficient Hair-Body Collision Handling
P. Guan, L. Sigal, V. Reznitskaya, J. K. Hodgins
ACM/Eurographics Symposium on Computer Animation (SCA), 2012

Video-based 3D Motion Capture through Biped Control
M. Vondrak, L. Sigal, J. K. Hodgins and Odest Jenkins
ACM Transactions on Graphics (Proc. SIGGRAPH), 2012

Human Context: Modeling human-human interactions for monocular 3D pose estimation
M. Andriluka and L. Sigal
VII Conference on Articulated Motion and Deformable Objects (AMDO), 2012

Social Roles in Hierarchical Models for Human Activity Recognition
T. Lan, L. Sigal and G. Mor
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012

Human attributes from 3D pose tracking
M. Livne, L. Sigal, N. Troje and D. Fleet
Computer Vision and Image Understanding (CVIU), 116:648-660, 2012

Shared kernel information embedding for discriminative inference
R. Memisevic, L. Sigal and D. Fleet
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 34(4):778-790, 2012

Loose-limbed People: Estimating Human Pose and Motion using Non-parametric Belief Propagation
L. Sigal, M. Isard, H. Haussecker and M. J. Black
International Journal of Computer Vision (IJCV), 98(1):15-48, 2012

Recognizing Character-directed Utterances in Multi-child Interactions
H. Hajishirzi, J. Lehman, K. Kumatani, L. Sigal, and J. Hodgins
late-breaking report section of Human Robot Interaction (HRI), 2012

Facial Expression Transfer with Input-Output Temporal Restricted Boltzmann Machines
M. Zeiler, G. Taylor, L. Sigal, I. Matthews and R. Fergus
Neural Information Processing Systems (NIPS), 2011

Visual Analysis of Humans: Looking at People
T. Moeslund, A. Hilton, V. Krüger and L. Sigal
ISBN 978-0-85729-996-3. To be published by Springer Verlag in October 2011

Benchmark Datasets for Pose Estimation and Tracking
M. Andriluka, L. Sigal and M. J. Black, Visual Analysis of Humans, Looking at People, T. Moeslund, A. Hilton, V. Krüger and L. Sigal
ISBN 978-0-85729-996-3. To be published by Springer Verlag in October 2011

Human Pose Estimation
L. Sigal
Encyclopedia of Computer Vision, Springer, 2011

Motion Capture from Body-Mounted Cameras
. Shiratori, H. S. Park, L. Sigal, Y. Sheikh and J. K. Hodgins
ACM Transactions on Graphics (Proc. SIGGRAPH), July 2011

Inferring 3D Body Pose Using Variational Semi-parametric Regression
Y. Tian, Y. Jia, Y. Shi, Y. Liu, J. Hao and L. Sigal
IEEE International Conference on Image Processing (ICIP), 2011

Latent Gaussian Mixture Regression for Human Pose Estimation
Y. Tian, L. Sigal, H. Badino, F. De la Torre and Y. Liu
Asian Conference on Computer Vision (ACCV), 2010

Human Attributes from 3D Pose Tracking
L. Sigal, D. Fleet, N. Troje, M. Livne
European Conference on Computer Vision, ECCV 2010.

Stable Spaces for Real-time Clothing
E. de Aguiar, L. Sigal, A. Treuille and J. K. Hodgins
ACM Trans. Graphics (Proc. SIGGRAPH), July 2010

Dynamical Binary Latent Variable Models for 3D Human Pose Tracking
G. Taylor, L. Sigal, D. Fleet, G. Hinton
IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010

HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion
L. Sigal, A. Balan and M. J. Black
International Journal of Computer Vision (IJCV), Special Issue on Evaluation of Articulated Human Motion and Pose Estimation, 2010

Estimating Contact Dynamics
M. Brubaker, L. Sigal, D. Fleet
IEEE International Conference on Computer Vision, ICCV 2009

Dynamics and Control of Multibody Systems
M. Vondrak, L. Sigal and O. C. Jenkins
Motion Control, A. Lazinica (Eds), ISBN978-953-7619-X-X, 2009

Shared Kernel Information Embedding for Discriminative Inference
L. Sigal, R. Memisevic, D. Fleet
IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009

Video-Based People Tracking
M. Brubaker, L. Sigal and D. Fleet
Handbook on Ambient Intelligence and Smart Environments, H. Nakashima, H. Aghajan, and J.C. Augusto (Eds), Springer Verlag, 2009

Physical Simulation for Probabilistic Motion Tracking
M. Vondrak, L. Sigal and O. C. Jenkins
IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008