In this paper, we examine the problem of learning a rep- resentation of image transformations specific to a complex object class, such as faces. Learning such a representation for a specific object class would allow us to perform im- proved, pose-invariant visual verification, such as uncon- strained face verification. We build off of the method of using factored higher-order Boltzmann machines to model such image transformations. Using this approach will po- tentially enable us to use the model as one component of a larger deep architecture. This will allow us to use the fea- ture information in an ordinary deep network to perform better modeling of transformations, and to infer pose esti- mates from the hidden representation. We focus on applying these higher-order Boltzmann ma- chines to the NORB 3D objects data set and the Labeled Faces in the Wild face data set. We first show two different approaches to using this method on these object classes, demonstrating that while some useful transformation infor - mation can be extracted, ultimately the simple direct appli - cation of these models to higher-resolution, complex objec t classes is insufficient to achieve improved visual verificat ion performance. Instead, we believe that this method should be integrated into a larger deep architecture, and show ini- tial results using the higher-order Boltzmann machine as the second layer of a deep architecture, above a first layer convolutional RBM.
Available at: http://works.bepress.com/erik_learned_miller/57/