2D observers for human 3D object recognition?

Zili Liu; Daniel Kersten

doi:10.1016/S0042-6989(98)00063-7

2D observers for human 3D object recognition?

Zili Liu, Daniel Kersten

Psychology (Twin Cities)

Research output: Contribution to journal › Article › peer-review

18 Scopus citations

Abstract

In human object recognition, converging evidence has shown that subjects' performance depends on their familiarity with an object's appearance. The extent of such dependence is a function of the inter-object similarity. The more similar the objects are, the stronger this dependence will be and the more dominant the two-dimensional (2D) image-based information will be. However, the degree to which three-dimensional (3D) model-based information is used remains an area of strong debate. Previously the authors showed that all models with independent 2D templates that allowed 2D rotations in the image plane cannot account for human performance in discriminating novel object views. Here the authors derive an analytic formulation of a Bayesian model that gives rise to the best possible performance under 2D affine transformations and demonstrate that this model cannot account for human performance in 3D object discrimination. Relative to this model, human statistical efficiency is higher for novel views than for learned views, suggesting that human observers have used some 3D structural information.

Original language	English (US)
Pages (from-to)	2507-2519
Number of pages	13
Journal	Vision Research
Volume	38
Issue number	15-16
DOIs	https://doi.org/10.1016/S0042-6989(98)00063-7
State	Published - Aug 1998

Bibliographical note

Funding Information:
DK was supported by a grant from the National Science Foundation, contract number SBR-9631682. We thank Ronen Basri, David Jacobs, David Knill, Michael Langer, Pascal Mamassian, Bosco Tjan, Daphna Weinshall, the anonymous reviewers and in particular, John Oliensis, for many helpful discussions. Weinshall pointed out to us the Werman–Weinshall theorem. Part of this work was presented at the Hong Kong International Workshop on ‘Theoretical Aspects of Neural Computation,’ Hong Kong University of Science and Technology, 1997; European Conference on Visual Perception (ECVP), Helsinki, Finland, 1997; ‘Neural Information Processing’ (NIPS), Denver, Colorado, 1997; and ‘International Conference on Computer Vision’ (ICCV), Mumbai, India, 1998.

Keywords

Affine transformation
Ideal observer
Object recognition
Object representation
Template matching

Access

10.1016/S0042-6989(98)00063-7

OpenUrl availability

Full text

Cite this

@article{98870861d648477a9e02ce5b97d683a8,

title = "2D observers for human 3D object recognition?",

abstract = "In human object recognition, converging evidence has shown that subjects' performance depends on their familiarity with an object's appearance. The extent of such dependence is a function of the inter-object similarity. The more similar the objects are, the stronger this dependence will be and the more dominant the two-dimensional (2D) image-based information will be. However, the degree to which three-dimensional (3D) model-based information is used remains an area of strong debate. Previously the authors showed that all models with independent 2D templates that allowed 2D rotations in the image plane cannot account for human performance in discriminating novel object views. Here the authors derive an analytic formulation of a Bayesian model that gives rise to the best possible performance under 2D affine transformations and demonstrate that this model cannot account for human performance in 3D object discrimination. Relative to this model, human statistical efficiency is higher for novel views than for learned views, suggesting that human observers have used some 3D structural information.",

keywords = "Affine transformation, Ideal observer, Object recognition, Object representation, Template matching",

author = "Zili Liu and Daniel Kersten",

note = "Funding Information: DK was supported by a grant from the National Science Foundation, contract number SBR-9631682. We thank Ronen Basri, David Jacobs, David Knill, Michael Langer, Pascal Mamassian, Bosco Tjan, Daphna Weinshall, the anonymous reviewers and in particular, John Oliensis, for many helpful discussions. Weinshall pointed out to us the Werman–Weinshall theorem. Part of this work was presented at the Hong Kong International Workshop on {\textquoteleft}Theoretical Aspects of Neural Computation,{\textquoteright} Hong Kong University of Science and Technology, 1997; European Conference on Visual Perception (ECVP), Helsinki, Finland, 1997; {\textquoteleft}Neural Information Processing{\textquoteright} (NIPS), Denver, Colorado, 1997; and {\textquoteleft}International Conference on Computer Vision{\textquoteright} (ICCV), Mumbai, India, 1998. ",

year = "1998",

month = aug,

doi = "10.1016/S0042-6989(98)00063-7",

language = "English (US)",

volume = "38",

pages = "2507--2519",

journal = "Vision Research",

issn = "0042-6989",