Hypergraph-based multilevel matrix approximation for text information retrieval

Haw Ren Fang, Yousef Saad

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

In Latent Semantic Indexing (LSI), a collection of documents is often pre-processed to form a sparse term-document matrix, followed by a computation of a low-rank approximation to the data matrix. A multilevel framework based on hypergraph coarsening is presented which exploits the hypergraph that is canonically associated with the sparse term-document matrix representing the data. The main goal is to reduce the cost of the matrix approximation without sacrificing accuracy. Because coarsening by multilevel hy-pergraph techniques is a form of clustering, the proposed approach can be regarded as a hybrid of factorization-based LSI and clustering-based LSI. Experimental results indicate that our method achieves good improvement of the retrieval performance at a reduced cost.

Original languageEnglish (US)
Title of host publicationCIKM'10 - Proceedings of the 19th International Conference on Information and Knowledge Management and Co-located Workshops
Pages1597-1600
Number of pages4
DOIs
StatePublished - 2010
Event19th International Conference on Information and Knowledge Management and Co-located Workshops, CIKM'10 - Toronto, ON, Canada
Duration: Oct 26 2010Oct 30 2010

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other19th International Conference on Information and Knowledge Management and Co-located Workshops, CIKM'10
Country/TerritoryCanada
CityToronto, ON
Period10/26/1010/30/10

Keywords

  • Latent Semantic Indexing
  • Low-rank matrix approximation
  • Multilevel hypergraph partitioning
  • Text information retrieval

Fingerprint

Dive into the research topics of 'Hypergraph-based multilevel matrix approximation for text information retrieval'. Together they form a unique fingerprint.

Cite this