The design, implementation, and use of the Ngram statistics package

Satanjeev Banerjee, Ted Pedersen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

125 Scopus citations

Abstract

The Ngram Statistics Package (NSP) is a flexible and easy– to–use software tool that supports the identification and analysis of Ngrams, sequences of N tokens in online text. We have designed and implemented NSP to be easy to customize to particular problems and yet remain general enough to serve a broad range of needs. This paper provides an introduction to NSP while raising some general issues in Ngram analysis, and summarizes several applications where NSP has been successfully employed. NSP is written in Perl and is freely available under the GNU Public License.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsAlexander Gelbukh
PublisherSpringer Verlag
Pages370-381
Number of pages12
ISBN (Print)3540005323
DOIs
StatePublished - 2003
Event4th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2003 - Mexico City, Mexico
Duration: Feb 16 2003Feb 22 2003

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2588
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other4th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2003
Country/TerritoryMexico
CityMexico City
Period2/16/032/22/03

Bibliographical note

Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2003.

Fingerprint

Dive into the research topics of 'The design, implementation, and use of the Ngram statistics package'. Together they form a unique fingerprint.

Cite this