Background: The poly Q polymorphism in AIB1 (amplified in breast cancer) gene is usually assessed by fragment length analysis which does not reveal the actual sequence variation. The purpose of this study is to investigate the sequence variation of poly Q encoding region in breast cancer cell lines at single molecule level, and to determine if the sequence variation is related to AIB1 gene amplification. Methods: The polymorphic poly Q encoding region of AIB1 gene was investigated at the single molecule level by PCR cloning/sequencing. The amplification of AIB1 gene in various breast cancer cell lines were studied by real-time quantitative PCR. Results: Significant amplifications (5-23 folds) of AIB1 gene were found in 2 out of 9 (22%) ER positive cell lines (in BT-474 and MCF-7 but not in BT-20, ZR-75-1, T47D, BT483, MDA-MB-361, MDA-MB-468 and MDA-MB-330). The AIB1 gene was not amplified in any of the ER negative cell lines. Different passages of MCF-7 cell lines and their derivatives maintained the feature of AIB1 amplification. When the cells were selected for hormone independence (LCC1) and resistance to 4-hydroxy tamoxifen (4-OH TAM) (LCC2 and R27), ICI 182,780 (LCC9) or 4-OH TAM, KEO and LY 117018 (LY-2), AIB1 copy number decreased but still remained highly amplified. Sequencing analysis of poly Q encoding region of AIB1 gene did not reveal specific patterns that could be correlated with AIB1 gene amplification. However, about 72% of the breast cancer cell lines had at least one under represented (<20%) extra poly Q encoding sequence patterns that were derived from the original allele, presumably due to somatic instability. Although all MCF-7 cells and their variants had the same predominant poly Q encoding sequence pattern of (CAG)3CAA(CAG)9(CAACAG)3(CAACAGCAG)2CAA of the original cell line, a number of altered poly Q encoding sequences were found in the derivatives of MCF-7 cell lines. Conclusion: These data suggest that poly Q encoding region of AIB1 gene is somatic unstable in breast cancer cell lines. The instability and the sequence characteristics, however, do not appear to be associated with the level of the gene amplification.