Sequence similarity-driven proteomics in organisms with unknown genomes by LC-MS/MS and automated de novo sequencing

Proteomics. 2007 Jul;7(14):2318-29. doi: 10.1002/pmic.200700003.

Abstract

LC-MS/MS analysis on a linear ion trap LTQ mass spectrometer, combined with data processing, stringent, and sequence-similarity database searching tools, was employed in a layered manner to identify proteins in organisms with unsequenced genomes. Highly specific stringent searches (MASCOT) were applied as a first layer screen to identify either known (i.e. present in a database) proteins, or unknown proteins sharing identical peptides with related database sequences. Once the confidently matched spectra were removed, the remainder was filtered against a nonannotated library of background spectra that cleaned up the dataset from spectra of common protein and chemical contaminants. The rectified spectral dataset was further subjected to rapid batch de novo interpretation by PepNovo software, followed by the MS BLAST sequence-similarity search that used multiple redundant and partially accurate candidate peptide sequences. Importantly, a single dataset was acquired at the uncompromised sensitivity with no need of manual selection of MS/MS spectra for subsequent de novo interpretation. This approach enabled a completely automated identification of novel proteins that were, otherwise, missed by conventional database searches.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algal Proteins / chemistry
  • Algal Proteins / metabolism
  • Amino Acid Sequence
  • Chlorophyta
  • Chromatography, Liquid / instrumentation
  • Chromatography, Liquid / methods*
  • Computer Simulation
  • Databases, Protein
  • Genome*
  • Membrane Proteins / chemistry
  • Membrane Proteins / metabolism
  • Molecular Sequence Data
  • Proteomics / instrumentation
  • Proteomics / methods*
  • Sequence Homology, Amino Acid*
  • Software
  • Tandem Mass Spectrometry / instrumentation
  • Tandem Mass Spectrometry / methods*

Substances

  • Algal Proteins
  • Membrane Proteins