proteins are ubiquitously expressed in eukaryotic cells (Melchior
F. et al., 2003), and implicatedly regulate
various cellular processes, e.g. stress responsing (Huang
TT et al., 2003), cell-cycle progression
S et al., 2001; Pinsky
BA et al., 2002; Seeler
JS et al., 2003), and gene expression (Muller
S et al., 2004), etc.
SUMO proteins belong to the superfamily
of Ubiquitin-like modifiers (UBLs) (Schwartz
DC et al., 2003), which consisted of three
components in mammalians: SUMO-1, SUMO-2 and SUMO-3 (Saitoh
H et al., 2000). Recently, another component
SUMO-4 was discovered (Bohren
KM et al., 2004) in human. SUMO proteins
are highly conserved from yeast to human. Only one SUMO protein -- Smt3
is in Baker's yeast.
There are only 12 verified SUMO substrates
before 2000 (Melchior
F, 2000). But now, > 60 SUMO substrates were discovered
JS et al., 2003) and it becomes a fascinating
hot area to search for new SUMO substrates.
Conventional experimental approaches
can identify SUMO substrates, but they are tedious and time-consuming.
Small-scale analysis of SUMO substrates could improve the efficiency
by the method of affinity chromatography-coupled high-pressure
liquid chromatography/tandem mass spectrometry (Zhao
Y et al., 2004), yet only 4 previously characterized
and 18 novel potential SUMO substrates were found.
The majority of the SUMO substrates
have a consensus motif with four amino acids: ¦·-K-X-E.
And a nuclear localization signal (NLS) suffices to produce
a SUMO conjugate in vivo (Manuel
S et al., 2001), with only a few exceptions.
With these molecular characteristics,
it's possible to predict the SUMO substrates in silico. Here
we combined the pattern recognition and comparative-sequence
approach to predict the SUMO substrates.