PFRMAT SS 
TARGET T0047 
AUTHOR 5209-2778-6515 
METHOD This is a short description of the method used in the secondary structure 
METHOD prediction algorithm 'sspred' made by Claus A. Andersen in Ole Lund's group. 
METHOD  
METHOD The secondary structure prediction program (sspred) is a two level system consisting 
METHOD of a neural network at each level. The use of evolutionary profiles has been  
METHOD incoorporated and the system has been tested on a redundancy reduced dataset in a 
METHOD ten fold cross-validation. The dataset consists of 650 protein chains with low  
METHOD sequence similarity and achieves a prediction performance Qtot= 72.6 %, when predicting 
METHOD the DSSP helix ('G','H'), sheet ('E'), and coil categories.  
METHOD  
METHOD The first level is a standard feedforward neural network with one hidden layer, 
METHOD which predicts the secondary structure elements (H,E,C) from the amino acid 
METHOD sequence. The amino acids are represented with sparse encoding as sequence  
METHOD windows containing 13 residues, where the secondary structure of the central  
METHOD residue is predicted. The network is trained on the pdb sequence and their 
METHOD sequence profiles and the training is stopped when the Matthews correlation  
METHOD coefficient sum (C_helix+C_sheet+C_coil) is maximal on the test set. The second 
METHOD level is also a standard feedforward neural network with one hidden layer. This 
METHOD structure-to-structure network is given the predictions of the first layer and 
METHOD predicts the same secondary structure elements (H,E,C). The predictions given 
METHOD as input is that of the pdb sequence and its profile. The predictions on each 
METHOD alignment in the profile is grouped according to the quality of the alignment. 
METHOD There are three groupings (good, medium, poor), which are averages of the 
METHOD predictions for that alignment quality. 
METHOD  
METHOD To produce a prediction on unknown sequences e.g. CASP, the output of the  
METHOD ten cross-validation predictions has been averaged. 
REMARK Direct output from sspred. 
MODEL 2 
E C 0.99 
E C 0.74 
A C 0.72 
S C 0.66 
S C 0.66 
T C 0.71 
R C 0.72 
G C 0.71 
N C 0.71 
L C 0.66 
D C 0.61 
V E 0.40 
A E 0.40 
K E 0.40 
L C 0.52 
N C 0.72 
G C 0.71 
D C 0.59 
W E 0.40 
F E 0.62 
S E 0.67 
I E 0.72 
V E 0.61 
V E 0.48 
A C 0.52 
S C 0.59 
N C 0.71 
K C 0.61 
R H 0.54 
E H 0.61 
K H 0.61 
I H 0.57 
E C 0.52 
E C 0.71 
N C 0.83 
G C 0.90 
S C 0.78 
M E 0.40 
R E 0.54 
V E 0.67 
F E 0.62 
M E 0.45 
Q H 0.50 
H H 0.53 
I H 0.50 
D H 0.44 
V C 0.52 
L C 0.61 
E C 0.66 
N C 0.66 
S C 0.61 
L E 0.45 
G E 0.61 
F E 0.72 
K E 0.77 
F E 0.77 
R E 0.77 
I E 0.62 
K C 0.59 
E C 0.72 
N C 0.80 
G C 0.78 
E C 0.71 
C C 0.50 
R C 0.41 
E E 0.48 
L E 0.54 
Y E 0.54 
L E 0.62 
V E 0.54 
A E 0.48 
Y C 0.50 
K C 0.66 
T C 0.78 
P C 0.83 
E C 0.86 
D C 0.83 
G C 0.71 
E E 0.54 
Y E 0.81 
F E 0.84 
V E 0.84 
E E 0.77 
Y E 0.54 
D C 0.72 
G C 0.83 
G C 0.80 
N C 0.71 
T E 0.61 
F E 0.81 
T E 0.84 
I E 0.81 
L E 0.72 
K E 0.48 
T C 0.61 
D C 0.78 
Y C 0.72 
D C 0.59 
R H 0.57 
Y H 0.53 
V E 0.62 
M E 0.72 
F E 0.67 
H E 0.62 
L E 0.48 
I C 0.40 
N C 0.61 
F C 0.74 
K C 0.80 
N C 0.86 
G C 0.83 
E C 0.72 
T C 0.55 
F E 0.40 
Q E 0.54 
L E 0.61 
M E 0.61 
V E 0.54 
L E 0.48 
Y C 0.41 
G C 0.61 
R C 0.72 
T C 0.78 
K C 0.66 
D C 0.61 
L C 0.59 
S C 0.59 
S H 0.77 
D H 0.90 
I H 0.97 
K H 0.98 
E H 0.98 
K H 0.98 
F H 0.98 
A H 0.98 
K H 0.97 
L H 0.97 
C H 0.90 
E H 0.77 
A H 0.64 
H C 0.80 
G C 0.90 
I C 0.88 
T C 0.74 
R C 0.50 
D E 0.48 
N E 0.54 
I E 0.61 
I E 0.62 
D E 0.52 
L C 0.59 
T C 0.72 
K C 0.74 
T C 0.66 
D C 0.61 
R C 0.66 
C C 0.78 
L C 1.00 
Q C 0.00 
A C 0.00 
R C 0.00 
G C 0.00 
END 
