PFRMAT AL 
TARGET T0055 
AUTHOR 5827-4749-3439 
REMARK HMM and threading prediction for t0055 
METHOD We used two very different methods on this target: HMMs and 
METHOD protein threading. 
METHOD For the HMMs, as usual, we employed a two-pronged approach: 
METHOD (1) We scored the target against all the HMMs in 
METHOD our HMM library, and (2) we gathered homologs to 
METHOD the target using PsiBLAST, constructed an 
METHOD HMM for the target and homologs (using UCSC's HMM software), and 
METHOD scored PDB. This yielded two sets of scores, which we then used 
METHOD to find a target-structure match. 
METHOD 
METHOD The highest-scoring HMMs in the first stage of the analysis 
METHOD were 1esl_1 and homologs (scop domains d2msbb and d1lit), 
METHOD all three of which had very strong scores. The second stage, 
METHOD scoring an hmm constructed for the target and homologs against 
METHOD PDB, produced very strong scores for 1esl_1 (undoubtedly 
METHOD partially due to the fact that the homologs found using BLAST 
METHOD included 1esl_1!). 
METHOD 
METHOD The alignment submitted was created by aligning the target and structure 
METHOD to an HMM constructed for the target and homologs. 
METHOD 
METHOD Protein threading resulted in all three proteins from the SCOP C-type 
METHOD lectin domain family appearing in the top 10 structures predicted for the 
METHOD target sequence.  Each of the top 10 predicted structures were then 
METHOD threaded against the full protein data bank (inverse threading). 
METHOD This step is useful for filtering out false positives. 
METHOD Of the top ten hits, the score for the target sequence was within 
METHOD the range of known SCOP superfamily member sequences for only the 
METHOD lectin domain.  Since the sequence similarity to 1esl is evident, 
METHOD we were able to assess that the automatic threading alignment 
METHOD was, not surprisingly, inferior to the HMM alignment.  We only 
METHOD include this alignment here in the interest of completeness 
METHOD (1lit scored slightly higher then 1esl). 
METHOD >t0055 
METHOD MD---------YEILFSDETMNYADAGTYCQSRGMALVSSAMRDSTMVKAILAFTEVKG-- 
METHOD HDYWVGADNLQDGAYNFLWNDGVSLPTDSDLWSPNEPSNPQSWQLCVQIWSKYNLLDDVGC 
METHOD GGARRVICEKELDD 
METHOD >d1lit__ 
METHOD -CPEGTNAYRSYCYYFNEDRETWVDADLYCQNM-NSGNLVSVLTQAEGAFVASLIKESGTD 
METHOD DFNVWIGLHDPKKNRRWHWSSGS--LVSYKSWGIGAPSSVNPGYCVS-LTSSTGFQKWKDV 
METHOD PCEDKFSFVCKFKN 
METHOD 
MODEL  1 
PARENT 1esl 
Y   3    W   1 
E   4    S   2 
I   5    Y   3 
L   6    N   4 
F   7    T   5 
S   8    S   6 
D   9    T   7 
E  10    E   8 
T  11    A   9 
M  12    M  10 
N  13    T  11 
Y  14    Y  12 
A  15    D  13 
D  16    E  14 
A  17    A  15 
G  18    S  16 
T  19    A  17 
Y  20    Y  18 
C  21    C  19 
Q  22    Q  20 
S  23    Q  21 
R  24    R  22 
G  25    Y  23 
M  26    T  24 
A  27    H  25 
L  28    L  26 
V  29    V  27 
S  30    A  28 
S  31    I  29 
A  32    Q  30 
M  33    N  31 
R  34    K  32 
D  35    E  33 
S  36    E  34 
T  37    I  35 
M  38    E  36 
V  39    Y  37 
I  42    L  38 
L  43    N  39 
A  44    S  40 
F  45    I  41 
T  46    L  42 
E  47    S  43 
V  48    Y  44 
K  49    S  45 
G  50    P  46 
H  51    S  47 
D  52    Y  48 
Y  53    Y  49 
W  54    W  50 
V  55    I  51 
G  56    G  52 
A  57    I  53 
D  58    R  54 
N  59    K  55 
L  60    V  56 
Q  61    N  57 
A  64    N  58 
Y  65    V  59 
N  66    W  60 
F  67    V  61 
L  68    W  62 
W  69    V  63 
N  70    G  64 
D  71    T  65 
G  72    Q  66 
V  73    K  67 
S  74    P  68 
L  75    L  69 
P  76    T  70 
T  77    E  71 
D  78    E  72 
S  79    A  73 
D  80    K  74 
L  81    N  75 
W  82    W  76 
S  83    A  77 
P  84    P  78 
N  85    G  79 
E  86    E  80 
P  87    P  81 
S  88    N  82 
N  89    N  83 
P  90    R  84 
Q  91    Q  85 
S  92    K  86 
W  93    D  87 
Q  94    E  88 
L  95    D  89 
C  96    C  90 
V  97    V  91 
Q  98    E  92 
I  99    I  93 
W 100    Y  94 
S 101    I  95 
K 102    K  96 
Y 103    R  97 
N 104    E  98 
L 105    K  99 
D 108    D 100 
V 109    V 101 
G 110    G 102 
C 111    M 103 
G 112    W 104 
G 113    N 105 
A 114    D 106 
R 115    E 107 
R 116    R 108 
V 117    C 109 
I 118    S 110 
C 119    K 111 
E 120    K 112 
K 121    K 113 
E 122    L 114 
L 123    A 115 
D 124    L 116 
D 125    C 117 
TER 
END 
