Using a backpropagation neural network model we have found a limit for secondary structure prediction from local sequence. By including only sequences from whole α-helix and non-α-helixstructures in our training and test sets—sequences spanning boundaries between these two structures were excluded—it was possible to investigate directly the relationship between sequence and structure for α-helix. A group of non-α-helix sequences, that was disrupting overall prediction success, was indistinguishable to the network from α-helix sequences. These sequences were found to occur at regions adjacent to the termini of α-helices with statistical significance, suggesting that potentially longer α-helices are disrupted by global constraints. Some of these regions spanned more than 20 residues. On these whole structure sequences, 10 residues in length, a comparatively high prediction success of 78% with a correlation coefficient of 0.52 was achieved. In addition, the structure of the input space, the distribution of β-sheet in this space, and the effect of segment length were also investigated.
|Number of pages||10|
|Journal||Proteins: Structure, Function, and Bioinformatics|
|Publication status||Published - Nov 1992|