164 Pages

Prediction of protein structural features by machine learning methods [Elektronische Ressource] / Andreas Kirschner


Gain access to the library to view online
Learn more


Published by
Published 01 January 2009
Reads 28
Language English
Document size 5 MB

Lehrstuhl fur Genomorientierte Bioinformatik
Prediction of protein structural features by machine
learning methods
Andreas Kirschner
Vollstandiger Abdruck der von der Fakultat Wissenschaftszentrum Weihenstephan
fur Ernahrung, Landnutzung und Umwelt der Technischen Universitat Munchen zur
Erlangung des akademischen Grades eines
Doktors der Naturwissenschaften
genehmigten Dissertation.
Vorsitzende: Univ.-Prof. Dr. A. Kapurniotu
Prufer der Dissertation:
1. Univ.-Prof. Dr. D. Frischmann
2. Univ.-Prof. Dr. I. Antes
Die Dissertation wurde am 01.10.2008 bei der Technischen Universitat Munchen
eingereicht und durch die Fakultat Wissenschaftszentrum Weihenstephan fur
Ernahrung, Landnutzung und Umwelt am 18.06.2009 angenommen.iiiii
to my grandmotherivv
Genome sequencing projects continue to reveal the building blocks of life, producing
millions of amino acid sequences whose biological roles can be understood only when
the structure and function of these proteins are elucidated. Although experimental
structure determination methods become faster and cheaper and provide high quality
insights, only computational structure prediction methods can satisfy the demand
for structural data for the majority of proteins. Protein structures are predicted in
various levels of detail: It is approached by the prediction in one-dimension which
has the aim to detect local structural regularities like-helices,-sheets or backbone
turns. The next higher level of detail involves prediction in two-dimensions where
the protein contact map is a prominent representation.
Throughout this work an array of machine learning techniques is used to investi-
gate sequence-structure relationships in proteins, while a strong focus lies on neural
networks. One important advance made is the development of a novel bidirectional
Elman-type recurrent neural network with multiple output layers (MOLEBRNN)
capable of predicting multiple mutually dependent structural motifs. This computa-
tional architecture was successfully applied to develop the currently most accurate
predictor of -turns and solvent accessibility, two important structural and func-
tional features of proteins. The advantage of the method introduced in this thesis
when compared to other predictors is that it does not require any external input
except for sequence pro les because interdependencies between dierent structural
features are taken into account implicitly during the learning process.
Finally, the rst method to identify interacting residues and -helices in mem-
brane proteins is presented. It is based on the analysis of co-evolving residues in
predicted transmembrane regions and the use of neural networks. The neural net-
work approach utilizes both input features commonly used for soluble proteins as
well as those specic to membrane proteins only, such as a residue’s position within
the transmembrane segment or its orientation towards the hydro- or lipophilic envi-
ronment. The predicted residue contacts were employed in a second step to identify
contacting helices with high accuracy.vi