16px-feed-icon Veröffentlichungen View this page in English



Motivation: Despite many years of research on how to properly align sequences in the presence of sequencing errors, alternative splicing and micro-exons, the correct alignment of mRNA sequences to genomic DNA is still a challenging task. Results: We present a novel approach based on large margin learning that combines accurate splice site predictions with common sequence alignment techniques. By solving a convex optimization problem, our algorithm — called PALMA — tunes the parameters of the model such that true alignments score higher than other alignments. We study the accuracy of alignments of mRNAs containing artificially generated micro-exons to genomic DNA. In a carefully designed experiment, we show that our algorithm accurately identifies the intron boundaries as well as boundaries of the optimal local alignment. It outperforms all other methods: for 5702 artificially shortened EST sequences from C. elegans and human it correctly identifies the intron boundaries in all except two cases. The best other method is a recently proposed method called exalin which misaligns 37 of the sequences. Our method also demonstrates robustness to mutations, insertions and deletions, retaining accuracy even at high noise levels. Availability: Datasets for training, evaluation and testing, additional results and a stand-alone alignment tool implemented in C++ and python are available at

Type: Article

Author: U. Schulze and B. Hepp and C. S. Ong and G. Rätsch
Title: PALMA: mRNA to Genome Alignments using Large Margin Algorithms
Journal: Bioinformatics
Year: 2007
Note:Project URL:
Pmid: 17537755
Authorurls: and and\~ong/ and /raetsch/members/raetsch
AUTHOR = {U. Schulze and B. Hepp and C. S. Ong and G. Rätsch},
TITLE = {PALMA: mRNA to Genome Alignments using Large Margin Algorithms},
JOURNAL = {Bioinformatics},
YEAR = {2007},
VOLUME = {23},
NUMBER = {15},
PAGES = {1892-1900},
NOTE = {Project URL:}