| 
 | NeoBio API | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Object
  |
  +--neobio.alignment.PairwiseAlignmentAlgorithm
        |
        +--neobio.alignment.NeedlemanWunsch
This class implements the classic global alignment algorithm (with linear gap penalty function) due to S.B.Needleman and C.D.Wunsch (1970).
It is based on a dynamic programming approach. The idea consists of, given two sequences A and B of sizes n and m, respectively, building an (n+1 x m+1) matrix M that contains the similarity of prefixes of A and B. Every position M[i,j] in the matrix holds the score between the subsequences A[1..i] and B[1..j]. The first row and column represent alignments with spaces.
Starting from row 0, column 0, the algorithm computes each position M[i,j] with the following recurrence:
 M[0,0] = 0
 M[i,j] = max { M[i,j-1]   + scoreInsertion (B[j]),
                M[i-1,j-1] + scoreSubstitution (A[i], B[j]),
                M[i-1,j]   + scoreDeletion(A[i])             }
 
In the end, the value at the last position (last row, last column) will contain
 the similarity between the two sequences. This part of the algorithm is accomplished
 by the computeMatrix method. It has quadratic space complexity
 since it needs to keep an (n+1 x m+1) matrix in memory. And since the work of computing
 each cell is constant, it also has quadratic time complexity.
After the matrix has been computed, the alignment can be retrieved by tracing a path
 back in the matrix from the last position to the first. This step is performed by
 the buildOptimalAlignment method, and since the path can
 be roughly as long as (m + n), this method has O(n) time complexity.
If the similarity value only is needed (and not the alignment itself), it is easy to
 reduce the space requirement to O(n) by keeping just the last row or column in memory.
 This is precisely what is done by the computeScore method. Note
 that it still requires O(n2) time.
For a more efficient approach to the global alignment problem, see the CrochemoreLandauZivUkelson algorithm. For local alignment, see the SmithWaterman algorithm.
SmithWaterman, 
CrochemoreLandauZivUkelson, 
CrochemoreLandauZivUkelsonLocalAlignment, 
CrochemoreLandauZivUkelsonGlobalAlignment| Field Summary | |
| protected  int[][] | matrixThe dynamic programming matrix. | 
| protected  CharSequence | seq1The first sequence of an alignment. | 
| protected  CharSequence | seq2The second sequence of an alignment. | 
| Fields inherited from class neobio.alignment.PairwiseAlignmentAlgorithm | 
| alignment, APPROXIMATE_MATCH_TAG, GAP_CHARACTER, GAP_TAG, MATCH_TAG, MISMATCH_TAG, score, score_computed, scoring, sequences_loaded, use_match_tag | 
| Constructor Summary | |
| NeedlemanWunsch() | |
| Method Summary | |
| protected  PairwiseAlignment | buildOptimalAlignment()Builds an optimal global alignment between the loaded sequences. | 
| protected  void | computeMatrix()Computes the dynamic programming matrix. | 
| protected  PairwiseAlignment | computePairwiseAlignment()Builds an optimal global alignment between the loaded sequences after computing the dynamic programming matrix. | 
| protected  int | computeScore()Computes the score of the best global alignment between the two sequences using the scoring scheme previously set. | 
| protected  void | loadSequencesInternal(java.io.Reader input1,
                                           java.io.Reader input2)Loads sequences into CharSequence instances. | 
| protected  void | unloadSequencesInternal()Frees pointers to loaded sequences and the dynamic programming matrix so that their data can be garbage collected. | 
| Methods inherited from class neobio.alignment.PairwiseAlignmentAlgorithm | 
| getPairwiseAlignment, getScore, loadSequences, max, max, max, scoreDeletion, scoreInsertion, scoreSubstitution, setScoringScheme, unloadSequences, useMatchTag | 
| Methods inherited from class java.lang.Object | 
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Field Detail | 
protected CharSequence seq1
protected CharSequence seq2
protected int[][] matrix
seq1 and j characters of
 seq2.
| Constructor Detail | 
public NeedlemanWunsch()
| Method Detail | 
protected void loadSequencesInternal(java.io.Reader input1,
                                     java.io.Reader input2)
                              throws java.io.IOException,
                                     InvalidSequenceException
CharSequence (please
 check the specification of that class for specific requirements).
loadSequencesInternal in class PairwiseAlignmentAlgorithminput1 - Input for first sequenceinput2 - Input for second sequence
java.io.IOException - If an I/O error occurs when reading the sequences
InvalidSequenceException - If the sequences are not validCharSequenceprotected void unloadSequencesInternal()
unloadSequencesInternal in class PairwiseAlignmentAlgorithmPairwiseAlignmentAlgorithm.unloadSequences()
protected PairwiseAlignment computePairwiseAlignment()
                                              throws IncompatibleScoringSchemeException
buildOptimalAlignment method
 after the computeMatrix method computes the dynamic programming
 matrix.
computePairwiseAlignment in class PairwiseAlignmentAlgorithmIncompatibleScoringSchemeException - If the scoring scheme is not compatible
 with the loaded sequences.computeMatrix(), 
buildOptimalAlignment()
protected void computeMatrix()
                      throws IncompatibleScoringSchemeException
IncompatibleScoringSchemeException - If the scoring scheme is not compatible
 with the loaded sequences.
protected PairwiseAlignment buildOptimalAlignment()
                                           throws IncompatibleScoringSchemeException
computeMatrix method.
IncompatibleScoringSchemeException - If the scoring scheme
 is not compatible with the loaded sequences.computeMatrix()
protected int computeScore()
                    throws IncompatibleScoringSchemeException
computeScore in class PairwiseAlignmentAlgorithmIncompatibleScoringSchemeException - If the scoring scheme is not compatible
 with the loaded sequences.PairwiseAlignmentAlgorithm.getScore()| 
 |  | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||