====== MS4: Multi Scale Selector of Sequence Signatures ====== Download the latest version {{:logiciels:ms4_beta-0.2.tar.gz|here}}. This implementation t_MS4 is a beta release of the MS4 algorithm. t_MS4 uses previous development : NLD-decoding to find NLD classes (C part of the code, included) and [[http://pypi.python.org/pypi/altgraph/|altgraph]] to manage trees. It requires [[http://www.python.org/|Python 2]].\\ MS4 is a method that selects among all the segments of similarity detected by the N-local decoding algorithm (Didier //et al.//, 2007) those on which a classification of unaligned set of sequences is based. The N-local decoding detects local similarity of size 2N-1 containing a variable number of mismatch and it has been proved to be successful for alignment-free classification for fixed value of N. The aim of this method is to automatically adapt N (the size of the similarities detected) to the local context and the data-set under consideration. Then, it computes a dissimilarity matrix based on these detected similarities for classifying sequences. For low values of N, similarities are spurious and many hits occur inside one sequence. For large values of N, similarities are exact words shared by no more than 2 sequences. MS4 fixes N as the average number of occurrences per sequence that are smaller than a given parameter Kappa ([[http://www.biomedcentral.com/1471-2105/11/406 |Corel et al., 2010]]).