Free online reference management for clinicians and scientists

Sign up now

Recent "algorithm" articles

  • These articles and links have been posted by Connotea users using the tag "algorithm".
  • To add to this collection, or to start your own library:

Learn more

Watch a short video (2m 41s)

EXPORT LIST RSS ?
Bookmarks matching tag algorithm
 
Number of articles per page:
10 | 25 | 50 | 100
 
baobabLUNA: the solution space of sorting by reversals
Bioinformatics 25 (14), 1833 (2009)
Summary: Computing the reversal distance and searching for an optimal sequence of reversals to transform a unichromosomal genome into another are useful algorithmic tools to analyse real evolutionary scenarios. Currently, these problems can be solved by at least two available softwares, the prominent of which are GRAPPA and GRIMM. However, the number of different optimal sequences is usually huge and taking only the distance and/or one example is often insufficient to do a proper analysis. Here, we offer an alternative and present baobabLUNA, a framework that contains an algorithm to give a compact representation of the whole space of solutions for the sorting by reversals problem. Availability and Implementation: Compiled code implemented in Java is freely available for download at http://pbil.univ-lyon1.fr/software/luna/. Documentation with methodological background, technical aspects, download and setup instructions, interface description and tutorial are available at http://pbil.univ-lyon1.fr/software/luna/doc/luna-doc.pdf.
 
Efficient computation of all perfect repeats in genomic sequences of up to half a gigabyte, with a case study on the human genome
Veronica Becher, Alejandro Deymonnaz, and Pablo Heiber
Bioinformatics 25 (14), 1746 (15 Jul 2009)
Motivation: There is a significant ongoing research to identify the number and types of repetitive DNA sequences. As more genomes are sequenced, efficiency and scalability in computational tools become mandatory. Existing tools fail to find distant repeats because they cannot accommodate whole chromosomes, but segments. Also, a quantitative framework for repetitive elements inside a genome or across genomes is still missing. Results: We present a new efficient algorithm and its implementation as a software tool to compute all perfect repeats in inputs of up to 500 million nucleotide bases, possibly containing many genomes. Our algorithm is based on a suffix array construction and a novel procedure to extract all perfect repeats in the entire input, that can be arbitrarily distant, and with no bound on the repeat length. We tested the software on the Homo sapiens DNA genome NCBI 36.49. We computed all perfect repeats of at least 40 bases occurring in any two chromosomes with exact matching. We found that each H.sapiens chromosome shares ~10% of its full sequence with every other human chromosome, distributed more or less evenly among the chromosome surfaces. We give statistics including a quantification of repeats by diversity, length and number of occurrences. We compared the computed repeats against all biological repeats currently obtainable from Ensembl enlarged with the output of the dust program and all elements identified by TRF and RepeatMasker (ftp://ftp.ebi.ac.uk/pub/databases/ensembl/jherrero/.repeats/all_repeats.txt.bz2). We report novel repeats as well as new occurrences of repeats matching with known biological elements. Availability: The source code, results and visualization of some statistics are accessible from http://kapow.dc.uba.ar/patterns/
 
On the inference of spatial structure from population genetics data
Bioinformatics 25 (14), 1796 (2009)
Motivation: In a series of recent papers, Tess, a computer program based on the concept of hidden Markov random field, has been proposed to infer the number and locations of panmictic population units from the genotypes and spatial locations of these individuals. The method seems to be of broad appeal as it is conceptually much simpler than other competing methods and it has been reported by its authors to be fast and accurate. However, this methodology is not grounded in a formal statistical inference method and seems to rely to a large extent on arbitrary choices regarding the parameters used. The present article is an investigation of the accuracy of this method and an attempt to assess whether recent results reported on the basis of this method are genuine features of the genetic process or artefacts of the method. Method: I analyse simulated data consisting of populations at Hardy–Weinberg and linkage equilibrium and also data simulated under a scenario of isolation-by-distance at mutation–migration–drift equilibrium. Arabidopsis thaliana data previously analysed with this method are also reconsidered. Results: Using the Tess program under the no-admixture model to analyse data consisting of several genuine HWLE populations with individuals of pure ancestries leads to highly inaccurate results; Using the Tess program under the admixture model to analyse data consisting of a continuous isolation-by-distance population leads to the inference of spurious HWLE populations whose number and features depend on the parameters used. Results previously reported about the A.thaliana using Tess seem to a large extent to be artefacts of the statistical methodology used. The findings go beyond population clustering models and can be an help to design more efficient algorithms based on graphs. Availability: The data analysed in the present article are available from http://folk.uio.no/gillesg/Bioinformatics-HMRF
 
Data structures and compression algorithms for genomic sequence data
Marty Brandon, Douglas Wallace, and Pierre Baldi
Bioinformatics 25 (14), 1731 (15 Jul 2009)
Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function and evolution, but also for the storage, navigation and privacy of genomic data. Here, we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and protecting the data. Results: The general idea is to encode only the differences between a genome sequence and a reference sequence, using absolute or relative coordinates for the location of the differences. These locations and the corresponding differential variants can be encoded into binary strings using various entropy coding methods, from fixed codes such as Golomb and Elias codes, to variables codes, such as Huffman codes. We demonstrate the approach and various tradeoffs using highly variables human mitochondrial genome sequences as a testbed. With only a partial level of optimization, 3615 genome sequences occupying 56 MB in GenBank are compressed down to only 167 KB, achieving a 345-fold compression rate, using the revised Cambridge Reference Sequence as the reference sequence. Using the consensus sequence as the reference sequence, the data can be stored using only 133 KB, corresponding to a 433-fold level of compression, roughly a 23% improvement. Extensions to nuclear genomes and high-throughput sequencing data are discussed. Availability: Data are publicly available from GenBank, the HapMap web site, and the MITOMAP database. Supplementary materials with additional results, statistics, and software implementations are available from http://mammag.web.uci.edu/bin/view/Mitowiki/ProjectDNACompression.
 
A robust framework for detecting structural variations in a genome
Seunghak Lee, Elango Cheran, and Michael Brudno
Bioinformatics 24 (13), i59 (01 Jul 2008)
Posted by jessopher and 1 other to algorithm genomics on Thu Jul 02 2009 at 22:16 UTC | info | related
 
protein folding algorithm
science.gmu.edu
 
Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences.
Bioinformatics (Oxford, England) 20 (17), 2911-7 (22 Nov 2004)
Posted by akborder and 1 other to prediction miRNA algorithm on Wed Jun 24 2009 at 08:52 UTC | info | related
 
Pendekatan hybrid genetic algorithm pada penjadwalan produksi di PT. Alas Petala Makmur
Hartono Sulistio
 
Penjadwalan produksi menggunakan algoritma genetika di PT. Sinar Angkasa Rungkut
Ervan Irawan
 
Perbandingan kinerja algoritma genetika dan simulated annealing untuk masalah multiple objective pada penjadwalan flowshop
Andree Pamungkas

<< Prev 0      Showing entries 1 to 10 of 392 total      Next 10 >>