|
1. |
Faster Multipoint Linkage Analysis Using Fourier Transforms |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 1-7
LEONID KRUGLYAK,
ERIC S. LANDER,
Preview
|
PDF (785KB)
|
|
摘要:
ABSTRACTGenetic linkage analysis of human pedigrees using many linked markers simultaneously is a difficult computational problem. We have previously described an approach to this problem that uses hidden Markov models (HMMs) and is quite efficient for pedigrees of moderate size. Here, we describe a new, faster algorithm for the key step in the HMM calculation. The algorithm employs a fast Fourier transform on the group of pedigree inheritance patterns. It substantially improves the overall performance of the software package GENEHUNTER for performing linkage analysis. The Fourier representation opens up new research directions for pedigree analysis.
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.1
年代:1998
数据来源: MAL
|
2. |
Approximation Algorithms for a Genetic Diagnostics Problem |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 9-26
S. RAO KOSARAJU,
ALEJANDRO A. SCHÄFFER,
LESLIE G. BIESECKER,
Preview
|
PDF (2157KB)
|
|
摘要:
ABSTRACTWe define and study a combinatorial problem called WEIGHTED DIAGNOSTIC COVER (WDC) that models the use of a laboratory technique calledgenotypingin the diagnosis of an important class of chromosomal aberrations. An optimal solution to WDC would enable us to define a genetic assay that maximizes the diagnostic power for a specified cost of laboratory work. We develop approximation algorithms for WDC by making use of the well-known problem SET COVER for which thegreedyheuristic has been extensively studied. We prove worst-case performance bounds on thegreedyheuristic for WDC and for another heuristic we calldirectional greedy. We implemented both heuristics. We also implemented a local search heuristic that takes the solutions obtained bygreedyanddir-greedyand applies swaps until they are locally optimal. We report their performance on a real data set that is representative of the options that a clinical geneticist faces for the real diagnostic problem. Many open problems related to WDC remain, both of theoretical interest and practical importance.
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.9
年代:1998
数据来源: MAL
|
3. |
Protein Folding in the Hydrophobic-Hydrophilic (HP) Model is NP-Complete |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 27-40
BONNIE BERGER,
TOM LEIGHTON,
Preview
|
PDF (1684KB)
|
|
摘要:
ABSTRACTOne of the simplest and most popular biophysical models of protein folding is the hydrophobic-hydrophilic (HP) model. TheHPmodel abstracts the hydrophobic interaction in protein folding by labeling the amino acids as hydrophobic (Hfor nonpolar) or hydrophilic (Pfor polar). Chains of amino acids are configured as self-avoiding walks on the 3D cubic lattice, where an optimal conformation maximizes the number of adjacencies betweenH's. In this paper, the protein folding problem under theHPmodel on the cubic lattice is shown to be NP-complete. This means that the protein folding problem belongs to a large set of problems that are believed to be computationally intractable.
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.27
年代:1998
数据来源: MAL
|
4. |
Pairwise and Multiple Identification of Three-Dimensional Common Substructures in Proteins |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 41-56
VINCENT ESCALIER,
JOÉL POTHIER,
HENRI SOLDANO,
ALAIN VIARI,
Preview
|
PDF (3245KB)
|
|
摘要:
ABSTRACTIn this paper, we present an algorithm to find three-dimensional substructures common to two or more molecules. The basic algorithm is devoted to pairwise structural comparison. Given two sets of atomic coordinates, it finds the largest subsets of atoms which are "similar" in the sense that all internal distances are approximately conserved. The basic idea of the algorithm is to recursively build subsets of increasing sizes, combining two sets of size k to build a set of size k + 1. The algorithm can be used "as is" for small molecules or local parts of proteins (about 30 atoms). When a high number of atoms is involved, we use a two step procedure. First we look for common "local" fragments by using the previous algorithm, and then we gather these fragments by using a Branch and Bound technique. We also extend the basic algorithm to perform multiple comparisons, by using one of the structures as a reference point (pivot) to which all other structures are compared. The solution is the largest subsets of atoms common to the pivot and at least q other structures. Although both algorithms are theoretically exponential in the number of atoms, experiments performed on biological data and using realistic parameters show that the solution is obtained within a few minutes. Finally, an application to the determination of the structural core of seven globins is presented.
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.41
年代:1998
数据来源: MAL
|
5. |
Statistical Modelling and Phylogenetic Analysis of a Deaminase Domain |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 57-72
I. SAIRA MIAN,
MICHAEL J. MOSER,
WILLIAM R. HOLLEY,
ALOKE CHATTERJEE,
Preview
|
PDF (2191KB)
|
|
摘要:
ABSTRACTDeamination reactions are catalyzed by a variety of enzymes including those involved in nucleoside/nucleotide metabolism and cytosine to uracil (C→U) and adenosine to inosine (A→I) mRNA editing. The active site of the deaminase (DM) domain in these enzymes contains a conserved histidine (or rarely cysteine), two cysteines and a glutamate proposed to act as a proton shuttle during deamination. Here, a statistical model, a hidden Markov model (HMM), of the DM domain has been created which identifies currently known DM domains and suggests new DM domains in viral, bacterial and eucaryotic proteins. However, no DM domains were identified in the currently predicted proteins from the archaeonMethanococcus jannaschiiand possible causes for, and a potential means to ameliorate this situation are discussed. In some of the newly identified DM domains, the glutamate is changed to a residue that could not function as a proton shuttle and in one instance (Mus musculusspermatid protein TENR) the cysteines are also changed to lysine and serine. These may be non-competent DM domains able to bind but not act upon their substrate. Phylogenetic analysis using an HMM-generated alignment of DM domains reveals three branches with clear substructure in each branch. The results suggest DM domains that are candidates for yeast, platyhelminth, plant and mammalian C→U and A→I mRNA editing enzymes. Some bacterial and eucaryotic DM domains form distinct branches in the phylogenetic tree suggesting the existence of common, novel sub
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.57
年代:1998
数据来源: MAL
|
6. |
Modelling and Simulation of Motility in Actomyosin Systems |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 73-86
DANIEL E. BENTIL,
Preview
|
PDF (1570KB)
|
|
摘要:
ABSTRACTWe present a model mechanism for simulating the diffusive motion and fluctuations inherent in myofibrillar sarcomere and its subunits at the molecular level. The model couples Langevin dynamics with Huxley kinetics to reproduce the transient patterns of momentum transfer, force generation and resulting motility due to the interactive activities of actin and myosin crossbridges. When myosin is detached from actin, our model predicts Brownian displacements centered at 0 ± 8 nm (mean ± SD,n= 265,308) and it is broadly distributed due to the Brownian noise. Attachment events produced displacements with step sizes of approximately 8 ± 6 nm (mean ± SD,n= 34,693), which is in agreement with some recent optical-tweezers transducer experimental results. The proposed model could form the basis for a complete qualitative and quantitative description of the evolving complex interactions of the molecular proteins—actin and myosin—in the overall framework of muscular contraction
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.73
年代:1998
数据来源: MAL
|
7. |
On the Distribution ofK-tuple Matches for Sequence Homology: A Constant Time Exact Calculation of the Variance |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 87-100
GARY BENSON,
XIAOPING SU,
Preview
|
PDF (1200KB)
|
|
摘要:
ABSTRACTWe study the distribution of a statistic useful in calculating the significance of the number ofk-tuple matches detected in biological sequence homology algorithms. The statistic isRn,k, the total number of heads in head runs of lengthkor more in a sequence of iid Bernoulli trials of lengthn. Calculation of the mean is straightforward. Poisson approximation formulas have been used for the variance because they are simple and powerful. Unfortunately, whenp=P(Head)is large, the Poisson approximation no longer works well. In our application,pis large, say.75, and we have turned instead to direct calculation of the variance. Surprisingly, we are able to show that the variance, which is based on the interactions ofO(n2) random variables, can be computed inconstant time, independent of the length of the sequence and probabilityp. This result can be used to calculate the mean and variance of a number of other head run statistics in constant time. Additionally, we show how to extend the result to sequences generated by a stationary Markov process where the variance can be calculated inO(n) time.
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.87
年代:1998
数据来源: MAL
|
8. |
Expectation and Variance of True and False Fragment Matches in DNA Restriction Mapping |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 101-111
ANDREW F. SIEGEL,
JARED C. ROACH,
GER VAN DEN ENGH,
Preview
|
PDF (1072KB)
|
|
摘要:
ABSTRACTConsider a DNA mapping project in which overlap of clones is inferred from multiple complete restriction enzyme digests. Each enzyme cuts each clone randomly into fragments whose lengths are determined with some error. Clones that share fragments with matching lengths could contain a region of overlap. However, common fragment lengths may be due to random coincidence leading to a false overlap declaration. Although the probability of false fragment matching is small, a mapping project involves a large number of clone comparisons. Consequently, erroneous fragment matches can be a serious problem. We use a geometrical probability approach to develop exact integral formulas and first-order approximations for the expected number and variance of classes of fragment pairs that will be identified falsely as matching. We also find exact formulas for the expected value, and variance of the number of true fragment matches. These formulas are useful in comparing different mapping strategies.
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.101
年代:1998
数据来源: MAL
|
9. |
Optimization of Restriction Fragment DNA Mapping |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 113-126
ANDREW F. SIEGEL,
JARED C. ROACH,
CHARLES MAGNESS,
ED THAYER,
GER VAN DEN ENGH,
Preview
|
PDF (1400KB)
|
|
摘要:
ABSTRACTConsider a mapping project in which overlap of clonal segments is inferred from complete multiple restriction digests. The fragment sizes of the clones are measured with some error, potentially leading to a map with erroneous links. The number of errors in the map depends on the number and types of enzymes used to characterize the clones. The most critical parameter is the decision rule k, or the criterion for declaring clone overlap. Small changes in k may cause an order of magnitude change in the amount of work it takes to build a map of given completion. We observe that the cost of an optimal mapping strategy is approximately proportional to the target size. While this finding is encouraging, considerable effort is nonetheless required: for large-scale sequencing projects with up-front mapping, mapping will be a nonnegligible fraction of the total sequencing cost.
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.113
年代:1998
数据来源: MAL
|
10. |
Constructing Additive Trees When the Error Is Small |
|
Journal of Computational Biology,
Volume 5,
Issue 1,
1998,
Page 127-133
LUSHENG WANG,
DAN GUSFIELD,
Preview
|
PDF (633KB)
|
|
摘要:
ABSTRACTWe consider the problem of constructing an additive tree from a given matrix of pairwise distances, when observation errors are allowed. We give conditions under which the tree topology is unique and semi-unique. We also design an efficient algorithm to construct a tree when the error is relatively small.
ISSN:1066-5277
DOI:10.1089/cmb.1998.5.127
年代:1998
数据来源: MAL
|
|