IEEE Transactions on Computers: January 2010 (Vol. 59, No. 1)

IEEE Journals - 3 hours 30 min ago
The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers, brief contributions, and comments on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.
Categories: IEEE Members Only

Single Sign-on and Social Networks

IEEE Distributed Systems Online - 20 November, 2009 - 09:38
Letting users sign on to the Internet once and securely access network resources anywhere has been one of the industry's enduring quests. While numerous standards efforts have steadily pursued this capability, most have been back-end technologies of which users are mostly unaware. Recent developments surrounding the open source OpenID federated-identity technology are bringing single sign-on efforts to the foreground.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://www.pheedo.com/click.phdo?s=c6cbde0df502b164bb279d3fc6398298p=1img alt= style=border: 0; border=0 src=http://www.pheedo.com/img.phdo?s=c6cbde0df502b164bb279d3fc6398298p=1//a img src=http://www.pheedo.com/feeds/tracker.php?i=c6cbde0df502b164bb279d3fc6398298 style=display: none; border=0 height=1 width=1 alt=/
Categories: IEEE Members Only

Distributed Computing Education, Part 5: Coming to Terms with Intellectual Property Rights

IEEE Distributed Systems Online - 20 November, 2009 - 09:38
Distributed computing teaching environments (and e-science education in general) require a supportive policy framework that encourages cooperation and sharing. If teachers can share educational content rather than creating their own, they increase the number of quality resources available to them. However, in sharing these resources, IPR issues such as copyright ownership and licensing must be considered.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://www.pheedo.com/click.phdo?s=a8074126e9150d31c5061579b5585f89p=1img alt= style=border: 0; border=0 src=http://www.pheedo.com/img.phdo?s=a8074126e9150d31c5061579b5585f89p=1//a img src=http://www.pheedo.com/feeds/tracker.php?i=a8074126e9150d31c5061579b5585f89 style=display: none; border=0 height=1 width=1 alt=/
Categories: IEEE Members Only

A Floating Backbone for Internet over the Ocean

IEEE Distributed Systems Online - 20 November, 2009 - 09:38
Satellite-based Internet access, the preferred solution for remote locations on the ocean, either offers a low-bandwidth connection or is very expensive to deploy. A backbone structure that provides ocean-wide Internet coverage could provide an alternative to satellite uplinks. With a wide-area network forming a mesh of nodes using floating and moving objects as well as coastal facilities, the network would use next-generation long-range surface radio technology to provide medium- to high-bandwidth Internet access. To achieve high-bandwidth Internet access under these circumstances, the backbone must leverage state-of-the-art sensor network technology, autonomous routing mechanisms, and self-organizing abilities.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://www.pheedo.com/click.phdo?s=d2373058ae8b3b12cce3658a80f8cdc2p=1img alt= style=border: 0; border=0 src=http://www.pheedo.com/img.phdo?s=d2373058ae8b3b12cce3658a80f8cdc2p=1//a img src=http://www.pheedo.com/feeds/tracker.php?i=d2373058ae8b3b12cce3658a80f8cdc2 style=display: none; border=0 height=1 width=1 alt=/
Categories: IEEE Members Only

PrePrint: Molecular Function Prediction Using Neighborhood Features

The recent advent of high throughput methods has generated large amounts of gene interaction data. This has allowed the construction of genome-wide networks. A significant number of genes in such networks remain uncharacterized and predicting the molecular function of these genes remains a major challenge. A number of existing techniques assume that genes with similar functions are topologically close in the network. Our hypothesis is that genes with similar functions observe similar annotation patterns in their neighborhood, regardless of the distance between them in the interaction network. We thus predict molecular functions of uncharacterized genes by comparing their functional neighborhoods to genes of known function. We propose a two-phase approach. First we extract functional neighborhood features of a gene using Random Walks with Restarts. We then employ a KNN classifier to predict the function of uncharacterized genes based on the computed neighborhood features. We perform leave-one-out validation experiments on two S. cerevisiae interaction networks revealing significant improvements over previous techniques. Our technique also provides a natural control of the trade-off between accuracy and coverage of prediction.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=97bb93a4e0fc9f420927c816eeedaa3cp=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=97bb93a4e0fc9f420927c816eeedaa3cp=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: GPD: A Graph Pattern Diffusion Kernel for Accurate Graph Classification with Applications in Cheminformatics

Graph data mining is an active research area. Graphs are general modeling tools to organize information from heterogenous sources and have been applied in many scientific, engineering, and business fields. With the fast accumulation of graph data, building highly accurate predictive models for graph data emerges as a new challenge that has not been fully explored in the data mining community. In this paper, we demonstrate a novel technique called graph pattern diffusion kernel (GPD) with applications in cheminformatics. Our idea is to leverage existing frequent pattern discovery methods and to explore the application of kernel classifier (e.g. support vector machine) in building highly accurate graph classification. In our method, we first identify all frequent patterns from a graph database. We then map subgraphs to graphs in the graph database and use a process we call #x201C;pattern diffusion#x201D; to label nodes in the graphs. Finally we designed a novel graph alignment algorithm to compute the inner product of two graphs. We have tested our algorithm using a number of chemical structure data. The experimental results demonstrate that our method is significantly better than competing methods such as those kernel functions based on paths, cycles, and subgraphs.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=d2e37747a8139c2b8512920d8b83a948p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=d2e37747a8139c2b8512920d8b83a948p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: Microarray Time Course Experiments: Finding Profiles

Time-course studies with microarray techniques and experimental replicates are very useful in biomedical research. We present, in replicate experiments, an alternative approach to select and cluster genes according to a new measure for association between genes. First the procedure normalizes and standardizes the expression profile of each gene and then identifies scaling parameters that will further minimize the distance between replicates of the same gene. Then, the procedure filters out genes with a flat profile, detects differences between replicates and separates genes without significant differences from the rest. For this last group of genes, we define a mean profile for each gene and use it to compute the distance between two genes. Next, a hierarchical clustering procedure is proposed, a statistic is computed for each cluster to determine its compactness and the total number of classes is determined. For the rest of the genes, those with significant differences between replicates, the procedure detects where the differences between replicates lie, and assigns each gene to the best fitting previously identified profile or defines a new profile. We illustrate this new procedure using simulated data and a representative data set arising from a microarray experiment with replication, and we report interesting results.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=5df44e363458a67000bed8e9b9f5417fp=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=5df44e363458a67000bed8e9b9f5417fp=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: Estimating Haplotype Frequencies by Combining Data from Large DNA Pools with Database Information

We assume that allele frequency data have been extracted from several large DNA pools, each containing genetic material of up to hundreds of ascertained individuals. Our goal is to estimate the haplotype frequencies among the ascertained individuals by combining the pooled allele frequency data with prior knowledge about the possible haplotypes. Such prior information can be obtained, for example, from a database such as HapMap. We present a Bayesian haplotyping method for pooled DNA based on a continuous approximation of the multinomial distribution. The proposed method is applicable when the sizes of the DNA pools and/or the number of considered loci exceed the limits of several earlier methods. In the example analyses the proposed model clearly outperforms a deterministic greedy algorithm on real data from the HapMap database. With a small number of loci the proposed method performs similarly to an EM-algorithm which uses a multinormal approximation for the pooled allele frequencies, but which does not utilize prior information about the haplotypes. The method has been implemented in a Matlab code which is available upon request from the authors.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=5a189a3005455c2b139b1cb9a7176f08p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=5a189a3005455c2b139b1cb9a7176f08p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: A Markov Blanket-Based Model for Gene Regulatory Network Inference

An efficient two step Markov blanket method for modeling and inferring complex regulatory networks from large-scale microarray datasets is presented. The inferred gene regulatory network is based on the time series gene expression data capturing the underlying gene interactions. For constructing a highly accurate GRN, the proposed method performs (i) discovery of a gene's Markov Blanket (MB), (ii) formulation of a flexible measure to determine the network's quality, (iii) efficient searching with the aid of a guided genetic algorithm, (iv) pruning to obtain a minimal set of correct interactions. Investigations are carried out using both synthetic as well as yeast cell-cycle gene expression data sets. The realistic synthetic datasets validate the robustness of the method by varying topology, sample size, time-delay, noise, vertex in-degree and presence of hidden nodes. It is shown that the proposed approach has excellent inferential capabilities and high accuracy even in the presence of noise. The gene network inferred from yeast cell-cycle data is investigated for its biological relevance using well known interactions, sequence analysis, motif patterns and GO data. Further, novel interactions are predicted for the unknown genes of the network and their influence on other genes is also discussed.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=ad112f4d804f9380d74fe1bce1243b09p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=ad112f4d804f9380d74fe1bce1243b09p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: Pairwise Statistical Significance of Local Sequence Alignment Using Sequence-Specific and Position-Specific Substitution Matrices

Pairwise sequence alignment is a central problem in bioinformatics which forms the basis of many other applications. Two related sequences are expected to have a high alignment score, but relatedness is usually judged by statistical significance rather than by alignment score. Recently, it was shown that pairwise statistical significance gives promising results as an alternative to database statistical significance for getting individual significance estimates of pairwise alignment scores. The improvement was mainly attributed to making the statistical significance estimation process more sequence-specific and database-independent. In this paper, we use sequence-specific and position-specific substitution matrices to derive the estimates of pairwise statistical significance, which is expected to use more sequence-specific information in estimating pairwise statistical significance. Experiments on a benchmark database with sequence-specific substitution matrices at different levels of sequence-specific contribution were conducted, and results confirm that using sequence-specific substitution matrices for estimating pairwise statistical significance is significantly better than using a standard matrix like BLOSUM62, and than database statistical significance estimates reported by popular database search programs like BLAST, PSI-BLAST (without pre-trained PSSMs) and SSEARCH on a benchmark database, but with pre-trained PSSMs, PSI-BLAST results are significantly better. Further, using position-specific substitution matrices for estimating pairwise statistical significance gives significantly better results even than PSI-BLAST using pre-trained PSSMs.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=54bb8ab8a6ffa8713de61e2fcddb3bc8p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=54bb8ab8a6ffa8713de61e2fcddb3bc8p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: The Impact of Multiple Protein Sequence Alignment on Phylogenetic Estimation

Multiple sequence alignment is typically the first step in estimating phylogenetic trees, with the assumption being that as alignments improve, so will phylogenetic reconstructions. Over the last decade or so, new multiple sequence alignment methods have been developed to improve comparative analyses of protein structure, but these new methods have not been typically used in phylogenetic analyses. In this paper, we report on a simulation study that we performed to evaluate the consequences of using these new multiple sequence alignment methods in terms of the resultant phylogenetic reconstruction. We find that while alignment accuracy is positively correlated with phylogenetic accuracy, the amount of improvement in phylogenetic estimation that results from an improved alignment can range from quite small to substantial. We observe that phylogenetic accuracy is most highly correlated with alignment accuracy when sequences are most difficult to align, and that variation in alignment accuracy can have little impact on phylogenetic accuracy when alignment error rates are generally low. We discuss these observations and implications for future work.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=af861cef704dd5e1eb5f419cfa6c8ed0p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=af861cef704dd5e1eb5f419cfa6c8ed0p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: A Weighted Principal Component Analysis and Its Application to Gene Expression Data

In this work we introduce in the first part new developments in Principal Component Analysis (PCA) and in the second part a new method to select variables (genes in our application). Our focus is on problems where the values taken by each variable do not all have the same importance and where the data may be contaminated with noise and contain outliers, as is the case with microarray data. The usual PCA is not appropriate to deal with this kind of problems. In this context, we propose the use of a new correlation coefficient as an alternative to Pearson's. This leads to a so-called weighted PCA (WPCA). In order to illustrate the features of our WPCA and compare it with the usual PCA, we consider the problem of analysing gene expression datasets. In the second part of this work we propose a new PCA-based algorithm to iteratively select the most important genes in a microarray dataset. We show that this algorithm produces better results when our WPCA is used instead of the usual PCA. Furthermore, by using Support Vector Machines, we show that it can compete with the Significance Analysis of Microarrays algorithm.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=e28b161f244912118e66a1cdae7767e5p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=e28b161f244912118e66a1cdae7767e5p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: Topology Improves Phylogenetic Motif Functional Site Predictions

Prediction of protein functional sites from sequence-derived data remains an open bioinformatics problem. We have developed a phylogenetic motif (PM) functional site prediction approach that identifies functional sites from alignment fragments that parallel the evolutionary patterns of the family. In our approach, PMs are identified by comparing tree topologies of each alignment fragment to that of the complete phylogeny. Herein, we bypass the phylogenetic reconstruction step and identify PMs directly from distance matrix comparisons. In order to optimize the new algorithm, we consider three different distance matrices and thirteen different matrix similarity scores. We assess the performance of the various approaches on a structurally non-redundant dataset that includes three types of functional site definitions. Without exception, the predictive power of the original approach outperforms the distance matrix variants. While the distance matrix methods fail to improve upon the original approach, our results are important because they clearly demonstrate that the improved predictive power is based on the topological comparisons. Meaning, phylogenetic trees are a straightforward, yet powerful way to improve functional site prediction accuracy. While complementary studies have shown that topology improves predictions of protein-protein interactions, this report represents the first demonstration that trees improve functional site predictions as well.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=acaaecf5df2ae9b0bbf79ab5293a2a12p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=acaaecf5df2ae9b0bbf79ab5293a2a12p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: On the Characterization and Selection of Diverse Conformational Ensembles, with Applications to Flexible Docking

To address challenging flexible docking problems, a number of docking algorithms pre-generate large collections of candidate conformers. To remove the redundancy from such ensembles, a central problem in this context is to report a selection of conformers maximizing some geometric diversity criterion. We make three contributions to this problem. First, we resort to geometric optimization so as to report selections maximizing the molecular volume or molecular surface area (MSA) of the selection. Greedy strategies are developed, together with approximation bounds. Second, to assess the efficacy of our algorithms, we investigate two conformer ensembles corresponding to a flexible loop of four protein complexes. By focusing on the MSA of the selection, we show that our strategy matches the MSA of standard selection methods, but resorting to a number of conformers between one and two orders of magnitude smaller. This observation is qualitatively explained using the Betti numbers of the union of balls of the selection. Finally, we replace the conformer selection problem in the context of multiple-copy flexible docking. On the afore-mentioned systems, we show that using the loops selected by our strategy can improve the result of the docking process.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=e21e4bc7ed54dfb974f1fe04f1bb3bcbp=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=e21e4bc7ed54dfb974f1fe04f1bb3bcbp=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: Influence of Prior Knowledge in Constraint-Based Learning of Gene Regulatory Networks

Constraint-based structure learning algorithms generally perform well on sparse graphs. Although sparsity is not uncommon, there are some domains where the underlying graph can have some dense regions; one of these domains is gene regulatory networks, which is the main motivation to undertake the study described in this paper. We propose a new constraint-based algorithm that can both increase the quality of output and decrease the computational requirements for learning the structure of gene regulatory networks. The algorithm is based on and extends the PC algorithm. Two different types of information are derived from the prior knowledge; one is the probability of existence of edges, and the other is the nodes that seem to be dependent on a large number of nodes compared to other nodes in the graph. Also a new method based on Gene Ontology for gene regulatory network validation is proposed. We demonstrate the applicability and effectiveness of the proposed algorithms on both synthetic and real data sets.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=29ada1b4438423e66720d5d36f421759p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=29ada1b4438423e66720d5d36f421759p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: F#xb2;Dock: Fast Fourier Protein-Protein Docking

The functions of proteins is often realized through their mutual interactions. Determining a relative transformation for a pair of proteins and their conformations which form a stable complex, reproducible in nature, is known as docking. It is an important step in drug design, structure determination and understanding function and structure relationships. We provide a scoring model for rigid docking and error-bounded approximation algorithms to predict docking sites. Translational search is sped up using the Fourier domain. Shape based interactions is shown to give good results for a large range of pairs of proteins.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=f7381028fb6ab2813671f4e5f41f20d4p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=f7381028fb6ab2813671f4e5f41f20d4p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: Peak Tree: A New Tool for Multiscale Hierarchical Representation and Peak Detection of Mass Spectrometry Data

In mass spectrometry (MS) analysis, false peak detection results are unavoidable due to severe spectrum variations. However, most current peak detection methods are neither robust enough to resist spectrum variations nor flexible enough to revise false detection results. To improve flexibility, we introduce peak tree to represent the peak information in MS spectra. Each tree node is a peak judgment on a range of scales, and each tree decomposition, as a set of nodes, is a candidate peak detection result. To improve robustness, we combine peak detection and common peak alignment into a closed-loop framework, which finds the optimal decomposition considering both peak intensity and common peak information. The common peak information is derived from the density clustering of peaks detected throughout the MS database and loopily refined to direct peak tree decomposition. Finally, we present an improved ant colony optimization (ACO) biomarker selection method to build a MS analysis system based on peak tree. Experiment shows that our peak detection method can better resist spectrum variations and provide higher sensitivity and lower false detection rates than conventional methods. The benefits from our peak tree based system for MS disease analysis are also proved on real SELDI databr clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=e045badaff523f1c6ea3b8696b263943p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=e045badaff523f1c6ea3b8696b263943p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: Predicting Metabolic Fluxes Using Gene Expression Differences as Constraints

A standard approach to estimate intracellular fluxes on a genome-wide scale is flux balance analysis (FBA), which optimizes an objective function subject to constraints on (relations between) fluxes. The performance of FBA models heavily depends on the relevance of the formulated objective function and the completeness of the defined constraints. Previous studies indicated that FBA predictions can be improved by adding regulatory on/off constraints. These constraints were imposed based on either absolute (Shlomi2007a,Covert2004) or relative (Shlomi2008) gene expression values. We provide a new algorithm that directly uses regulatory up/down constraints based on gene expression data in FBA optimization (tFBA). Our assumption is that if the activity of a gene drastically changes from one condition to the other, the flux through the reaction controlled by that gene will change accordingly. The potential of the proposed method, tFBA, is demonstrated through the analysis of fluxes in yeast under nine different cultivation conditions. We illustrate that changes in gene expression are predictive for changes in fluxes. We compare tFBA and FBA predictions to show that our approach yields more biologically relevant results.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=a28537be4af70b9f9e89992c5afce29bp=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=a28537be4af70b9f9e89992c5afce29bp=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: A Partial Set Covering Model for Protein Mixture Identification Using Mass Spectrometry Data

Protein identification is a key and essential step in mass spectrometry (MS) based proteome research. To date, there are many protein identification strategies that employ either MS data or MS/MS data for database searching. While MS-based methods provide wider coverage than MS/MS-based methods, their identification accuracy is lower since MS data have less information than MS/MS data. Thus, it is desired to design more sophisticated algorithms that achieve higher identification accuracy using MS data. Peptide Mass Fingerprinting (PMF) has been widely used to identify single purified proteins from MS data for many years. In this paper, we extend this technology to protein mixture identification. First, we formulate the problem of protein mixture identification as a Partial Set Covering (PSC) problem. Then, we present several algorithms that can solve the PSC problem efficiently. Finally, we extend the partial set covering model to both MS/MS data and the combination of MS data and MS/MS data. The experimental results on simulated data and real data demonstrate the advantages of our method: (1) it outperforms previous MS-based approaches significantly; (2) it is useful in the MS/MS-based protein inference; and (3) it combines MS data and MS/MS data in a unified model such that the identification performance is further improved.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=c3608345a2a4d21dd36596e069c888a7p=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=c3608345a2a4d21dd36596e069c888a7p=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only

PrePrint: Fast Surface-Based Travel Depth Estimation Algorithm for Macromolecule Surface Shape Description

Travel Depth, introduced by Coleman and Sharp in 2006, is a physical interpretation of molecular depth, term frequently used to describe the shape of a molecular active site or binding site. Travel Depth can be seen as the physical distance a solvent molecule would have to travel from a point of the surface, i.e., the Solvent Excluded Surface (SES), to its convex hull. Existing algorithms providing an estimation of the Travel Depth are based on a regular sampling of the molecule volume and on the use of the Dijkstra’s shortest path algorithm. Since Travel Depth is only defined on the molecular surface, this volume-based approach is characterized by a large computational complexity due to the processing of unnecessary samples lying inside or outside the molecule. In this paper, we propose a surface-based approach that restricts the processing to data defined on the SES. This algorithm significantly reduces the complexity of Travel Depth estimation and makes possible the analysis of large macromolecule surface shape description with high resolution. Experimental results show that compared to existing methods, the proposed algorithm achieves accurate estimations with considerably reduced processing times.br clear=both style=clear: both;/ br clear=both style=clear: both;/ a href=http://ads.pheedo.com/click.phdo?s=33b26cc8a59813d0ba68230b804ad07fp=1img alt= style=border: 0; border=0 src=http://ads.pheedo.com/img.phdo?s=33b26cc8a59813d0ba68230b804ad07fp=1//a img alt= height=0 width=0 border=0 style=display:none src=http://a.rfihub.com/eus.gif?eui=2225/
Categories: IEEE Members Only
Syndicate content