Anni



Introduction

High-throughput experiments, such as with DNA microarrays, typically result in a list of hundreds of genes deemed potentially relevant to the process under study. A growing number of methodologies is being developed to efficiently extract and use information for these large numbers of genes.

Anni is an innovative approach to find functional relations between genes and other biomedical concepts from free text literature. For each gene a profile of related concepts is constructed that summarizes the context in which the gene is mentioned in literature.
An advantage of these concept profiles is that they can easily be compared and patterns of similarity can be found efficiently, for instance with clustering approaches. An important issue is the selection of the measure to weigh the association of a concept in a profile. It is a challenge to distinguish between a concept that co-occurs with the concept of interest because of chance and a concept that has a semantic relationship. With this in mind we adopted a method based on likelihood ratios, which has been successfully used for the identification of interesting collocations. This method does not require the data to have a normal distribution and is known to yield good results even on small samples. With Anni genes with similar functions are identified by hierarchical clustering.
For a cluster Anni provides a coherence measure together with a complete annotation of the underlying overlap of the concept profiles, a p-value to illustrate how exceptional the cluster is and a link-out to the literature behind concept associations.

For more information:

Visit the Anni website of the Biosemantics Group Rotterdam here.

Downloads:

Anni 2.0 2007
ProgramJava web start
ManualPdf file


Anni © 2005 Medical Informatics (Erasmus MC)

GATC Platform © 2002 - 2012 Erasmus MC.
Page Last Updated : Fri, 31 Aug 2007 15:57:34 +0200