CMBI logo SARA logo
   Based on Medline abstracts till February 2008 Version 2.5alpha 2009-09-14
You are currently using an older version of CoPub, click here for the new version

CoPub description

CoPub is a text mining tool that detects co-occuring biomedical concepts in abstracts from the Medline literature database. The biomedical concepts included in CoPub are all human, mouse and rat genes, furthermore biological processes, molecular functions and cellular components from Gene Ontology, and also liver pathologies, diseases, drugs and pathways. Altogether more than 250,000 search strings are linked with CoPub.

Special attention was given to genes and proteins. For all human, mouse and rat genes not only long forms of names were used, but also their symbols and aliases, which increases recall. Symbols not referring to genes or proteins are a well known problem, but sophisticated scripts detect these homonyms and neglect the abstracts in which they occur thereby increasing precision.

CoPub is especially useful for microarray data analysis. It is often difficult to grasp the meaning of lists of differentially expressed genes. Mining Medline with gene names one by one is laborious and tedious, if not impossible, and many relevant abstracts will be missed. With CoPub it is now possible to upload a list of Affymetrix identifiers and find biomedical concepts from Medline that are significantly linked to the gene set. From every retrieved biomedical concept it is only one mouse click to co-published genes and another one to the relevant abstracts.

With the input list of differentially expressed genes and the output list of over-represented keywords, CoPub calculates and displays a literature-based network in SVG format, in which nodes and edges are hyperlinked to the relevant abstracts.

CoPub features
  • Fast and easy access to relevant abstracts
  • Single gene search in all categories
  • Multiple gene search in all categories
  • Single keyword search in gene category
  • Categories of biomedical concepts: genes (human, mouse, rat), liver pathologies, biological processes, molecular functions, cellular components, diseases, drugs, pathways
  • Use of long forms, symbols and aliases of genes
  • Homonym detection
  • Statistical filter to display only significant biomedical concepts
  • Based on Medline abstracts till February 2008
The microarray data analysis feature of CoPub was succesfully applied to gene expression data from toxicogenomics studies to reveal the mode of toxicity of a variety of compounds (Literature-based compound profiling: application to toxicogenomics, Frijters et al. Pharmacogenomics, Nov. 2007, pmid 18034617).

CoPub: a literature-based keyword enrichment tool for microarray data analysis, Frijters et al. Nucleic Acids Research - Web Server Issue 2008, May 2008, pmid 18442992, PDF).

CoPub project
CoPub is a continuation of an earlier version built by Erasmus MC and Organon, and is being further developed by:

Radboud logo Computational Drug Discovery (CDD) Group, Centre for Molecular and Biomolecular Informatics (CMBI),
Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands.
SARA logo SARA Computing and Network Services, Amsterdam, The Netherlands.
NBIC logo CoPub is hosted at SARA with support of the Netherlands Bioinformatics Centre NBIC.


People involved in the CoPub project:
  • Raoul Frijters
  • Jan Polman
  • Stefan Verhoeven
  • Wilco Fleuren
  • Wynand Alkema
  • Rene van Schaik
  • Jacob de Vlieg
  • Bart Heupers
  • Pieter van Beek
  • Machiel Jansen
  • Maurice Bouwhuis

Questions or comments can be sent to support@copub.org.