Upgrading Postgres (If Needed)

featureset constraints
Example populating featureset:

Add ``maps'' into featureset.

Example populating feature_featureset

Insert new CMap map_types and feature_types
Run cmap_synchronize_chado.pl
Create Correspondences
Triggers

Installing the Triggers
What the Triggers Do
What the Triggers Do Not Do

Removing Linkage
Creating Links For Features and Maps Between CMap and Chado
Future stuff
AUTHOR

Introduction

I'm glad you've decided to synchronize your Chado data with CMap. This document will help you get started

Postgres Requirements

Postgres 8.0+ with Perl support is required to use the Perl triggers. You will need to upgrade your install of Postgres if your version does not meet the requirements.

Upgrading Postgres (If Needed)

When upgrading, first see the upgrade secion of postgres INSTALL doc.

Here is what I had to do:

- Dumped old posgres db

  $ pg_dumpall > outputfile

- kill current posgres

- Install Postgres 8.0+ with Perl

use the --with-perl flag during config

  $ ./configure --with-perl
  $ make
  $ make install
  $ su - postgres
  $ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
  $ /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l /usr/local/pgsql/logfile start

- Add the old data back

  $ /usr/local/pgsql/bin/psql -d postgres -f outputfile

Add ``untrusted'' perl to the db

  $ createlang plperlu template1
  $ createlang plperlu dbname

Add new tables to chado.


These tables give the concept of a feature set (featureset) and links it to
feature (feature_featureset) and dbxref (featureset_dbxref).  It also adds a
linker table from featureloc to dbxref (featureloc_dbxref).

    $ psql test2 < chado_integration/chado_synchronize/chado_sync_tables.sql

Create featureset.

In order for this whole thing to work, the featureset table needs to be populated. Someday, a script might help you out with this but for now you will have to do this by hand.

The concept of a featureset is the same as a map set in CMap. Basically, maps (sequence assemblies, chromosomes or whatever you want to make a map) are grouped into sets. For example, sequence assembies from the same assembly run would all be in the same set to differenciate from other assembly runs.

featureset constraints

All maps in a set must be of the same type.

All maps in a set must be from the same organism.

Maps must only belong to one featureset otherwise there will possibly be inconsistencies.

Example populating featureset:

 insert into featureset (name,uniquename,feature_type_id,organism_id) 
  (
    select distinct 
        o.abbreviation||'_'||cvt.name,
        o.abbreviation||'_'||cvt.name,
        f.type_id,
        f.organism_id
    from feature f, organism o, cvterm cvt 
    where   f.type_id = cvt.cvterm_id
        and f.organism_id = o.organism_id
        and cvt.name = 'chromosome_arm'
  );

Add ``maps'' into featureset.

In chado, the ``maps'' are stored in the feature table. Each ``map'' should be connected to it's featureset by feature_featureset.

Example populating feature_featureset

This example is not very complex. For instance it does not take into account various versions of data but you can use it as a starting point.

  insert into feature_featureset 
    select fs.featureset_id, f.feature_id 
    from featureset fs, feature f 
    where f.type_id = fs.feature_type_id;

Insert new CMap map_types and feature_types

You will need to insert new map types and feature types into the CMap config file. The accessions for these types should be the cvterm ``name'' but with spaces replaced by ``_''.

Run cmap_synchronize_chado.pl

This script will look at the featuresets that you've inserted and insert the data into CMap.

You will have several options. The most import is the feature types to look at. If you select too many, it may take a long time.

  $ chado_integration/chado_synchronize/cmap_synchronize_chado.pl --chado_datasource datasource -u sql_username [-p sql_password] --cmap_datasource cmap_datasource

Create Correspondences

You will now need to use cmap_admin.pm to create correspondences between the newly created features.

Triggers

Postgres triggers have been written to keep the data in the CMap database synchronized with chado.

Installing the Triggers

Included in this directory is a file called trigger.PL. It will help create the triggers file.

Create trigger file.
Run trigger.PL using the cmap data source (the example uses CMAP_DEMO as the cmap data source). This will create a file called ``triggers.DATASOURCE.sql'' (in the example it will create ``triggers.CMAP_DEMO.sql'').
```
  $ perl trigger.PL -d CMAP_DEMO
```

Run the resulting sql in postgres

  $ psql chado-fly < triggers.CMAP_DEMO.sql

What the Triggers Do

The triggers watch the tables; organism, featureset, feature_featureset, featureloc and feature.
Inserts

- Tables (cmap_object): featureset (map_set), feature_featureset (map) and featureloc (feature)
- Inserts data in the CMap database corresponding to its object type
- Inserts the CMap ID into the chado dbxref table and adds a link to the corresponding *_dbxref table.
- In the case of a featureset insertion, it checks to make sure the organism is in the CMap db and if not, adds it.
Updates

- Tables (cmap_object): organism (species), featureset (map_set), feature_featureset (map), featureloc (feature) and feature (map/feature)
- Does trivial updates such as name changes and featureloc position changes.
Deletes

- Tables (cmap_object): organism (species), featureset (map_set), feature_featureset (map) and featureloc (feature)
- Does cascading deletes from the CMap database. For instance, if a map set is deleted, all of the maps and features in that map set are also deleted.
- Relies on the chado deletes to cascade. If a feature is deleted, it's featureloc must also be deleted.

What the Triggers Do Not Do

Inserts

- Does not do anything when inserting into the feature table because there isn't enough information to go on.
Updates

- Does not do any substantial data changes such as changing which organism a featureset is or what the srcfeature is of a featureloc.
Deletes

- Does not remove any of the dbxref entries. Only the linker tables are removed.

Removing Linkage

If you want to remove the links to CMap in the chado database, run remove_sync_from_chado.pl (in this directory). This will remove all of the dbxref that point to the CMap database.

  $ chado_integration/chado_synchronize/remove_sync_from_chado.pl --chado_datasource datasource -u sql_username [-p sql_password] [--db_base_name db_base_name]

The options are similar to cmap_synchronize_chado.pl except db_base_name. This is the base name used to name the entries in the ``db'' table. It will be set to ``cmap'' by default (which is fine if the value in cmap_synchronize_chado.pl wasn't changed). Only if there are multiple CMap databases that need to be connected will this value be changed.

Creating Links For Features and Maps Between CMap and Chado

From the CMap side, you can specify cross-references (xrefs) for each feature or map. This is not automatic however.

Then to create the link in the image, you can modify the area_code option in the config directory for each feature type.

The following is an example of how to get an xref from the database.

  area_code <<EOF
    my $dbxrefs = $self->sql()->get_xrefs(
      cmap_object => $self,
      object_id   => $feature->{'feature_id'},
      object_type => 'feature',
      xref_name   => 'Chado',
    );
    my $new_url = '';
    if ( @{ $dbxrefs || [] } ) {
      my $t = $self->template;
      $t->process( \$dbxrefs->[0]{'xref_url'},
          { object => $feature }, \$new_url );
    }
    $url = $new_url;
    $code=sprintf("onMouseOver=\"window.status='%s';return true\"",$feature->{'feature_type_acc'});
  EOF

For more information about how the area_code works, see the ``Map, Feature and Evidence Type Information'' section of ADMINISTRATION.html.

For more information about how cross-references work, see attributes-and-xrefs.html.

Future stuff

Maybe include the trigger installation in the cmap_synchronize_chado.pl script or at the least replace the text in the trigger.template file to make it easier for the user.

AUTHOR

Ben Faga <faga@cshl.org>