Skip to main content
Log in

Comparative genomics indicates the mammalian CD33rSiglec locus evolved by an ancient large-scale inverse duplication and suggests all Siglecs share a common ancestral region

  • Original Paper
  • Published:
Immunogenetics Aims and scope Submit manuscript

Abstract

The CD33-related sialic acid binding Ig-like lectins (CD33rSiglecs) are predominantly inhibitory receptors expressed on leukocytes. They are distinguishable from conserved Siglecs, such as Sialoadhesin and MAG, by their rapid evolution. A comparison of the CD33rSiglec gene cluster in different mammalian species showed that it can be divided into subclusters, A and B. The two subclusters, inverted in relation to each other, each encode a set of CD33rSiglec genes arranged head-to-tail. Two regions of strong correspondence provided evidence for a large-scale inverse duplication, encompassing the framework CEACAM-18 (CE18) and ATPBD3 (ATB3) genes that seeded the mammalian CD33rSiglec cluster. Phylogenetic analysis was consistent with the predicted inversion. Rodents appear to have undergone wholesale loss of CD33rSiglec genes after the inverse duplication. In contrast, CD33rSiglecs expanded in primates and many are now pseudogenes with features consistent with activating receptors. In contrast to mammals, the fish CD33rSiglecs clusters show no evidence of an inverse duplication. They display greater variation in cluster size and structure than mammals. The close arrangement of other Siglecs and CD33rSiglecs in fish is consistent with a common ancestral region for Siglecs. Expansion of mammalian CD33rSiglecs appears to have followed a large inverse duplication of a smaller primordial cluster over 180 million years ago, prior to eutherian/marsupial divergence. Inverse duplications in general could potentially have a stabilizing effect in maintaining the size and structure of large gene clusters, facilitating the rapid evolution of immune gene families.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Abbreviations

IgV:

Immunoglobulin superfamily variable domain

ITAM:

Immunoreceptor tyrosine-based activation motif

ITIM:

Immunoreceptor tyrosine-based inhibition motif

References

Download references

Acknowledgments

We thank Professor Paul Crocker and Ajit Varki for critical reading of the manuscript and helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Huan Cao or Alexander David Barrow.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Fig. S1

Siglec EnsEMBL gene and protein ID. EnsEMBL gene and protein ID as well as chromosome/group/scaffold number are listed for genes described. Versions of the published assemblies used are included in the last page (DOC 29 kb)

Fig. S2

Human CD33rSiglec gene cluster 2D dot-plot analysis. Sequence of the whole human CD33rSiglec gene cluster is plotted on both x and y axes. For abbreviations and gene annotations, see Fig. 1. The diagonal line shows one-to-one correspondence between sequences on the x and y axes. Additional lines indicate regions of similarity between the two sequences. Perpendicular lines indicated regions likely created by an inverse duplication whereas parallel lines show regions likely created by tandem duplication. Perpendicular lines corresponding to CD33rSiglec genes form two distinct sets as shown by the circles (DOC 778 kb)

Fig. S3

Rhesus versus dog CD33rSiglec gene cluster dot-plot analysis. Sequence corresponding to the whole rhesus macaque CD33rSiglec gene cluster (x-axis) is plotted against that of the dog CD33rSiglec gene cluster (y-axis). For abbreviations and gene annotations, see Fig. 1. Perpendicular lines (PL1 and PL2) like those found in Fig. 2a, b that appear on one side of the fragmented diagonal lines and correspond to the rhesus subcluster B and the dog subcluster A. PL1 corresponds to Siglec-13 and CE18 in rhesus macaque subcluster B and the Siglec-9, -P3 and CE18 in the dog subcluster A. PL2 corresponds to the conserved ATB3 of the dog subcluster A and an unannotated region between rhesus macaque’s ZNF175 and Siglec-5 (DOC 592 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, H., de Bono, B., Belov, K. et al. Comparative genomics indicates the mammalian CD33rSiglec locus evolved by an ancient large-scale inverse duplication and suggests all Siglecs share a common ancestral region. Immunogenetics 61, 401–417 (2009). https://doi.org/10.1007/s00251-009-0372-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00251-009-0372-0

Keywords

Navigation