Rapid Targeted Assembly of the Proteome Reveals Evolutionary Variation of GC Content in Avian Lice

Publication Type:Journal Article
Year of Publication:2024
Authors:A. R. Grant, Johnson, K. P., Stanley, E. L., Baldwin-Brown, J. G., KOLENCIK, S. T. A. N. I. S. L. A. V., Allen, J. M.
Journal:Bioinformatics and Biology Insights
Volume:18
Pagination:11 pp
Date Published:Jan-01-2024
Type of Article:Open Access
ISSN:1177-9322, 1177-9322
Keywords:AT (adenine/thymine) rich, base composition, Bioinformatics, computational resource efficiency, Feather lice, phylogenetic signal, protein-coding genes
Abstract:

Nucleotide base composition plays an influential role in the molecular mechanisms involved in gene function, phenotype, and amino acid composition. GC content (proportion of guanine and cytosine in DNA sequences) shows a high level of variation within and among species. Many studies measure GC content in a small number of genes, which may not be representative of genome-wide GC variation. One challenge when assembling extensive genomic data sets for these studies is the significant amount of resources (monetary and computational) associated with data processing, and many bioinformatic tools have not been optimized for resource efficiency. Using a high-performance computing (HPC) cluster, we manipulated resources provided to the targeted gene assembly program, automated target restricted assembly method (aTRAM), to determine an optimum way to run the program to maximize resource use. Using our optimum assembly approach, we assembled and measured GC content of all of the protein-coding genes of a diverse group of parasitic feather lice. Of the 499 426 genes assembled across 57 species, feather lice were GC-poor (mean GC = 42.96%) with a significant amount of variation within and between species (GC range = 19.57%-73.33%). We found a significant correlation between GC content and standard deviation per taxon for overall GC and GC3, which could indicate selection for G and C nucleotides in some species. Phylogenetic signal of GC content was detected in both GC and GC3. This research provides a large-scale investigation of GC content in parasitic lice laying the foundation for understanding the basis of variation in base composition across species.

URL:https://journals.sagepub.com/doi/full/10.1177/11779322241257991
DOI:10.1177/11779322241257991
File attachments: 
Tue, 2024-07-02 15:57 -- Yokb
Scratchpads developed and conceived by (alphabetical): Ed Baker, Katherine Bouton Alice Heaton Dimitris Koureas, Laurence Livermore, Dave Roberts, Simon Rycroft, Ben Scott, Vince Smith