ORFLine: a bioinformatic pipeline to prioritize small open reading frames identifies candidate secreted small proteins from lymphocytes.
Hu F., Lu J., Matheson LS., Díaz-Muñoz MD., Saveliev A., Xu J., Turner M.
MotivationThe annotation of small open reading frames (smORFs) of <100 codons (<300 nucleotides) is challenging due to the large number of such sequences in the genome.ResultsIn this study, we developed a computational pipeline, which we have named ORFLine, that stringently identifies smORFs and classifies them according to their position within transcripts. We identified a total of 5744 unique smORFs in datasets from mouse B and T lymphocytes and systematically characterized them using ORFLine. We further searched smORFs for the presence of a signal peptide, which predicted known secreted chemokines as well as novel micropeptides. Four novel micropeptides show evidence of secretion and are therefore candidate mediators of immunoregulatory functions.Availability and implementationFreely available on the web at https://github.com/boboppie/ORFLine.Supplementary informationSupplementary data are available at Bioinformatics online.