Highly variable region
For B-cell antibody, the most lineage-defining element is usually the CDR3 region. Given the mechanism of B-cell antibody maturation, CDR3 region is highly variable across lineages in terms of sequence and length. By providing the amino acid sequence CDR3 region, users instruct Absibling to focus on relevant antibody sequences within closely related lineages. In this analysis, there is no need to provide full-length antibody sequence that is often highly sensitive R&D secret. Due to the short length of CDR3 region, it might be helpful to include ~5 flanking amino acids on both sides to ensure sensitivity of search.
This section defines the extent of conservation within variable region between query and searched target sequences. When flanking peptids are included, the conservation applies to entire length of query sequence including flanking peptids if provided.
Primer with MID tag
Provide the primer sequence that is tagged with molecular identifier (MID). It should contain only sequence complementary to the corresponding antibody coding region (excluding the MID), for example the "IgG-C" primer in the figure below. This information helps Absibling to extract MID in order to track PCR amplicons derived from the same RNA molecule.
Variable gene identity
Specify the identity of the variable gene germline. If not sure, analyze your antibody nucleotide sequence using Abgermline
. It is crucial that germline identity is identified using the same database as we are using.
Drag and drop Fastq data derived from Illumina pair-end sequencing. Accepted file types include fastq (*.fastq) and compressed fastq file (*.gz). The sequenced library should be prepared following the method illustrated below or similar. To be compatible, The constant region primer used for reverse-transcription should be tagged with molecular identifier (MID). The relative positions of P7 and P5 primers and sequence of IgG-V do not impact the analysis.
Support level tag
When a sibling sequence is supported by more than one RNA molecule and at least one of the RNA molecule is represented by more than one read, this sibling is labeled as 'mMmR' (see below figure). When a sibling sequence is supported by one RNA molecule which is represented by more than one read, the sibling is labeled as 'sMmR'. Similarly, the scenarios for 'mMsR' and 'sMsR' are illustrated in the figure below.