Real-Time PCRs that Discriminates Mycobacterium tuberculosis and Mycobacterium bovis Based on Single Nucleotide Polymorphism and Genome Analysis of Amplicons

| Accurate identification of Mycobacterium strains is necessary from a diagnostic standpoint. Differences in gyrA 95 and gyrB (1410) genome due to specific single nucleotide polymorphism (SNP) were used to design RealTime PCRs that can discriminate Mycobacterium tuberculosis (MT) from Mycobacterium bovis (MB) belonging to the Mycobacterium Tuberculosis Complex (MTBC). The Real-Time PCR employing the VIC labeled gyrA 95 probe specifically detected the ATCC reference strain of MT H37Rv and eight field strains as MT, but not MB or other members of MTBC and other micro-organism tested. The VIC labeled gyrA 95 assay could detect up to 450 fg of MT. The FAM labeled gyrB (1410) specifically detected MB, but not MT or other members of MTBC and other microorganism tested. The assay could detect up to 600 fg of MB, but it could not detect MB from any field isolate. Both assays were completed in 78 minutes. The results of both Real-Time PCR assays did not differ significantly from culture by Student’s t-test. The analytical sensitivity of gyrA 95 assays determined by Receiver Operating Characteristic curve (ROC) was 100% at 95% Confidence Intervals (CI) |66.4 -100.0|, and analytical specificity 100% at 95% CI |75.3 -100.0|. The repeatability and reproducibility of both above assays tested by Bland-Altman Plot indicated that the mean Cq values were within acceptable statistical limits of + 1.96 SD. Comparison of partial genome sequences of gyrA 95 and gyrB (1410) of field strains of MT with MT H37Rv by Phylogenetic Tree and Disparity Index tests indicated that eight field strains were homologous to MT H37Rv, but one was homologous to Mycobacterium intracellulare. RealTime PCR assays were found rapid, specific, sensitive, repeatable and reproducible. Further studies are necessary for conversion of these assays to quantitative formats and determine parameters of diagnostic estimates before their use under clinical settings.

Earlier, molecular assays and mini-sequencing techniques have been described using allelic variations in bacterial genomes due to existence of single nucleotide polymorphism (SNP) (Kasai et al., 2000;Gopaul et al., 2008;Bouakaze et al., 2008;Bouakaze et al., 2010).Mutations in the alleles of the members of the Mycobacterium tuberculosis complex have been used for the species and lineage specific identification, and further these mutations have been used for classification of strains broadly into three Principal Genetic Groups (PGGs) -Group 1 (a and b), 2 and 3 and SNP Cluster Groups (SCGs) (Sreevatsan et al., 1997;Brosch et al, 2002;Filliol et al., 2006;Bouakaze et al., 2010;Bouakaze et al., 2011).Thus, at the position 203 in the katG gene [represented as katG 203 ] the mutation ACT ACC in Mycobacterium tuberculosis strain H37Rv differentiates PGG 1a from 1b group, similarly the mutation due to SNP in sequence CTG  CGG in katG 463 gene is used for assignment of strains to PGG 2 group, while presence of polymorphism in strains in ACC AGC in gyrA 95 gene indicate that is belongs to PGG 3 group (Sreevatsan et al., 1997).Earlier studies on cluster analysis of diversity in SNP nucleotides among Mycobacterium tuberculosis strains had indicated phylogenetically six distinct SNP cluster groups (SCGs) 1 to 6 and 5 sub-groups (3a, 3b, 3c, 6a and 6b) (Filliol et al., 2006).On this basis, mutations in alleles of genomes -hsp65 631 CT can distinguish -M.canetti, 16S RNA 1429 TC -M.pinnipedii, gyrB (675) CT -M.microti, gyrB (765) GA -M.bovis, M. bovis BCG and M. caprae, gyrB (1410) from M. tuberculosis H37Rv (Bouakaze et al., 2010).SNP based assays designed to discriminate strains within the genus Brucella (Gopaul et al., 2008) and Mycobacterium (Bouakaze et al., 2011) has been reported.These techniques used Real time PCR assays employing differentially labeled fluorescent probe sequences encompassing the SNP sequence to specifically discriminate closely related strains rapidly (Gopaul et al., 2008;Gopaul et al., 2010).Based on the above information and DNA sequencing data of Mycobacterium tuberculosis H37Rv available in the public do-main, in this study, firstly we aimed to design probes containing the species specific SNPs and their corresponding flanking primers of selected gene targets.Secondly, by using the selected pair of primers we aimed to amplify these genes and sequence them in order to verify the presence of specific SNPs in the amplicons.Thirdly, following the affirmation of the presence of these SNPs -fluorescently labeled minor groove binding TaqMan probes targeting gyrA 95  and gyrB (1410) were employed, aimed at development of Real-Time PCRs that can discriminate between MT and MB.The check for authenticity of the assays was initially planned on reference strains, and later, their further validation on field strains.Lastly, it was aimed to compare the partial sequence information of gyrA 95 and gyrB (1410) of field and reference strains to study the degree of homology of field strain with respect reference strain of MT for epidemiological reasons.
The analysis of the authenticity of the primers and probe sequences used in this study employing in-silico analysis is described in this paper.We report here the development and validation of in-house discriminatory Real-Time PCRs targeting specific SNPs in the gyrA and gyrB genes using reference strains of Mycobacterium and field isolates from India.The significance of comparative genome analysis of partial sequences of gyrA 95 and gyrB (1410), generated from amplicons derived from DNA templates of reference and field strains of Mycobacterium is also presented in this study.

SaMPLe identiFication nuMBer oF FieLd iSoLateS
The identification numbers of Mycobacterium field isolates, recovered from animals belonging to the members of the Bovidae family, were given on the following basis with respect to: a) genus and species of the bacteria isolated; b) species of animal from which the isolates were recovered; c) country of origin; d) Laboratory isolation number; e) year of isolation; and f ) the gene target selected for partial sequencing of amplicons from the isolates.Thus, 'MT' represents Mycobacterium tuberculosis; 'MI' -Mycobacterium intracellulare; ' A' -Antelope; 'G' -Gazelle; 'C' -Cattle; IG, AG, HT, MM represent the place and State of India, like GG -Gandhinagar, Gujarat, HT -Hyderabad, Telangana, MM -Mumbai, Maharashtra, respectively; IS1 to IS9 are Laboratory numbers; 2010, 2011 and 2012 represents year of isolation; and gyrA and gyrB -represents the partial sequencing of these genes from PCR products amplified from DNA extracted from field isolates.For brevity, sample identification number has been represented as IS1 to IS9 as abbreviations in the text, except as mentioned in Table 2.

GenoMic dna extraction FroM MycobacteriuM and non-MycobacteriuM StrainS
The genomic DNA from field isolates of Mycobacterium (Table 2) and reference strains of Mycobacterium (Table 3) were extracted as per method described by GenoType MTBC kit from Hain Life Sciences, Germany (Cat.No. 47013).Briefly, 1 ml liquid culture from BACTEC tube was transferred to 1.5 ml micro-centrifuge tube and pelleted at 10000 xg for 15 minutes.The pellet was re-suspend ed in 100 µl of nuclease free water and bacteria were heat i--nactivated by incubating at 95 °C for 20 minutes.Later, the bacterial cells were lysed by incubating in a water bath sonicator for 20 minutes.The lysate was centrifuged at full speed (10000 x g) for 5 minutes and the supernatant were directly used for amplification or transferred to fresh 1.5 ml micro-centrifuge tube and stored in -20 °C until further use.The genomic DNA from non-mycobacterial standard strains (Listed in Table 3) were extracted as per protocol given by QiaAmp Blood Mini Kit (Qiagen, Germany).The quality of DNA was checked by agarose gel electrophoresis using 0.8% agarose.

SeLection oF Gene tarGet, deSiGninG oF PriMerS and Minor GrooVe BindinG (MGB) ProBeS
The selection of eight gene targets (hsp65  and nuclease free water (36.2 µl).Each SNP PCR was performed for 40 cycles following the thermal profile detailed below: initial denaturation at 94 °C for 5 minutes; denaturation at 94 °C for 30 seconds; annealing at 60 °C for 30 seconds; extension 72 °C for 30 seconds and final extension at 72 °C for 10 minutes.PCR products were purified using QIA Quick gel purification kit (QIAGEN, Germany) following the manufacturer's instructions.The purified products were quantified using Nanodrop spectrophotometer and the quality and purity was checked by electrophoresis using 2% agarose gel.The eight amplicons purified as described above were subjected nucleotide sequencing by a Chain Termination method using the Big Dye Terminator Version 3.1 cycle sequencing Kit (ABI, Life Technologies, USA) by following the manufacturer's instruction.Subsequently the labeled fragments were purified and analyzed on ABI 3130XL Genetic Analyzer (ABI, Life Technologies, USA).The primers used for sequencing were same for the corresponding gene targets as those mentioned for normal PCR in Table 1.

deVeLoPMent and oPtiMization oF reaL tiMe PcrS
In the first set of experiments, reactions were performed in 0. anaLYticaL SPeciFicitY and SenSitiVitY Analytical specificity (ASp) of the assay was estimated by using DNA from different reference bacteria and protozoa.Analytical sensitivity (ASe) of the assay was estimated by performing the assay with known DNA concentration of standard Mycobacterium strains (MT and MB) in ten-fold serial dilution in triplicates.

rePeataBiLitY and reProduciBiLitY
The intra-assay repeatability was determined by performing the assay using serially diluted DNA from standard strains of Mycobacterium tuberculosis (ATCC 27294) (45ng to 4.5ag) and Mycobacterium bovis BGC(ATCC 35734) (60 ng to 6ag) in triplicates on same day.The inter-assay reproducibility of the Real-Time PCR was determined by performing the assay with the same set of DNA in triplicates on three different days.

anaLYSiS oF nucLeotide SequenceS
Nucleotide sequencing was performed from amplicons generated from DNA templates of field isolates which were greater than 300 bp in size, using sequencing primers (Table 1) specific for gyrA 95 and gyrB (1410).The sequences (electropherograms) obtained for each sample by paired sequencing with forward and reverse primers were analyzed using Seqscape v2.0 software.The authenticity of each base call was counter verified with the paired base call obtained in the paired sequence.Further, BLASTN analysis was performed on each sequence (https://blast.ncbi.nlm.nih.gov/Blast.cgi) for identification of Mycobacterium species as well as to obtain the percent of nucleotide homology.Multiple sequence alignment of the partial nucleotide sequences (gyrA and gyrB) of the study along with published sequences retrieved from the GenBank was performed using Clustal W v2, in order to compare the nucleotide and aminoacid sequences.The sequences obtained were compared with the published sequence of ATCC reference strains of Mycobacterium, available in the public domain, for determining a) the actual presence of specific SNPs (Bouakaze et al., 2010), and b) degree of sequence homology of the field strain with respect to reference strains.Based on partial nucleotide sequence of gyrB (1410) from 9 Mycobacterium field isolates a Phylogenetic Tree was constructed using Maximum Likelihood Method; and sequence homology among field strains and reference strains with respect to gyrA and gyrB genes was analyzed using the Disparity Index Test.

StatiSticaL anaLYSiS
The data on ASp and ASe of Real-

Pcr:
The amplified gene targets hsp65 631 , katG 463 , 16S rRNA 1429 , gyrA 95 , gyrB (675), gyrB (675+756), gyrB (756), gyrB (1410) from DNA of reference strain of Mycobacterium tuberculosis (H37Rv) were identified by their amplicon size -72, 72, 120, 198, 123, 150, 271  above PCR products revealed the existence SNPs at specific expected position when compared with sequence information of M. tuberculosis (H37Rv) available in the public domain, employing BLASTn analyses.Amplicons greater than 300 bp were generated from the above gene targets using sequencing primers (Table 1); specific amplification of some of the representative genes are shown in Fig 1c. ).In addition, partial sequences of the above isolates were obtained from amplification of gyrA and gyrB gene sequences that were greater than 300 bp using sequencing primers (Table 2).These partial sequences were matched with Mycobacterium sequence available in the public domain by BLASTn analysis.Sequence data of 8/9 isolates matched with M. tuberculosis H37Rv reference strain, while 1/9 isolate showed homology with sequence of M. intracellulare.A total of 17 partial sequences were generated from gyrA and gyrB were submitted to GenBank; the GenBank Accession number has been provided in Table 2. Sequence homologies among field strains and reference strains were supported by disparity index test results (Table 4a and 4b), and the phylogenetic tree constructed based on partial sequences of gyrB, also clustered 8 isolates with MT reference strain and 1 isolate with M.I (Figure 3).Mycobacterium tuberculosis H37Rv -0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Hence the development and use of molecular diagnostic tools for rapid and specific detection of infection assumes significance.Molecular techniques such as the use of rapid, specific and sensitive diagnostic Real Time PCR are being increasingly applied under clinical settings for arriving at a quicker and more accurate evaluation of the status of tuberculosis infection.Molecular identification of strains belonging to the genus Mycobacterium classified under MTBC group for differentiation of Mycobacterium tuberculosis strains of species and lineage level, and to study their evolution, based on variation in the genome due to SNPs as been reported earlier (Sreevatsan et al., 1997;Brosch et al., 2002;Filliol., 2006;Bouakaze et al., 2010;Bouakaze et al., 2011).This approach has been extended to monitor presence of virulent MT strains such those belonging to the Beijing lineage (Alonso et al., 2010) and rapidly identify drug resistant strains (Ramirez et al., 2010).
Realizing the potential of this approach based on the above literature, we had set about investigating whether certain gene targets such as those containing the SNPs (Hsp65 631 , katG 463 , 16S rRNA 1429 , gyrA 95 , gyrB (675), gyrB (756) and gyrB (1410) reported earlier, that have been found suitable and specific for differentiation of strains within the MTBC, were suitable for development of discriminatory Real Time PCRs.The above reports had described the SNaPshot mini-sequencing approach for strain differentiation.In this work we wanted to study whether differences in these genes due to SNPs could be adopted for development of Real Time PCRs that exploits minor groove binding probes containing these SNPs, flanked by forward and reverse primers encompassing the selected regions within a specific genome.
The initial assays reported here were designed to test whether all the eight regions harboring the SNPs could be specifically amplified in a simple gel PCR format and prove the actual existence of these specific SNPs within sets of amplicons (one set containing amplicons less than 300 bp, and the other set with amplicons greater than 300 bp in size) by using sequencing data from these amplicons.The sequencing information was then used to determine the extent of homology of these sequences with published sequence information for the M. tuberculosis H37Rv and M. bovis available in the NCBI database.Using the above exercise for quality assurance, we further designed differentially labeled fluorescent minor groove binding probes and corresponding primer pairs for development of discriminatory real time PCRs.We limited the assessment of our designs, by checking the suitability of our assays concentrating only on two SNP containing gene targets-gyrA 95 and gyrB (1410).
Although  et al., 2015) it has been emphasized that SNP Real Time PCR possess the potential to rapidly and specifically detect 70 MTBC strains to lineage and sub-lineage specific accuracy, employing a throughput format.
Although the present study indicated that SNP Real Time PCR was sensitive enough to detect 450 fg of MT and 600 fg of MB, the assay can be upgraded by inclusion of an internal amplification control (IAC) in tests for determination of the copy number and therefore upgrade it into a quantitative assay format.The IAC can serve as additional quality assurance indicators (Mukherjee et al., 2015).The efficiency and robustness of these assays need to be also tested in future by evaluating performances using DNA templates generated from diverse clinical matrices.The development of assays reported here is restricted to discrimination of strains within the MTBC.However, they are unable to distinguish lineages or sub-lineages within the MTBC, since our recent spoligo typing analysis of 8 MT field isolates from India revealed the prevalence of 5 distinct lineages in the sub-continent  (1410) gene targets from DNA templates of eight Mycobacterium field isolates, indicated that the alleles in the SNPs for each selected targets were same as those indicated in the published genome sequence information (Gen-Bank Accession No. NC_000962.3).In this analysis, the presence of the allele 'G' in the sequence (CGG) of katG 463 and 'G' in the sequence (AGC) of gyrA 95 in the reference and field strains seems to suggest that field strains of MT belong to PGG 3, and further the presence of the allele 'C' in gyrB (1410) and absence 'T' confirmed the field strains were not of M. bovis or M. bovis BCG.However, the gyrA 95 VIC labeled probe would not be suitable for identification MT strains belonging to PPG 1b because the allele in such strains is 'C' which is also present in other members of the MTBC.Earlier reports (Brosch et al., 2002;Bouakaze et al., 2011) based on testing of a total of 81 field MT strains had indicated that 20% to 35% belonged PGG 1b, and 65 to 80% of these strains belonged to PGG 2 and 3; of which 4% were identified as PGG 3.So it seems that MT field strains assigned to PGG 3 are rare and their recovery from animals in India might be pointing towards an important link related to the acquirement of MT infection by animals from the primary human host for this pathogen by a 'spill over' mechanism.'Spill over' mechanism as a method of spread of MT infection from human to animals have been suggested earlier (Malama et al., 2014).
Although, the analysis of partial sequencing data of MT and MI strains by Disparity Index Test for gyrA and gyrB and the profile of Phylogenetic Tree based on gyrB, indicated complete homology of field strains with reference strains, and a synonymous substitution of SNP in gyrA 95  and gyrB (1410); we were not sure whether these mutations were naturally acquired, since our observations for the presence SNPs on field strains were limited to study of only a part of the two genomes -gyrA and gyrB and not for entire genome (Kasai et al., 2000;Ilina et al., 2013;Oudghiri et al., 2018).Also, in order to conclude that mutations in SNPs have been naturally acquired, it has been recommended that a larger number of housekeeping genes be considered for investigation (Ilina et al., 2013;Oudghiri et al., 2018), since we focused our investigation for field strains on two gene targets, it is difficult to make a final conclusion about the evolution of mutation in these field strains.

Figure 4a :
Figure 4a: Intra-assay, inter-assay variability from 45 ng to 450 fg for gene target gyrA 95 bold under the column 'Criterion' and shown in the row mentioning the condition '>0' was considered for sensitivity and specificity of the assay at 95% confidence intervals.

Table 1 :
Details of primers and probes used for Normal, Real Time PCR and sequencing assays

Primers used for normal PCR, Real Time PCR and Sequencing Minor Groove Binding (MGB) differentially labelled fluorescent VIC c and FAM d probe se- quence 5'3' Primers designed for PCR and se- quencing of amplicons greater than 300 bp e Forward Primer (FP) and Re- verse Primer (RP) sequence 53' Ampli- con size (bp) Forward Primer (FP) and Reverse Primer (RP) sequence 5'3'
aForward primer (FP) and Reverse primer (RP) -as suggested byBouakaze et al., 2010, and bFP and RP designed In-house and used in this study for normal gel PCR, Real Time PCR and sequencing of PCR products for experimentally checking the authenticity of designed primers and the position of SNPs with respect to initial sequence homology data generated in silico derived from information in the public domain; c 2ʹ-chloro-7ʹ-phenyl-1,4-dichloro-6-carboxy fluorescein labeled (VIC) and

Advances in Animal and Veterinary Sciences February
2019 | Volume 7 | Issue 2 | Page 53

Table 3 :
Specificity of the qPCR targeting gyrase A (gyrA) and gyrase B (gyrB) genes Sr. No.

Name and origin of the strain a, b,c Real time PCR Result gyrA 95 gene target (Cq Value) gyrB (1410) gene target (Cq Value)
a American Type Culture Collection (ATCC); b United States Department of Agriculture (USDA); c Collection de l' Institute Pasteur (CIP) France.IS-Field isolates of Mycobacterium.
50°C for 2 minutes and 95°C for 10 minutes, followed by cycling at 95°C for 15 seconds and 60°C for 60 seconds for 45 cycles.The real-time PCR assay was performed in Rotor Gene Q Real-time PCR cycler (Qiagen,Germany).The specific primers used for Real Time PCR were same as those used for normal Gel PCR (Table1).In the second set of experiments, using the above PCR protocol, DNA extracts from MT and MB and Mycobacterium strains and from organism not belonging to genus Mycobacterium (listed in Table3) were subjected to Real Time PCR targeting gyrA95and gyrB (1410) gene targets.Similarly, in the third set of experiments, gyrA95and gyrB (1410) targets encompassing specific SNP for MT/MB in DNA extracts of field isolates were amplified using specific primers (Table 1).For the second and third set of experiments purified DNA extracts from ATCC reference strains -MT and MB in duplicates were always included as positive control.Similarly molecular biology grade DNA free sterile distilled water in duplicates was included throughout these experiments as negative control.
2 ml PCR strip-tubes(Qiagen Cat.No.981005).The total reaction volume was 25 µL which comprises of 12.5 µL of master mix (Eurogentec qPCR master mix No ROX), 10 pico-moles of each primer (Bioserve India), 5 pico-moles of MGB probe (Invitrogen Bioservices India Pvt.Ltd) and 5 µL of the template.Serially diluted DNA, ranging from 45 ng to 4.5 ag extracted from Mycobacterium tuberculosis (MT) ATCC 27294 and 60 ng to 6 ag of Mycobacterium bovis BGC ATCC 35734 (MB).Two (gyrA95and gyrB (1410) out the eight gene targets (hsp65 631 , katG 463 , 16S rRNA 1429 , gyrA95 , gyrB (675), gyrB (756), gyrB (675+756) and gyrB (1410) were tested independently in a separate series of reactions for MT and MB templates, respectively.Reaction conditions were set as follows: Hold Advances in Animal and Veterinary Sciences February 2019 | Volume 7 | Issue 2 | Page 54 at Time PCRs for gyrA 95 was analyzed by using Student's t-test and Receiver Operating Characteristic curve (ROC); but for the gyrB (1410) the Student's t-test was used alone.The data on repeatability and reproducibility of gyrA 95 and gyrB (1410) Real-Time PCR assays were analyzed by Bland Altman Plot.For the analysis of data by Student's t-test, ROC and Bland-Altman Plot the MedCalc® software version 14.12, 1993 -2105 was employed.

Table 4b :
Gyrase B Disparity Index Test

Table 5a :
Analysis of Specificity of gyrA 95 Real time PCR by Student's t-test

Table 5b :
Analysis of Specificity of gyrB (1410) Real time PCR by Student's t test

Table 6 :
Determination of Analytical Specificity (ASp) and Analytical Sensitivity (ASe) of gyrA 95 Real time PCR by ROC Analysis

in Animal and Veterinary Sciences February 2019 | Volume 7 | Issue 2 | Page 63
. tuberculosis strain, while the FAM labeled SNP gyrB (1410) probe was specific for detection of M. bovis strains; and both assays could be completed in 78 minutes, and were repeatable and reproducible.As expected the FAM labeled SNP gyrA 95 probenon-specifically detected all the members belonging to the MTBC complex.Similarly, the VIC labeled SNP gyrB (1410) probe detected all the members belonging to the MTBC complex.The results of PCR assays validated the specificity of selection of primers and differentially labeled fluorescent MGB probes used in this study; thus confirming the authenticity of our initial in silico sequence analysis.Indian field isolates of animal origin correctly delineated eight strains as MT and one as MI, but none as MB.These results taken together were encouraging since they exhibited the potential for further validation of discriminatory SNP based diagnostic Real Time PCRs to be explored under clinical settings.As an extension of this study, future program could be aimed at determining the diagnostic estimates for these assays using DNA templates directly from clinical samples; since in a recent study (Wampande M [Mukherjee et al.,manuscript accepted for publication in Revue Scientifique et Technique Vol.37 (3)], this limitation seems to restrict the repertoire of the above assays.Comparison of sequencing data from amplicons of sizes 72 bp, 72 bp, 120 bp, 198bp, 123 bp, 271 bp, 167 bp and 150 bp generated from amplification of gene target of hsp65 631 , katG 463 , 16S rRNA 1429 , gyrA 95 , gyrB (675), gyrB(756), gyrB (1410) and gyrB (675+756) respectively, from DNA template of MT H37Rv reference strain; and amplicons of size 325 bp and 301bp generated from the gyrA 95 and gyrB