Taking the leadApril 9, 2019
The very fact that the sequencing costs are coming down should make multidrug-resistant Mycobacterium tuberculosis (MDR-TB) more manageable. However, that is not the case. Development of novel technologies and a reduction in costs and infrastructure barriers are important to the reduction of disease burden and better management. It is high time that India embraces this fact and develops research capabilities in two very important aspects of tuberculosis diagnostics. The first tool is a biospecimen/sputum collection system, which will enable direct, whole-genome sequencing, and the second is building robust bioinformatics capabilities for improved data analysis.
Historically, tuberculosis detection has relied heavily on microscopy as the first-line diagnosis, which often fails in the detection of drug-resistant genotypes of the bacterium. Traditional, culture-based diagnosis of TB typically takes several weeks, owing to the slow growth rate of M. tuberculosis. Some technical advances have resulted in liquid culture and an automated detection system called mycobacteria growth indicator tube (MGIT), which takes less than a fortnight (Pfyffer et al., 1997). India, with increasing infections of TB, can very much take the lead in supplementing these technologies by conducting large-scale, genome-wide studies, in collaboration with companies or research groups focusing on biospecimen collection systems. Some of the tools developed in this direction are SureSelelctXT target enrichment, MolYsis Basic5 kit (Molzym, Germany) and NucleoSpin Tissue-Kit (Machery-Nagel, Duren, Germany).
Any NGS workflow will not be complete without sequence data analysis, which requires expertise in information technology and clinical-data analytical capabilities – both of which are already possessed by the Indian scientific community, both commercial and academic. In the past few years, there has been an explosion of bioinformatics platforms for both expert and non-expert analysis and the interpretation of MTB NGS data. The majority of existing tools provide cloud-based WGS pipeline to start processing from the raw sequence. There are two important data consortiums, namely
ReseqTB and CRyPTIC, which have accumulated large datasets and maintain genomic and phenotypic data. New advancements have resulted in several WGS tools, of which Mykrobe predictor is currently compatible with both Illumina and Oxford Nanopore WGS data. However, there is still a large gap in improving data quality assessment (control parameters like base score, quality score etc), which are presently done by platforms like FastQC. There are additional steps needed, like trimming and combination of multiple sequencing files. Easy-to-use, web-based tools, like Galaxy Cloudman and Cloud Virtual Resources, can facilitate user better outcomes. There is also a need to work on providing more computational space for data storage and on developing a tailored analytical software collection for customization. Another important area where there is a large gap is NGS data reporting, including sequence variants (single nucleotide polymorphism – SNPs), deletions, insertions and structural variants.
These opportunities are some of the important aspects that will need large-scale collaborations for improved clinical research. India can play a major role in emerging as a leader in managing TB, not just here, but for the world at large.
The author is medical scientist and former director of SGRF, Bangalore