Strains were aligned and analyzed as previously described. The VCF files contain a wealth of information, some of which I'm not entirely sure what it means. Therefore I'll limit my analysis to 2 basic pieces of information: where the SNP is (record.POS)
and the number of "alternate" (not exactly sure what alternate means in this context) alleles called (record.INFO['AC'])
. I could change it to record.INFO['AF']
because it's a float of the frequency, but having it in the range of 1-100 works pretty well for visualization. These python blurbs utilize PyVCF.