David Bioinfo Hot! -

Why ‘rm -rf’ is scarier than a pipette tip, and other truths of digital biology. Introduction: Hello, World (of Omics)

bwa mem genome.fa sample_R1.fastq sample_R2.fastq > aligned.sam samtools sort -@8 aligned.sam -o sorted.bam freebayes -f genome.fa sorted.bam > variants.vcf Then I wait. This is when I practice patience. And refresh my email 47 times. david bioinfo

I found 10,000 variants. The lab expected 5. Did I mis-call indels? Is there a batch effect? Did someone accidentally use the mouse reference genome again? (It happened once. Once.) Why ‘rm -rf’ is scarier than a pipette

The first rule of : Always check your checksums. And refresh my email 47 times

You can have the cleanest pipeline, the most parallelized code, and a server with 1TB of RAM. But if you don’t understand the biological question, you’re just moving bytes around.

At 1:00 PM, the wet-lab team sends me an email: “Hi David, we ran the PCR. Can you just ‘quickly’ align this to the genome and find every variant associated with that rare disease? Thanks! Need it by 3 PM.” I smile. I type. I invoke the sacred magic:

Hi! I’m David. Ask me what I do, and you’ll get a different answer depending on the day.