Purging-duplicates-one-haplotype-VGP6b

  • Author(s):
  • Galaxy
  • VGP
  • Release: 0.7.1
  • License: CC-BY-4.0
  • UniqueID: 727b2b8c-3ff1-4736-952d-b2c215c0faa1

Purge Duplicate Contigs

Purge contigs marked as duplicates by purge_dups in a single haplotype(could be haplotypic duplication or overlap duplication) This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)

Inputs

  1. Genomescope model parameters [txt] (Generated by the k-mer profiling workflow)
  2. Hifi long reads - trimmed [fastq] (Generated by Cutadapt in the contigging workflow)
  3. Assembly to purge (e.g. hap1) [fasta] (Generated by the contigging workflow)
  4. K-mer database [meryldb] (Generated by the k-mer profiling workflow)
  5. Assembly to leave alone (used for merqury statistics) (e.g. hap2) [fasta] (Generated by the contigging workflow)
  6. Estimated Genome Size [txt]
  7. Database for busco lineage (recommended: latest)
  8. Busco lineage (recommended: vertebrata)
  9. Name of un-altered assembly
  10. Name of purged assembly

Outputs

  1. Haplotype 1 purged assembly (Fasta and gfa)
  2. Haplotype 2 purged assembly (Fasta and gfa)
  3. QC: BUSCO report for both assemblies
  4. QC: Merqury report for both assemblies
  5. QC: Assembly statistics for both assemblies
  6. QC: Nx plot for both assemblies
  7. QC: Size plot for both assemblies