Purge-duplicate-contigs-VGP6

Purge contigs marked as duplicates by purge_dups (could be haplotypic duplication or overlap duplication). This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)

  • Author(s):
  • Galaxy
  • VGP
  • Release: 0.4
  • License: CC-BY-4.0
  • UniqueID: b5d0e4ae-3041-4cbe-9d0e-a1140e5469dd

Purge Duplicate Contigs

Purge contigs marked as duplicates by purge_dups (could be haplotypic duplication or overlap duplication) This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)

Inputs

  1. Hifi long reads - trimmed [fastq] (Generated by Cutadapt in the contigging workflow)
  2. Primary Assembly (hap1) [fasta] (Generated by the contigging workflow)
  3. Alternate Assembly (hap2) [fasta] (Generated by the contigging workflow)
  4. K-mer database [meryldb] (Generated by the k-mer profiling workflow)
  5. Genomescope model parameters [txt] (Generated by the k-mer profiling workflow)
  6. Estimated Genome Size [txt]
  7. Database for busco lineage (recommended: latest)
  8. Lineage of your species for Busco Orthologs (recommended: vertebrata)
  9. Name of first haplotype
  10. Name of second haplotype

Outputs

  1. Haplotype 1 purged assembly (Fasta and gfa)
  2. Haplotype 2 purged assembly (Fasta and gfa)
  3. QC: BUSCO report for both assemblies
  4. QC: Merqury report for both assemblies
  5. QC: Assembly statistics for both assemblies
  6. QC: Nx plot for both assemblies
  7. QC: Size plot for both assemblies