Minimac4 Documentation
From Genome Analysis Wiki
Jump to navigationJump to searchA typical Minimac4 command line would have the following parameter options:
Reference Haplotypes : --refHaps [], --passOnly, --rsid, --referenceEstimates [ON], --mapFile [docs/geneticMapFile.b38.map.txt.gz] Target Haplotypes : --haps [] Output Parameters : --prefix [Minimac4.Output], --estimate, --nobgzip, --vcfBuffer [200], --format [GT,DS], --allTypedSites, --meta, --memUsage Chunking Parameters : --ChunkLengthMb [20.00], --ChunkOverlapMb [3.00] Subset Parameters : --chr [], --start, --end, --window Approximation Parameters : --minimac3, --probThreshold [0.01], --diffThreshold [0.01], --topThreshold [0.01] Other Parameters : --log, --help, --cpus [1], --params PhoneHome : --noPhoneHome, --phoneHomeThinning [50]
Among all, --refHaps and --haps are required.
Reference Haplotypes
- --refHaps <input_m3vcf_filename>
- This option defines the reference panel in M3VCF format to impute against.
- If your reference panel is in VCF format, please use Minimac3 to convert the VCF file to M3VCF (along with parameter estimation) and then use that M3VCF for imputation using Minimac4.
--passOnlyDEACTIVATED! If ON, only variants will FILTER=PASS will be recorded from reference VCF file (does NOT work on M3VCF files yet).
- --rsid
- If ON, Minimac4 will only import RS ID of variants from ID column of reference file (if available).
- --referenceEstimates
- ON by default. If ON, Minimac4 expects the input M3VCF file comes with parameter estimate; otherwise, a genetic map file for option
--mapFile
is required.
- --mapFile <input_genetic_map_file>
- This option is automatically ignored except when
--referenceEstimates
is OFF. - It defines the genetic map file used for recombination rate estimation during imputation.
- The input genetic map file should be tab-separated, with 1st column as chromosome id, 3rd column as cumulative recombination rate in cM/Mb, and 4th as genetic map coordinates in cM.
Target Haplotypes
- --haps <input_vcf_filename>
- This option defines the pre-phased target genotype data in VCF format to impute.