vcftools

Version:

0.17.1

Category:

bio

Cluster:

Loki

Author / Distributor

http://vcftools.sourceforge.net

Description

vcftools is a suite of functions for use on genetic variation data in the form of VCF and BCF files. The tools provided will be used mainly to summarize data, run calculations on data, filter out data, and convert data into other useful file formats.

Documentation

 BASIC OPTIONS
    These options are used to specify the input and output files.

INPUT FILE OPTIONS
     --vcf <input_filename>
       This option defines the VCF file to be processed. VCFtools expects files in VCF format v4.0, v4.1 or v4.2. The latter two are supported with some small limitations.  If
       the user provides a dash character '-' as a file name, the program expects a VCF file to be piped in through standard in.

     --gzvcf <input_filename>
       This option can be used in place of the --vcf option to read compressed (gzipped) VCF files directly.

     --bcf <input_filename>
       This  option can be used in place of the --vcf option to read BCF2 files directly. You do not need to specify if this file is compressed with BGZF encoding. If the user
       provides a dash character '-' as a file name, the program expects a BCF2 file to be piped in through standard in.

OUTPUT FILE OPTIONS
     --out <output_prefix>
       This option defines the output filename prefix for all files generated by vcftools. For example, if <prefix> is set to output_filename, then all output files will be of
       the form output_filename.*** . If this option is omitted, all output files will have the prefix "out." in the current working directory.

     --stdout
     -c
       These options direct the vcftools output to standard out so it can be piped into another program or written directly to a filename of choice. However, a select few out‐
       put functions cannot be written to standard out.

     --temp <temporary_directory>
       This option can be used to redirect any temporary files that vcftools creates into a specified directory.

SITE FILTERING OPTIONS
   These options are used to include or exclude certain sites from any analysis being performed by the program.

POSITION FILTERING
     --chr <chromosome>
     --not-chr <chromosome>
       Includes or excludes sites with indentifiers matching <chromosome>. These options may be used multiple times to include or exclude more than one chromosome.

     --from-bp <integer>
     --to-bp <integer>
       These options specify a lower bound and upper bound for a range of sites to be processed. Sites with positions less than or greater than these values will be  excluded.
       These options can only be used in conjunction with a single usage of --chr. Using one of these does not require use of the other.

     --positions <filename>
     --exclude-positions <filename>
       Include  or  exclude a set of sites on the basis of a list of positions in a file. Each line of the input file should contain a (tab-separated) chromosome and position.
       The file can have comment lines that start with a "#", they will be ignored.

     --positions-overlap <filename>
     --exclude-positions-overlap <filename>
       Include or exclude a set of sites on the basis of the reference allele overlapping with a list of positions in a file. Each line of the  input  file  should  contain  a
       (tab-separated) chromosome and position. The file can have comment lines that start with a "#", they will be ignored.

Examples/Usage

  • List available modules:

    $ module avail vcftools
    
  • Load the Anaconda module:

    $ module load bio/VCFtools/0.17.1
    
  • Check the loaded modules:

    $ module list
    
  • Unload the Anaconda module:

    $ module unload bio/VCFtools/0.17.1
    
  • Output allele frequency for all sites in the input vcf file from chromosome 1:

    $ vcftools --gzvcf input_file.vcf.gz --freq --chr 1 --out chr1_analysis
    

Installation

Source code is obtained from VCFtools