Expand description
Node.js bindings for VCF/BCF file access via HTSlib.
This crate provides Node-API (N-API) bindings for reading VCF/BCF files, enabling high-performance genomic data processing from JavaScript/TypeScript.
§Installation
npm install htsvcf§Quick Start
import { openReader } from 'htsvcf'
const reader = await openReader('input.vcf.gz')
// Iterate all variants
while (true) {
const { done, value: variant } = await reader.next()
if (done) break
console.log(`${variant.chrom}:${variant.pos} ${variant.ref} -> ${variant.alt}`)
}
reader.close()§API Overview
§Opening files
// Async (recommended)
const reader = await openReader('input.vcf.gz')
// Sync constructor
const reader = new Reader('input.vcf.gz')§Iterating variants
// Async iteration (recommended)
while (true) {
const { done, value } = await reader.next()
if (done) break
// process value (Variant)
}
// Sync iteration
while (true) {
const { done, value } = reader.nextSync()
if (done) break
// process value
}§Querying regions (requires index)
if (reader.hasIndex()) {
// Query a region (0-based coordinates)
await reader.query('chr1', 1000, 2000)
// Or use region string (1-based, like samtools)
await reader.query('chr1:1001-2000')
// Then iterate as normal
while (true) {
const { done, value } = await reader.next()
if (done) break
// variants overlapping the region
}
}§Variant fields
const v = variant
// Basic fields (read-only)
v.chrom // "chr1"
v.pos // 12345 (1-based)
v.start // 12344 (0-based)
v.stop // 12345 (end position)
v.ref // "A"
v.alt // ["G", "T"]
v.rid // Reference ID (integer) or undefined
// Read/write fields
v.id // "rs12345" or "."
v.id = "rs999"
v.qual // 30.5 or null if missing
v.qual = 42.0
v.qual = null // Set to missing
v.filter // ["PASS"] or ["q10", "dp"]
v.filter = ["PASS"]§INFO fields
// Read INFO (returns typed values based on header)
v.info('DP') // 42 (Integer)
v.info('AF') // [0.25, 0.75] (Float array)
v.info('SOMATIC') // true (Flag)
v.info('GENE') // "BRCA1" (String)
v.info('MISSING') // undefined (not present)
// Write INFO (type must match header definition)
v.set_info('DP', 100)
v.set_info('AF', [0.1, 0.9])
v.set_info('SOMATIC', true)
v.set_info('GENE', 'TP53')
v.set_info('DP', null) // Clear/remove the field§FORMAT fields (per-sample)
// Get FORMAT values (array with one entry per sample)
v.format('GT') // ["0/1", "0/0", "1/1"]
v.format('DP') // [30, 25, null] (null = missing)
v.format('AD') // [[10, 20], [25, 0], [0, 30]]
// Set FORMAT values (array with one entry per sample)
v.set_format('DP', [40, 35, 50])
v.set_format('AD', [[15, 25], [30, 5], [5, 35]])
v.set_format('DP', null) // Clear the field
// Get all FORMAT fields for one sample by name
const s = v.sample('NA12878')
s.GT // "0/1"
s.DP // 30
s.AD // [10, 20]
s.sample_name // "NA12878"
s.genotype // { alleles: [0, 1], phase: [false] }
// Get all samples at once (more efficient for bulk access)
const all = v.samples() // Array of sample objects
all[0].GT // First sample's genotype
all[0].sample_name // First sample's name
// Get a subset of samples
const subset = v.samples(['NA12878', 'NA12879'])
// Get parsed genotypes (alleles and phase info)
const gts = v.genotypes()
// [{ alleles: [0, 1], phase: [false] }, { alleles: [1, 1], phase: [true] }, ...]
// Genotypes for a subset of samples
const gtSubset = v.genotypes(['NA12878'])§Output
// Convert to VCF line (without trailing newline)
v.toString() // "chr1\t12345\trs12345\tA\tG\t30\tPASS\tDP=42\t..."§Header access
const header = reader.header
// List sample names
header.samples() // ["NA12878", "NA12879", ...]
// Get field definitions
header.get('INFO', 'DP')
// { id: 'DP', type: 'Integer', number: '1', description: 'Read depth' }
header.get('FORMAT', 'GT')
// { id: 'GT', type: 'String', number: '1', description: 'Genotype' }
// List all header records
header.records()
// [{ section: 'INFO', id: 'DP', number: '1', type: 'Integer', ... }, ...]
// Add new field definitions
header.addInfo('CUSTOM', '1', 'Integer', 'My custom annotation')
header.addFormat('SCORE', '1', 'Float', 'Per-sample score')
// Get full header text
header.toString()§TypeScript
Full TypeScript definitions are included. Key types:
import { Reader, Variant, Header, openReader } from 'htsvcf'
const reader: Reader = await openReader('input.vcf.gz')
const header: Header = reader.header
const { value: variant }: { done: boolean; value: Variant } = await reader.next()Structs§
- Genotype
- Represents a parsed genotype for a single sample.
- Header
- Header
Field - A VCF header field definition (INFO or FORMAT).
- Header
Record - A VCF header record with its section (INFO, FORMAT, FILTER, etc.).
- Next
Batch Task - Next
Task - Open
Reader Task - Query
Task - Reader
- Reader
Options - Variant
- Writer
- Writer
Options