hyparquet-writer/README.md
2025-04-08 01:05:19 -07:00

2.4 KiB

Hyparquet Writer

hyparquet writer parakeet

npm minzipped workflow status mit license coverage dependencies

Hyparquet Writer is a JavaScript library for writing Apache Parquet files. It is designed to be lightweight, fast and store data very efficiently. It is a companion to the hyparquet library, which is a JavaScript library for reading parquet files.

Usage

Call parquetWrite with argument columnData. Each column in columnData should contain:

  • name: the column name
  • data: an array of same-type values
  • type: the parquet schema type (optional, type guessed from data if not provided)

Example:

import { parquetWrite } from 'hyparquet-writer'

const arrayBuffer = parquetWrite({
  columnData: [
    { name: 'name', data: ['Alice', 'Bob', 'Charlie'], type: 'STRING' },
    { name: 'age', data: [25, 30, 35], type: 'INT32' },
  ],
})

Options

Options can be passed to parquetWrite to change parquet file properties:

  • compression: use snappy compression (default true)
  • statistics: write column statistics (default true)
  • rowGroupSize: number of rows in each row group (default 100000)
  • kvMetadata: extra key-value metadata
import { parquetWrite } from 'hyparquet-writer'

const arrayBuffer = parquetWrite({
  columnData: [
    { name: 'name', data: ['Alice', 'Bob', 'Charlie'], type: 'STRING' },
    { name: 'age', data: [25, 30, 35], type: 'INT32' },
  ],
  compression: false,
  statistics: false,
  rowGroupSize: 1000,
  kvMetadata: {
    'key1': 'value1',
    'key2': 'value2',
  },
})

References