mirror of https://github.com/asadbek064/hyparquet-writer.git synced 2026-02-21 20:01:34 +00:00

Apache Parquet file writer in JavaScript

Go to file

Kenny Daniel 10f23492e6 Unconvert decimal type		2025-04-11 04:38:06 -07:00
.github/workflows	Support more SchemaElement options	2025-04-11 02:50:38 -07:00
src	Unconvert decimal type	2025-04-11 04:38:06 -07:00
test	Unconvert decimal type	2025-04-11 04:38:06 -07:00
.gitignore	FileWriter	2025-04-08 03:22:30 -07:00
eslint.config.js	Thrift writer	2025-03-25 10:30:37 -07:00
hyparquet-writer.jpg	Add mascot	2025-04-07 01:27:45 -07:00
LICENSE	Initial JS project	2025-03-21 00:08:34 -07:00
package.json	Support more SchemaElement options	2025-04-11 02:50:38 -07:00
README.md	FileWriter	2025-04-08 03:22:30 -07:00
tsconfig.build.json	Handle byte array vs string, and change parquetWrite column api	2025-03-26 01:01:04 -07:00
tsconfig.json	Thrift writer	2025-03-25 10:30:37 -07:00

README.md

Hyparquet Writer

Hyparquet Writer is a JavaScript library for writing Apache Parquet files. It is designed to be lightweight, fast and store data very efficiently. It is a companion to the hyparquet library, which is a JavaScript library for reading parquet files.

Quick Start

To write a parquet file to an ArrayBuffer use parquetWriteBuffer with argument columnData. Each column in columnData should contain:

name: the column name
data: an array of same-type values
type: the parquet schema type (optional)

import { parquetWriteBuffer } from 'hyparquet-writer'

const arrayBuffer = parquetWriteBuffer({
  columnData: [
    { name: 'name', data: ['Alice', 'Bob', 'Charlie'], type: 'STRING' },
    { name: 'age', data: [25, 30, 35], type: 'INT32' },
  ],
})

Note: if type is not provided, the type will be guessed from the data. The supported parquet types are:

BOOLEAN
INT32
INT64
FLOAT
DOUBLE
BYTE_ARRAY

Node.js Write to Local Parquet File

To write a local parquet file in node.js use parquetWriteFile with arguments filename and columnData:

const { parquetWriteFile } = await import('hyparquet-writer')

parquetWriteFile({
  filename: 'example.parquet',
  columnData: [
    { name: 'name', data: ['Alice', 'Bob', 'Charlie'], type: 'STRING' },
    { name: 'age', data: [25, 30, 35], type: 'INT32' },
  ],
})

Note: hyparquet-writer is published as an ES module, so dynamic import() may be required on the command line.

Advanced Usage

Options can be passed to parquetWrite to adjust parquet file writing behavior:

writer: a generic writer object
compression: use snappy compression (default true)
statistics: write column statistics (default true)
rowGroupSize: number of rows in each row group (default 100000)
kvMetadata: extra key-value metadata to be stored in the parquet footer

import { ByteWriter, parquetWrite } from 'hyparquet-writer'

const writer = new ByteWriter()
const arrayBuffer = parquetWrite({
  writer,
  columnData: [
    { name: 'name', data: ['Alice', 'Bob', 'Charlie'], type: 'STRING' },
    { name: 'age', data: [25, 30, 35], type: 'INT32' },
  ],
  compression: false,
  statistics: false,
  rowGroupSize: 1000,
  kvMetadata: {
    'key1': 'value1',
    'key2': 'value2',
  },
})

README.md

Hyparquet Writer

Quick Start

Node.js Write to Local Parquet File

Advanced Usage

References