Apache Parquet file writer in JavaScript
Go to file
2025-04-07 01:27:45 -07:00
.github/workflows Initial JS project 2025-03-21 00:08:34 -07:00
src Write statistics 2025-04-03 20:39:37 -07:00
test Write statistics 2025-04-03 20:39:37 -07:00
.gitignore Snappy compression 2025-03-26 23:38:25 -07:00
eslint.config.js Thrift writer 2025-03-25 10:30:37 -07:00
hyparquet-writer.jpg Add mascot 2025-04-07 01:27:45 -07:00
LICENSE Initial JS project 2025-03-21 00:08:34 -07:00
package.json Add mascot 2025-04-07 01:27:45 -07:00
README.md Add mascot 2025-04-07 01:27:45 -07:00
tsconfig.build.json Handle byte array vs string, and change parquetWrite column api 2025-03-26 01:01:04 -07:00
tsconfig.json Thrift writer 2025-03-25 10:30:37 -07:00

Hyparquet Writer

hyparquet writer parakeet

npm minzipped workflow status mit license coverage dependencies

Hyparquet Writer is a JavaScript library for writing Apache Parquet files. It is designed to be lightweight, fast and store data very efficiently. It is a companion to the hyparquet library, which is a JavaScript library for reading parquet files.

Usage

Call parquetWrite with a list of columns, each column is an object with a name and data field. The data field should be an array of same-type values.

import { parquetWrite } from 'hyparquet-writer'

const arrayBuffer = parquetWrite({
  columnData: [
    { name: 'name', data: ['Alice', 'Bob', 'Charlie'], type: 'STRING' },
    { name: 'age', data: [25, 30, 35], type: 'INT32' },
  ],
})

Options

  • compression: use snappy compression (default true)
  • statistics: write column statistics (default true)
  • rowGroupSize: number of rows in each row group (default 100000)
  • kvMetadata: extra key-value metadata

References