mirror of
https://github.com/asadbek064/hyparquet-writer.git
synced 2025-12-06 07:31:55 +00:00
Apache Parquet file writer in JavaScript
| .github/workflows | ||
| src | ||
| test | ||
| .gitignore | ||
| eslint.config.js | ||
| hyparquet-writer.jpg | ||
| LICENSE | ||
| package.json | ||
| README.md | ||
| tsconfig.build.json | ||
| tsconfig.json | ||
Hyparquet Writer
Hyparquet Writer is a JavaScript library for writing Apache Parquet files. It is designed to be lightweight, fast and store data very efficiently. It is a companion to the hyparquet library, which is a JavaScript library for reading parquet files.
Usage
Call parquetWrite with argument columnData. Each column in columnData should contain:
name: the column namedata: an array of same-type valuestype: the parquet schema type (optional, type guessed from data if not provided)
Example:
import { parquetWrite } from 'hyparquet-writer'
const arrayBuffer = parquetWrite({
columnData: [
{ name: 'name', data: ['Alice', 'Bob', 'Charlie'], type: 'STRING' },
{ name: 'age', data: [25, 30, 35], type: 'INT32' },
],
})
Options
Options can be passed to parquetWrite to change parquet file properties:
compression: use snappy compression (default true)statistics: write column statistics (default true)rowGroupSize: number of rows in each row group (default 100000)kvMetadata: extra key-value metadata
import { parquetWrite } from 'hyparquet-writer'
const arrayBuffer = parquetWrite({
columnData: [
{ name: 'name', data: ['Alice', 'Bob', 'Charlie'], type: 'STRING' },
{ name: 'age', data: [25, 30, 35], type: 'INT32' },
],
compression: false,
statistics: false,
rowGroupSize: 1000,
kvMetadata: {
'key1': 'value1',
'key2': 'value2',
},
})
