hyparquet-writer/README.md

40 lines
1.8 KiB
Markdown
Raw Normal View History

2025-03-26 03:15:14 +00:00
# Hyparquet Writer
2025-03-27 06:37:05 +00:00
[![npm](https://img.shields.io/npm/v/hyparquet-writer)](https://www.npmjs.com/package/hyparquet-writer)
2025-03-27 07:27:22 +00:00
[![minzipped](https://img.shields.io/bundlephobia/minzip/hyparquet-writer)](https://www.npmjs.com/package/hyparquet-writer)
[![workflow status](https://github.com/hyparam/hyparquet-writer/actions/workflows/ci.yml/badge.svg)](https://github.com/hyparam/hyparquet-writer/actions)
2025-03-26 03:15:14 +00:00
[![mit license](https://img.shields.io/badge/License-MIT-orange.svg)](https://opensource.org/licenses/MIT)
2025-03-31 20:42:57 +00:00
![coverage](https://img.shields.io/badge/Coverage-96-darkred)
2025-04-01 06:32:14 +00:00
[![dependencies](https://img.shields.io/badge/Dependencies-1-blueviolet)](https://www.npmjs.com/package/hyparquet-writer?activeTab=dependencies)
2025-03-26 05:36:06 +00:00
2025-04-03 07:42:54 +00:00
Hyparquet Writer is a JavaScript library for writing [Apache Parquet](https://parquet.apache.org) files. It is designed to be lightweight, fast and store data very efficiently. It is a companion to the [hyparquet](https://github.com/hyparam/hyparquet) library, which is a JavaScript library for reading parquet files.
2025-03-26 05:36:06 +00:00
## Usage
Call `parquetWrite` with a list of columns, each column is an object with a `name` and `data` field. The `data` field should be an array of same-type values.
2025-03-26 05:36:06 +00:00
```javascript
import { parquetWrite } from 'hyparquet-writer'
2025-03-26 05:36:06 +00:00
2025-03-27 07:27:22 +00:00
const arrayBuffer = parquetWrite({
columnData: [
2025-03-28 23:13:27 +00:00
{ name: 'name', data: ['Alice', 'Bob', 'Charlie'], type: 'STRING' },
{ name: 'age', data: [25, 30, 35], type: 'INT32' },
2025-03-27 07:27:22 +00:00
],
})
2025-03-26 05:36:06 +00:00
```
2025-04-04 03:19:37 +00:00
## Options
2025-04-03 20:21:57 +00:00
- `compression`: use snappy compression (default true)
- `statistics`: write column statistics (default true)
- `rowGroupSize`: number of rows in each row group (default 100000)
- `kvMetadata`: extra key-value metadata
2025-04-04 03:19:37 +00:00
2025-03-26 05:36:06 +00:00
## References
- https://github.com/hyparam/hyparquet
2025-03-31 21:51:11 +00:00
- https://github.com/hyparam/hyparquet-compressors
2025-03-26 05:36:06 +00:00
- https://github.com/apache/parquet-format
- https://github.com/apache/parquet-testing