Decompressors for hyparquet
Go to file
2024-05-22 12:56:27 -07:00
.github/workflows Initial project skeleton 2024-05-09 00:19:56 -07:00
src Gzip implementation 2024-05-22 12:53:05 -07:00
test Gzip implementation 2024-05-22 12:53:05 -07:00
.eslintrc.json Gzip implementation 2024-05-22 12:53:05 -07:00
.gitignore Publish v0.1.0 2024-05-20 01:06:33 -07:00
LICENSE Initial project skeleton 2024-05-09 00:19:56 -07:00
package.json Publish v0.1.3 2024-05-22 12:56:27 -07:00
README.md Gzip implementation 2024-05-22 12:53:05 -07:00
rollup.config.js Brotli 2024-05-19 23:57:44 -07:00
tsconfig.json Initial project skeleton 2024-05-09 00:19:56 -07:00

hyparquet decompressors

npm workflow status mit license coverage

This package exports a compressors object intended to be passed into hyparquet.

Apache Parquet is a popular columnar storage format that is widely used in data engineering, data science, and machine learning applications for efficiently storing and processing large datasets. It supports a number of different compression formats, but most parquet files use snappy compression.

The hyparquet library by default only supports uncompressed and snappy compressed files. The hyparquet-compressors package extends support for all legal parquet compression formats.

The hyparquet-compressors package works in both node.js and the browser. Uses js and wasm packages, no system dependencies.

Usage

import { parquetRead } from 'hyparquet'
import { compressors } from 'hyparquet-compressors'

await parquetRead({ file, compressors, onComplete: console.log })

See hyparquet repo for further info.

Supported compression formats

Parquet compression types supported with hyparquet-compressors:

  • Uncompressed
  • Snappy
  • GZip
  • LZO
  • Brotli
  • LZ4
  • ZSTD
  • LZ4_RAW

References