| .github/workflows | ||
| src | ||
| test | ||
| .eslintrc.json | ||
| .gitignore | ||
| LICENSE | ||
| package.json | ||
| README.md | ||
| rollup.config.js | ||
| tsconfig.json | ||
hyparquet decompressors
This package exports a compressors object intended to be passed into hyparquet.
Apache Parquet is a popular columnar storage format that is widely used in data engineering, data science, and machine learning applications for efficiently storing and processing large datasets. It supports a number of different compression formats, but most parquet files use snappy compression.
The hyparquet library by default only supports uncompressed and snappy compressed files. The hyparquet-compressors package extends support for all legal parquet compression formats.
The hyparquet-compressors package works in both node.js and the browser. Uses js and wasm packages, no system dependencies.
Usage
import { parquetRead } from 'hyparquet'
import { compressors } from 'hyparquet-compressors'
await parquetRead({ file, compressors, onComplete: console.log })
See hyparquet repo for further info.
Supported compression formats
Parquet compression types supported with hyparquet-compressors:
- Uncompressed
- Snappy
- GZip
- LZO
- Brotli
- LZ4
- ZSTD
- LZ4_RAW
References
- https://parquet.apache.org/docs/file-format/data-pages/compression/
- https://en.wikipedia.org/wiki/Brotli
- https://en.wikipedia.org/wiki/Gzip
- https://en.wikipedia.org/wiki/LZ4_(compression_algorithm)
- https://en.wikipedia.org/wiki/Snappy_(compression)
- https://en.wikipedia.org/wiki/Zstd
- https://github.com/101arrowz/fzstd
- https://github.com/foliojs/brotli.js
- https://github.com/hyparam/hysnappy
- https://github.com/nodeca/pako