From 64e1df40bceb52990ff678b0a7f387f87629aa1e Mon Sep 17 00:00:00 2001 From: Sylvain Lesage Date: Fri, 15 Aug 2025 13:14:24 -0400 Subject: [PATCH] Add a Binary columns section Note: I didn't mention WKB columns, I'm not sure if there are other binary types in Parquet. --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index fbd75d7..0bf6ff2 100644 --- a/README.md +++ b/README.md @@ -218,6 +218,10 @@ await parquetRead({ The `parquetReadObjects` function defaults to `rowFormat: 'object'`. +### Binary columns + +Parquet supports binary column types, such as `BYTE_ARRAY` and `UTF8`. As many parquet files in the wild encode strings as `BYTE_ARRAY` instead of `UTF8`, by default, the `BYTE_ARRAY` values are decoded as UTF-8 strings. This behavior can be changed by setting the `utf8` option to `false` in functions such as `parquetRead`. Note that this option does not affect `UTF8` columns, which are always decoded as UTF-8 strings. + ## Supported Parquet Files The parquet format is known to be a sprawling format which includes options for a wide array of compression schemes, encoding types, and data structures.