Commit Graph

66 Commits

Author SHA1 Message Date
mike-iqmo
dbf3065f8e
Addresses issues with duckdb use of delta encodings (#77)
* Addresses issues with duckdb use of delta encodings

* Shrunk size of test data
2025-05-14 16:28:58 -07:00
Kenny Daniel
b7db4653e7
Add another column to page_indexed test 2025-04-26 17:18:11 -07:00
Kenny Daniel
9a04cbccd3
Convert unsigned types 2025-04-14 23:20:58 -07:00
Kenny Daniel
972402d083
Fix handling of dictionary pages from parquet.net 2025-04-09 17:26:47 -07:00
Kenny Daniel
655444bcde
Fix continued data pages
Parquet allows consecutive pages to continue a previously assembled
list. Broke in hyparquet 1.9.0. Added continued_page.parquet test.
2025-04-07 17:40:23 -07:00
Kenny Daniel
ba74d58dd3
Test for reading the last row of files 2025-04-06 22:05:58 -07:00
Kenny Daniel
1247f5d606
Split out readPage
Remove dict-page-offset-zero test because it's a malformed parquet file.
2025-04-02 20:27:10 -07:00
Kenny Daniel
85e1af66c1
Fix thrift parsing of crypto_metadata 2025-03-25 15:42:48 -07:00
Kenny Daniel
2a302702d4
Fix handling of boolean rle 2025-02-22 13:29:29 -08:00
Johan Levin
bf268e141c
Use prepended length for bit-packed hybrid bool columns (#62) 2025-02-19 11:07:49 -08:00
Kenny Daniel
36d8ea2e1d
Fix handling of signed decimals (#60) 2025-02-07 18:52:48 -08:00
Kenny Daniel
870187c7de
Update README with Awaitable 2024-12-21 15:31:59 -08:00
Kenny
a2024a781c
Parse column and offset indexes (#29)
* Parse indicies

* Add parsed offset indices

* Add parsed column indices

* Test readColumnIndex and readOffsetIndex

* Add more parsed offset indices

* Remove unnecessary toJson when loading expected results

* Add length checks to convertMetadata

* Rename indicies.js to indexes.js

* Rename indices.test.js to indexes.test.js

* Rename *_indices.json to *_indexes.json

* Use asyncBufferFromFile in indexes.test.js

---------

Co-authored-by: Brian Park <park-brian@users.noreply.github.com>
2024-08-18 18:23:54 -07:00
Kenny Daniel
c6c79c05ca
Fix for issue #23 nested struct assembly 2024-08-02 14:47:04 -07:00
Kenny Daniel
17f412c2f5
Convert logical date units 2024-05-24 16:55:13 -07:00
Kenny Daniel
efdbf459a5
Convert date and decimal stats 2024-05-24 15:22:59 -07:00
Kenny Daniel
a56420de2f
Parse metadata TimeUnit 2024-05-24 15:17:20 -07:00
Kenny Daniel
2edc14b70e
Convert unsigned ints 2024-05-23 23:35:49 -07:00
Kenny Daniel
c68256575b
Convert logical timestamp 2024-05-23 18:50:57 -07:00
Kenny Daniel
7a08aa3183
Handle repeated with no children 2024-05-23 18:26:16 -07:00
Kenny Daniel
ed3b525a27
Fix nested optional from duckdb#3734 🦆 2024-05-23 18:19:01 -07:00
Kenny Daniel
af7bab33f8
Handle top level repeated from duckdb#2557 🦆 2024-05-23 17:43:36 -07:00
Kenny Daniel
d92cc5fd22
Convert timestamps and json 2024-05-23 16:43:26 -07:00
Kenny Daniel
06578a9419
struct_strings.parquet 2024-05-23 02:10:04 -07:00
Kenny Daniel
7d1d877c9f
Fix metadata parsing of page_type 2024-05-23 00:11:58 -07:00
Kenny Daniel
b8e4496063
Upgrade dataPage to match dictionary type 2024-05-23 00:07:09 -07:00
Kenny Daniel
c4ad05e580
Convert byte arrays to utf8 by default 2024-05-22 22:40:21 -07:00
Kenny Daniel
1f8289b4b2
rle_boolean_encoding.parquet 2024-05-22 19:16:10 -07:00
Kenny Daniel
5eeb05da40
dict-page-offset-zero.parquet 2024-05-21 22:50:50 -07:00
Kenny Daniel
4f7791354c
incorrect_map_schema.parquet 2024-05-21 22:18:39 -07:00
Kenny Daniel
6a75a960da
Convert boolean column 2024-05-21 22:05:29 -07:00
Kenny Daniel
a1ca1ef785
byte_stream_split_extended.gzip.parquet 2024-05-21 17:21:36 -07:00
Kenny Daniel
70387fa345
repeated_no_annotation.parquet 2024-05-20 23:09:31 -07:00
Kenny Daniel
d453313dca
Fix optional structs! 2024-05-20 05:03:33 -07:00
Kenny Daniel
9cd09b8eed
Byte stream split encoding 2024-05-20 04:09:32 -07:00
Kenny Daniel
1689d7473a
Delta length byte array encoding 2024-05-20 02:32:31 -07:00
Kenny Daniel
da72c06ac2
Use hyparquet-compressors for tests (brotli, lz4, zstd) 2024-05-20 02:07:40 -07:00
Kenny Daniel
d4341b803e
Delta byte array encoding 2024-05-18 19:23:11 -07:00
Kenny Daniel
561f06f701
Int_Map test is redundant with nullable.impala.parquet 2024-05-18 18:33:15 -07:00
Kenny Daniel
3583aeb549
nullable.impala.parquet 2024-05-17 22:52:57 -07:00
Kenny
cf4c4ba04d
Assembly of nested column types (#11) 2024-05-17 22:44:03 -07:00
Kenny Daniel
c83aa2ea5b
Float16 2024-05-13 20:36:53 -07:00
Kenny Daniel
7639b8ca7f
Fix fixed length byte array type 2024-05-12 21:52:26 -07:00
Kenny Daniel
fe7e19b2a4
Fix decimal conversion with precision 2024-05-12 19:52:15 -07:00
Kenny Daniel
82db6a8017
Delta binary packed encoding 2024-05-12 15:47:16 -07:00
Kenny Daniel
892c933a05
Parse logical types 2024-05-05 16:13:19 -07:00
Kenny Daniel
57ed66646d
Convert statistics based on column type 2024-05-04 01:11:46 -07:00
Kenny Daniel
3d5d423694
Parse additional metadata 2024-05-04 01:03:42 -07:00
Kenny Daniel
09ea11517c
Fix typescript definitions 2024-04-26 14:01:00 -07:00
Kenny Daniel
d0213bf0f1
Snappy jpg test 2024-04-12 23:58:37 -07:00