Commit Graph

67 Commits

Author SHA1 Message Date
Kenny Daniel
f5274904b7
Add onPage callback to parquetRead 2025-04-10 23:29:58 -07:00
Kenny Daniel
90be536e05
Group selection of a row group into an object 2025-04-10 22:36:10 -07:00
Kenny Daniel
4df7095ab4
Group column decoding params into an object 2025-04-10 19:30:25 -07:00
Kenny Daniel
972402d083
Fix handling of dictionary pages from parquet.net 2025-04-09 17:26:47 -07:00
Kenny Daniel
655444bcde
Fix continued data pages
Parquet allows consecutive pages to continue a previously assembled
list. Broke in hyparquet 1.9.0. Added continued_page.parquet test.
2025-04-07 17:40:23 -07:00
Kenny Daniel
6c225888c4
Skip unnecessary pages
Do this by passing rowGroupStart and rowGroupEnd for the rows to
fetch within a rowgroup. If a page is outside those bounds, we can
skip the page. Replaces rowLimit.
2025-04-07 00:40:17 -07:00
Kenny Daniel
b38b65f7c7
Refactor assembleLists to take a schemaPath 2025-04-02 23:39:55 -07:00
Kenny Daniel
1247f5d606
Split out readPage
Remove dict-page-offset-zero test because it's a malformed parquet file.
2025-04-02 20:27:10 -07:00
Kenny Daniel
6af6f43f44
Export more constants 2025-03-31 23:20:22 -07:00
Kenny Daniel
d7f8d39de3
Return typed arrays in onChunk. Change readColumn to return DecodedArray[]. (#67)
Refactored readColumn to avoid `concat` operations.
This avoids extra copying and allocation.
2025-03-10 23:33:47 -07:00
Kenny Daniel
791a847e42
Revert "Simplify relative import paths"
This reverts commit e590f4ee03263460a389bdd29678015727cdcd5a.
2025-03-06 08:54:32 -08:00
Kenny Daniel
e590f4ee03
Simplify relative import paths 2025-03-05 14:03:17 -08:00
Brian Park
9992316748
Enable readColumn to read all rows (#53)
* Enable readColumn to read all rows

* Refactor readColumn to use hasRowLimit

* Simplify hasRowLimit condition

* Check less common condition first

* add readColumn test files

* implement readColumn tests for undefined rowLimits

* remove unused variable

* return early if no metadata is present

* address tsc warnings

* add comparison

* clarify that undefined is valid for rowLimit

* remove test files

* verify edge case works when rowLimit is undefined

* add test cases for readColumn

---------

Co-authored-by: Brian Park <park-brian@users.noreply.github.com>
2024-12-19 18:08:22 -08:00
Sylvain Lesage
09ae9400c5
build types before publishing to npm (#46)
* build types before publishing to npm

* use prepare instead of prepublishOnly + make it clear that we only build types

doc for prepare vs prepublishOnly is here: https://docs.npmjs.com/cli/v8/using-npm/scripts

* no jsx in this lib

* relative imports from the root, so that it works from types/

* remove unused hyparquet.d.ts + report differences to jsdoc in files

* try to understand if this is the cause of the failing CI check

tsc fails: https://github.com/hyparam/hyparquet/actions/runs/12040954822/job/33571851170?pr=46

* Revert "try to understand if this is the cause of the failing CI check"

This reverts commit 5e2fc8ca179064369de71793ab1cda3facefddc7.

* not sure what happens, but we just need to ensure the types are created correctly

* increment version

* Explicitly export types for use in downstream typescript projects

* Use new typescript jsdoc imports for smaller package

* Combine some files and use @import jsdoc

* use the local typescript

---------

Co-authored-by: Kenny Daniel <platypii@gmail.com>
2024-12-02 17:47:42 +01:00
Kenny Daniel
9cd2e3b666
Fix row limit for structs 2024-06-25 17:41:50 -07:00
Kenny Daniel
ddb8b16cd0
Fix handling of multiple pages 2024-06-07 23:16:04 -07:00
Kenny Daniel
6d769a4336
Demo: move to folder, typecheck, and render column indices 2024-05-31 19:40:44 -07:00
Kenny Daniel
36f5b4f043
Move decompressPage to avoid circular dependency chain 2024-05-27 12:54:42 -07:00
Kenny Daniel
9aebdb2917
Convert dictionary before dereferencing, and check encoding 2024-05-24 14:33:29 -07:00
Kenny Daniel
ed3b525a27
Fix nested optional from duckdb#3734 🦆 2024-05-23 18:19:01 -07:00
Kenny Daniel
b8e4496063
Upgrade dataPage to match dictionary type 2024-05-23 00:07:09 -07:00
Kenny Daniel
c4ad05e580
Convert byte arrays to utf8 by default 2024-05-22 22:40:21 -07:00
Kenny Daniel
5eeb05da40
dict-page-offset-zero.parquet 2024-05-21 22:50:50 -07:00
Kenny Daniel
9cd09b8eed
Byte stream split encoding 2024-05-20 04:09:32 -07:00
Kenny
cf4c4ba04d
Assembly of nested column types (#11) 2024-05-17 22:44:03 -07:00
Kenny Daniel
3f958ed25d
Handle skipNulls in assembleLists 2024-05-17 19:41:40 -07:00
Kenny Daniel
034e9cda16
Faster row transpose 2024-05-14 17:13:24 -07:00
Kenny Daniel
9f95eff222
Faster decimal conversion 2024-05-14 01:58:28 -07:00
Kenny Daniel
7639b8ca7f
Fix fixed length byte array type 2024-05-12 21:52:26 -07:00
Kenny Daniel
4d960e0533
Convert consistently 2024-05-09 17:07:05 -07:00
Kenny Daniel
4d5c8324aa
TypedArrays 2024-05-01 23:23:55 -07:00
Kenny Daniel
d093b0dcaa
Use DataReader for thrift 2024-05-01 00:55:16 -07:00
Kenny Daniel
93ff9a9f99
Refactor isListLike and isMapLike to use schemaPath 2024-04-29 18:45:29 -07:00
Kenny Daniel
2c6a111113
Refactor to use schemaPath 2024-04-29 17:38:26 -07:00
Kenny Daniel
a42cc558d0
Adjust read coalesce size 2024-04-29 16:43:07 -07:00
Kenny Daniel
bf1b8d79c7
Simplify imports 2024-04-28 16:05:47 -07:00
Kenny Daniel
86273b110c
PageType enum to string 2024-04-18 00:21:13 -07:00
Kenny Daniel
6ffdeca103
Fast array concat 2024-04-07 09:59:37 -07:00
Kenny Daniel
429fd9e813
Fix max call stack error in browser: concat not spread... 2024-04-06 22:01:42 -07:00
Kenny Daniel
46df1ab454
Rewrite dremel assembly 2024-03-21 16:47:21 -07:00
Kenny Daniel
52721a3d30
Split out assemble objects 2024-03-18 17:40:52 -07:00
Kenny Daniel
c6ad30b59a
schemaElement returns trees 2024-03-12 20:39:15 -07:00
Kenny Daniel
0f4708b954
Change compressors to return Uint8Array 2024-02-27 19:45:56 -08:00
Kenny Daniel
11f35c9e43
Encoding as string 2024-02-27 10:51:57 -08:00
Kenny Daniel
e3b5fca883
Custom decompressors 2024-02-27 09:05:02 -08:00
Kenny Daniel
87d78ab06e
Oops fix the other tests 2024-02-26 22:51:57 -08:00
Kenny Daniel
a65132b79c
Data Page V2 2024-02-26 18:44:33 -08:00
Kenny Daniel
ca971ccc01
Split out convert function 2024-02-26 12:20:48 -08:00
Kenny Daniel
c70b3b2227
Prepare for data page v2 2024-02-26 11:44:28 -08:00
Kenny Daniel
a7e5aef31f
decompressPage for dictionary and data page v1 only 2024-02-24 12:12:38 -08:00