Commit Graph

61 Commits

Author SHA1 Message Date
Kenny Daniel
0a20750193
Pushdown filter (#141) 2025-11-21 03:07:56 -08:00
Kenny Daniel
acdbb22828
Update geospatial and variant metadata from thrift spec 2025-10-23 18:49:05 -07:00
Sylvain Lesage
e8b1c8e570
Minimal support for GeoParquet (#133)
* Initial support for GeoParquet

* pr comments

* convert crs

* add test file + expected JSON files

* add sentence to README

* Apply suggestion from @platypii

Co-authored-by: Kenny Daniel <platypii@gmail.com>

* PR comments

* update README

* review comment

---------

Co-authored-by: Kenny Daniel <platypii@gmail.com>
2025-10-16 04:22:01 -04:00
Kenny Daniel
d701904253
Add well-known-binary decoder for geometry and geography (#131) 2025-09-30 11:45:39 -07:00
Kenny Daniel
2f00330527
Parse geospatial_statistics (#130) 2025-09-27 16:31:16 -07:00
Kenny Daniel
8611663334
Custom string parser option (#129) 2025-09-26 19:07:25 -07:00
Sylvain Lesage
c6429d5abe
try to fix the types again (#120)
* try to fix the types again

* fix test (breaking)

* [breaking] only support object format for parquetReadObjects and parquetQuery

* remove internal types

* remove redundant test

* override __index__ with original data if present

Also: add comments to explain special cases.

* remove the need to slice arrays

* loosen the types to avoid code duplication

* always write the index, because the results should be consistent

* Revert "always write the index, because the results should be consistent"

This reverts commit fd4e3060674fa6e81bd32fc894d7c366103e004a.
2025-09-16 15:29:44 -07:00
Kenny Daniel
92a417c506
Revert "Fix onComplete return type (#104)" (#117)
This reverts commit 49bd895fb51dd13631f7a4f61e46e0baf8f1c0c5.
2025-09-03 22:15:51 -07:00
Sylvain Lesage
49bd895fb5
Fix onComplete return type (#104)
* attempt to fix #28

* remove breaking changes

* loosen the types a bit, but no breaking change

* fix format and doc

* fix format

* fix format

* 'remove unused import and add space

Co-authored-by: Mario <mario@autarc.energy>
2025-08-22 15:09:28 -04:00
Kenny Daniel
8050e0e38d
Fix filter on unselected column (#95) 2025-06-30 01:47:05 -07:00
Kenny Daniel
f8ecf52bed
Publish v1.16.0 2025-06-10 11:02:42 -07:00
LiraNuna
8609192b23
Introduce 'custom parsers' option for decoding dates (#87) 2025-06-09 18:02:31 -07:00
Kenny Daniel
4e2f76df09
parquetReadAsync (#83) 2025-05-26 17:27:15 -07:00
Kenny Daniel
9a9519f0b7
Add more details to QueryPlan. (#82)
- Add metadata
 - Add rowStart and rowEnd
 - Add columns
 - Add groupStart, selectStart, selectEnd, and groupRows to GroupPlan
 - Rename ranges to fetches
 - Rename numRows to groupRows in ColumnDecoder
2025-05-25 15:21:58 -07:00
Kenny Daniel
5d8f17903e
Omit onComplete from parquetReadObjects 2025-05-22 23:07:04 -07:00
Kenny Daniel
0e6d7dee6f
Parquet Query Planner: plan byte ranges, pre-fetch in parallel (#75)
* Parquet Query Planner: plan byte ranges, pre-fetch in parallel.

 - parquetPlan() that returns lists of byte ranges to fetch.
 - prefetchAsyncBuffer() pre-fetches all byte ranges in parallel.
   throws exception if non-pre-fetched slice is requested later.
2025-04-30 00:49:40 -07:00
Kenny Daniel
9a04cbccd3
Convert unsigned types 2025-04-14 23:20:58 -07:00
Kenny Daniel
8161983962
Publish v1.12.0 2025-04-11 04:43:11 -07:00
Kenny Daniel
11c7d8174a
LogicalType DECIMAL is not a LogicalTypeSimple 2025-04-11 00:21:55 -07:00
Kenny Daniel
f5274904b7
Add onPage callback to parquetRead 2025-04-10 23:29:58 -07:00
Kenny Daniel
90be536e05
Group selection of a row group into an object 2025-04-10 22:36:10 -07:00
Kenny Daniel
4df7095ab4
Group column decoding params into an object 2025-04-10 19:30:25 -07:00
Kenny Daniel
4645e34f97
Re-order types.d.ts to put important apis up front 2025-04-10 16:33:50 -07:00
Kenny Daniel
f9a10da20b
Type thrift 2025-04-03 19:20:00 -07:00
Kenny Daniel
6af6f43f44
Export more constants 2025-03-31 23:20:22 -07:00
Kenny Daniel
85e1af66c1
Fix thrift parsing of crypto_metadata 2025-03-25 15:42:48 -07:00
Brian Park
c9727a4246
Query filter (#56)
* implement ParquetQueryFilter types

* implement parquetQuery filter tests

* implement parquetQuery filter

* filter before ordering

* apply filters before sorting/slicing

* format types

* add deep equality utility

* document and format equals utility

* use deep equality checks

* update filter tests

* support more types for equality

* make $not unary

* ensure arrays are correctly compared

* support both forms of $not

* add operator tests

* Filter operator tests

---------

Co-authored-by: Brian Park <park-brian@users.noreply.github.com>
Co-authored-by: Kenny Daniel <platypii@gmail.com>
2024-12-21 15:23:57 -08:00
Sylvain Lesage
09ae9400c5
build types before publishing to npm (#46)
* build types before publishing to npm

* use prepare instead of prepublishOnly + make it clear that we only build types

doc for prepare vs prepublishOnly is here: https://docs.npmjs.com/cli/v8/using-npm/scripts

* no jsx in this lib

* relative imports from the root, so that it works from types/

* remove unused hyparquet.d.ts + report differences to jsdoc in files

* try to understand if this is the cause of the failing CI check

tsc fails: https://github.com/hyparam/hyparquet/actions/runs/12040954822/job/33571851170?pr=46

* Revert "try to understand if this is the cause of the failing CI check"

This reverts commit 5e2fc8ca179064369de71793ab1cda3facefddc7.

* not sure what happens, but we just need to ensure the types are created correctly

* increment version

* Explicitly export types for use in downstream typescript projects

* Use new typescript jsdoc imports for smaller package

* Combine some files and use @import jsdoc

* use the local typescript

---------

Co-authored-by: Kenny Daniel <platypii@gmail.com>
2024-12-02 17:47:42 +01:00
Kenny
a2024a781c
Parse column and offset indexes (#29)
* Parse indicies

* Add parsed offset indices

* Add parsed column indices

* Test readColumnIndex and readOffsetIndex

* Add more parsed offset indices

* Remove unnecessary toJson when loading expected results

* Add length checks to convertMetadata

* Rename indicies.js to indexes.js

* Rename indices.test.js to indexes.test.js

* Rename *_indices.json to *_indexes.json

* Use asyncBufferFromFile in indexes.test.js

---------

Co-authored-by: Brian Park <park-brian@users.noreply.github.com>
2024-08-18 18:23:54 -07:00
Kenny Daniel
8e0235413a
Update dependencies 2024-08-02 16:12:57 -07:00
Kenny Daniel
17f412c2f5
Convert logical date units 2024-05-24 16:55:13 -07:00
Kenny Daniel
a56420de2f
Parse metadata TimeUnit 2024-05-24 15:17:20 -07:00
Kenny Daniel
2edc14b70e
Convert unsigned ints 2024-05-23 23:35:49 -07:00
Kenny Daniel
034e9cda16
Faster row transpose 2024-05-14 17:13:24 -07:00
Kenny Daniel
c83aa2ea5b
Float16 2024-05-13 20:36:53 -07:00
Kenny Daniel
12dc5a47f8
Add path to schemaTree 2024-05-06 13:23:18 -07:00
Kenny Daniel
892c933a05
Parse logical types 2024-05-05 16:13:19 -07:00
Kenny Daniel
57ed66646d
Convert statistics based on column type 2024-05-04 01:11:46 -07:00
Kenny Daniel
f86c8c6359
Update metadata types based on parquet.thrift schema, use bigint for i64 type. 2024-05-04 00:18:16 -07:00
Kenny Daniel
f411b83f5e
Fix min/max type definition 2024-05-02 17:29:39 -07:00
Kenny Daniel
4d5c8324aa
TypedArrays 2024-05-01 23:23:55 -07:00
Kenny Daniel
d093b0dcaa
Use DataReader for thrift 2024-05-01 00:55:16 -07:00
Kenny Daniel
09ea11517c
Fix typescript definitions 2024-04-26 14:01:00 -07:00
Kenny Daniel
86273b110c
PageType enum to string 2024-04-18 00:21:13 -07:00
Kenny Daniel
f826bff757
Use DataReader over Decoded. Fewer allocations, slightly faster. 2024-04-17 23:43:04 -07:00
Kenny Daniel
8a98407734
Parse logical types from metadata 2024-03-12 01:15:12 -07:00
Kenny Daniel
0f4708b954
Change compressors to return Uint8Array 2024-02-27 19:45:56 -08:00
Kenny Daniel
8b575ad2d8
ParquetType as string 2024-02-27 11:31:17 -08:00
Kenny Daniel
11f35c9e43
Encoding as string 2024-02-27 10:51:57 -08:00
Kenny Daniel
e3b5fca883
Custom decompressors 2024-02-27 09:05:02 -08:00