Commit Graph

203 Commits

Author SHA1 Message Date
Kenny Daniel
f37b2aea9f
for is faster than forEach 2025-03-17 10:18:01 -07:00
Kenny Daniel
d7f8d39de3
Return typed arrays in onChunk. Change readColumn to return DecodedArray[]. (#67)
Refactored readColumn to avoid `concat` operations.
This avoids extra copying and allocation.
2025-03-10 23:33:47 -07:00
Kenny Daniel
a9467f6c3d
Remove selfCopyBytes in favor of copyBytes 2025-03-10 20:56:00 -07:00
Kenny Daniel
4bbc7742e5
Comment out unnecessary length read in readRleBitPackedHybrid 2025-03-09 11:20:58 -07:00
Kenny Daniel
791a847e42
Revert "Simplify relative import paths"
This reverts commit e590f4ee03263460a389bdd29678015727cdcd5a.
2025-03-06 08:54:32 -08:00
Kenny Daniel
e590f4ee03
Simplify relative import paths 2025-03-05 14:03:17 -08:00
Kenny Daniel
2456cdc85f
Better error messages 2025-03-04 11:05:22 -08:00
Kenny Daniel
2a302702d4
Fix handling of boolean rle 2025-02-22 13:29:29 -08:00
Johan Levin
bf268e141c
Use prepended length for bit-packed hybrid bool columns (#62) 2025-02-19 11:07:49 -08:00
Kenny Daniel
36d8ea2e1d
Fix handling of signed decimals (#60) 2025-02-07 18:52:48 -08:00
Sean Lynch
725545731d
Support endpoints that don't support range requests in asyncBufferFromUrl (#57)
* Support endpoints that don't support range requests in asyncBufferFromUrl

Before this commit asyncBufferFromUrl assumes that the body of whatever
successful response it gets is equivalent to the range it requested. If
the origin server does not support HTTP range requests then this
assumption is usually wrong and will lead to parsing failures.

This commit changes asyncBufferFromUrl to change its behaviour slightly
based on the status code in the response:
- if 200 then we got the whole parquet file as the response. Save it and
  use the resulting ArrayBuffer to serve all future slice calls.
- if 206 then we got a range response and we can just return that.

I have also included some test cases to ensure that such responses are
handled correctly and also tweaked other existing mocks to also include
the relevant status code.

* Fix all lint warnings

* replace switch with if-else
2025-01-16 11:55:05 -08:00
Brian Park
c9727a4246
Query filter (#56)
* implement ParquetQueryFilter types

* implement parquetQuery filter tests

* implement parquetQuery filter

* filter before ordering

* apply filters before sorting/slicing

* format types

* add deep equality utility

* document and format equals utility

* use deep equality checks

* update filter tests

* support more types for equality

* make $not unary

* ensure arrays are correctly compared

* support both forms of $not

* add operator tests

* Filter operator tests

---------

Co-authored-by: Brian Park <park-brian@users.noreply.github.com>
Co-authored-by: Kenny Daniel <platypii@gmail.com>
2024-12-21 15:23:57 -08:00
Brian Park
9992316748
Enable readColumn to read all rows (#53)
* Enable readColumn to read all rows

* Refactor readColumn to use hasRowLimit

* Simplify hasRowLimit condition

* Check less common condition first

* add readColumn test files

* implement readColumn tests for undefined rowLimits

* remove unused variable

* return early if no metadata is present

* address tsc warnings

* add comparison

* clarify that undefined is valid for rowLimit

* remove test files

* verify edge case works when rowLimit is undefined

* add test cases for readColumn

---------

Co-authored-by: Brian Park <park-brian@users.noreply.github.com>
2024-12-19 18:08:22 -08:00
Kenny Daniel
7ce11ad844
Validate url for asyncBufferFromUrl 2024-12-17 09:25:54 -08:00
Kenny Daniel
f762dba6a8
Use ParquetReadOptions type for parquetRead options (#51) 2024-12-10 16:16:52 -08:00
Sylvain Lesage
09ae9400c5
build types before publishing to npm (#46)
* build types before publishing to npm

* use prepare instead of prepublishOnly + make it clear that we only build types

doc for prepare vs prepublishOnly is here: https://docs.npmjs.com/cli/v8/using-npm/scripts

* no jsx in this lib

* relative imports from the root, so that it works from types/

* remove unused hyparquet.d.ts + report differences to jsdoc in files

* try to understand if this is the cause of the failing CI check

tsc fails: https://github.com/hyparam/hyparquet/actions/runs/12040954822/job/33571851170?pr=46

* Revert "try to understand if this is the cause of the failing CI check"

This reverts commit 5e2fc8ca179064369de71793ab1cda3facefddc7.

* not sure what happens, but we just need to ensure the types are created correctly

* increment version

* Explicitly export types for use in downstream typescript projects

* Use new typescript jsdoc imports for smaller package

* Combine some files and use @import jsdoc

* use the local typescript

---------

Co-authored-by: Kenny Daniel <platypii@gmail.com>
2024-12-02 17:47:42 +01:00
Kenny Daniel
82b25df871
Update dependencies 2024-11-29 14:11:04 -08:00
Carlos Bardasano
9aacd15349
Fix metadata timestamp conversion (#45)
* Fix metadata timestamp conversion

* Remove redundant check

Co-authored-by: Kenny Daniel <platypii@gmail.com>

---------

Co-authored-by: Kenny Daniel <platypii@gmail.com>
2024-11-29 13:48:06 -08:00
Sylvain Lesage
e738643745
Publish 1.6.1 - fix type of utils and update the doc (#44)
* Publish 1.6.1 - fix types

* update the doc
2024-11-22 21:19:34 +01:00
Sylvain Lesage
6ec836dac5
pass requestInit to fetch utils (#34)
* pass requestInit to fetch utils

It will allow authentication

* add tests
2024-11-08 22:22:30 +01:00
Kenny Daniel
5d21b09b7a
Export cachedAsyncBuffer 2024-10-16 01:29:31 -07:00
Kenny Daniel
bb202dfdeb
Fix util export types 2024-10-04 21:46:35 -07:00
Matthew Peveler
c8033621f9 Export additional types
Signed-off-by: Matthew Peveler <mpeveler@timescale.com>
2024-10-04 21:30:32 -07:00
Kenny Daniel
e6301a8bc8
demo: use web worker for parquet parsing to avoid blocking main thread 2024-09-25 02:22:30 -07:00
Kenny Daniel
9d49dabc15
Query api 2024-09-24 21:01:04 -07:00
Kenny Daniel
9a2f4fdcba
Update dependencies 2024-09-24 16:54:44 -07:00
Kenny Daniel
df02229407
Promisified parquetReadObjects function 2024-08-20 11:30:39 -07:00
Kenny
a2024a781c
Parse column and offset indexes (#29)
* Parse indicies

* Add parsed offset indices

* Add parsed column indices

* Test readColumnIndex and readOffsetIndex

* Add more parsed offset indices

* Remove unnecessary toJson when loading expected results

* Add length checks to convertMetadata

* Rename indicies.js to indexes.js

* Rename indices.test.js to indexes.test.js

* Rename *_indices.json to *_indexes.json

* Use asyncBufferFromFile in indexes.test.js

---------

Co-authored-by: Brian Park <park-brian@users.noreply.github.com>
2024-08-18 18:23:54 -07:00
Kenny Daniel
b1c8a1dd8b
Revert onComplete type signature change from #25
The type change caused a lot of downstream type errors.
If you pass rowFormat: 'object' then it will return Record<string, any>[]
instead of any[][]. This means the types are not aligned with behavior.
Will figure out how to fix it later, for now don't want break downstream projects.
2024-08-14 22:00:32 -07:00
ctranstrum
8ace1a47d2
return column names in the order requested (#27)
* return column names in the order requested

* retain correct ordering of columns in object rows as well
2024-08-14 00:01:47 -07:00
ctranstrum
d13d52b606
Add an option to return each row as an object keyed by column name (#25)
* Add an option to return each row as an object keyed by column name

* rename option to rowFormat and address feedback
2024-08-13 09:15:59 -07:00
Kenny Daniel
8e0235413a
Update dependencies 2024-08-02 16:12:57 -07:00
Kenny Daniel
0e807587e1
Prevent webpack from trying to include node fs 2024-08-02 16:04:22 -07:00
Kenny Daniel
c6c79c05ca
Fix for issue #23 nested struct assembly 2024-08-02 14:47:04 -07:00
Kenny Daniel
83e06c3465
Export asyncBufferFromFile, asyncBufferFromUrl and add to README 2024-07-26 17:18:36 -07:00
Kenny Daniel
a5122e61d6
utils: asyncBufferFromFile 2024-07-26 15:07:47 -07:00
Kenny Daniel
5188b3c764
utils: asyncBufferFromUrl 2024-07-26 14:12:35 -07:00
Kenny Daniel
58a6b963a1 Fix out of order columns in onComplete 2024-07-22 21:45:18 -07:00
Kenny Daniel
9cd2e3b666
Fix row limit for structs 2024-06-25 17:41:50 -07:00
Kenny Daniel
72cdffef7c
Fix readBitPacked for 17+ bitwidth 2024-06-13 00:10:45 -07:00
Kenny Daniel
ddb8b16cd0
Fix handling of multiple pages 2024-06-07 23:16:04 -07:00
Kenny Daniel
6d769a4336
Demo: move to folder, typecheck, and render column indices 2024-05-31 19:40:44 -07:00
Kenny Daniel
9db378de2f
toJson tests 2024-05-28 14:24:12 -07:00
Kenny Daniel
f28735c0ce
readVarInt tests 2024-05-28 14:18:04 -07:00
Kenny Daniel
490d1ec800
Fix 3-byte RLE 2024-05-28 13:58:02 -07:00
Kenny Daniel
36f5b4f043
Move decompressPage to avoid circular dependency chain 2024-05-27 12:54:42 -07:00
Kenny Daniel
17f412c2f5
Convert logical date units 2024-05-24 16:55:13 -07:00
Kenny Daniel
efdbf459a5
Convert date and decimal stats 2024-05-24 15:22:59 -07:00
Kenny Daniel
a56420de2f
Parse metadata TimeUnit 2024-05-24 15:17:20 -07:00
Kenny Daniel
9aebdb2917
Convert dictionary before dereferencing, and check encoding 2024-05-24 14:33:29 -07:00