forked from sheetjs/sheetjs
- merged DBF from js-harb (fixes #407 h/t @joefreire) - updated codepage to 1.8.0 - stub for macro/dialog sheet parsing (fixes #292 h/t @GenoD) - XLSB/XLSM write vbaraw (fixes #606 h/t @johnothetree) - phantomjs demo (fixes #184 h/t @machinewu)
4.3 KiB
4.3 KiB
Parsing Options
The exported read and readFile functions accept an options argument:
| Option Name | Default | Description |
|---|---|---|
| type | Input data encoding (see Input Type below) | |
| cellFormula | true | Save formulae to the .f field |
| cellHTML | true | Parse rich text and save HTML to the .h field |
| cellNF | false | Save number format string to the .z field |
| cellStyles | false | Save style/theme info to the .s field |
| cellDates | false | Store dates as type d (default is n) |
| sheetStubs | false | Create cell objects of type z for stub cells |
| sheetRows | 0 | If >0, read the first sheetRows rows ** |
| bookDeps | false | If true, parse calculation chains |
| bookFiles | false | If true, add raw files to book object ** |
| bookProps | false | If true, only parse enough to get book metadata ** |
| bookSheets | false | If true, only parse enough to get the sheet names |
| bookVBA | false | If true, expose vbaProject.bin to vbaraw field ** |
| password | "" | If defined and file is encrypted, use password ** |
| WTF | false | If true, throw errors on unexpected file features ** |
- Even if
cellNFis false, formatted text will be generated and saved to.w - In some cases, sheets may be parsed even if
bookSheetsis false. bookSheetsandbookPropscombine to give both sets of informationDepswill be an empty object ifbookDepsis falsybookFilesbehavior depends on file type:keysarray (paths in the ZIP) for ZIP-based formatsfileshash (mapping paths to objects representing the files) for ZIPcfbobject for formats using CFB containers
sheetRows-1rows will be generated when looking at the JSON object output (since the header row is counted as a row when parsing the data)bookVBAmerely exposes the raw vba object. It does not parse the data.- Currently only XOR encryption is supported. Unsupported error will be thrown for files employing other encryption methods.
- WTF is mainly for development. By default, the parser will suppress read
errors on single worksheets, allowing you to read from the worksheets that do
parse properly. Setting
WTF:1forces those errors to be thrown.
The defaults are enumerated in bits/84_defaults.js
Input Type
Strings can be interpreted in multiple ways. The type parameter for read
tells the library how to parse the data argument:
type |
expected input |
|---|---|
"base64" |
string: base64 encoding of the file |
"binary" |
string: binary string (n-th byte is data.charCodeAt(n)) |
"buffer" |
nodejs Buffer |
"array" |
array: array of 8-bit unsigned int (n-th byte is data[n]) |
"file" |
string: filename that will be read and processed (nodejs only) |
Guessing File Type
Excel and other spreadsheet tools read the first few bytes and apply other
heuristics to determine a file type. This enables file type punning: renaming
files with the .xls extension will tell your computer to use Excel to open the
file but Excel will know how to handle it. This library applies similar logic:
| Byte 0 | Raw File Type | Spreadsheet Types |
|---|---|---|
0xD0 |
CFB Container | BIFF 5/8 or password-protected XLSX/XLSB |
0x09 |
BIFF Stream | BIFF 2/3/4/5 |
0x3C |
XML/HTML | SpreadsheetML / Flat ODS / UOS1 / HTML / plaintext |
0x50 |
ZIP Archive | XLSB or XLSX/M or ODS or UOS2 or plaintext |
0xFE |
UTF8 Text | SpreadsheetML or Flat ODS or UOS1 or plaintext |
DBF files are detected based on the first byte as well as the third and fourth bytes (corresponding to month and day of the file date)