| 
									
										
										
										
											2022-02-04 05:29:01 +00:00
										 |  |  | ### Usage
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Most scenarios involving spreadsheets and data can be broken into 5 parts: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 1) **Acquire Data**:  Data may be stored anywhere: local or remote files, | 
					
						
							|  |  |  |    databases, HTML TABLE, or even generated programmatically in the web browser. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 2) **Extract Data**:  For spreadsheet files, this involves parsing raw bytes to | 
					
						
							|  |  |  |    read the cell data. For general JS data, this involves reshaping the data. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 3) **Process Data**:  From generating summary statistics to cleaning data | 
					
						
							|  |  |  |    records, this step is the heart of the problem. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 4) **Package Data**:  This can involve making a new spreadsheet or serializing | 
					
						
							|  |  |  |    with `JSON.stringify` or writing XML or simply flattening data for UI tools. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 5) **Release Data**:  Spreadsheet files can be uploaded to a server or written | 
					
						
							|  |  |  |    locally.  Data can be presented to users in an HTML TABLE or data grid. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | A common problem involves generating a valid spreadsheet export from data stored | 
					
						
							|  |  |  | in an HTML table.  In this example, an HTML TABLE on the page will be scraped, | 
					
						
							|  |  |  | a row will be added to the bottom with the date of the report, and a new file | 
					
						
							|  |  |  | will be generated and downloaded locally. `XLSX.writeFile` takes care of | 
					
						
							|  |  |  | packaging the data and attempting a local download: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | // Acquire Data (reference to the HTML table) | 
					
						
							|  |  |  | var table_elt = document.getElementById("my-table-id"); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | // Extract Data (create a workbook object from the table) | 
					
						
							|  |  |  | var workbook = XLSX.utils.table_to_book(table_elt); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | // Process Data (add a new row) | 
					
						
							| 
									
										
										
										
											2022-02-05 13:59:25 +00:00
										 |  |  | var ws = workbook.Sheets["Sheet1"]; | 
					
						
							|  |  |  | XLSX.utils.sheet_add_aoa(ws, [["Created "+new Date().toISOString()]], {origin:-1}); | 
					
						
							| 
									
										
										
										
											2022-02-04 05:29:01 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | // Package and Release Data (`writeFile` tries to write and save an XLSB file) | 
					
						
							|  |  |  | XLSX.writeFile(workbook, "Report.xlsb"); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | This library tries to simplify steps 2 and 4 with functions to extract useful | 
					
						
							|  |  |  | data from spreadsheet files (`read` / `readFile`) and generate new spreadsheet | 
					
						
							| 
									
										
										
										
											2022-02-05 13:59:25 +00:00
										 |  |  | files from data (`write` / `writeFile`).  Additional utility functions like | 
					
						
							|  |  |  | `table_to_book` work with other common data sources like HTML tables. | 
					
						
							| 
									
										
										
										
											2022-02-04 05:29:01 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | This documentation and various demo projects cover a number of common scenarios | 
					
						
							|  |  |  | and approaches for steps 1 and 5. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Utility functions help with step 3. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-03-03 08:35:39 +00:00
										 |  |  | ["Acquiring and Extracting Data"](#acquiring-and-extracting-data) describes | 
					
						
							|  |  |  | solutions for common data import scenarios. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ["Packaging and Releasing Data"](#packaging-and-releasing-data) describes | 
					
						
							|  |  |  | solutions for common data export scenarios. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ["Processing Data"](#packaging-and-releasing-data) describes solutions for | 
					
						
							|  |  |  | common workbook processing and manipulation scenarios. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ["Utility Functions"](#utility-functions) details utility functions for | 
					
						
							|  |  |  | translating JSON Arrays and other common JS structures into worksheet objects. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-02-05 13:59:25 +00:00
										 |  |  | ### The Zen of SheetJS
 | 
					
						
							| 
									
										
										
										
											2022-02-04 05:29:01 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | _Data processing should fit in any workflow_ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The library does not impose a separate lifecycle.  It fits nicely in websites | 
					
						
							|  |  |  | and apps built using any framework.  The plain JS data objects play nice with | 
					
						
							|  |  |  | Web Workers and future APIs. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _JavaScript is a powerful language for data processing_ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The ["Common Spreadsheet Format"](#common-spreadsheet-format) is a simple object | 
					
						
							|  |  |  | representation of the core concepts of a workbook.  The various functions in the | 
					
						
							|  |  |  | library provide low-level tools for working with the object. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For friendly JS processing, there are utility functions for converting parts of | 
					
						
							| 
									
										
										
										
											2022-02-05 13:59:25 +00:00
										 |  |  | a worksheet to/from an Array of Arrays.  The following example combines powerful | 
					
						
							|  |  |  | JS Array methods with a network request library to download data, select the | 
					
						
							|  |  |  | information we want and create a workbook file: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <details> | 
					
						
							|  |  |  |   <summary><b>Get Data from a JSON Endpoint and Generate a Workbook</b> (click to show)</summary> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The goal is to generate a XLSB workbook of US President names and birthdays. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | **Acquire Data** | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _Raw Data_ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <https://theunitedstates.io/congress-legislators/executive.json> has the desired | 
					
						
							|  |  |  | data.  For example, John Adams: | 
					
						
							| 
									
										
										
										
											2022-02-04 05:29:01 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							| 
									
										
										
										
											2022-02-05 13:59:25 +00:00
										 |  |  | { | 
					
						
							|  |  |  |   "id": { /* (data omitted) */ }, | 
					
						
							|  |  |  |   "name": { | 
					
						
							|  |  |  |     "first": "John",          // <-- first name | 
					
						
							|  |  |  |     "last": "Adams"           // <-- last name | 
					
						
							|  |  |  |   }, | 
					
						
							|  |  |  |   "bio": { | 
					
						
							|  |  |  |     "birthday": "1735-10-19", // <-- birthday | 
					
						
							|  |  |  |     "gender": "M" | 
					
						
							|  |  |  |   }, | 
					
						
							|  |  |  |   "terms": [ | 
					
						
							|  |  |  |     { "type": "viceprez", /* (other fields omitted) */ }, | 
					
						
							|  |  |  |     { "type": "viceprez", /* (other fields omitted) */ }, | 
					
						
							|  |  |  |     { "type": "prez", /* (other fields omitted) */ } // <-- look for "prez" | 
					
						
							|  |  |  |   ] | 
					
						
							|  |  |  | } | 
					
						
							| 
									
										
										
										
											2022-02-04 05:29:01 +00:00
										 |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-02-05 13:59:25 +00:00
										 |  |  | _Filtering for Presidents_ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The dataset includes Aaron Burr, a Vice President who was never President! | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | `Array#filter` creates a new array with the desired rows.  A President served | 
					
						
							|  |  |  | at least one term with `type` set to `"prez"`.  To test if a particular row has | 
					
						
							|  |  |  | at least one `"prez"` term, `Array#some` is another native JS function.  The | 
					
						
							|  |  |  | complete filter would be: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | const prez = raw_data.filter(row => row.terms.some(term => term.type === "prez")); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _Lining up the data_ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For this example, the name will be the first name combined with the last name | 
					
						
							|  |  |  | (`row.name.first + " " + row.name.last`) and the birthday will be the subfield | 
					
						
							|  |  |  | `row.bio.birthday`.  Using `Array#map`, the dataset can be massaged in one call: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | const rows = prez.map(row => ({ | 
					
						
							|  |  |  |   name: row.name.first + " " + row.name.last, | 
					
						
							|  |  |  |   birthday: row.bio.birthday | 
					
						
							|  |  |  | })); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The result is an array of "simple" objects with no nesting: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | [ | 
					
						
							|  |  |  |   { name: "George Washington", birthday: "1732-02-22" }, | 
					
						
							|  |  |  |   { name: "John Adams", birthday: "1735-10-19" }, | 
					
						
							|  |  |  |   // ... one row per President | 
					
						
							|  |  |  | ] | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | **Extract Data** | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | With the cleaned dataset, `XLSX.utils.json_to_sheet` generates a worksheet: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | const worksheet = XLSX.utils.json_to_sheet(rows); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | `XLSX.utils.book_new` creates a new workbook and `XLSX.utils.book_append_sheet` | 
					
						
							|  |  |  | appends a worksheet to the workbook. The new worksheet will be called "Dates": | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | const workbook = XLSX.utils.book_new(); | 
					
						
							|  |  |  | XLSX.utils.book_append_sheet(workbook, worksheet, "Dates"); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | **Process Data** | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _Fixing headers_ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | By default, `json_to_sheet` creates a worksheet with a header row. In this case, | 
					
						
							|  |  |  | the headers come from the JS object keys: "name" and "birthday". | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The headers are in cells A1 and B1.  `XLSX.utils.sheet_add_aoa` can write text | 
					
						
							|  |  |  | values to the existing worksheet starting at cell A1: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | XLSX.utils.sheet_add_aoa(worksheet, [["Name", "Birthday"]], { origin: "A1" }); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _Fixing Column Widths_ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Some of the names are longer than the default column width.  Column widths are | 
					
						
							|  |  |  | set by [setting the `"!cols"` worksheet property](#row-and-column-properties). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The following line sets the width of column A to approximately 10 characters: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | worksheet["!cols"] = [ { wch: 10 } ]; // set column A width to 10 characters | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | One `Array#reduce` call over `rows` can calculate the maximum width: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | const max_width = rows.reduce((w, r) => Math.max(w, r.name.length), 10); | 
					
						
							|  |  |  | worksheet["!cols"] = [ { wch: max_width } ]; | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Note: If the starting point was a file or HTML table, `XLSX.utils.sheet_to_json` | 
					
						
							|  |  |  | will generate an array of JS objects. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | **Package and Release Data** | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | `XLSX.writeFile` creates a spreadsheet file and tries to write it to the system. | 
					
						
							|  |  |  | In the browser, it will try to prompt the user to download the file.  In NodeJS, | 
					
						
							|  |  |  | it will write to the local directory. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | XLSX.writeFile(workbook, "Presidents.xlsx"); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | **Complete Example** | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | // Uncomment the next line for use in NodeJS: | 
					
						
							|  |  |  | // const XLSX = require("xlsx"), axios = require("axios"); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | (async() => { | 
					
						
							|  |  |  |   /* fetch JSON data and parse */ | 
					
						
							|  |  |  |   const url = "https://theunitedstates.io/congress-legislators/executive.json"; | 
					
						
							|  |  |  |   const raw_data = (await axios(url, {responseType: "json"})).data; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   /* filter for the Presidents */ | 
					
						
							|  |  |  |   const prez = raw_data.filter(row => row.terms.some(term => term.type === "prez")); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   /* flatten objects */ | 
					
						
							|  |  |  |   const rows = prez.map(row => ({ | 
					
						
							|  |  |  |     name: row.name.first + " " + row.name.last, | 
					
						
							|  |  |  |     birthday: row.bio.birthday | 
					
						
							|  |  |  |   })); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   /* generate worksheet and workbook */ | 
					
						
							|  |  |  |   const worksheet = XLSX.utils.json_to_sheet(rows); | 
					
						
							|  |  |  |   const workbook = XLSX.utils.book_new(); | 
					
						
							|  |  |  |   XLSX.utils.book_append_sheet(workbook, worksheet, "Dates"); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   /* fix headers */ | 
					
						
							|  |  |  |   XLSX.utils.sheet_add_aoa(worksheet, [["Name", "Birthday"]], { origin: "A1" }); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   /* calculate column width */ | 
					
						
							|  |  |  |   const max_width = rows.reduce((w, r) => Math.max(w, r.name.length), 10); | 
					
						
							|  |  |  |   worksheet["!cols"] = [ { wch: max_width } ]; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   /* create an XLSX file and try to save to Presidents.xlsx */ | 
					
						
							|  |  |  |   XLSX.writeFile(workbook, "Presidents.xlsx"); | 
					
						
							|  |  |  | })(); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For use in the web browser, assuming the snippet is saved to `snippet.js`, | 
					
						
							|  |  |  | script tags should be used to include the `axios` and `xlsx` standalone builds: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```html | 
					
						
							|  |  |  | <script src="https://unpkg.com/xlsx/dist/xlsx.full.min.js"></script> | 
					
						
							|  |  |  | <script src="https://unpkg.com/axios/dist/axios.min.js"></script> | 
					
						
							|  |  |  | <script src="snippet.js"></script> | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | </details> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _File formats are implementation details_ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The parser covers a wide gamut of common spreadsheet file formats to ensure that | 
					
						
							|  |  |  | "HTML-saved-as-XLS" files work as well as actual XLS or XLSX files. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The writer supports a number of common output formats for broad compatibility | 
					
						
							|  |  |  | with the data ecosystem. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | To the greatest extent possible, data processing code should not have to worry | 
					
						
							|  |  |  | about the specific file formats involved. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-02-04 05:29:01 +00:00
										 |  |  | 
 |