| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | --- | 
					
						
							|  |  |  | title: Synthetic DOM | 
					
						
							| 
									
										
										
										
											2024-04-01 10:44:10 +00:00
										 |  |  | pagination_prev: demos/net/headless/index | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | --- | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | import current from '/version.js'; | 
					
						
							|  |  |  | import CodeBlock from '@theme/CodeBlock'; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | [SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing | 
					
						
							|  |  |  | data from spreadsheets. | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | SheetJS offers three methods to directly process HTML DOM TABLE elements: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | - `table_to_sheet`[^1] generates a SheetJS worksheet[^2] from a TABLE element | 
					
						
							|  |  |  | - `table_to_book`[^3] generates a SheetJS workbook[^4] from a TABLE element | 
					
						
							|  |  |  | - `sheet_add_dom`[^5] adds data from a TABLE element to an existing worksheet | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | These methods work in the web browser. NodeJS and other server-side platforms | 
					
						
							|  |  |  | traditionally lack a DOM implementation, but third-party modules fill the gap. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | This demo covers synthetic DOM implementations for non-browser platforms.  We'll | 
					
						
							|  |  |  | explore how to use SheetJS DOM methods in server-side environments to parse | 
					
						
							|  |  |  | tables and export data to spreadsheets. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | :::tip pass | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | The most robust approach for server-side processing is to automate a headless | 
					
						
							|  |  |  | web browser. ["Browser Automation"](/docs/demos/net/headless) includes demos. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ## Integration Details
 | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | Synthetic DOM implementations typically provide a function that accept a HTML | 
					
						
							|  |  |  | string and return an object that represents `document`. An API method such as | 
					
						
							|  |  |  | `getElementsByTagName` or `querySelector` can pull TABLE elements. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```mermaid | 
					
						
							|  |  |  | flowchart LR | 
					
						
							| 
									
										
										
										
											2024-04-22 19:38:55 +00:00
										 |  |  |   subgraph Synthetic DOM Operations | 
					
						
							|  |  |  |     html(HTML\nstring) | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  |     doc{{`document`\nDOM Object}} | 
					
						
							| 
									
										
										
										
											2024-04-22 19:38:55 +00:00
										 |  |  |   end | 
					
						
							|  |  |  |   subgraph SheetJS Operations | 
					
						
							|  |  |  |     table{{DOM\nTable}} | 
					
						
							|  |  |  |     wb(((SheetJS\nWorkbook))) | 
					
						
							|  |  |  |     file(workbook\nfile) | 
					
						
							|  |  |  |   end | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  |   html --> |Library\n\n| doc | 
					
						
							|  |  |  |   doc --> |DOM\nAPI| table | 
					
						
							|  |  |  |   table --> |`table_to_book`\n\n| wb | 
					
						
							| 
									
										
										
										
											2024-04-22 19:38:55 +00:00
										 |  |  |   wb --> |`writeFile`\n\n| file | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | SheetJS methods use features that may be missing from some DOM implementations. | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ### Table rows
 | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | The `rows` property of TABLE elements is a list of TR row children. This list | 
					
						
							|  |  |  | automatically updates when rows are added and deleted. | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | SheetJS methods do not mutate `rows`. Assuming there are no nested tables, the | 
					
						
							|  |  |  | `rows` property can be created using `getElementsByTagName`: | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							|  |  |  | tbl.rows = Array.from(tbl.getElementsByTagName("tr")); | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ### Row cells
 | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | The `cells` property of TR elements is a list of TD cell children. This list | 
					
						
							|  |  |  | automatically updates when cells are added and deleted. | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | SheetJS methods do not mutate `cells`. Assuming there are no nested tables, the | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | `cells` property can be created using `getElementsByTagName`: | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ```js | 
					
						
							|  |  |  | tbl.rows.forEach(row => row.cells = Array.from(row.getElementsByTagName("td"))); | 
					
						
							|  |  |  | ``` | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ## NodeJS
 | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ### JSDOM
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | [JSDOM](https://git.io/jsdom) is a DOM implementation for NodeJS. The synthetic | 
					
						
							|  |  |  | DOM elements are compatible with SheetJS methods. | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | The following example scrapes the first table from the file `SheetJSTable.html` | 
					
						
							|  |  |  | and generates a XLSX workbook: | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ```js title="SheetJSDOM.js" | 
					
						
							|  |  |  | const XLSX = require("xlsx"); | 
					
						
							|  |  |  | const { readFileSync } = require("fs"); | 
					
						
							|  |  |  | const { JSDOM } = require("jsdom"); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | /* obtain HTML string.  This example reads from SheetJSTable.html */ | 
					
						
							|  |  |  | const html_str = readFileSync("SheetJSTable.html", "utf8"); | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | // highlight-start | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | /* get first TABLE element */ | 
					
						
							|  |  |  | const doc = new JSDOM(html_str).window.document.querySelector("table"); | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | /* generate workbook */ | 
					
						
							|  |  |  | const workbook = XLSX.utils.table_to_book(doc); | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | // highlight-end | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | XLSX.writeFile(workbook, "SheetJSDOM.xlsx"); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-08 04:47:04 +00:00
										 |  |  | <details> | 
					
						
							|  |  |  |   <summary><b>Complete Demo</b> (click to show)</summary> | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | :::note Tested Deployments | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | This demo was last tested on 2024 January 27 against JSDOM `24.0.0` | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 1) Install SheetJS and JSDOM libraries: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <CodeBlock language="bash">{`\ | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz jsdom@24.0.0`} | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | </CodeBlock> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 2) Save the previous codeblock to `SheetJSDOM.js`. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 3) Download [the sample `SheetJSTable.html`](pathname:///dom/SheetJSTable.html): | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | curl -LO https://docs.sheetjs.com/dom/SheetJSTable.html | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 4) Run the script: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | node SheetJSDOM.js | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The script will create a file `SheetJSDOM.xlsx` that can be opened. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | </details> | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ### HappyDOM
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | HappyDOM provides a DOM framework for NodeJS. For the tested version (`13.3.1`), | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | the following patches were needed: | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | - TABLE `rows` property (explained above) | 
					
						
							|  |  |  | - TR `cells` property (explained above) | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-08 04:47:04 +00:00
										 |  |  | <details> | 
					
						
							|  |  |  |   <summary><b>Complete Demo</b> (click to show)</summary> | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | :::note Tested Deployments | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | This demo was last tested on 2024 January 27 against HappyDOM `13.3.1` | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 1) Install SheetJS and HappyDOM libraries: | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | <CodeBlock language="bash">{`\ | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz happy-dom@13.3.1`} | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | </CodeBlock> | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 2) Download [the sample script `SheetJSHappyDOM.js`](pathname:///dom/SheetJSHappyDOM.js): | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ```bash | 
					
						
							|  |  |  | curl -LO https://docs.sheetjs.com/dom/SheetJSHappyDOM.js | 
					
						
							|  |  |  | ``` | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-29 03:29:45 +00:00
										 |  |  | 3) Download [the sample `SheetJSTable.html`](pathname:///dom/SheetJSTable.html): | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | curl -LO https://docs.sheetjs.com/dom/SheetJSTable.html | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 4) Run the script: | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ```bash | 
					
						
							|  |  |  | node SheetJSHappyDOM.js | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The script will create a file `SheetJSHappyDOM.xlsx` that can be opened. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | </details> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### XMLDOM
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | [XMLDOM](https://xmldom.org/) provides a DOM framework for NodeJS. For the | 
					
						
							|  |  |  | tested version (`0.8.10`), the following patches were needed: | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | - TABLE `rows` property (explained above) | 
					
						
							|  |  |  | - TR `cells` property (explained above) | 
					
						
							|  |  |  | - Element `innerHTML` property: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```js | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | Object.defineProperty(tbl.__proto__, "innerHTML", { get: function() { | 
					
						
							| 
									
										
										
										
											2024-04-22 19:38:55 +00:00
										 |  |  |   var outerHTML = new XMLSerializer().serializeToString(this); | 
					
						
							|  |  |  |   if(outerHTML.match(/</g).length == 1) return ""; | 
					
						
							|  |  |  |   return outerHTML.slice(0, outerHTML.lastIndexOf("</")).replace(/<[^"'>]*(("[^"]*"|'[^']*')[^"'>]*)*>/, ""); | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | }}); | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | ``` | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-08 04:47:04 +00:00
										 |  |  | <details> | 
					
						
							|  |  |  |   <summary><b>Complete Demo</b> (click to show)</summary> | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | :::note Tested Deployments | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-03-14 08:25:08 +00:00
										 |  |  | This demo was last tested on 2024 March 12 against XMLDOM `0.8.10` | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 1) Install SheetJS and XMLDOM libraries: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <CodeBlock language="bash">{`\ | 
					
						
							|  |  |  | npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz @xmldom/xmldom@0.8.10`} | 
					
						
							|  |  |  | </CodeBlock> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 2) Download [the sample script `SheetJSXMLDOM.js`](pathname:///dom/SheetJSXMLDOM.js): | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | curl -LO https://docs.sheetjs.com/dom/SheetJSXMLDOM.js | 
					
						
							| 
									
										
										
										
											2023-05-20 21:37:10 +00:00
										 |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 3) Run the script: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | node SheetJSXMLDOM.js | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The script will create a file `SheetJSXMLDOM.xlsx` that can be opened. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | </details> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### CheerioJS
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | :::caution pass | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | Cheerio does not support a number of fundamental properties out of the box. They | 
					
						
							|  |  |  | can be shimmed, but it is strongly recommended to use a more compliant library. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | [CheerioJS](https://cheerio.js.org/) provides a DOM-like framework for NodeJS. | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | [`SheetJSCheerio.js`](pathname:///dom/SheetJSCheerio.js) implements the missing | 
					
						
							|  |  |  | features to ensure that SheetJS DOM methods can process TABLE elements. | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-08 04:47:04 +00:00
										 |  |  | <details> | 
					
						
							|  |  |  |   <summary><b>Complete Demo</b> (click to show)</summary> | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | :::note Tested Deployments | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-03-14 08:25:08 +00:00
										 |  |  | This demo was last tested on 2024 March 12 against Cheerio `1.0.0-rc.12` | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 1) Install SheetJS and CheerioJS libraries: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <CodeBlock language="bash">{`\ | 
					
						
							|  |  |  | npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz cheerio@1.0.0-rc.12`} | 
					
						
							|  |  |  | </CodeBlock> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 2) Download [the sample script `SheetJSCheerio.js`](pathname:///dom/SheetJSCheerio.js): | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | curl -LO https://docs.sheetjs.com/dom/SheetJSCheerio.js | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 3) Download [the sample `SheetJSTable.html`](pathname:///dom/SheetJSTable.html): | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | curl -LO https://docs.sheetjs.com/dom/SheetJSTable.html | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 4) Run the script: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | node SheetJSCheerio.js | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The script will create a file `SheetJSCheerio.xlsx` that can be opened. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | </details> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Other Platforms
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### DenoDOM
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | [DenoDOM](https://deno.land/x/deno_dom) provides a DOM framework for Deno. For | 
					
						
							| 
									
										
										
										
											2024-06-20 07:30:34 +00:00
										 |  |  | the tested version (`0.1.46`), the following patches were needed: | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | - TABLE `rows` property (explained above) | 
					
						
							|  |  |  | - TR `cells` property (explained above) | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | This example fetches [a sample table](pathname:///dom/SheetJSTable.html): | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-06-25 09:36:58 +00:00
										 |  |  | <CodeBlock language="ts" title="SheetJSDenoDOM.ts">{`\ | 
					
						
							|  |  |  | // @deno-types="https://cdn.sheetjs.com/xlsx-${current}/package/types/index.d.ts" | 
					
						
							|  |  |  | import * as XLSX from 'https://cdn.sheetjs.com/xlsx-${current}/package/xlsx.mjs'; | 
					
						
							|  |  |  | \n\ | 
					
						
							| 
									
										
										
										
											2024-06-20 07:30:34 +00:00
										 |  |  | import { DOMParser } from 'https://deno.land/x/deno_dom@v0.1.46/deno-dom-wasm.ts'; | 
					
						
							| 
									
										
										
										
											2023-06-25 09:36:58 +00:00
										 |  |  | \n\ | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | const doc = new DOMParser().parseFromString( | 
					
						
							| 
									
										
										
										
											2023-06-25 09:36:58 +00:00
										 |  |  |   await (await fetch('https://docs.sheetjs.com/dom/SheetJSTable.html')).text(), | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  |   "text/html", | 
					
						
							|  |  |  | )!; | 
					
						
							|  |  |  | // highlight-start | 
					
						
							|  |  |  | const tbl = doc.querySelector("table"); | 
					
						
							| 
									
										
										
										
											2023-06-25 09:36:58 +00:00
										 |  |  | \n\ | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | /* patch DenoDOM element */ | 
					
						
							|  |  |  | tbl.rows = tbl.querySelectorAll("tr"); | 
					
						
							|  |  |  | tbl.rows.forEach(row => row.cells = row.querySelectorAll("td, th")) | 
					
						
							| 
									
										
										
										
											2023-06-25 09:36:58 +00:00
										 |  |  | \n\ | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | /* generate workbook */ | 
					
						
							|  |  |  | const workbook = XLSX.utils.table_to_book(tbl); | 
					
						
							|  |  |  | // highlight-end | 
					
						
							| 
									
										
										
										
											2023-06-25 09:36:58 +00:00
										 |  |  | XLSX.writeFile(workbook, "SheetJSDenoDOM.xlsx");`} | 
					
						
							|  |  |  | </CodeBlock> | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-08 04:47:04 +00:00
										 |  |  | <details> | 
					
						
							|  |  |  |   <summary><b>Complete Demo</b> (click to show)</summary> | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | :::note Tested Deployments | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-06-20 07:30:34 +00:00
										 |  |  | This demo was tested in the following deployments: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | | Architecture | DenoDOM | Deno   | Date       | | 
					
						
							|  |  |  | |:-------------|:--------|:-------|:-----------| | 
					
						
							|  |  |  | | `darwin-x64` | 0.1.46  | 1.44.4 | 2024-06-19 | | 
					
						
							|  |  |  | | `darwin-arm` | 0.1.46  | 1.44.4 | 2024-06-19 | | 
					
						
							| 
									
										
										
										
											2023-05-18 22:41:23 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 1) Save the previous codeblock to `SheetJSDenoDOM.ts`. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 2) Run the script with `--allow-net` and `--allow-write` entitlements: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | deno run --allow-net --allow-write SheetJSDenoDOM.ts | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The script will create a file `SheetJSDenoDOM.xlsx` that can be opened. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-09-11 04:44:15 +00:00
										 |  |  | </details> | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | [^1]: See [`table_to_sheet` in "HTML" Utilities](/docs/api/utilities/html#create-new-sheet) | 
					
						
							| 
									
										
										
										
											2024-04-08 03:55:10 +00:00
										 |  |  | [^2]: See ["Worksheet Object" in "SheetJS Data Model"](/docs/csf/sheet) for more details. | 
					
						
							| 
									
										
										
										
											2024-01-17 20:22:38 +00:00
										 |  |  | [^3]: See [`table_to_book` in "HTML" Utilities](/docs/api/utilities/html#create-new-sheet) | 
					
						
							|  |  |  | [^4]: See ["Workbook Object" in "SheetJS Data Model"](/docs/csf/book) for more details. | 
					
						
							|  |  |  | [^5]: See [`sheet_add_dom` in "HTML" Utilities](/docs/api/utilities/html#add-to-sheet) |