2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								# Headless Browsers
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-09-24 23:40:09 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								The library, eschewing unstable and nascent ECMAScript features, plays nicely
							 | 
						
					
						
							
								
									
										
										
										
											2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								with most headless browsers.  This demo shows a few common headless scenarios.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								NodeJS does not ship with its own layout engine.  For advanced HTML exports, a
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								headless browser is generally indistinguishable from a browser process.
							 | 
						
					
						
							
								
									
										
										
										
											2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								## Chromium Automation with Puppeteer
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-17 21:48:05 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								[Puppeteer](https://pptr.dev/) enables headless Chromium automation.
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								[`html.js`](./html.js) shows a dedicated script for converting an HTML file to
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								XLSB using puppeteer.  The first argument is the path to the HTML file.  The
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								script writes to `output.xlsb`:
							 | 
						
					
						
							
								
									
										
										
										
											2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								```bash
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								# read from test.html and write to output.xlsb
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								$ node html.js test.html
							 | 
						
					
						
							
								
									
										
										
										
											2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								```
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								The script pulls up the webpage using headless Chromium and adds a script tag
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								reference to the standalone browser build.  That will make the `XLSX` variable
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								available to future scripts added in the page!  The browser context is not able
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								to save the file using `writeFile`, so the demo generates the XLSB spreadsheet
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								bytes with the `base64` type, sends the string back to the main process, and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								uses `fs.writeFileSync` to write the file.
							 | 
						
					
						
							
								
									
										
										
										
											2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								## WebKit Automation with PhantomJS
							 | 
						
					
						
							
								
									
										
										
										
											2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								This was tested using [PhantomJS 2.1.1](https://phantomjs.org/download.html)
							 | 
						
					
						
							
								
									
										
										
										
											2017-12-09 07:17:25 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								```bash
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								$ phantomjs phantomjs.js
							 | 
						
					
						
							
								
									
										
										
										
											2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								```
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								The flow is similar to the Puppeteer flow (scrape table and generate workbook in
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								website context, copy string back, write string to file from main process).
							 | 
						
					
						
							
								
									
										
										
										
											2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-08 09:50:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								The `binary` type generates strings that can be written in PhantomJS using the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								`fs.write` method with mode `"wb"`.
							 | 
						
					
						
							
								
									
										
										
										
											2017-12-09 07:17:25 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								## wkhtmltopdf
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								This was tested in wkhtmltopdf 0.12.4, installed using the official binaries:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								```bash
							 | 
						
					
						
							
								
									
										
										
										
											2021-09-09 06:01:53 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								$ wkhtmltopdf --javascript-delay 20000 http://oss.sheetjs.com/sheetjs/tests/ test.pdf
							 | 
						
					
						
							
								
									
										
										
										
											2017-12-09 07:17:25 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								```
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-08-19 23:06:34 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-09-24 23:40:09 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								[](https://github.com/SheetJS/js-xlsx)
							 |