304 lines
		
	
	
		
			8.6 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			304 lines
		
	
	
		
			8.6 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| 
								 | 
							
								---
							 | 
						||
| 
								 | 
							
								title: Stream Export
							 | 
						||
| 
								 | 
							
								sidebar_position: 11
							 | 
						||
| 
								 | 
							
								hide_table_of_contents: true
							 | 
						||
| 
								 | 
							
								---
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								import Tabs from '@theme/Tabs';
							 | 
						||
| 
								 | 
							
								import TabItem from '@theme/TabItem';
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Many platforms offer methods to write files. These methods typically expect the
							 | 
						||
| 
								 | 
							
								entire file to be generated before writing. Large workbook files may exceed
							 | 
						||
| 
								 | 
							
								platform-specific size limits.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Some platforms also offer a "streaming" or "incremental" approach. Instead of
							 | 
						||
| 
								 | 
							
								writing the entire file at once, these methods can accept small chunks of data
							 | 
						||
| 
								 | 
							
								and incrementally write to the filesystem.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The [Streaming Write](/docs/demos/bigdata/stream#streaming-write) demo includes
							 | 
						||
| 
								 | 
							
								live browser demos and notes for platforms that do not support SheetJS streams.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								:::tip pass
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								This feature was expanded in version `0.20.3`. It is strongly recommended to
							 | 
						||
| 
								 | 
							
								[upgrade to the latest version](/docs/getting-started/installation/).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								:::
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Streaming Basics
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								SheetJS streams use the NodeJS push streams API. It is strongly recommended to
							 | 
						||
| 
								 | 
							
								review the official NodeJS "Stream" documentation[^1].
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<details>
							 | 
						||
| 
								 | 
							
								  <summary><b>Historical Note</b> (click to show)</summary>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								NodeJS push streams were introduced in 2012. The text streaming methods `to_csv`
							 | 
						||
| 
								 | 
							
								and `to_html` are supported in NodeJS v0.10 and later while the object streaming
							 | 
						||
| 
								 | 
							
								method `to_json` is supported in NodeJS v0.12 and later.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The first SheetJS streaming write function, `to_csv`, was introduced in 2017. It
							 | 
						||
| 
								 | 
							
								used and still uses the battle-tested NodeJS streaming API.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Years later, browser vendors opted to standardize a different stream API.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								For maximal compatibility, the library uses NodeJS push streams.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								</details>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								#### NodeJS ECMAScript Module Support
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In CommonJS modules, libraries can load the `stream` module using `require`.
							 | 
						||
| 
								 | 
							
								SheetJS libraries will load streaming support where applicable.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Due to ESM limitations, libraries cannot freely import the `stream` module.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								:::danger ECMAScript Module Limitations
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The original specification only supported top-level imports:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js
							 | 
						||
| 
								 | 
							
								import { Readable } from 'stream';
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								If a module is unavailable, there is no way for scripts to gracefully fail or
							 | 
						||
| 
								 | 
							
								ignore the error.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								---
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Patches to the specification added two different solutions to the problem:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								- "dynamic imports" will throw errors that can be handled by libraries. Dynamic
							 | 
						||
| 
								 | 
							
								imports will taint APIs that do not use Promise-based methods.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js
							 | 
						||
| 
								 | 
							
								/* Readable will be undefined if stream cannot be imported */
							 | 
						||
| 
								 | 
							
								const Readable = await (async() => {
							 | 
						||
| 
								 | 
							
								  try {
							 | 
						||
| 
								 | 
							
								    return (await import("stream"))?.Readable;
							 | 
						||
| 
								 | 
							
								  } catch(e) { /* silently ignore error */ }
							 | 
						||
| 
								 | 
							
								})();
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								- "import maps" control module resolution, allowing library users to manually
							 | 
						||
| 
								 | 
							
								shunt unsupported modules.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								**These patches were released after browsers adopted ESM!** A number of browsers
							 | 
						||
| 
								 | 
							
								and other platforms support top-level imports but do not support the patches.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								---
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								**Due to ESM woes, it is strongly recommended to use CommonJS when possible!**
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								:::
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								For maximal platform support, SheetJS libraries expose a special `set_readable`
							 | 
						||
| 
								 | 
							
								method to provide a `Readable` implementation:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js title="SheetJS NodeJS ESM streaming support"
							 | 
						||
| 
								 | 
							
								import { stream as SheetJStream } from 'xlsx';
							 | 
						||
| 
								 | 
							
								import { Readable } from 'stream';
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								SheetJStream.set_readable(Readable);
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Worksheet Export
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The worksheet export methods accept a SheetJS worksheet object.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								### CSV Export
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								**Export worksheet data in "Comma-Separated Values" (CSV)**
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js
							 | 
						||
| 
								 | 
							
								var csvstream = XLSX.stream.to_csv(ws, opts);
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								`to_csv` creates a NodeJS text stream. The options mirror the non-streaming
							 | 
						||
| 
								 | 
							
								[`sheet_to_csv`](/docs/api/utilities/csv#delimiter-separated-output) method.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The following NodeJS script fetches https://docs.sheetjs.com/pres.numbers and
							 | 
						||
| 
								 | 
							
								streams CSV rows to the terminal.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<Tabs groupId="mod">
							 | 
						||
| 
								 | 
							
								  <TabItem value="cjs" label="CommonJS">
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js title="Streaming CSV Print Example"
							 | 
						||
| 
								 | 
							
								const XLSX = require("xlsx");
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								(async() => {
							 | 
						||
| 
								 | 
							
								  var ab = await (await fetch("https://docs.sheetjs.com/pres.numbers")).arrayBuffer()
							 | 
						||
| 
								 | 
							
								  var wb = XLSX.read(ab);
							 | 
						||
| 
								 | 
							
								  var ws = wb.Sheets[wb.SheetNames[0]];
							 | 
						||
| 
								 | 
							
								  XLSX.stream.to_csv(ws).pipe(process.stdout);
							 | 
						||
| 
								 | 
							
								})();
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								  </TabItem>
							 | 
						||
| 
								 | 
							
								  <TabItem value="esm" label="ESM">
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js title="Streaming CSV Print Example"
							 | 
						||
| 
								 | 
							
								import { read, stream } from "xlsx";
							 | 
						||
| 
								 | 
							
								import { Readable } from "stream";
							 | 
						||
| 
								 | 
							
								stream.set_readable(Readable);
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								var ab = await (await fetch("https://docs.sheetjs.com/pres.numbers")).arrayBuffer()
							 | 
						||
| 
								 | 
							
								var wb = read(ab);
							 | 
						||
| 
								 | 
							
								var ws = wb.Sheets[wb.SheetNames[0]];
							 | 
						||
| 
								 | 
							
								stream.to_csv(ws).pipe(process.stdout);
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								  </TabItem>
							 | 
						||
| 
								 | 
							
								</Tabs>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								### JSON Export
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								**Export worksheet data to "Arrays of Arrays" or "Arrays of Objects"**
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js
							 | 
						||
| 
								 | 
							
								var jsonstream = XLSX.stream.to_json(ws, opts);
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								`to_json` creates a NodeJS object stream. The options mirror the non-streaming
							 | 
						||
| 
								 | 
							
								[`sheet_to_json`](/docs/api/utilities/array#array-output) method.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The following NodeJS script fetches https://docs.sheetjs.com/pres.numbers and
							 | 
						||
| 
								 | 
							
								streams JSON rows to the terminal. A `Transform`[^2] stream generates text from
							 | 
						||
| 
								 | 
							
								the object streams.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<Tabs groupId="mod">
							 | 
						||
| 
								 | 
							
								  <TabItem value="cjs" label="CommonJS">
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js title="Streaming Objects Print Example"
							 | 
						||
| 
								 | 
							
								const XLSX = require("xlsx")
							 | 
						||
| 
								 | 
							
								const { Transform } = require("stream");
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								/* this Transform stream converts JS objects to text */
							 | 
						||
| 
								 | 
							
								var conv = new Transform({writableObjectMode:true});
							 | 
						||
| 
								 | 
							
								conv._transform = function(obj, e, cb){ cb(null, JSON.stringify(obj) + "\n"); };
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								(async() => {
							 | 
						||
| 
								 | 
							
								  var ab = await (await fetch("https://docs.sheetjs.com/pres.numbers")).arrayBuffer()
							 | 
						||
| 
								 | 
							
								  var wb = XLSX.read(ab);
							 | 
						||
| 
								 | 
							
								  var ws = wb.Sheets[wb.SheetNames[0]];
							 | 
						||
| 
								 | 
							
								  XLSX.stream.to_json(ws, {raw: true}).pipe(conv).pipe(process.stdout);
							 | 
						||
| 
								 | 
							
								})();
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								  </TabItem>
							 | 
						||
| 
								 | 
							
								  <TabItem value="esm" label="ESM">
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js title="Streaming Objects Print Example"
							 | 
						||
| 
								 | 
							
								import { read, stream } from "xlsx";
							 | 
						||
| 
								 | 
							
								import { Readable, Transform } from "stream";
							 | 
						||
| 
								 | 
							
								stream.set_readable(Readable);
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								/* this Transform stream converts JS objects to text */
							 | 
						||
| 
								 | 
							
								var conv = new Transform({writableObjectMode:true});
							 | 
						||
| 
								 | 
							
								conv._transform = function(obj, e, cb){ cb(null, JSON.stringify(obj) + "\n"); };
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								var ab = await (await fetch("https://docs.sheetjs.com/pres.numbers")).arrayBuffer()
							 | 
						||
| 
								 | 
							
								var wb = read(ab);
							 | 
						||
| 
								 | 
							
								var ws = wb.Sheets[wb.SheetNames[0]];
							 | 
						||
| 
								 | 
							
								stream.to_json(ws, {raw: true}).pipe(conv).pipe(process.stdout);
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								  </TabItem>
							 | 
						||
| 
								 | 
							
								</Tabs>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								### HTML Export
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								**Export worksheet data to HTML TABLE**
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js
							 | 
						||
| 
								 | 
							
								var htmlstream = XLSX.stream.to_html(ws, opts);
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								`to_html` creates a NodeJS text stream. The options mirror the non-streaming
							 | 
						||
| 
								 | 
							
								[`sheet_to_html`](/docs/api/utilities/html#html-table-output) method.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The following NodeJS script fetches https://docs.sheetjs.com/pres.numbers and
							 | 
						||
| 
								 | 
							
								streams HTML TABLE rows to the terminal.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<Tabs groupId="mod">
							 | 
						||
| 
								 | 
							
								  <TabItem value="cjs" label="CommonJS">
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js title="Streaming HTML Print Example"
							 | 
						||
| 
								 | 
							
								const XLSX = require("xlsx");
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								(async() => {
							 | 
						||
| 
								 | 
							
								  var ab = await (await fetch("https://docs.sheetjs.com/pres.numbers")).arrayBuffer()
							 | 
						||
| 
								 | 
							
								  var wb = XLSX.read(ab);
							 | 
						||
| 
								 | 
							
								  var ws = wb.Sheets[wb.SheetNames[0]];
							 | 
						||
| 
								 | 
							
								  XLSX.stream.to_html(ws).pipe(process.stdout);
							 | 
						||
| 
								 | 
							
								})();
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								  </TabItem>
							 | 
						||
| 
								 | 
							
								  <TabItem value="esm" label="ESM">
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js title="Streaming HTML Print Example"
							 | 
						||
| 
								 | 
							
								import { read, stream } from "xlsx";
							 | 
						||
| 
								 | 
							
								import { Readable } from "stream";
							 | 
						||
| 
								 | 
							
								stream.set_readable(Readable);
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								var ab = await (await fetch("https://docs.sheetjs.com/pres.numbers")).arrayBuffer()
							 | 
						||
| 
								 | 
							
								var wb = read(ab);
							 | 
						||
| 
								 | 
							
								var ws = wb.Sheets[wb.SheetNames[0]];
							 | 
						||
| 
								 | 
							
								stream.to_html(ws).pipe(process.stdout);
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								  </TabItem>
							 | 
						||
| 
								 | 
							
								</Tabs>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Workbook Export
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The workbook export methods accept a SheetJS workbook object.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								### XLML Export
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								**Export workbook data to SpreadsheetML2003 XML files**
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js
							 | 
						||
| 
								 | 
							
								var xlmlstream = XLSX.stream.to_xlml(wb, opts);
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								`to_xlml` creates a NodeJS text stream. The options mirror the non-streaming
							 | 
						||
| 
								 | 
							
								[`write`](/docs/api/write-options) method using the `xlml` book type.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The following NodeJS script fetches https://docs.sheetjs.com/pres.numbers and
							 | 
						||
| 
								 | 
							
								writes a SpreadsheetML2003 workbook to `SheetJStream.xml.xls`:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<Tabs groupId="mod">
							 | 
						||
| 
								 | 
							
								  <TabItem value="cjs" label="CommonJS">
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js title="Streaming XLML Write Example"
							 | 
						||
| 
								 | 
							
								const XLSX = require("xlsx"), fs = require("fs");
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								(async() => {
							 | 
						||
| 
								 | 
							
								  var ab = await (await fetch("https://docs.sheetjs.com/pres.numbers")).arrayBuffer()
							 | 
						||
| 
								 | 
							
								  var wb = XLSX.read(ab);
							 | 
						||
| 
								 | 
							
								  XLSX.stream.to_xlml(wb).pipe(fs.createWriteStream("SheetJStream.xml.xls"));
							 | 
						||
| 
								 | 
							
								})();
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								  </TabItem>
							 | 
						||
| 
								 | 
							
								  <TabItem value="esm" label="ESM">
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```js title="Streaming XLML Write Example"
							 | 
						||
| 
								 | 
							
								import { read, stream } from "xlsx";
							 | 
						||
| 
								 | 
							
								import { Readable } from "stream";
							 | 
						||
| 
								 | 
							
								stream.set_readable(Readable);
							 | 
						||
| 
								 | 
							
								import { createWriteStream } from "fs";
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								var ab = await (await fetch("https://docs.sheetjs.com/pres.numbers")).arrayBuffer()
							 | 
						||
| 
								 | 
							
								var wb = read(ab);
							 | 
						||
| 
								 | 
							
								stream.to_xlml(wb).pipe(createWriteStream("SheetJStream.xml.xls"));
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								  </TabItem>
							 | 
						||
| 
								 | 
							
								</Tabs>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[^1]: See ["Stream"](https://nodejs.org/api/stream.html) in the NodeJS documentation.
							 | 
						||
| 
								 | 
							
								[^2]: See [`Transform`](https://nodejs.org/api/stream.html#class-streamtransform) in the NodeJS documentation.
							 |