| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | --- | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | title: Data Processing with JE | 
					
						
							|  |  |  | sidebar_label: Perl + JE | 
					
						
							| 
									
										
										
										
											2023-02-28 11:40:44 +00:00
										 |  |  | pagination_prev: demos/bigdata/index | 
					
						
							|  |  |  | pagination_next: solutions/input | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | --- | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-04-27 09:12:19 +00:00
										 |  |  | import current from '/version.js'; | 
					
						
							| 
									
										
										
										
											2023-05-07 13:58:36 +00:00
										 |  |  | import CodeBlock from '@theme/CodeBlock'; | 
					
						
							| 
									
										
										
										
											2023-04-27 09:12:19 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-14 07:40:38 +00:00
										 |  |  | :::danger pass | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | In a production application, it is strongly recommended to use a binding for a | 
					
						
							| 
									
										
										
										
											2023-05-22 08:06:09 +00:00
										 |  |  | C engine like [`JavaScript::Duktape`](/docs/demos/engines/duktape#perl) | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | [`JE`](https://metacpan.org/pod/JE) is a pure-Perl JavaScript engine. | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | [SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing | 
					
						
							|  |  |  | data from spreadsheets. | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | This demo uses JE and SheetJS to pull data from a spreadsheet and print CSV | 
					
						
							|  |  |  | rows. We'll explore how to load SheetJS in a JE context and process spreadsheets | 
					
						
							|  |  |  | from Perl scripts. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The ["Complete Example"](#complete-example) section includes a complete script | 
					
						
							|  |  |  | for reading data from XLS files, printing CSV rows, and writing FODS workbooks. | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ## Integration Details
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | The [SheetJS ExtendScript build](/docs/getting-started/installation/extendscript) | 
					
						
							|  |  |  | can be parsed and evaluated in a JE context. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The engine deviates from ES3. Modifying prototypes can fix some behavior: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <details> | 
					
						
							|  |  |  |   <summary><b>Required shim to support JE</b> (click to show)</summary> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The following features are implemented: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | - simple string `charCodeAt` | 
					
						
							|  |  |  | - Number `charCodeAt` (to work around string `split` bug) | 
					
						
							|  |  |  | - String `match` (to work around a bug when there are no matches) | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | ```js title="Required shim to support JE" | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | /* String#charCodeAt is missing */ | 
					
						
							|  |  |  | var string = ""; | 
					
						
							|  |  |  | for(var i = 0; i < 256; ++i) string += String.fromCharCode(i); | 
					
						
							|  |  |  | String.prototype.charCodeAt = function(n) { | 
					
						
							|  |  |  |   var result = string.indexOf(this.charAt(n)); | 
					
						
							|  |  |  |   if(result == -1) throw this.charAt(n); | 
					
						
							|  |  |  |   return result; | 
					
						
							|  |  |  | }; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | /* workaround for String split bug */ | 
					
						
							|  |  |  | Number.prototype.charCodeAt = function(n) { return this + 48; }; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | /* String#match bug with empty results */ | 
					
						
							|  |  |  | String.prototype.old_match = String.prototype.match; | 
					
						
							|  |  |  | String.prototype.match = function(p) { | 
					
						
							|  |  |  |   var result = this.old_match(p); | 
					
						
							|  |  |  |   return (Array.isArray(result) && result.length == 0) ? null : result; | 
					
						
							|  |  |  | }; | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | </details> | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | When loading the ExtendScript build, the BOM must be removed: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```perl | 
					
						
							|  |  |  | ## Load SheetJS source
 | 
					
						
							|  |  |  | my $src = read_file('xlsx.extendscript.js', { binmode => ':raw' }); | 
					
						
							|  |  |  | $src =~ s/^\xEF\xBB\xBF//; ## remove UTF8 BOM | 
					
						
							|  |  |  | my $XLSX = $je->eval($src); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### Reading Files
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Data should be passed as Base64 strings: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```perl | 
					
						
							|  |  |  | use File::Slurp; | 
					
						
							|  |  |  | use MIME::Base64 qw( encode_base64 ); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Set up conversion method
 | 
					
						
							|  |  |  | $je->eval(<<'EOF'); | 
					
						
							|  |  |  | function sheetjsparse(data) { try { | 
					
						
							|  |  |  |   return XLSX.read(String(data), {type: "base64", WTF:1}); | 
					
						
							|  |  |  | } catch(e) { return String(e); } } | 
					
						
							|  |  |  | EOF | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Read file
 | 
					
						
							|  |  |  | my $raw_data = encode_base64(read_file($ARGV[0], { binmode => ':raw' }), ""); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Call method with data
 | 
					
						
							|  |  |  | $return_val = $je->method(sheetjsparse => $raw_data); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### Writing Files
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Due to bugs in data interchange, it is strongly recommended to use a simple | 
					
						
							|  |  |  | format like `.fods`: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ```perl | 
					
						
							|  |  |  | use File::Slurp; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Set up conversion method
 | 
					
						
							|  |  |  | $je->eval(<<'EOF'); | 
					
						
							|  |  |  | function sheetjswrite(wb) { try { | 
					
						
							|  |  |  |   return XLSX.write(wb, { WTF:1, bookType: "fods", type: "string" }); | 
					
						
							|  |  |  | } catch(e) { return String(e); } } | 
					
						
							|  |  |  | EOF | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Generate file
 | 
					
						
							|  |  |  | my $fods = $je->method(sheetjswrite => $workbook); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Write to filesystem
 | 
					
						
							|  |  |  | write_file("SheetJE.fods", $fods); | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Complete Example
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-03 06:47:00 +00:00
										 |  |  | :::note Tested Deployments | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-05-26 07:50:55 +00:00
										 |  |  | This demo was tested in the following deployments: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | | Architecture | Version | Date       | | 
					
						
							|  |  |  | |:-------------|:--------|:-----------| | 
					
						
							| 
									
										
										
										
											2024-12-18 05:47:18 +00:00
										 |  |  | | `darwin-x64` | `0.066` | 2024-12-17 | | 
					
						
							| 
									
										
										
										
											2024-05-26 07:50:55 +00:00
										 |  |  | | `darwin-arm` | `0.066` | 2024-05-25 | | 
					
						
							| 
									
										
										
										
											2025-01-11 05:52:44 +00:00
										 |  |  | | `linux-x64`  | `0.066` | 2025-01-10 | | 
					
						
							| 
									
										
										
										
											2024-05-26 07:50:55 +00:00
										 |  |  | | `linux-arm`  | `0.066` | 2024-05-25 | | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-05-26 07:50:55 +00:00
										 |  |  | 1) Install `JE` and `File::Slurp` through CPAN: | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							| 
									
										
										
										
											2024-05-26 07:50:55 +00:00
										 |  |  | cpan install JE File::Slurp | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | :::note pass | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | There were permissions errors in some test runs: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | mkdir /Library/Perl/5.30/File: Permission denied at /System/Library/Perl/5.30/ExtUtils/Install.pm line 489. | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-12-18 05:47:18 +00:00
										 |  |  | The install command should be run through `sudo`: | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							|  |  |  | sudo cpan install JE File::Slurp | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ::: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 2) Download the [SheetJS ExtendScript build](/docs/getting-started/installation/extendscript): | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-05-07 13:58:36 +00:00
										 |  |  | <CodeBlock language="bash">{`\ | 
					
						
							| 
									
										
										
										
											2023-04-27 09:12:19 +00:00
										 |  |  | curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.extendscript.js`} | 
					
						
							| 
									
										
										
										
											2023-05-07 13:58:36 +00:00
										 |  |  | </CodeBlock> | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-02-15 01:00:49 +00:00
										 |  |  | 3) Download the demo [`SheetJE.pl`](pathname:///perl/SheetJE.pl): | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-02-15 01:00:49 +00:00
										 |  |  | ```bash | 
					
						
							|  |  |  | curl -LO https://docs.sheetjs.com/perl/SheetJE.pl | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-07-01 03:59:01 +00:00
										 |  |  | 4) Download the [test file](pathname:///cd.xls) and run: | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | ```bash | 
					
						
							| 
									
										
										
										
											2024-04-26 04:16:13 +00:00
										 |  |  | curl -LO https://docs.sheetjs.com/cd.xls | 
					
						
							| 
									
										
										
										
											2023-02-13 09:20:49 +00:00
										 |  |  | perl SheetJE.pl cd.xls | 
					
						
							|  |  |  | ``` | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-03 06:47:00 +00:00
										 |  |  | After a short wait, the contents will be displayed in CSV form. The script will | 
					
						
							|  |  |  | also generate the spreadsheet `SheetJE.fods` which can be opened in LibreOffice. |