forked from sheetjs/docs.sheetjs.com
		
	
		
			
	
	
		
			357 lines
		
	
	
		
			9.7 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			357 lines
		
	
	
		
			9.7 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
|  | --- | ||
|  | title: Sheets in Ghidra | ||
|  | sidebar_label: Ghidra | ||
|  | pagination_prev: demos/cloud/index | ||
|  | pagination_next: demos/bigdata/index | ||
|  | sidebar_custom_props: | ||
|  |   summary: Generate spreadsheets from Ghidra-generated bitfield tables | ||
|  | --- | ||
|  | 
 | ||
|  | import current from '/version.js'; | ||
|  | import CodeBlock from '@theme/CodeBlock'; | ||
|  | 
 | ||
|  | [Ghidra](https://ghidra-sre.org/) is a software reverse engineering platform | ||
|  | with a robust Java-based extension system. | ||
|  | 
 | ||
|  | [SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing | ||
|  | data from spreadsheets. | ||
|  | 
 | ||
|  | The [Complete Demo](#complete-demo) uses SheetJS to export data from a Ghidra | ||
|  | script. We'll create an extension that loads the [V8](/docs/demos/engines/v8) | ||
|  | JavaScript engine through the Ghidra.js[^1] integration and uses the SheetJS | ||
|  | library to export a bitfield table from Apple Numbers to a XLSX workbook. | ||
|  | 
 | ||
|  | :::note Tested Deployments | ||
|  | 
 | ||
|  | This demo was tested by SheetJS users in the following deployments: | ||
|  | 
 | ||
|  | | Architecture | Ghidra   | Date       | | ||
|  | |:-------------|:---------|:-----------| | ||
|  | | `darwin-arm` | `11.1.2` | 2024-10-13 | | ||
|  | 
 | ||
|  | ::: | ||
|  | 
 | ||
|  | ## Integration Details
 | ||
|  | 
 | ||
|  | Ghidra natively supports scripts that are run in Java. JS extension scripts | ||
|  | require a [JavaScript engine](/docs/demos/engines/) with Java bindings. | ||
|  | 
 | ||
|  | Ghidra.js[^1] is a Ghidra integration for [RhinoJS](/docs/demos/engines/rhino), | ||
|  | [GraalJS](/docs/demos/engines/graaljs) and [V8](/docs/demos/engines/v8#java). | ||
|  | The current version uses [the Javet V8 binding](https://www.caoccao.com/Javet). | ||
|  | 
 | ||
|  | ### Loading SheetJS Scripts
 | ||
|  | 
 | ||
|  | The [SheetJS NodeJS module](/docs/getting-started/installation/nodejs) can be | ||
|  | loaded in Ghidra.js scripts using `require`: | ||
|  | 
 | ||
|  | ```js title="Loading SheetJS scripts in Ghidra.js" | ||
|  | const XLSX = require("xlsx"); | ||
|  | ``` | ||
|  | 
 | ||
|  | :::caution pass | ||
|  | 
 | ||
|  | SheetJS NodeJS modules must be installed in a folder in the Ghidra script path! | ||
|  | 
 | ||
|  | ::: | ||
|  | 
 | ||
|  | ### Bitfields and Sheets
 | ||
|  | 
 | ||
|  | Binary file formats commonly use bitfields to compactly store a set of Boolean | ||
|  | (true or false) flags. For example, in the XLSB file format, the `BrtRowHdr` | ||
|  | record[^2] encodes [row properties](/docs/csf/features/rowprops). Bit offsets | ||
|  | 91-96 are interpreted as flags marking if a row is hidden or if it is collapsed. | ||
|  | 
 | ||
|  | #### Assembly Implementation
 | ||
|  | 
 | ||
|  | Functions that parse bitfields typically test each bit sequentially: | ||
|  | 
 | ||
|  | ```nasm title="x86_64 sample assembly with mnemonics" | ||
|  |             CASE_1c | ||
|  | 41 0f ba e5 1c  BT         R13D,0x1c | ||
|  | 73 69           JNC        CASE_1d | ||
|  | 
 | ||
|  | ;; .... Do some work here (bit offset 28) | ||
|  | 
 | ||
|  |             CASE_1d | ||
|  | 41 0f ba e5 1d  BT         R13D,0x1d | ||
|  | 73 69           JNC        CASE_1e | ||
|  | 
 | ||
|  | ;; .... Do some work here (bit offset 29) | ||
|  | ``` | ||
|  | 
 | ||
|  | :::note pass | ||
|  | 
 | ||
|  | The assembly is approximated by the following TypeScript snippet: | ||
|  | 
 | ||
|  | ```typescript title="Approximate TypeScript" | ||
|  | /* R13 is a 64-bit register */ | ||
|  | declare let R13: BigInt; | ||
|  | /* NOTE: asm R13D is technically a live binding */ | ||
|  | let R13D: number = Number(R13 & 0xFFFFFFFFn); | ||
|  | 
 | ||
|  | if((R13D >> 28) & 1) { | ||
|  |   // .... Do some work here (bit offset 28) | ||
|  | } | ||
|  | 
 | ||
|  | if((R13D >> 29) & 1) { | ||
|  |   // .... Do some work here (bit offset 29) | ||
|  | } | ||
|  | ``` | ||
|  | 
 | ||
|  | ::: | ||
|  | 
 | ||
|  | #### Array of Objects
 | ||
|  | 
 | ||
|  | A bitmask or bit offset can be paired with a description in a JavaScript object. | ||
|  | 
 | ||
|  | For example, in the `BrtRowHdr` record, bit offset 92 indicates whether the row | ||
|  | is hidden (if the bit is set) or visible (if the bit is not set). The offset and | ||
|  | description can be stored as fields in an object: | ||
|  | 
 | ||
|  | ```js title="Sample metadata for BrtRowHdr offset 92" | ||
|  | const metadata_92 = { Offset: 92, Description: "Hidden flag" }; | ||
|  | ``` | ||
|  | 
 | ||
|  | Each object can be stored in an array: | ||
|  | 
 | ||
|  | ```js title="Array of sample metadata for BrtRowHdr" | ||
|  | const metadata = [ | ||
|  |   { Offset: 91, Description: "Collapsed flag" }, | ||
|  |   { Offset: 92, Description: "Hidden flag" }, | ||
|  |   // ... | ||
|  | ]; | ||
|  | ``` | ||
|  | 
 | ||
|  | This is an ["Array of Objects"](/docs/api/utilities/array#arrays-of-objects). | ||
|  | The SheetJS `json_to_sheet` method[^3] can generate a SheetJS worksheet object | ||
|  | from the array: | ||
|  | 
 | ||
|  | ```js title="Generating a worksheet from the metadata" | ||
|  | const ws = XLSX.utils.json_to_sheet(metadata); | ||
|  | ``` | ||
|  | 
 | ||
|  | The SheetJS `book_new` method[^4] generates a SheetJS workbook object that can | ||
|  | be written to the filesystem using the `writeFile` method[^5]: | ||
|  | 
 | ||
|  | ```js title="Exporting the worksheet to file" | ||
|  | const wb = XLSX.utils.book_new(ws, "Offsets"); | ||
|  | XLSX.utils.writeFile(wb, "SheetJSGhidra.xlsx"); | ||
|  | ``` | ||
|  | 
 | ||
|  | ### Java Binding
 | ||
|  | 
 | ||
|  | Ghidra.js exposes a number of globals for interacting with Ghidra, including: | ||
|  | 
 | ||
|  | - `currentProgram`: information about the loaded program. | ||
|  | - `JavaHelper`: Java helper to load classes. | ||
|  | 
 | ||
|  | Ghidra.js automatically bridges instance methods to Java method calls. It also | ||
|  | handles the plugin and file extension details. | ||
|  | 
 | ||
|  | #### Launching the Decompiler
 | ||
|  | 
 | ||
|  | `ghidra.app.decompiler.DecompInterface` is the primary Java interface to the | ||
|  | decompiler. In Ghidra.js, `JavaHelper.getClass` will load the class. | ||
|  | 
 | ||
|  | _Java_ | ||
|  | 
 | ||
|  | ```java title="Launch decompiler process in Java (snippet)" | ||
|  | import ghidra.app.script.GhidraScript; | ||
|  | import ghidra.app.decompiler.DecompInterface; | ||
|  | import ghidra.program.model.listing.Program; | ||
|  | 
 | ||
|  | public class SheetZilla extends GhidraScript { | ||
|  |   @Override public void run() throws Exception { | ||
|  |     DecompInterface ifc = new DecompInterface(); | ||
|  |     boolean success = ifc.openProgram(currentProgram); | ||
|  |     /* ... do work here ... */ | ||
|  |   } | ||
|  | } | ||
|  | ``` | ||
|  | 
 | ||
|  | _Ghidra.js_ | ||
|  | 
 | ||
|  | ```js title="Launch decompiler process in Ghidra.js" | ||
|  | const DecompInterface = JavaHelper.getClass('ghidra.app.decompiler.DecompInterface'); | ||
|  | const decompiler = new DecompInterface(); | ||
|  | decompiler.openProgram(currentProgram); | ||
|  | ``` | ||
|  | 
 | ||
|  | #### Identifying a Function
 | ||
|  | 
 | ||
|  | The `getGlobalSymbols` method of a symbol table instance will return an array of | ||
|  | symbols matching the given name: | ||
|  | 
 | ||
|  | ```js | ||
|  | /* name of function to find */ | ||
|  | const fname = 'MyMethod'; | ||
|  | 
 | ||
|  | /* find symbols matching the name */ | ||
|  | // highlight-next-line | ||
|  | const fsymbs = currentProgram.getSymbolTable().getGlobalSymbols(fname); | ||
|  | 
 | ||
|  | /* get first result */ | ||
|  | const fsymb = fsymbs[0]; | ||
|  | ``` | ||
|  | 
 | ||
|  | The `getFunctionAt` method of a function manager instance will take an address | ||
|  | and return a reference to a function: | ||
|  | 
 | ||
|  | ```js | ||
|  | /* get address */ | ||
|  | const faddr = fsymb.getAddress(); | ||
|  | 
 | ||
|  | /* find function */ | ||
|  | // highlight-next-line | ||
|  | const fn = currentProgram.getFunctionManager().getFunctionAt(faddr); | ||
|  | ``` | ||
|  | 
 | ||
|  | #### Decompiling a Function
 | ||
|  | 
 | ||
|  | The `decompileFunction` method attempts to decompile the referenced function: | ||
|  | 
 | ||
|  | ```js | ||
|  | /* decompile function */ | ||
|  | // highlight-next-line | ||
|  | const decomp = decompiler.decompileFunction(fn, 10000, null); | ||
|  | ``` | ||
|  | 
 | ||
|  | Once decompiled, it is possible to retrieve the decompiled C code: | ||
|  | 
 | ||
|  | ```js | ||
|  | /* get generated C code */ | ||
|  | const src = decomp.getDecompiledFunction().getC(); | ||
|  | ``` | ||
|  | 
 | ||
|  | ## Complete Demo
 | ||
|  | 
 | ||
|  | In this demo, we will inspect the `_TSTCellToCellStorage` method within the | ||
|  | `TSTables` framework of Apple Numbers 14.2. This particular method handles | ||
|  | serialization of cells to the NUMBERS file format. | ||
|  | 
 | ||
|  | The implementation has a number of blocks which look like the following script: | ||
|  | 
 | ||
|  | ```js | ||
|  | if(flags >> 0x0d & 1) { | ||
|  |   const field = "numberFormatID"; | ||
|  |   const current_value = cell[field]; | ||
|  |   // ... check if current_value is set, do other stuff | ||
|  | } | ||
|  | ``` | ||
|  | 
 | ||
|  | Based on the bit offset and the field name, we will generate the following row: | ||
|  | 
 | ||
|  | ```js | ||
|  | const mask = 1 << 0x0d; // = 8192 = 0x2000 | ||
|  | const name = "number format ID"; | ||
|  | const row = { Mask: "0x" + mask.toString(16), "Internal Name": name }; | ||
|  | ``` | ||
|  | 
 | ||
|  | Rows will be generated for each block and the final dataset will be exported. | ||
|  | 
 | ||
|  | ### System Setup
 | ||
|  | 
 | ||
|  | 0) Install Ghidra, Xcode, and Apple Numbers. | ||
|  | 
 | ||
|  | <details> | ||
|  |   <summary><b>Installation Notes</b> (click to show)</summary> | ||
|  | 
 | ||
|  | On macOS, Ghidra was installed using Homebrew: | ||
|  | 
 | ||
|  | ```bash | ||
|  | brew install --cask ghidra | ||
|  | ``` | ||
|  | 
 | ||
|  | </details> | ||
|  | 
 | ||
|  | 1) Add the base Ghidra folder to the PATH variable. The following shell command | ||
|  | adds to the path for the current `zsh` or `bash` session: | ||
|  | 
 | ||
|  | ```bash | ||
|  | export PATH="$PATH":$(dirname $(realpath `which ghidraRun`)) | ||
|  | ``` | ||
|  | 
 | ||
|  | 2) Install `ghidra.js` globally: | ||
|  | 
 | ||
|  | ```bash | ||
|  | npm install -g ghidra.js | ||
|  | ``` | ||
|  | 
 | ||
|  | :::note pass | ||
|  | 
 | ||
|  | If the install fails with a permissions issue, install with the root user: | ||
|  | 
 | ||
|  | ```bash | ||
|  | sudo npm install -g ghidra.js | ||
|  | ``` | ||
|  | 
 | ||
|  | ::: | ||
|  | 
 | ||
|  | ### Program Preparation
 | ||
|  | 
 | ||
|  | 3) Create a temporary folder to hold the Ghidra project: | ||
|  | 
 | ||
|  | ```bash | ||
|  | mkdir -p /tmp/sheetjs-ghidra | ||
|  | ``` | ||
|  | 
 | ||
|  | 4) Copy the `TSTables` framework to the current directory: | ||
|  | 
 | ||
|  | ```bash | ||
|  | cp /Applications/Numbers.app/Contents/Frameworks/TSTables.framework/Versions/Current/TSTables . | ||
|  | ``` | ||
|  | 
 | ||
|  | 5) Create a "thin" binary by extracting the `x86_64` part of the framework: | ||
|  | 
 | ||
|  | ```bash | ||
|  | lipo TSTables -thin x86_64 -output TSTables.macho | ||
|  | ``` | ||
|  | 
 | ||
|  | :::info pass | ||
|  | 
 | ||
|  | When this demo was last tested, the headless analyzer did not support Mach-O fat | ||
|  | binaries. `lipo` creates a new binary with support for one architecture. | ||
|  | 
 | ||
|  | ::: | ||
|  | 
 | ||
|  | 6) Analyze the program: | ||
|  | 
 | ||
|  | ```bash | ||
|  | $(dirname $(realpath `which ghidraRun`))/support/analyzeHeadless /tmp/sheetjs-ghidra Numbers -import TSTables.macho | ||
|  | ``` | ||
|  | 
 | ||
|  | :::note pass | ||
|  | 
 | ||
|  | This process may take a while and print a number of Java stacktraces. The errors | ||
|  | can be ignored. | ||
|  | 
 | ||
|  | ::: | ||
|  | 
 | ||
|  | ### SheetJS Integration
 | ||
|  | 
 | ||
|  | 7) Download [`sheetjs-ghidra.js`](pathname:///ghidra/sheetjs-ghidra.js): | ||
|  | 
 | ||
|  | ```bash | ||
|  | curl -LO https://docs.sheetjs.com/ghidra/sheetjs-ghidra.js | ||
|  | ``` | ||
|  | 
 | ||
|  | 8) Install the [SheetJS NodeJS module](/docs/getting-started/installation/nodejs): | ||
|  | 
 | ||
|  | <CodeBlock language="bash">{`\ | ||
|  | npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz`} | ||
|  | </CodeBlock> | ||
|  | 
 | ||
|  | 9) Run the script: | ||
|  | 
 | ||
|  | ```bash | ||
|  | $(dirname $(realpath `which ghidraRun`))/support/analyzeHeadless /tmp/sheetjs-ghidra Numbers -process TSTables.macho -noanalysis -scriptPath `pwd` -postScript sheetjs-ghidra.js | ||
|  | ``` | ||
|  | 
 | ||
|  | 10) Open the generated `SheetJSGhidraTSTCell.xlsx` spreadsheet. | ||
|  | 
 | ||
|  | [^1]: The project does not have a website. The [source repository](https://github.com/vaguue/Ghidra.js) is publicly available. | ||
|  | [^2]: `BrtRowHdr` is defined in the [`MS-XLSB` specification](/docs/miscellany/references) | ||
|  | [^3]: See [`json_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-objects-input) | ||
|  | [^4]: See [`book_new` in "Utilities"](/docs/api/utilities/wb) | ||
|  | [^5]: See [`writeFile` in "Writing Files"](/docs/api/write-options) |