Defining a custom output language
Ghidra has internal support for different output languages, but currently only allows to set this language through an internal selection process when it loads the binary. Besides the C output, Ghidra implements Java output for decompiling Java class files. With ReOxide you can register your own output language and can force the switch to a different language during decompilation.
When developing your own plugin, you can use the REOXIDE_LANGUAGE macro to register one custom language for the plugin. The reason for limiting it to one language per plugin results from the way ReOxide instantiates the plugin. Ghidra uses a singleton to register the language capabilities, which causes issues with the initialization order during the dynamic loading of the plugins. To avoid this issue, we tie the language capability to the plugin context, where ReOxide explicitly manages the lifetime:
#include <reoxide_interface.hh>
#include <reoxide_plugin.hh>
using namespace ghidra;
using namespace reoxide;
class PrintRustPlugin : public Plugin, public PrintLanguageCapability {
public:
PrintRustPlugin()
{
name = "rust-language";
isdefault = false;
}
virtual ~PrintRustPlugin() { }
virtual PrintLanguage* buildLanguage(Architecture* glb)
{
return new PrintRust(glb, name);
}
};
REOXIDE_CONTEXT(PrintRustPlugin);
REOXIDE_RULES();
REOXIDE_ACTIONS();
REOXIDE_LANGUAGE("rust-language");WARNING
The string, in this case "rust-language", that you assign to the name member variable of PrintLanguageCapability needs to match the string provided to REOXIDE_LANGUAGE. Ghidra uses the name variable to find the language and ReOxide uses the macro to register it with the manager. We currently do not check that these strings match automatically.
This C++ snippet shows a slightly modified version of the PrintRustPlugin, which we distribute with ReOxide. Its source code gives you a general idea on how to structure a plugin to include a custom output language. As main reference on how to develop a custom output language you will want to take a examine the printc.cc file that Ghidra includes.
To switch to a different output language, you can either make use of the reoxide command or of the graphical user interface. First, list the available languages while reoxided runs in the background:
$ reoxide list-languagesThis should show you output comparable to this:
core: c-language
core: java-language
printmir: mir-language
printrust: rust-languageYou can then switch the output languages with the force-output-language command:
$ reoxide force-output-language rust-languageAfter refreshing the Ghidra decompiler interface, you should now see a different output language. To switch back to the default language, you can run the force-output-language command without parameter.
ReOxide currently includes two plugins for output languages, rust-language and mir-language. The rust-language plugin tries to produce pseudo-Rust code in a similar manner to other tools like BinaryNinja:
/* verifier::simulator::Simulation::run */
fn verifier::simulator::Simulation::run(param_1: *mut simulation_state) -> u8
{
let lVar1: i64;
let lVar2: i64;
let uVar3: u64;
let bVar4: bool;
let auVar5: [u8; 12];
let player: *mut frame_data;
let py: u32;
let sread: *mut sread;
// ...
local_138 = &local_178 as *mut u8;
sread = param_1.sread;
local_a0 = sread.count;
local_178 = &local_a0;
local_170 = core::fmt::num::imp::_<impl_core::fmt::Display_for_u64>::fmt;
player = &PTR_s_Starting_simulation_for_replay_w_0024d790 as *mut frame_data;
// ...
std::io::stdio::_print(&player);
if sread.count < 0x2711u64 {
step = param_1.step;
if step < sread.count {
goal_x = param_1.goal_x;
goal_y = param_1.goal_y;
// ...
fStack_98 = param_1.field154_0xe4;
uStack_94 = 0.1;
uStack_90 = 0;
uStack_8c = 0;
loop {
uVar14 = -1.0;
lVar11 = sread.data;
frame = lVar11 + step;
quat1.x = frame.x;
quat1.y = frame.y;
quat1.z = frame.z;
quat1.w = frame.w;
quat2.x = frame.x;
quat2.y = frame.y;
// ...On the other hand, the mir-language tries to mimic Rust MIR. It only superficially resembles the Rust MIR; in reality it corresponds to a listing of Ghidra P-Code in the form of Rust MIR:
fn run(_1: *mut simulation_state) -> u8 {
debug param_1 => _1;
let mut _0: u8;
let mut _2: *mut u64;
let mut _3: *mut u8;
let mut _4: *mut u64;
let mut _5: *mut u64;
let mut _6: *mut u64;
let mut _7: *mut *mut sread;
let mut _8: *mut sread;
let mut _9: *mut size_t;
// ...
let mut _633: u8;
let mut _634: u8;
let mut _635: u8;
bb0: {
_3 = copy _2 as *mut u8;
_8 = copy (*_7);
_10 = copy (*_9);
// ...
_16 = const 0_f32;
_17 = const 0_f32;
_18 = const 1_f32;
_19 = const 0_f32;
_21 = indirect(copy _11, copy _unknown);
// ...
_30 = indirect(copy _10, copy _unknown);
switchInt(copy _33) -> [0: bb1, otherwise: bb21];
}
bb1: {
_35 = copy (*_34);
switchInt(copy _36) -> [0: bb20, otherwise: bb2];
}
bb2: {
_39 = copy (*_38);
_41 = copy (*_40);
_47 = copy (*_46);
_48 = Add(copy _45, const 1028443341_f32);
_49 = Mul(copy _48, const 0_f32);
_51 = copy (*_50);
_53 = copy (*_52);
_60 = Sqrt(copy _59);
_61 = Mul(copy _58, copy _41);
_63 = zext(copy _62);
_64 = const 0_u64;
_66 = copy (*_65);
_67 = const 1036831949_f32;
_68 = const 0_u32;
_69 = const 0_u32;
goto -> bb3;
}