Decompilation pipeline
Ghidra uses Action
and Rule
objects at the core of its decompilation transformation. An Action
represents a transform or analysis across a whole function, whereas a Rule
matches and rewrites small sequences of P-Code opcodes. Both primitives chained together transform the low-level P-Code output of the lifter into high-level P-Code. Ghidra then renders high-level P-code as decompiled code. Ghidra defines the actions and rules in ruleaction.cc
and coreaction.cc
, while their declarations span header files. The most important definitions for understanding the decompilation flow appear towards the end of coreaction.cc
: the definition of buildDefaultGroups
and universalAction
.
Instantiating actions and action groups
The universal action has all possible actions and rules and specifies the order of their execution. The following code shows a part of the definition function:
// ...
act = new ActionRestartGroup(Action::rule_onceperfunc,"universal",1);
registerAction(universalname,act);
act->addAction( new ActionStart("base"));
act->addAction( new ActionConstbase("base"));
act->addAction( new ActionNormalizeSetup("normalanalysis"));
act->addAction( new ActionDefaultParams("base"));
// act->addAction( new ActionParamShiftStart("paramshift") );
act->addAction( new ActionExtraPopSetup("base",stackspace) );
act->addAction( new ActionPrototypeTypes("protorecovery"));
act->addAction( new ActionFuncLink("protorecovery") );
act->addAction( new ActionFuncLinkOutOnly("noproto") );
{
actfullloop = new ActionGroup(Action::rule_repeatapply,"fullloop");
{
actmainloop = new ActionGroup(Action::rule_repeatapply,"mainloop");
actmainloop->addAction( new ActionUnreachable("base") );
actmainloop->addAction( new ActionVarnodeProps("base") );
actmainloop->addAction( new ActionHeritage("base") );
actmainloop->addAction( new ActionParamDouble("protorecovery") );
actmainloop->addAction( new ActionSegmentize("base"));
actmainloop->addAction( new ActionInternalStorage("base") );
// ...
This excerpt shows the creation of Action
objects and containers that allow grouping actions into a sequence. The code snippets has two of these: ActionRestartGroup
and ActionGroup
. The ActionRestartGroup
will run its contained actions from the beginning when an Action
sets a certain restart flag. Only the universal action uses this container, making it the root of all actions. ActionGroup
and ActionRestartGroup
represent Action
objects, so they can belong to groups themselves.
The excerpt also shows some of the flags specifiable for actions, e.g. Action::rule_onceperfunc
and Action::rule_repeatapply
. Using these flags, Ghidra creates loops of related actions and applies them repeatedly until actions no longer trigger changes.
Instantiating rules
The special container ActionPool
holds Rule
objects instead of actions. The following code snippet shows the API for using an ActionPool
as analogous to using an ActionGroup
, but with rules instead.
/// ...
actprop = new ActionPool(Action::rule_repeatapply,"oppool1");
actprop->addRule( new RuleEarlyRemoval("deadcode"));
actprop->addRule( new RuleTermOrder("analysis"));
actprop->addRule( new RuleSelectCse("analysis"));
actprop->addRule( new RuleCollectTerms("analysis"));
actprop->addRule( new RulePullsubMulti("analysis"));
actprop->addRule( new RulePullsubIndirect("analysis"));
actprop->addRule( new RulePushMulti("nodejoin"));
actprop->addRule( new RuleSborrow("analysis") );
actprop->addRule( new RuleIntLessEqual("analysis") );
actprop->addRule( new RuleTrivialArith("analysis") );
actprop->addRule( new RuleTrivialBool("analysis") );
actprop->addRule( new RuleTrivialShift("analysis") );
/// ...
In contrast to actions, rules cannot contain other rules. Rule implementations perform simple and atomic transformations and represent the leaf nodes of the analysis hierarchy.
Default groups
Ghidra also uses the concept of groups to structure actions inside the universal action unrelated to ActionGroup
objects. The default group specifies the members of default actions. For example, this excerpt of buildDefaultGroups
shows the members of the decompile
group, a top-level analysis action executed by Ghidra:
const char *members[] = {
"base", "protorecovery", "protorecovery_a", "deindirect",
"localrecovery", "deadcode", "typerecovery", "stackptrflow",
"blockrecovery", "stackvars", "deadcontrolflow", "switchnorm",
"cleanup", "splitcopy", "splitpointer", "merge", "dynamic", "casts",
"analysis", "fixateglobals", "fixateproto", "constsequence",
"segment", "returnsplit", "nodejoin", "doubleload", "doubleprecis",
"unreachable", "subvar", "floatprecision",
"conditionalexe", ""
};
setGroup("decompile",members);
Some names of members also appear in the excerpts of the universal action definition, e.g. base
, protorecovery
, deadcode
and others. This mechanism allows selecting relevant actions from the universal action. To this end, every Action
and Rule
has a name and a group identifier. The constructor sets the name itself while the universal action sets the group identifier during its construction. That means all strings in the universalAction
function denote group identifiers, with one exception. Action and rule containers have an empty group identifier and the strings shown in universalAction
represent their names. For example, the oppool1
in ActionPool(Action::rule_repeatapply,"oppool1")
refers to the name of the action, but base
in ActionStart("base")
refers to the group identifier.
This might seem illogical at first, but makes sense when considering the construction of default actions, such as decompile
. To derive this action, Ghidra walks through the hierarchy of the universal action. At every step it tests if the current action or rule has a group identifier that matches any of the member names of the default group. If yes, it includes the action or rule, otherwise it skips it. This keeps the main structure of the universal action, but deactivates certain actions. In comparison, deactivating action groups would change the main structure and might lead to unpredictable results.