Skip to content

Decompilation Pipeline

At the core of Ghidra's decompilation transformation are Action and Rule objects. An Action represents a transform or analysis across a whole function, whereas a Rule is used for matching and rewriting of small sequences of P-Code opcodes. Both primitives are chained together in order to transform the low-level P-Code output of the lifter into high-level P-Code that is then "rendered" as decompiled code. At the time of writing, all actions and rules inside of Ghidra are defined in ruleaction.cc and coreaction.cc, with their declarations being split over several header files. The most important definitions for understanding the decompilation flow can be found towards the end of coreaction.cc: the definition of buildDefaultGroups and universalAction.

Instantiating Actions and Action Groups

The universal action contains all possible actions and rules and specifies the order of their execution. The following code is an excerpt of the definition function:

cpp
// ...
act = new ActionRestartGroup(Action::rule_onceperfunc,"universal",1);
registerAction(universalname,act);

act->addAction( new ActionStart("base"));
act->addAction( new ActionConstbase("base"));
act->addAction( new ActionNormalizeSetup("normalanalysis"));
act->addAction( new ActionDefaultParams("base"));
//  act->addAction( new ActionParamShiftStart("paramshift") );
act->addAction( new ActionExtraPopSetup("base",stackspace) );
act->addAction( new ActionPrototypeTypes("protorecovery"));
act->addAction( new ActionFuncLink("protorecovery") );
act->addAction( new ActionFuncLinkOutOnly("noproto") );
{
  actfullloop = new ActionGroup(Action::rule_repeatapply,"fullloop");
  {
    actmainloop = new ActionGroup(Action::rule_repeatapply,"mainloop");
    actmainloop->addAction( new ActionUnreachable("base") );
    actmainloop->addAction( new ActionVarnodeProps("base") );
    actmainloop->addAction( new ActionHeritage("base") );
    actmainloop->addAction( new ActionParamDouble("protorecovery") );
    actmainloop->addAction( new ActionSegmentize("base"));
    actmainloop->addAction( new ActionInternalStorage("base") );
// ...

This excerpt shows the creation of several Action objects, as well as action containers, which allow grouping actions into a sequence. The code snippets contains two of these: ActionRestartGroup and ActionGroup. The ActionRestartGroup is unique here, as it will execute its contained actions from the beginning if a certain flag is set. Because this flag is set per function, there can only be one such group in the compilation pipeline and it is therefore used to mark the root of the universal action. Besides that, ActionGroup and ActionRestartGroup are both Action objects themselves, meaning that groups can also be added as actions into other groups. Together with the flags that can be specified for actions (in the excerpt you can see e.g. Action::rule_onceperfunc and Action::rule_repeatapply) this can be used to create loops of related actions that are applied repeatedly until the action no longer triggers a change.

Instantiating Rules

There is another special container called ActionPool, which contains Rule objects instead of actions. As can be seen in the code snippet below, the API for using an ActionPool is analogous to using an ActionGroup, but with rules instead.

cpp
/// ...
actprop = new ActionPool(Action::rule_repeatapply,"oppool1");
actprop->addRule( new RuleEarlyRemoval("deadcode"));
actprop->addRule( new RuleTermOrder("analysis"));
actprop->addRule( new RuleSelectCse("analysis"));
actprop->addRule( new RuleCollectTerms("analysis"));
actprop->addRule( new RulePullsubMulti("analysis"));
actprop->addRule( new RulePullsubIndirect("analysis"));
actprop->addRule( new RulePushMulti("nodejoin"));
actprop->addRule( new RuleSborrow("analysis") );
actprop->addRule( new RuleIntLessEqual("analysis") );
actprop->addRule( new RuleTrivialArith("analysis") );
actprop->addRule( new RuleTrivialBool("analysis") );
actprop->addRule( new RuleTrivialShift("analysis") );
/// ...

In contrast to actions, rules can not contain other rules, as they are meant to be simple and atomic. They therefore represent the leaf nodes of the analysis hierarchy.

Default Groups

In a somewhat confusing naming clash, Ghidra has a concept of groups that is interwoven with the definition of the actions inside the universal action, but not related to the ActionGroup objects. There is the notion of default groups, which specify the members of certain default actions. For example, this excerpt of buildDefaultGroups shows the members of the decompile group, which is a top-level analysis action executed by Ghidra:

cpp
const char *members[] = {
  "base", "protorecovery", "protorecovery_a", "deindirect",
  "localrecovery", "deadcode", "typerecovery", "stackptrflow",
  "blockrecovery", "stackvars", "deadcontrolflow", "switchnorm",
  "cleanup", "splitcopy", "splitpointer", "merge", "dynamic", "casts",
  "analysis", "fixateglobals", "fixateproto", "constsequence",
  "segment", "returnsplit", "nodejoin", "doubleload", "doubleprecis",
  "unreachable", "subvar", "floatprecision",
  "conditionalexe", ""
};
setGroup("decompile",members);

Some names of members are familiar from the excerpts of the universal action definition, e.g. base, protorecovery, deadcode and others. This is no coincidence, but rather a mechanism for selecting relevant actions from the universal action. To this end, every Action and Rule has a name and a group identifier. The name itself is typically set in the constructor, whereas the group identifier is set during the construction of the universal action. That means, all strings in the universalAction function (i.e. the previous excerpts) are group identifiers, with one exception: Action and rule containers have an empty group identifier and the strings shown in universalAction represent their names. That means, the oppool1 in ActionPool(Action::rule_repeatapply,"oppool1") refers to the name of the action, but base in ActionStart("base") refers to the group identifier.

This might seem illogical at first, but makes sense under consideration of how the default actions, such as decompile, are constructed. In order to derive this action, Ghidra walks through the hierarchy of the universal action. At every step it tests whether the current action or rule has a group identifier that matches any of the member names of the default group. If yes, the action/rule is included, otherwise it is skipped. This keeps the overall structure of the universal action, but simply deactivates certain actions; deactivating action groups would however change the overall structure and may lead to unpredictable results.