The `{domir}`

package supports the determination of the
relative importance of the inputs (i.e., independent variables,
predictors, or features) in a user’s statistical or machine learning
model. The methodology used by `{domir}`

is called
*Dominance Analysis* which is based on a series of pairwise
comparisons between the model fit values ascribed to elements in the
model including comparing *Shapley
values*.

The intention of this package is to provide a flexible user interface
to dominance analysis—a relatively assumption-free methodology for
comparing the value of model inputs to prediction. The user interface is
structured such that `{domir}`

automates the decomposition of
the returned value and comparisons between model inputs and the user
provides the model inputs, the predictive model into which they are
entered, and returned value from the model to decompose.

To install the most recent version of `{domir}`

from CRAN
use:

`install.packages("domir")`

`{domir}`

is also used as the computational engine
underlying the `dominance_analysis()`

function for the {parameters} package
from the `{easystats}`

collection.

`{domir}`

DoesThe primary dominance analysis function `domir`

implements
the most computationally intensive and programming heavy parts of
dominance analysis for the user and has relatively few requirements on
the predictive modeling functions with which it can work.

The flexibility of `domir`

comes at the cost of more
complexity for the user in terms of setting up a function that accepts
the type of input `domir`

will provide (currently only a
‘formula’) and and expects to receive (currently only a numeric
scalar).

Below these ideas are outlined in greater detail in the context of a
few examples. The next section begins the discussion with a more
extensive comparison of `domir`

with packages that implement
similar methods.

The `domir`

function implements the same method as the
“lmg” type for the `calc.relimpo`

function in the
`{relaimpo}`

package. `domir`

can replicate the
results produced by both the above package but, as will be seen,
requires more user input.

To illustrate these points, consider the following example linear regression on which all three of the dominance analysis results to come are based:

`lm(mpg ~ am + vs + cyl, data = mtcars)`

Classic dominance analysis uses the variance explained \(R^2\) as fit statistic (i.e., as
implemented by `lm`

’s `summary`

method) and so
will this example.

`{domir}`

’s `domir`

Implementing a ‘classic’ dominance analysis on this linear regression
in `domir`

can be inputted as:

```
<-
lm_wrapper function(formula, data) {
lm(formula, data = data) |>
summary() |>
"r.squared"]]
_[[
}
domir(mpg ~ am + vs + cyl, lm_wrapper, data = mtcars)
```

```
## Overall Value: 0.7619773
##
## General Dominance Values:
## General Dominance Standardized Ranks
## am 0.1774892 0.2329324 3
## vs 0.2027032 0.2660226 2
## cyl 0.3817849 0.5010450 1
##
## Conditional Dominance Values:
## Subset Size: 1 Subset Size: 2 Subset Size: 3
## am 0.3597989 0.1389842 0.033684441
## vs 0.4409477 0.1641982 0.002963748
## cyl 0.7261800 0.3432799 0.075894823
##
## Complete Dominance Designations:
## Dmnated?am Dmnated?vs Dmnated?cyl
## Dmnates?am NA NA FALSE
## Dmnates?vs NA NA FALSE
## Dmnates?cyl TRUE TRUE NA
```

In `domir`

, the `lm`

model is not submitted
directly. Rather, it is wrapped into a function (i.e.,
`lm_wrapper`

) that, in this case, accepts two arguments;
*formula* or an R formula and *data* a data frame in which
the independent variables in the formula are present. The result of the
`lm`

submitted into the `summary`

function and the
result is then filtered to just the **r.squared** element
and returned.

What `domir`

does automate taking subsets of the
*formula* and submit them, repeatedly until all possible subsets
have been submitted, to `lm_wrapper`

(see this vignette
for a conceptual discussion of dominance analysis). In this way,
`domir`

is a `Map`

- or `lapply`

-like
function as it receives an object on which to operate (i.e., the
*formula*) and a function to which to apply to it.
`domir`

expects a numeric scalar to be returned from the
function.

Like `lapply`

, other arguments
(`data = mtcars`

) can also be passed to each call of the
function and must be explicitly built into the wrapper function.

What is important to note about `domir`

that differs from
other dominance analysis-oriented functions discussed below is that
`domir`

expects that the user will supply the analysis
pipeline linking the *formula* it passes to the numeric scalar
value that it expects. This ‘supply the pipeline’ approach makes
`domir`

far more flexible than other implementations but does
require the user to think more carefully about how to structure the
pipeline.

Note that the focus of `domir`

’s `print`

-ed
results focuses on the numerical results from “General Dominance Values”
and “Conditional Dominance Values” and, a logical matrix of “Complete
Dominance Designations”.

See also the (now superseded) `domir::domin`

function for
another approach to structuring the input pipeline for dominance
analysis.

`{relaimpo}`

’s
`calc.relimp`

with `type = "lmg"`

`{relaimpo}`

is not a dominance analysis software but does
produce general dominance value decomposition for linear regression
using the explained variance \(R^2\) in
the `calc.relimp`

function with the argument
`type = "lmg"`

.

`::calc.relimp(mpg ~ am + vs + cyl, data = mtcars, type = "lmg") relaimpo`

```
## Response variable: mpg
## Total response variance: 36.3241
## Analysis based on 32 observations
##
## 3 Regressors:
## am vs cyl
## Proportion of variance explained by model: 76.2%
## Metrics are not normalized (rela=FALSE).
##
## Relative importance metrics:
##
## lmg
## am 0.1774892
## vs 0.2027032
## cyl 0.3817849
##
## Average coefficients for different model sizes:
##
## 1X 2Xs 3Xs
## am 7.244939 4.316851 3.026480
## vs 7.940476 2.995142 1.294614
## cyl -2.875790 -2.795816 -2.137632
```

`calc.relimp`

has a similar to structure to that of
`domir`

but does not require a pipeline function. This is
because `{relaimpo}`

is specialized and works only with
`lm`

models and the variance explained \(R^2\) as a fit statistic.
`calc.relimp`

also allows for multiple methods of submitting
(i.e., correlation matrices, fitted `lm`

object, a
`data.frame`

) given that it always implements the same
model.

`calc.relimp`

’s printed results provide relative
importance metric values that match those obtained from
`domir`

(i.e., the general dominance values). In addition,
`calc.relimp`

reports the average `lm`

coefficients across numbers of independent variables/\(X\)s in a way similar to the conditional
dominance values reported by `domir`

—an additional and useful
result to show the impact of inclusion of different numbers of
independent variables on obtained coefficients/predicted values.

Again, note that `{relaimpo}`

is not dominance
analysis-oriented and does not report on dominance designations or
dominance values other than the general dominance values.

Further examples of `domir`

s functionality will be
populated on the `{domir}`

wiki.