// Patrick Louis

D-Bus and Polkit, No More Mysticism and Confusion

freedesktop logo

Dbus and Polkit are two technologies that emanate an aura of confusion. While their names are omnipresent in discussions, and the internet has its share of criticism and rants about them, not many have a grasp of what they actually do. In this article I’ll give an overview of these technologies.

D-Bus, or Desktop Bus, is often described as a software that allows other processes to communicate with one another, to perform inter-process communication (IPC). However, this term is generic and doesn’t convey what it is used for. Many technologies exist that can perform IPC, from plain socket, to messaging queue, so what differentiates D-Bus from them.

D-Bus can be considered a middleware, a software glue that sits in the middle to provide services to software through a sort of plugin/microkernel architecture. That’s what the bus metaphor represents, it replicates the functionality of hardware buses, with components attaching themselves to known interfaces that they implement, and providing a mean of communication between them. With D-Bus these can be either procedure calls aka methods or signals aka notifications.

While D-Bus does offer 1-to-1 and 1-to-many IPC, it’s more of a byproduct of its original purpose than a mean of efficient process to process data transfer — it isn’t meant to be fast. D-Bus emerges from the world of desktop environments where blocks are well known, and each implements a functionality that should be accessible from other processes if needed without having to reinvent the transfer mechanism for each and every software.
This is the problem it tackles: having components in a desktop environment that are distributed in many processes, each fulfilling a specific job. In such case, if a process implements the behavior needed, instead of reimplementing it, it can instead harness the feature already provided by that other process.

Its design is heavily influenced by Service Oriented Architectures (SOA), Enterprise Service Buses (ESB), and microkernel architectures.
A bus permits abstracting communication between software, replacing all direct contact, and only allowing them to happen on the bus instead.
Additionally, the SOA allows software to expose objects that have methods that can be called remotely, and also allows other software to subscribe/publish events happening in remote objects residing in other software.
Moreover, D-Bus provides an easy plug-and-play, a loose coupling, where any software could detach itself from the bus and allow another process to be plugged, containing objects that implement the same features the previous process implemented.
In sum, it’s an abstraction layer for functionalities that could be implemented by any software, a standardized way to create pluggable desktop components. This is what D-Bus is about, this is the role it plays, and it explains the difficulty in grasping the concepts that gave rise to it.

The big conceptual picture goes as follows.
We have a D-Bus daemon running at an address and services that implement well known behaviors. These services attach to the D-Bus daemon and the attachment edge has a name, a bus name.
Inside these services, there are objects that implement the well known behavior. These objects also have a path leading to them so that you can target which object within that service implements the specific interface needed.
Then, the interface methods and events can be called or registered on this object inside this service, connected to this bus name, from another service that requires the behavior implemented by that interface to be executed.

This is how these particular nested components interact with one another, and it gives rise to the following:

Address of D-Bus daemon ->
Bus Name that the service attached to ->
Path of the object within this service ->
Interface that this object implements ->
Method or Signal concrete implementation

Or in graphical form:

D-Bus ecosystem

Instead of having everyone talk to one another:

p2p interaction

Let’s take a method call example that shows these 3 required pieces of information.

org.gnome.SessionManager \
/org/gnome/SessionManager \
org.gnome.SessionManager.CanShutdown

   boolean true

Here, we have the service bus name org.gnome.SessionManager, the object path /org/gnome/SessionManager, and the interface/method name org.gnome.SessionManager.CanShutdown, all separated by spaces. If the /org/gnome/SessionManager only implements a single interface then we could call it as such CanShutdown, but here it doesn’t.

Let’s dive deeper into the pieces we’ve mentioned. They are akin to the ones in an SOA ecosystem, but with the addition of the bus name, bus daemon, and the abstraction for the plug-and-play.

  • Objects

An object is an entity that resides in a process/service and that effectuates some work. It is identified by a path name. The path name is usually written, though not mandatory, in a namespace format where it is grouped and divided by slashes /, just like Unix file system path.

For example: /org/gnome/Nautilus/window/1.

Objects have methods and signals, methods take input and return output, while signals are events that processes can subscribe to.

  • Interfaces

These methods and signals are concrete implementations of interfaces, the same definition as in OOP.
As with OOP, interfaces are a group of abstractions that have to be defined in the object that implements them. The members, methods and signals, are also namespaced under this interface name.

Example:

interface=org.gnome.Shell.Introspect
member method=GetRunningApplications
absolute name of method=org.gnome.Shell.Introspect.GetRunningApplications

Some interfaces are commonly implemented by objects, such as the org.freedesktop.Introspectable interface, which, as the name implies, makes the object introspectable. It allows to query the object about its capabilities, features, and other interfaces it implements. This is a very useful feature because it allows discovery.
It’s also worth mentioning that dbus can be used in a generic way to set and get properties of services’ objects through the org.freedesktop.DBus.Properties interface.

Interfaces can be described as standard, and for documentation, in D-Bus XML configuration files so that other programmers can use the reference to implement them properly. These files can also be used to auto-generate classes from the XML, making it quicker to implement and less error-prone.
These files can usually be found under /usr/share/dbus-1/interfaces/. Our org.gnome.Shell.Introspect of earlier is there in the file org.gnome.Shell.Introspect.xml along with our method GetRunningApplications. Here’s an excerpt of the relevant section.

<!--
	GetRunningApplications:
	@short_description: Retrieves the description of all running applications

	Each application is associated by an application ID. The details of
	each application consists of a varlist of keys and values. Available
	keys are listed below.

	'active-on-seats' - (as)   list of seats the application is active on
								(a seat only has at most one active
								application)
-->
<method name="GetRunningApplications">
	<arg name="apps" direction="out" type="a{sa{sv}}" />
</method>

Notice the type= part, which describes the format of the output, we’ll come back to what this means in the message format section, but in short each letter represents a basic type. The out direction means that it’s the type of an output value of the method, similarly in is for method parameters. See the following example taken from org.gnome.Shell.Screenshot.xml.

<!--
	ScreenshotArea:
	@x: the X coordinate of the area to capture
	@y: the Y coordinate of the area to capture
	@width: the width of the area to capture
	@height: the height of the area to capture
	@flash: whether to flash the area or not
	@filename: the filename for the screenshot
	@success: whether the screenshot was captured
	@filename_used: the file where the screenshot was saved

	Takes a screenshot of the passed in area and saves it
	in @filename as png image, it returns a boolean
	indicating whether the operation was successful or not.
	@filename can either be an absolute path or a basename, in
	which case the screenshot will be saved in the $XDG_PICTURES_DIR
	or the home directory if it doesn't exist. The filename used
	to save the screenshot will be returned in @filename_used.
-->
<method name="ScreenshotArea">
	<arg type="i" direction="in" name="x"/>
	<arg type="i" direction="in" name="y"/>
	<arg type="i" direction="in" name="width"/>
	<arg type="i" direction="in" name="height"/>
	<arg type="b" direction="in" name="flash"/>
	<arg type="s" direction="in" name="filename"/>
	<arg type="b" direction="out" name="success"/>
	<arg type="s" direction="out" name="filename_used"/>
</method>
  • Proxies

Proxies are the nuts and bolts of an RPC ecosystem, they represent remote objects, along with their methods, in your native code as if they were local. Basically, these are wrappers to make it more simple to manipulate things on D-Bus programmatically instead of worrying about all the components we’ve mentioned above. Programming with proxies might look like this.

Proxy proxy = new Proxy(getBusConnection(), "/remote/object/path");
Object returnValue = proxy.MethodName(arg1, arg2);
  • Bus names

The bus name, or also sometimes called connection name, is the name of the connection that an application gets assigned when it connects to D-Bus. Because D-Bus is a bus architecture, it requires that each assigned name be unique, you can’t have two applications using the same bus name. Usually, it is the D-Bus daemon that generates this random unique value, one that begins with a colon by convention, however, applications may ask to own well-known names instead. These well-known names, as reverse domain names, are for cases when people want to agree on a standard unique application that should implement a certain behavior. Let’s say for instance a specification for a com.mycompany.TextEditor bus name, where the mandatory object path should be /com/mycompany/TextFileManager, and supporting interface org.freedesktop.FileHandler. This makes the desktop environment more predictable and stable. However, today this is still only a dream and has nothing to do with current desktop environment implementations.

  • Connection and address of D-Bus daemon

The D-Bus daemon is the core of D-Bus, it is what everything else attaches itself to. Thus, the address that the daemon uses and listens to should be well known to clients. The mean of communication can be varied from UNIX domain sockets to TCP/IP sockets if used remotely.
In normal scenarios, there are two daemons running, a system-wide daemon and a per-session daemon, one for system-level applications and one for session related applications such as desktop environments. The address of the session bus can be discovered by reading the environment variable $DBUS_SESSION_BUS_ADDRESS, while the address of the system D-Bus daemon is discovered by checking a predefined UNIX domain socket path, though it can be overridden by using another environment variable, namely $DBUS_SYSTEM_BUS_ADDRESS.
Keep in mind that it’s always possible to start private buses, private daemons for non-standard use.

  • Service

A service is the application daemon connected to a bus that provides some utility to clients via the objects it contains that implement some interfaces. Normally we talk of services when the bus name is well-known, as in not auto-generated but using a reverse domain name. Due to D-Bus nature, services are singleton and owner of the bus name, and thus are the only applications that can fulfill specific requests. If any other application wants to use the particular bus name they have to wait in a queue of aspiring owner until the first one relinquishes it.

Within the D-Bus ecosystem, you can request that the D-Bus daemon automatically start a program, if not already started, that provides a given service (well-known name) whenever it’s needed. We call this service activation. It’s quite convenient as you don’t have to remember what application does what, nor care if it’s already running, but instead send a generic request to D-Bus and rely on it to launch it.

To do this we have to define a service file in the /usr/share/dbus-1/services/ directory that describes what and how the service will run.
A simple example goes as follows.

[D-BUS Service]
Name=org.gnome.ServiceName
Exec=program-providing-servicename

You can also specify the user with which the command will be executed using a User= line, and even specify if it’s in relation with a systemd service using SystemdService=.

Additionally, if you are creating a full service, it’s a good practice to define its interfaces explicitly in the /usr/share/dbus-1/interfaces as we previously mentioned.

Now when calling the org.gnome.ServiceName, D-Bus will check to see if the service exists already on the bus, if not it will block the method call, search for the service in the directory, if it matches, start the service as specified to take ownership of the bus name, and finally continue with the method call. If there’s no service file, an error is returned. It’s possible programmatically to make such call asynchronous to avoid blocking.

This is actually a mechanism that systemd can use for service activation when the application acquires a name on dbus (Service Type=dbus). For example, polkit and wpa_supplicant. When the dbus daemon is started with --systemd-activation, as shown below, then systemd services can be started on the fly whenever they are needed. That’s also related to SystemdService= we previously mentioned, as both a systemd unit file and a dbus daemon service file are required in tandem.

dbus         498       1  0 Jun05 ?        00:01:41 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
vnm          810     795  0 Jun05 ?        00:00:19 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only

And the systemd unit file for polkit.

[Unit]
Description=Authorization Manager
Documentation=man:polkit(8)

[Service]
Type=dbus
BusName=org.freedesktop.PolicyKit1
ExecStart=/usr/lib/polkit-1/polkitd --no-debug

Here’s an exploratory example of service activation.
Let’s say we found a service file for Cheese (A webcam app) in the /service directory that is called org.gnome.Cheese.service.

We have no clue what interfaces and methods it implements because its interfaces aren’t described in the /interfaces directory, so we send it any message.

$ dbus-send --session \
--dest=org.gnome.Cheese \
/ org.gnome.Cheese.nonexistent

If we now take a look at the processes, we can clearly see it has been started by the dbus daemon.

$ ps -ef | grep cheese
vnm        56841     716 11 20:53 ?        00:00:00 /usr/bin/cheese --gapplication-service
vnm        56852   56783  0 20:53 pts/4    00:00:00 grep -i cheese

Cheese probably implements introspect so let’s try to see which methods it has.

$ gdbus introspect --session \
--dest org.gnome.Cheese \
--object-path /org/gnome/Cheese | less

We can see that it implements the org.freedesktop.Application interface that is described here, but that I couldn’t find the interface description of in /usr/share/dbus-1/interfaces/. So let’s try to call one of them, the org.freedesktop.Application.Activate seems interesting, it should start the application for us.

$ gdbus call --session --dest org.gnome.Cheese \
--object-path /org/gnome/Cheese \
--method org.freedesktop.Application.Activate  '{}'

NB: I’m using gdbus instead of dbus-send because dbus-send has limitation with complex types such as (a{sv}), a dictionary of key with type “string” and value of type “variant”. We’ll explain the types in the next section.

And cheese will open.
So this call is based on pure service activation.

What kind of messages are sent, and what’s up with the type we mentioned.

Messages, the unit of data transfer in D-Bus, are composed of header and data. The header contains information regarding the sender, receiver, and the message type, while the data is the payload of the message.

The D-Bus message type, not to be confused with the type format of the data payload, could be either a signal (DBUS_MESSAGE_TYPE_SIGNAL), a method call(DBUS_MESSAGE_TYPE_SIGNAL), or an error (DBUS_MESSAGE_TYPE_ERROR).

D-Bus is fully typed and type-safe as far as the payload is concerned, that means the types are predefined and are checked to see if they fit the signatures.

The following types are available:

<contents>   ::= <item> | <container> [ <item> | <container>...]
<item>       ::= <type>:<value>
<container>  ::= <array> | <dict> | <variant>
<array>      ::= array:<type>:<value>[,<value>...]
<dict>       ::= dict:<type>:<type>:<key>,<value>[,<key>,<value>...]
<variant>    ::= variant:<type>:<value>
<type>       ::= string | int16 | uint16 | int32 | uint32 | int64 | uint64 | double | byte | boolean | objpath

These are what is represented in the previous example with the type= in the interface definition. Here are some descriptions.

b           ::= boolean
s           ::= string
i           ::= int
u           ::= uint
d           ::= double
o           ::= object path
v           ::= variant (could be different types)
a{keyvalue} ::= dictionary of key-value type
a(type)     ::= array of value of type

As was said, the actual method of transfer of the information isn’t mandated by the protocol, but it can usually be done locally via UNIX sockets, pipes, or via TCP/IP.

It wouldn’t be very secure to have anyone on the machine be able to send messages to the dbus daemon and do service activation, or call any and every method, some of them could be dealing with sensitive data and activities. It wouldn’t be very secure either to have this data sent in plain text.
On the transfer side, that is why D-Bus implements a simple protocol based on SASL profiles for authenticating one-to-one connections. For the authorization, the dbus daemon controls access to interfaces by a security system of policies.

The policies are read and represented in XML files that can be found in multiple places, including /usr/share/dbus-1/session.conf, /usr/share/dbus-1/system.conf/, /usr/share/dbus-1/session.d/*, and /usr/share/dbus-1/system.d/*.
These files mainly control which user can talk to which interface. If you are not able to talk with a D-Bus service or get an org.freedesktop.DBus.Error.AccessDeniederror, then it’s probably due to one of these files.

For example:

<!DOCTYPE busconfig PUBLIC
 "-//freedesktop//DTD D-BUS Bus Configuration 1.0//EN"
 "http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd">
<busconfig>
	<policy user="vnm">
		<allow own="net.nixers"/>
		<allow send_destination="net.nixers"/>
		<allow send_interface="net.nixers.Blog" send_member="GetPosts"/>
	</policy>
</busconfig>

In this example, the user “vnm” can:

  • Own the interface net.nixers
  • Send messages to the owner of the given service
  • Call GetPosts from interface net.nixers.Blog

If services need more granularity when it comes to permission, then polkit can be used instead.

There’s a lot more that can be configured in the dbus daemon, namely in the configuration files for the session wide daemon in /usr/share/dbus-1/session.conf, and the system wide daemon in /usr/share/dbus-1/system.conf. Such as the way it listens to connections, the limits regarding messages, and where they read other files.

So how do we integrate and harness dbus in our client or service programs.

libdbus schema

We do this using libraries, of course, which there are many. The most low-level one being libdbus, the reference implementation of the specification. However, it’s quite hard to use so people rely on other libraries such as GDBus (part of GLib in GNOME), QtDBus (part of Qt so KDE too), dbus-java, and sd-bus (which is part of systemd).
Some of these libraries offer the proxy capability we’ve talked, namely manipulating dbus objects as if they were local. They also could offer ways to generate classes in the programming language of choice by inputting an interface definition file (see gdbus-codegen and qdbusxml2cpp for an idea).

Let’s name a few projects that rely on D-Bus.

  • KDE: A desktop environment based on Qt
  • GNOME: A desktop environment based on gtk
  • Systemd: An init system
  • Bluez: A project adding Bluetooth support under Linux
  • Pidgin: An instant messaging client
  • Network-manager: A daemon to manage network interfaces
  • Modem-manager: A daemon to provide an API to dial with modems - works with Network-Manager
  • Connman: Same as Network-Manager but works with Ofono for modem
  • Ofono: A daemon that exposing features provided by telephony devices such as modems

One thing that is nice about D-Bus is that there is a lot of tooling to interact with it, it’s very exploratory.

Here’s a bunch of useful ones:

  • dbus-send: send messages to dbus
  • dbus-monitor: monitor all messages
  • gdbus: manipulate dbus with gtk
  • qdbus: manipulate dbus with qt
  • QDBusViewer: exploratory gui
  • D-Feet: exploratory gui

I’ll list some examples.

Monitor all the method calls in the org.freedesktop namespace.

$ dbus-monitor --session type=method_call \
interface=org.freedesktop

For instance, we can debug what happens when we use the command line tool notify-send(1).

This is equivalent to this line of gdbus(1).

$ gdbus call --session --dest org.freedesktop.Notifications \
--object-path /org/freedesktop/Notifications \
--method org.freedesktop.Notifications.Notify \
my_app_name 42 \
gtk-dialog-info "The Summary" \
"Here's the body of the notification" '[]' '{}' 5000

Or as we’ve seen, we can use dbus-send(1), however it has some limitations with dictionaries and variant types. Here are some more examples of it.

$ dbus-send --system --print-reply \
--dest=org.freedesktop.systemd1 \
/org/freedesktop/systemd1/unit/apache2_2eservice \
org.freedesktop.DBus.Properties.Get \
string:'org.freedesktop.systemd1.Unit' \
string:'ActiveState'

$ dbus-send --system --print-reply --type=method_call \
--dest=org.freedesktop.systemd1 \
/org/freedesktop/systemd1 \
org.freedesktop.systemd1.Manager.GetUnit \
string:'apache2.service'

D-Feet QDBusViewer

D-Feet and QDBusViewer are GUI that are driven by the introspectability of objects. You can also introspect using gbus and qdbus.

Either through calling org.freedesktop.DBus.Introspectable.Introspect.

With gdbus:

$ gdbus call --session --dest org.freedesktop.Notifications \
--object-path /org/freedesktop/Notifications \
--method org.freedesktop.DBus.Introspectable.Introspect

With dbus-send:

$ dbus-send --session --print-reply \
--dest=org.freedesktop.Notifications \
/org/freedesktop/Notifications \
org.freedesktop.DBus.Introspectable.Introspect

Or by using the introspect feature of the tool, here gdbus, which will output it in a fancy colored way:

$ gdbus introspect --session \
--dest org.freedesktop.Notifications \
--object-path /org/freedesktop/Notifications

D-Bus is not without limitations and critics. As we said in the introduction, it isn’t meant for high performance IPC, it’s meant for control, and not data transfer. So it’s fine to use it to activate a chat application, for instance, but not to have a whole media stream pass on it.
D-Bus has also been criticized as being bloated and over-engineered, though those claims are often unsubstantiated and only come from online rants. It remains that D-Bus is still heavily popular and that there’s no replacement that is a real contender.

Now, let’s turn our attention to Polkit.

Polkit, formerly PolicyKit, is a service running on dbus that offers clients a way to perform granular system-wide privilege authentication, something dbus default policies are not able to do, nor sudo.
Unlike sudo, that switches the user and grants permission to the whole process, polkit delimits distinct actions, categorizes users by group or name, and decides whether the action is allowed or not. This is all offered system-wide, so that dbus services can query polkit to know if clients have privileges or not.
In polkit parlance, we talk of MECHANISMS, privileged services, that offer actions to SUBJECTS, which are unprivileged programs.

The polkit authority is a system daemon, usually dbus service activated, named “polkitd”, and running as the polkitd user UID.

$ ps -ef | grep polkitd
polkitd   904  1  0 Jun05 ?  00:00:34 /usr/lib/polkit-1/polkitd --no-debug

The privileged services (MECHANISMS) can define a set of actions for which authentication is required. If another process wants to access the method of such privileged service, maybe through dbus method call, the privilege service will query polkit. Polkit will then consult two things, the action policy defined by that service and a set of programmatic rules that generally apply. If needed, polkit will initiate an authentication agent to verify that the user is who they say they are. Finally, polkit sends its result back to the privilege service and let it know if the user is allowed to perform the action or not.

In summary, the following definitions apply:

  • Subject - a user
  • Action - a privileged duty that (generally) requires some authentication.
  • Result - the action to take given a subject/action pair and a set of rules. This may be to continue, to deny, or to prompt for a password.
  • Rule - a piece of logic that maps a subject/action pair to a result.

And they materialize in these files:

  • /usr/share/polkit-1/actions - Default policies for each action. These tell polkit whether to allow, deny, or prompt for a password.
  • /etc/polkit-1/rules.d - User-supplied rules. These are JavaScript scripts.
  • /usr/share/polkit-1/rules.d - Distro-supplied rules. Do not change these because they will be overwritten by the next upgrade.

Which can be summarized in this picture:

polkit architecture

Thus, polkit works along a per-session authentication agent, usually started by the desktop environment. This is another service that is used whenever a user needs to be prompted for a password to prove its identity.
The polkit package contains a textual authentication agent called pkttyagent, which is used as a general fallback but lacks in features. I advise anyone that is trying the examples in this post to install a decent authentication agent instead.

Here’s a list of popular ones:

  • lxqt-policykit - which provides /usr/bin/lxqt-policykit-agent
  • lxsession - which provides /usr/bin/lxpolkit
  • mate-polkit - which provides /usr/lib/mate-polkit/polkit-mate-authentication-agent-1
  • polkit-efl - which provides /usr/bin/polkit-efl-authentication-agent-1
  • polkit-gnome - which provides /usr/lib/polkit-gnome/polkit-gnome-authentication-agent-1
  • polkit-kde-agent - which provides /usr/lib/polkit-kde-authentication-agent-1
  • ts-polkitagent - which provides /usr/lib/ts-polkitagent
  • xfce-polkit - which provides /usr/lib/xfce-polkit/xfce-polkit

Authentication agent

Services/mechanisms have to define the set of actions for which clients require authentication. This is done through defining a policy XML file in the /usr/share/polkit-1/actions/ directory. The actions are defined in a namespaced format, and there can be multiple ones per policy file.
A simple, grep '<action id' * | less in this directory should give an idea of the type of actions that are available. You can also list all the installed polkit actions, using the pkaction(1) command.

For example:

org.xfce.thunar.policy: <action id="org.xfce.thunar">
org.freedesktop.policykit.policy:  <action id="org.freedesktop.policykit.exec">

NB: File names aren’t required to be the same as the action id namespace.

This file defines metadata information for each action, such as the vendor, the vendor URL, the icon name, the message that will be displayed when requiring authentication in multiple languages, and the description. The important sections in the action element are the defaults and annotate elements.

The defaults element is the one that polkit inspects to know if a client is authorized or not. It is composed of 3 mandatory sub-elements: allow_any for authorization policy that applies to any client, allow_inactive for policy that apply to clients in inactive session on local console, and allow_active for client in the currently active session on local consoles.
These elements take as value one of the following:

  • no - Not authorized
  • yes - Authorized.
  • auth_self - The owner of the current session should authenticate (usually the user that logged in, your user password)
  • auth_admin - Authentication by the admin is required (root)
  • auth_self_keep - Same as auth_self but the authentication is kept for some time that is defined in polkit configurations.
  • auth_admin_keep - Same as auth_admin but also keeps it for some time

The annotate element is used to pass extra key-value pair to the action. There can be multiple key-value that are passed. Some annotations/key-values are well known, such as the org.freedesktop.policykit.exec.path which, if passed to the pkexec program that is shipped by default with polkit, will tell it how to execute a certain program.
Another defined annotation is the org.freedesktop.policykit.imply which will tell polkit that if a client was authorized for the action it should also be authorized for the action in the imply annotation.
One last interesting annotation is the org.freedesktop.policykit.owner, which will let polkitd know who has the right to interrogate it about whether other users are currently authorized to do certain actions or not.

Other than policy actions, polkit also offers a rule system that is applied every time it needs to resolve authentication. The rules are defined in two directories, /etc/polkit-1/rules.d/ and /usr/share/polkit-1/rules.d/. As users, we normally add custom rules to the /etc/ directory and leave the /usr/share/ for distro packages rules.
Rules within these files are defined in javascript and come with a preset of helper methods that live under the polkit object.

The polkit javascript object comes with the following methods, which are self-explanatory.

  • void addRule( polkit.Result function(action, subject) {...});
  • void addAdminRule( string[] function(action, subject) {...}); called when administrator authentication is required
  • void log( string message);
  • string spawn( string[] argv);

The polkit.Result object is defined as follows:

polkit.Result = {
    NO              : "no",
    YES             : "yes",
    AUTH_SELF       : "auth_self",
    AUTH_SELF_KEEP  : "auth_self_keep",
    AUTH_ADMIN      : "auth_admin",
    AUTH_ADMIN_KEEP : "auth_admin_keep",
    NOT_HANDLED     : null
};

Note that the rule files are processed in alphabetical order, and thus if a rule is processed before another and returns any value other than polkit.Result.NOT_HANDLED, for example polkit.Result.YES, then polkit won’t bother continuing processing the next files. Thus, file name convention does matter.

The functions polkit.addRule, and polkit.addAdminRule, have the same arguments, namely an action and a subject. Respectively being the action being requested, which has an id attribute, and a lookup() method to fetch annotations values, and the subject which has as attributes the pid, user, groups, seat, session, etc, and methods such as isInGroup, and isInNetGroup.

Here are some examples taken from the official documentation:

Log the action and subject whenever the action org.freedesktop.policykit.exec is requested.

polkit.addRule(function(action, subject) {
    if (action.id == "org.freedesktop.policykit.exec") {
        polkit.log("action=" + action);
        polkit.log("subject=" + subject);
    }
});

Allow all users in the admin group to perform user administration without changing policy for other users.

polkit.addRule(function(action, subject) {
    if (action.id == "org.freedesktop.accounts.user-administration" &&
        subject.isInGroup("admin")) {
        return polkit.Result.YES;
    }
});

Define administrative users to be the users in the wheel group:

polkit.addAdminRule(function(action, subject) {
    return ["unix-group:wheel"];
});

Run an external helper to determine if the current user may reboot the system:

polkit.addRule(function(action, subject) {
    if (action.id.indexOf("org.freedesktop.login1.reboot") == 0) {
        try {
            // user-may-reboot exits with success (exit code 0)
            // only if the passed username is authorized
            polkit.spawn(["/opt/company/bin/user-may-reboot",
                          subject.user]);
            return polkit.Result.YES;
        } catch (error) {
            // Nope, but do allow admin authentication
            return polkit.Result.AUTH_ADMIN;
        }
    }
});

The following example shows how the authorization decision can depend on variables passed by the pkexec(1) mechanism:

polkit.addRule(function(action, subject) {
    if (action.id == "org.freedesktop.policykit.exec" &&
        action.lookup("program") == "/usr/bin/cat") {
        return polkit.Result.AUTH_ADMIN;
    }
});

Keep in mind that polkit will track changes in both the policy and rules directories, so there’s no need to worry about restarting polkit, changes will appear immediately.

We’ve mentioned a tool called pkexec(1) that comes pre-installed along polkit. This program lets you execute a command as another user, by default executing it as root. It is a sort of sudo replacement but that may appear confusing to most users who have no idea about polkit. However, the integration with authentication agent is quite nice.

So how do we integrate and harness polkit in our subject and mechanism software. We do this using libraries, of course, which there is are many to integrate with different desktop environments.
The libpolkit-agent-1, or the libpolkit-gobject-1 (gtk), libraries are used by the mechanisms, and this is most of what is needed. The portion of code that requires authentication can be wrapped with a check on polkit.
For instance, the polkit_authority_check_authorization() is used to check whether a subject is authorized.

As for writing an authentication agent, it will have to implement the registration methods to be able to receive requests from polkit.

Remember, polkit is a dbus service, and thus all its interfaces are well known and can be introspected. That means that you can possibly interact with it directly through dbus instead of using a helper library.

Polkit also offers some excellent manpages that are extremely useful, be sure to check polkit(8), polkitd(8), pkcheck(1), pkaction(1), pkexec(1).

The following tools are of help:

  • polkit-explorer or polkitex - a GUI to inspect policy files
  • pkcreate - a WIP tool to easily create policy files, but it seems it is lacking
  • pkcheck - Check whether a subject has privileges or not
  • pkexec - Execute a command as another user

Let’s test through some examples.

First pkaction(1), to query the policy file.

$ pkaction -a org.xfce.thunar -v

org.xfce.thunar:
  description:       Run Thunar as root
  message:           Authentication is required to run Thunar as root.
  vendor:            Thunar
  vendor_url:        https://xfce.org/
  icon:              system-file-manager
  implicit any:      auth_self_keep
  implicit inactive: auth_self_keep
  implicit active:   auth_self_keep
  annotation:        org.freedesktop.policykit.exec.path -> /usr/bin/thunar
  annotation:        org.freedesktop.policykit.exec.allow_gui -> true

Compared to polkitex:

freedesktop logo

We can get the current shell PID.

$ ps
    PID TTY          TIME CMD
 421622 pts/21   00:00:00 zsh
 421624 pts/21   00:00:00 ps

And then give ourselves temporary privileges to org.freedesktop.systemd1.manage-units permission.

$ pkcheck --action-id 'org.freedesktop.systemd1.manage-units' --process 421622 -u
$ pkcheck --list-temp
authorization id: tmpauthz10
action:           org.freedesktop.systemd1.manage-units
subject:          unix-process:421622:195039910 (zsh)
obtained:         26 sec ago (Sun Jun 28 10:53:39 2020)
expires:          4 min 33 sec from now (Sun Jun 28 10:58:38 2020)

As you can see, if the auth_admin_keep or auth_self_keep are set, the authorization will be kept for a while and can be listed using pkcheck.

You can try to exec a process as another user, just like sudo:

$ pkexec /usr/bin/thunar

If you want to override the currently running authentication agent, you can test having pkttyagent running in another terminal passing it the -p argument for the process it will listen to.

# terminal 1
$ pkttyagent -p 423619
# terminal 2
$ pkcheck --action-id 'org.xfce.thunar' --process 423619 -u
# will display in terminal 1
polkit\56temporary_authorization_id=tmpauthz13
polkit\56retains_authorization_after_challenge=true
==== AUTHENTICATING FOR org.xfce.thunar ====
Authentication is required to run Thunar as root.
Authenticating as: vnm
Password: 
==== AUTHENTICATION COMPLETE ====

So this is it for polkit, but what’s the deal with consolekit and systemd logind, and what’s the relation with polkit.

Remember we’ve talked about sessions when discussing the <default> element of polkit policy files, this is where these two come in. Let’s quote again:

  • auth_self - The owner of the current session should authenticate (usually the user that logged in, your user password)
  • allow_active - for client in the currently active session on local consoles

The two programs consolekit and systemd logind have as purpose to be services on dbus that can be interrogated about the status of the current session, its users, its seats, its login. It can also be used to manage the session with methods for shutting down, suspending, restarting, and hibernating the machine.

$ loginctl show-session $XDG_SESSION_ID
Id=2
Name=vnm
Timestamp=Fri 2020-06-05 21:06:43 EEST
[...snip...]
Remote=no
Active=yes
State=active

# in another terminal we monitor using
$ dbus-monitor --system
# and the output
method call time=1593360621.762509 sender=:1.59516 \
-> destination=org.freedesktop.login1 serial=2 \
path=/org/freedesktop/login1; \
interface=org.freedesktop.login1.Manager; \
member=GetSession

method call time=1593360621.763069 sender=:1.59516 \
-> destination=org.freedesktop.login1 serial=3 \
path=/org/freedesktop/login1/session/_32; \
interface=org.freedesktop.DBus.Properties; \
member=GetAll

As can be seen, this is done through the org.freedesktop.login1.Manager bus name.

And so, polkit uses data gathered from systemd logind or consolekit to create the 3 domain rules we’ve seen, the allow_any, allow_inactive, and allow_active. This is where these two interact with one another.
The following condition applies for the returned value of systemd logind:

  • allow_any mean any session (even remote sessions)
  • allow_inactive means Remote == false and Active == false
  • allow_active means Remote == false and Active == true


In conclusion, all these technologies, D-Bus, polkit, and systemd logind, are inherently intertwined, and this is as much a positive aspect as it is a fragile point of failure. They each complete one another but if one goes down, there could be issues echoing all across the system.
I hope this post has removed the mystification around them and helped anyone to understand what they stand for: Yet another glue in the desktop environments, similar to this post but solving another problem.






References:




If you want to have a more in depth discussion I'm always available by email or irc. We can discuss and argue about what you like and dislike, about new ideas to consider, opinions, etc..
If you don't feel like "having a discussion" or are intimidated by emails then you can simply say something small in the comment sections below and/or share it with your friends.