+++
title = "Introduction to Generic Netlink, or How to Talk with the Linux Kernel"
date = 2023-02-10T22:38:08
+++
![Tux has got some mail!](genl-tux.png)
Are you writing some code in kernel-space and want to communicate with it from
the comfort of user-space? You are not writing kernel code, but want to talk
with the Linux kernel? Serve yourself some good java (the good hot kind, not the
Oracle one), make yourself comfortable and read ahead!
I recently got myself into programming in the Linux Kernel, more specifically
kernel space modules, and one of the APIs that I had to study was Generic
Netlink. As is usual with most Linux Kernel APIs the documentation, outside of
sometimes fairly well commented code, is a bit lacking and/or old.
Hence why I decided to make a little example for myself on how to use Generic
Netlink for kernel-user space communications, and an introductory guide using
the aforementioned example for my colleagues and any other person interested in
using Generic Netlink but not knowing where to start.
This guide covers the following:
* Registering Generic Netlink families in kernel.
* Registering Generic Netlink operations and handling them in kernel.
* Registering Generic Netlink multicast groups in kernel.
* Sending "events" through Generic Netlink from kernel.
* Connecting to Generic Netlink from a user program.
* Resolving Generic Netlink families and multicast groups from a user program.
* Sending a message to a Generic Netlink family from a user program.
* Subscribing to a Generic Netlink multicast group from a user program.
* Listening for Generic Netlink messages from a user program.
## What is Netlink?
Netlink is a socket domain created with the task of providing IPC for the Linux
Kernel, especially kernel\<-\>user IPC. Netlink was created initially with the
intention of replacing the aging ioctl() interface, by providing a more flexible
way of communicating between kernel and user programs.
Netlink communication happens over standard sockets using the `AF_NETLINK`
domain. Nonetheless, on the user land side of things, libraries exist that
provide a more convenient way of using the Netlink interface, such as libnl[^1].
That said, new Netlink families aren't being created anymore and new code
doesn't use Netlink directly. The classic use of Netlink is relegated to already
existing families such as `NETLINK_ROUTE`, `NETLINK_SELINUX`, etc. The main
problem with Netlink is that it uses static allocation of IDs which are limited
to 32 unique families, which greatly limits its users and may cause conflicts
with out-of-tree modules.
## Presenting Generic Netlink
Generic Netlink was created to fix the deficiencies of Netlink as well as
bringing some quality of life improvements. It is not a separate socket domain
though, it's more of an extension of Netlink. In fact, it is a Netlink family —
`NETLINK_GENERIC`.
Generic Netlink has been around since 2005 so it is a well established interface
for kernel\<-\>userspace IPC. Some notable users of Generic Netlink include
subsystems such as 802.11, ACPI, Wireguard, among others.
The main features that Generic Netlink brings to the table are dynamic family
registration, introspection and a simplified kernel API. This tutorial is
focused specifically on Generic Netlink, since it's the standard way of
communicating with the kernel in ways that are more sophisticated than a simple
sysfs file.
![Generic Netlink bus diagram](01-diagram.png)
Generic Netlink bus diagram
## Some theory
As I've already mentioned, Netlink works over the usual BSD sockets. A Netlink
message always starts with a Netlink header, followed by a protocol header, that
is in the case of Generic Netlink, the Generic Netlink header.
### Netlink messages
The headers look like this:
![Netlink header](02-nlmsghdr.png)
Netlink Header
![Generic Netlink header](03-genlmsghdr.png)
Generic Netlink Header
Or as described by the following C structures:
```C
struct nlmsghdr {
__u32 nlmsg_len;
__u16 nlmsg_type;
__u16 nlmsg_flags;
__u32 nlmsg_seq;
__u32 nlmsg_pid;
};
struct genlmsghdr {
__u8 cmd;
__u8 version;
__u16 reserved;
};
```
Netlink header fields meaning:
* Length — the length of the whole message, including headers.
* Type — the Netlink family ID, in our case Generic Netlink.
* Flags — a do or dump; more on that later.
* Sequence — sequence number; also more on that later.
* Port ID — set to 0, since we are sending from kernel.
and for Generic Netlink:
* Command: operation identifier as defined by the Generic Netlink family.
* Version: the version of the Generic Netlink family protocol.
* Reserved: as its name implies ¯\\_(ツ)_/¯.
Most of the fields are pretty straight forward, and the header is not usually
filled manually by the Netlink user. Some of the information contained in the
headers is provided by the user through the API when calling the different
functions. Some of that information are things like the flags and sequence
numbers.
There are three types of message operations that are usually performed over a
Netlink socket:
* A do operation
* A dump operation
* And multicast messages, or asynchronous notifications.
There are many different ways of sending messages over Netlink, but these are
the most used in Generic Netlink.
A do operation is a single action kind of operation in which the user program
sends the message and receives a reply that could be an acknowledgment or error
message, or maybe even a message with some information.
A dump operation is one for (duh) dumping information, usually more than fits in
one message. The user program also sends a message but receives multiple reply
message until received a `NLMSG_DONE` message that signals the end of the dump.
Whether an operation is a do or a dump is set using the flags field:
* `NLM_F_REQUEST | NLM_F_ACK` for do.
* `NLM_F_REQUEST | NLM_F_ACK | NLM_F_DUMP` for dump.
Now the third type, the multicast messages, are used for sending notifications
to the users that are subscribed to them via the generic netlink multicast
group.
As we saw, there's also a sequence number field in the Netlink header. However,
unlike in other protocols, Netlink doesn't manipulate or enforce the sequence
number itself. It's provided as a way to help keep track of messages and
replies. In practice, the user program would increase the sequence number with
each message sent, and the kernel module would send the reply(ies) with the same
sequence as the command message. Multicast message are usually sent with a
sequence number of 0.
### Message payload
Netlink provides a system of attributes to encode data with information such as
type and length. The use of attributes allows for validation of data and for a
supposedly easy way to extend protocols without breaking backward compatibility.
You can also encode your own types, such as a struct in a single attribute.
However, the use of Netlink attributes for each field is encouraged.
The attributes are encoded in LTV format and are padded such that each attribute
begins at an offset that is a multiple of 4 bytes. The length fields, both in
the message header and attribute, always include the header, but not the
padding.
![Netlink attribute diagram](04-nlattr.png)
Netlink attribute diagram
The attributes in a Netlink, and hence Generic Netlink, message are not
necessarily added always in the same order, which is why they should be walked
and parsed.
Netlink provides for a way to validate that a message is correctly formatted
using so called "attribute validation policies", represented by `struct
nla_policy`. I do find it a bit strange that the structure is not exposed to
user space and hence validation seems to be performed by default only on the
kernel side. It can also be done on user space and libnl also provides its own
`struct nla_policy` but it differs from the kernel one which means you basically
have to do duplicate work to validate the attributes on both sides.
The types of messages and operations are defined by the so called Netlink
commands. A command correlates to one message type, and also might correlate to
one op(eration).
Each command or message type can have or use one or more attributes.
### Families
I have already mentioned families in this text and in different contexts.
Unfortunately, as is common in the world of computer programming, things
sometimes aren't named in the very best way hence we end up with situations like
this one.
Sockets have families, of which we use the `AF_NETLINK`. Netlink also has
families, of which there are only 32, and no more are planned or should be
introduced to the Linux kernel; the family we use is `NETLINK_GENERIC`. Last but
not least, Generic Netlink also has families, although these are dynamically
registered and a whopping total of 1024 can be registered at a single time.
Generic Netlink families are identified by a string, such as "nl80211", for
example. Since the families are registered dynamically, that means that their ID
can change from one computer to another or even one boot to another, so we need
to resolve them before we can send messages to a family.
Generic Netlink in itself provides a single statically allocated family called
`nlctrl` which provides with a command to resolve said families. It also
provides since not long ago a way for introspecting operations and exposing
policies to user space, but we are not going to go into detail on this in this
tutorial.
One more thing of note is that a single Generic Netlink socket is not bound to
any one family. A Generic Netlink socket can talk to any family at any time, it
just needs to provide the family ID when sending the message, by using the type
field as we saw earlier.
### Multicast groups
There are some message that we would like to send asynchronously to user
programs in order to notify them of some events, or just communicate information
as it becomes available. This is where multicast groups come in.
Generic Netlink multicast groups, just like families, are dynamically registered
with a string name and receiving a numeric ID upon registration. In other words,
they must also be resolved before being to subscribe to them. Once subscribed to
a multicast group, the user program will receive all message sent to the group.
In order to avoid mixing sequence numbers with unicast messages and to make
handling easier, it is recommended to use a different socket for multicast
messages.
## Getting our hands dirty
There's much more about Generic Netlink and especially classic Netlink, but
those were the most important concepts to know about when working with Generic
Netlink. That said, it's not very interesting just knowing about something, we
are here for the action after all.
I have made an example of using Generic Netlink that consists of two parts. A
kernel module, and a userland program.
The kernel module provides a single generic netlink operation and a multicast
group. The message structure is the same for the do op and the multicast
notification. The first reads a string message, prints it to the kernel log, and
sends its own message back; and the second sends a notification upon reading a
message from sysfs, echoing it.
The user space program connects to Generic Netlink, subscribes to the multicast
group, sends a message to our family and prints out the received messages.
I'll be explaining them step by step with code listings in this article. The
full source code for both parts can be found at
.
### The land of the Kernel
Using Generic Netlink from kernel space is pretty straightforward. All we need
to start using it is to include a single header in our file, `net/genetlink.h`.
In total all the headers that we need to start working with our example are as
follows:
```C
#include
#include
```
We'll need some definitions and enumerations that will be shared between kernel
space and user space, we'll put them in a header file that we'll call
`genltest.h`:
```C
#define GENLTEST_GENL_NAME "genltest"
#define GENLTEST_GENL_VERSION 1
#define GENLTEST_MC_GRP_NAME "mcgrp"
/* Attributes */
enum genltest_attrs {
GENLTEST_A_UNSPEC,
GENLTEST_A_MSG,
__GENLTEST_A_MAX,
};
#define GENLTEST_A_MAX (__GENLTEST_A_MAX - 1)
/* Commands */
enum genltest_cmds {
GENLTEST_CMD_UNSPEC,
GENLTEST_CMD_ECHO,
__GENLTEST_CMD_MAX,
};
#define GENLTEST_CMD_MAX (__GENLTEST_CMD_MAX - 1)
```
There we defined the name of our family, our protocol version, our multicast
group name, the attributes that we will use in our messages and our commands.
Back in our kernel code, we make a validation policy for our "echo" command:
```C
/* Attribute validation policy for our echo command */
static struct nla_policy echo_pol[GENLTEST_A_MAX + 1] = {
[GENLTEST_A_MSG] = { .type = NLA_NUL_STRING },
};
```
Make an array with our Generic Netlink operations:
```C
/* Operations for our Generic Netlink family */
static struct genl_ops genl_ops[] = {
{
.cmd = GENLTEST_CMD_ECHO,
.policy = echo_pol,
.doit = echo_doit,
},
};
```
Similarly an array with our multicast groups:
```C
/* Multicast groups for our family */
static const struct genl_multicast_group genl_mcgrps[] = {
{ .name = GENLTEST_MC_GRP_NAME },
};
```
Finally the struct describing our family, where we include everything so far:
```C
/* Generic Netlink family */
static struct genl_family genl_fam = {
.name = GENLTEST_GENL_NAME,
.version = GENLTEST_GENL_VERSION,
.maxattr = GENLTEST_A_MAX,
.ops = genl_ops,
.n_ops = ARRAY_SIZE(genl_ops),
.mcgrps = genl_mcgrps,
.n_mcgrps = ARRAY_SIZE(genl_mcgrps),
};
```
On initialization of our module, we need to register our family with Generic
Netlink. For that we just need to pass it our `genl_family` structure:
```C
ret = genl_register_family(&genl_fam);
if (unlikely(ret)) {
pr_crit("failed to register generic netlink family\n");
// etc...
}
```
And similarly, on module exit we need to unregister it:
```C
if (unlikely(genl_unregister_family(&genl_fam))) {
pr_err("failed to unregister generic netlink family\n");
}
```
As you may have noticed, we set our doit callback for our "echo" command to a
`echo_doit` function. Here's what it looks like:
```C
/* Handler for GENLTEST_CMD_ECHO messages received */
static int echo_doit(struct sk_buff *skb, struct genl_info *info)
{
int ret = 0;
void *hdr;
struct sk_buff *msg;
/* Check if the attribute is present and print it */
if (info->attrs[GENLTEST_A_MSG]) {
char *str = nla_data(info->attrs[GENLTEST_A_MSG]);
pr_info("message received: %s\n", str);
} else {
pr_info("empty message received\n");
}
/* Allocate a new buffer for the reply */
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg) {
pr_err("failed to allocate message buffer\n");
return -ENOMEM;
}
/* Put the Generic Netlink header */
hdr = genlmsg_put(msg, info->snd_portid, info->snd_seq, &genl_fam, 0,
GENLTEST_CMD_ECHO);
if (!hdr) {
pr_err("failed to create genetlink header\n");
nlmsg_free(msg);
return -EMSGSIZE;
}
/* And the message */
if ((ret = nla_put_string(msg, GENLTEST_A_MSG,
"Hello from Kernel Space, Netlink!"))) {
pr_err("failed to create message string\n");
genlmsg_cancel(msg, hdr);
nlmsg_free(msg);
goto out;
}
/* Finalize the message and send it */
genlmsg_end(msg, hdr);
ret = genlmsg_reply(msg, info);
pr_info("reply sent\n");
out:
return ret;
}
```
In summary, when handling a do command we follow these steps:
1. Get the data from the incoming message from the `genl_info` structure.
2. Allocate a new message buffer for the reply.
3. Put the Generic Netlink header in the message buffer; notice that we use the
same port id and sequence number as in the incoming message since this is a
reply.
4. Put all our payload attributes.
5. Send the reply.
Now let's take a look at how to send multicast notifications. I've used sysfs to
make this example a little bit more fun; I've created a kobj called `genltest`
which contains a `ping` attribute from which we will echo what is written to it.
For brevity I'll elide the sysfs code from the article and just add the function
that forms and sends the message here:
```C
/* Multicast ping message to our genl multicast group */
static int echo_ping(const char *buf, size_t cnt)
{
int ret = 0;
void *hdr;
/* Allocate message buffer */
struct sk_buff *skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (unlikely(!skb)) {
pr_err("failed to allocate memory for genl message\n");
return -ENOMEM;
}
/* Put the Generic Netlink header */
hdr = genlmsg_put(skb, 0, 0, &genl_fam, 0, GENLTEST_CMD_ECHO);
if (unlikely(!hdr)) {
pr_err("failed to allocate memory for genl header\n");
nlmsg_free(skb);
return -ENOMEM;
}
/* And the message */
if ((ret = nla_put_string(skb, GENLTEST_A_MSG, buf))) {
pr_err("unable to create message string\n");
genlmsg_cancel(skb, hdr);
nlmsg_free(skb);
return ret;
}
/* Finalize the message */
genlmsg_end(skb, hdr);
/* Send it over multicast to the 0-th mc group in our array. */
ret = genlmsg_multicast(&genl_fam, skb, 0, 0, GFP_KERNEL);
if (ret == -ESRCH) {
pr_warn("multicast message sent, but nobody was listening...\n");
} else if (ret) {
pr_err("failed to send multicast genl message\n");
} else {
pr_info("multicast message sent\n");
}
return ret;
}
```
The process is very similar to the do operation, except that we are not
responding to a request but sending an asynchronous message. Because of that we
are setting the sequence number to 0, since it is not of consequence here, and
the port id to 0, the kernel port/PID. We also sent the message via the
`genlmsg_multicast()` function.
This is all for the kernel side of things for this tutorial. Now let's take a
look at the user side of things.
### User land
Netlink is a socket family and so it's possible to communicate over Netlink by
just opening a socket and send and receiving messages over it, something like
this:
```C
int fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
/* Format request message... */
/* ... */
/* Send it */
send(fd, &req, sizeof(req), 0);
/* Receive response */
recv(fd, &resp, BUF_SIZE, 0);
/* Do something with response... */
/* ... */
```
That said, it's better to make use of the libnl[^1] library or similar since
they provide a better way to interface with Generic Netlink that is less prone
to errors and already contains all the boilerplate that you would need to write
anyway. This library is precisely the one that I'll be using in this example.
We'll need to include some headers from the libnl library to get started:
```C
#include
#include
#include
#include
#include
```
As well as our shared header with the enumerations and defines from the kernel
module:
```C
#include "../ks/genltest.h"
```
I've also made a little helper macro for printing errors:
```C
#define prerr(...) fprintf(stderr, "error: " __VA_ARGS__)
```
As I mentioned in the introduction, it's easier to use different sockets for
unicast and multicast messages, so that's what I'm going to be doing here,
opening two different sockets to connect to Generic Netlink:
```C
/* Allocate netlink socket and connect to generic netlink */
static int conn(struct nl_sock **sk)
{
*sk = nl_socket_alloc();
if (!sk) {
return -ENOMEM;
}
return genl_connect(*sk);
}
/*
* ...
*/
struct nl_sock *ucsk, *mcsk;
/*
* We use one socket to receive asynchronous "notifications" over
* multicast group, and another for ops. We do this so that we don't mix
* up responses from ops with notifications to make handling easier.
*/
if ((ret = conn(&ucsk)) || (ret = conn(&mcsk))) {
prerr("failed to connect to generic netlink\n");
goto out;
}
```
Next we need to resolve the ID of the Generic Netlink family that we want to
connect to:
```C
/* Resolve the genl family. One family for both unicast and multicast. */
int fam = genl_ctrl_resolve(ucsk, GENLTEST_GENL_NAME);
if (fam < 0) {
prerr("failed to resolve generic netlink family: %s\n",
strerror(-fam));
goto out;
}
```
A (Generic) Netlink socket is not associated with a family, we are going to need
the family ID when sending the message a little bit later.
The libnl library can do sequence checking for us, but we don't need it for
multicast messages, so we disable it for our multicast socket:
```C
nl_socket_disable_seq_check(mcsk);
```
We also need to resolve the multicast group name. In this case we are going to
be using the resolved ID right away to subscribe to the group and start
receiving the notifications:
```C
/* Resolve the multicast group. */
int mcgrp = genl_ctrl_resolve_grp(mcsk, GENLTEST_GENL_NAME,
GENLTEST_MC_GRP_NAME);
if (mcgrp < 0) {
prerr("failed to resolve generic netlink multicast group: %s\n",
strerror(-mcgrp));
goto out;
}
/* Join the multicast group. */
if ((ret = nl_socket_add_membership(mcsk, mcgrp) < 0)) {
prerr("failed to join multicast group: %s\n", strerror(-ret));
goto out;
}
```
We need to modify the default callback so that we can handle the incoming
messages:
```C
/* Modify the callback for replies to handle all received messages */
static inline int set_cb(struct nl_sock *sk)
{
return -nl_socket_modify_cb(sk, NL_CB_VALID, NL_CB_CUSTOM,
echo_reply_handler, NULL);
}
/*
* ...
*/
if ((ret = set_cb(ucsk)) || (ret = set_cb(mcsk))) {
prerr("failed to set callback: %s\n", strerror(-ret));
goto out;
}
```
As you can see, we set the handler function to the same for both sockets, since
we will be basically receiving the same message format for both the do request
and the notifications. Our handler looks like this:
```C
/*
* Handler for all received messages from our Generic Netlink family, both
* unicast and multicast.
*/
static int echo_reply_handler(struct nl_msg *msg, void *arg)
{
int err = 0;
struct genlmsghdr *genlhdr = nlmsg_data(nlmsg_hdr(msg));
struct nlattr *tb[GENLTEST_A_MAX + 1];
/* Parse the attributes */
err = nla_parse(tb, GENLTEST_A_MAX, genlmsg_attrdata(genlhdr, 0),
genlmsg_attrlen(genlhdr, 0), NULL);
if (err) {
prerr("unable to parse message: %s\n", strerror(-err));
return NL_SKIP;
}
/* Check that there's actually a payload */
if (!tb[GENLTEST_A_MSG]) {
prerr("msg attribute missing from message\n");
return NL_SKIP;
}
/* Print it! */
printf("message received: %s\n", nla_get_string(tb[GENLTEST_A_MSG]));
return NL_OK;
}
```
Nothing that fancy going on here, much of it is very similar to what we were
doing on the kernel side of things. The main difference is that the message was
already parsed for us by the kernel API, while here we have the option to walk
the attributes manually or parse them all onto an array with the help of library
function.
Next we want to send a message to kernel space:
```C
/* Send (unicast) GENLTEST_CMD_ECHO request message */
static int send_echo_msg(struct nl_sock *sk, int fam)
{
int err = 0;
struct nl_msg *msg = nlmsg_alloc();
if (!msg) {
return -ENOMEM;
}
/* Put the genl header inside message buffer */
void *hdr = genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, fam, 0, 0,
GENLTEST_CMD_ECHO, GENLTEST_GENL_VERSION);
if (!hdr) {
return -EMSGSIZE;
}
/* Put the string inside the message. */
err = nla_put_string(msg, GENLTEST_A_MSG,
"Hello from User Space, Netlink!");
if (err < 0) {
return -err;
}
printf("message sent\n");
/* Send the message. */
err = nl_send_auto(sk, msg);
err = err >= 0 ? 0 : err;
nlmsg_free(msg);
return err;
}
/*
* ...
*/
/* Send unicast message and listen for response. */
if ((ret = send_echo_msg(ucsk, fam))) {
prerr("failed to send message: %s\n", strerror(-ret));
}
```
Also not that different from the kernel API. Now we listen once for the response
to our command and indefinitely for incoming notifications:
```C
printf("listening for messages\n");
nl_recvmsgs_default(ucsk);
/* Listen for "notifications". */
while (1) {
nl_recvmsgs_default(mcsk);
}
```
As good hygiene, let's close the connection and socket before exiting our
program:
```C
/* Disconnect and release socket */
static void disconn(struct nl_sock *sk)
{
nl_close(sk);
nl_socket_free(sk);
}
/*
* ...
*/
disconn(ucsk);
disconn(mcsk);
```
That's about it!
## Conclusion
The old ways of interacting with the kernel and its different subsystems through
such interfaces as sysfs and especially ioctl had many downsides and lacked some
very needed features such as asynchronous operations and a properly structured
format.
Netlink, and its extended form, Generic Netlink, provide a very flexible way of
communicating with the kernel, solving many of the downsides and problems of the
interfaces of old. It's certainly not a perfect solution (not very Unixy for
instance), but it's certainly the best way we have to communicate with the
kernel for things more complicated than setting simple parameters.
## Post scriptum
At the moment when I started learning to use Generic Netlink, the 6.0 kernel
wasn't yet stable, hence the excellent kernel docs Netlink intro[^2] wasn't yet
in the "latest" section, but rather in the "next" section. Since I was looking
inside the then "latest" (v5.19) docs, I didn't notice it until I started
writing my own article.
I'm not sure if I had started writing this article had I come across the new
kernel docs page, but in the end I think it was worth it since this one is a
good complement to the official docs if you want to get your hands dirty
straight away, especially considering that it provides a practical example.
Do give a read to the kernel docs page! It covers some things that might not be
covered here.
[^1]: Original site with documentation ,
up-to-date repository:
[^2]: Very good introductory Netlink article, from the kernel docs themselves: