3.4 Abstract network stack
[Libraries module]

The Abstract network stack module is described through the following subsections:

3.4.1 Module API

Overview [link]

Network library is a framework for designing network stacks. It is built around 3 main objects:

a set of layers,
tasks,
a scheduler.

Layers [link]

A network layer is a part of a networking stack. Depending on actual needs, layer may be statically or dynamically allocated on a per-session basis.

For instance, in a TCP/IP stack, Ethernet, IP and TCP layers are long-lived, whereas TCP session layer (for a single source/target IP/port tuple) is shorter-lived.

Layers communicate exclusively through tasks that are executed in a unique scheduler for a given stack.

Layers are stacked in a tree fashion. Root is close to the network interfaces, leaves are closer to the application. A layer has a single parent layer in stack, but may have multiple children layers.

All layers are refcounted. Parents hold a refcount on children (and not the other way around). An interface layer that gets destroyed (because the communication medium disappears) implicitly drops references on children layers which will, in turn, get destroyed.

Each layer should check for parent validity before sending a task to it.

Communication at borders of libnetwork (between libnetwork and other modules) is done through a delegate pattern. Each layer is responsible for declaration of its delegate vtable and API.

Layers provide a set of entry points for scheduler to send tasks and update state. Main entry point for task handling is task_handle.

Tasks [link]

A network task is a work load to be handled by a given layer. It may be of various types among:

a timeout,
a packet to handle, inbound or outbound,
a request to respond to,
a response to a request,
a one-way notification.

A task always has a source layer and a target layer. A layer may send a task to itself.

Task always take references to source and target headers, therefore, source and target layers of a given task may not disappear.

Tasks that are not handled yet may be cancelled.

There are various helpers to intialize and send tasks directly.

Scheduler [link]

Network scheduler maintains state of tasks. It has a time source for tracking timeouts of tasks and handles delivery to destination layers.

Scheduler ensures each layer is executing its task_handle entry point non-concurrently. This avoids usage for locking in each layer.

Whether multiple layers may run in parallel is an implementation detail. Whether layer entry points are called from a thread, a kroutine or other scheduling routines in an implementation detail as well.

Object management [link]

Layers [link]

A layer implementation is either generic or hardware-dependant. When generic, there is a single factory function in library code, like example Dumper Layer. When hardware-specific, layer is in a device driver implementing the Network device class. Device acts as a factory, its layer_create method can create a layer from its type, delegate and parameters. This way, one single device driver can be factory for various layer types.

In all cases, layers are created with a refcount of 1, and caller is responsible for releasing them if not needed any more.

Layers are intended to be stacked one on another through calls to net_layer_bind. This implicitly calls bound handler of parent layer to check whether selected child layer is acceptable, returning an error from bound method aborts binding. net_layer_bind gets passed an address parameter, its interpretation is specific to parent layer. Another method notifies when a child layer gets unbound.

As a parent layer takes reference on its children, initialization code may drop its references on children layers once they got bound. This way if the only reference owner is the parent layer, once parent layer gets destroyed, children will implicitly and automatically get dandling and destroyed.

Delegate is an opaque object handling layer events. A pointer and a vtable must be provided on layer creation. Delegate are intended for handling of out-of band signalling, like layer administrative tasks. Basic delegate handles notification for deletion of layer. Delegate vtable definition may be inherited and augmented by specific layer definitions. A delegate must always implement its layer requirements.

Task [link]

Memory allocation policy for tasks is flexible. Scheduler contains a slab for allocating basic tasks, but any layer may provide its own allocation policy for its tasks. Every task must have a destroy function that is called on task completion or cancel. When calling scheduler task allocator, destroy function is set already.

A task takes references on its source and destination layers. As such, a task is guaranteed to be delivered without dandling pointers. But in order to avoid dandling layers taking references to themselves through tasks, some deletion opportunities are taken when a layer gets dandling (i.e. when its parent disappears). To get most of deletion done, tasks in queue in the scheduler that reference dandling layer as source or destination get cancelled. In order to cancel tasks that are not in queue yet, layer's dandling method is called.

Buffers [link]

Packet-based tasks reference payload buffers. Buffers are allocated in a buffer pool. Scheduler provides a allocation in a common buffer pool for the stack context.

In order to optimize handling of payloads while they bubble through the stack, a layer should abide its context's prefix size and mtu offsets.

If task contains a reference to a buffer, layer may update buffer's begin and end pointers before forwarding the task. This allow to decode and remove, or insert layer-specific network headers without copying included payload.

Buffers are refcounted and tasks take references to buffers. Once a buffer is handed to a task, must be refdropped.

Workload management [link]

Tasks get passed from layer to layer through a scheduler. Scheduler implementation guarantees one layer code needs not be reentrant (i.e. a layer's methods are never called concurrently from scheduler). This does not imply anything for different layers of the stack. Pushing task for future handling into scheduler is always safe, from any environment.

A layer may receive tasks from itself or other layers through its task_handle method. Each received task should be either forwarded, answered to, responded to or destroyed. Layer may also keep the task for later processing, but care must be taken for not creating retention loops.

Implementing a layer [link]

A layer consist of a factory function, either called directly when in library code, or called from a device layer_create method. Factory is responsible for calling net_layer_init with scheduler context, layer's handler and vtable, if any. Once called, layer is considered alive.

Layer's handler acts as a vtable and provides a set of methods for the layer. Some are mandatory, others are optional. Among these methods, there is the destroyed one. It is responsible for cleaning layer contents and freeing any dynamically allocated memory.

Layer handler also contains methods for managing layer lifetime in the stack. In particular, the bound and unbound methods are called when children layers are attached and detached from the considered layer.

As long as a child layer is attached, to its parent, stack context may call child_context_adjust method. This method is responsible for modifying the context of a child layer any time the context of the parent changes. This is mostly useful for ajusting context addresses and packet's prefix and mtu sizes.

net_layer_handler_s::dandling is called when layer gets unbound from its parent. When called, layer may or may not be expecting soon destruction, for instance, layer may be bound again to another layer. Implementer must take care of not mixing dandling and destruction events.

3.4 Abstract network stack [Libraries module]

Overview [link]

Layers [link]

Tasks [link]

Scheduler [link]

Object management [link]

Layers [link]

Task [link]

Buffers [link]

Workload management [link]

Implementing a layer [link]

3.4 Abstract network stack
[Libraries module]