diff --git a/index.html b/index.html index 19ab682..aad0e54 100644 --- a/index.html +++ b/index.html @@ -18,7 +18,7 @@

The Linux Kernel Module Programming Guide

Peter Jay Salzman, Michael Burian, Ori Pomerantz, Bob Mottram, Jim Huang

-
June 25, 2022
+
July 2, 2022
@@ -4597,8 +4597,8 @@ to call arbitrary functions within the kernel. Also, the function prototype of t containing a unsigned long argument, will prevent work from any type checking. Furthermore, the function prototype with unsigned long - argument may be an obstacle to the control-flow integrity. Thus, it is better -to use a unique prototype to separate from the cluster that takes an + argument may be an obstacle to the forward-edge protection of control-flow integrity. +Thus, it is better to use a unique prototype to separate from the cluster that takes an unsigned long argument. The timer callback should be passed a pointer to the timer_list @@ -4608,8 +4608,8 @@ to use a unique prototype to separate from the cluster that takes an structure, into a larger structure, and it can use the container_of macro instead of the unsigned long - value. -

Before Linux v4.14, setup_timer + value. For more information see: Improving the kernel timers API. +

Before Linux v4.14, setup_timer was used to initialize the timer and the timer_list structure looked like: @@ -4624,7 +4624,7 @@ to use a unique prototype to separate from the cluster that takes an 8 9void setup_timer(struct timer_list *timer, void (*callback)(unsigned long), 10                 unsigned long data); -

Since Linux v4.14, timer_setup +

Since Linux v4.14, timer_setup is adopted and the kernel step by step converting to timer_setup from setup_timer @@ -4638,7 +4638,7 @@ Moreover, the timer_setup

1void timer_setup(struct timer_list *timer, 
 2                 void (*callback)(struct timer_list *), unsigned int flags);
-

The setup_timer +

The setup_timer was then removed since v4.15. As a result, the timer_list structure had changed to the following. @@ -4649,7 +4649,7 @@ Moreover, the timer_setup 4    u32 flags; 5    /* ... */ 6}; -

The following source code illustrates a minimal kernel module which, when +

The following source code illustrates a minimal kernel module which, when loaded, starts blinking the keyboard LEDs until it is unloaded.

@@ -4738,7 +4738,7 @@ loaded, starts blinking the keyboard LEDs until it is unloaded. 83module_exit(kbleds_cleanup); 84 85MODULE_LICENSE("GPL"); -

If none of the examples in this chapter fit your debugging needs, +

If none of the examples in this chapter fit your debugging needs, there might yet be some other tricks to try. Ever wondered what CONFIG_LL_DEBUG in make menuconfig @@ -4749,25 +4749,25 @@ everything what your code does over a serial line. If you find yourself porting kernel to some new and former unsupported architecture, this is usually amongst the first things that should be implemented. Logging over a netconsole might also be worth a try. -

While you have seen lots of stuff that can be used to aid debugging here, there are +

While you have seen lots of stuff that can be used to aid debugging here, there are some things to be aware of. Debugging is almost always intrusive. Adding debug code can change the situation enough to make the bug seem to disappear. Thus, you should keep debug code to a minimum and make sure it does not show up in production code. -

+

14 Scheduling Tasks

-

There are two main ways of running tasks: tasklets and work queues. Tasklets are a +

There are two main ways of running tasks: tasklets and work queues. Tasklets are a quick and easy way of scheduling a single function to be run. For example, when triggered from an interrupt, whereas work queues are more complicated but also better suited to running multiple things in a sequence. -

+

14.1 Tasklets

-

Here is an example tasklet module. The +

Here is an example tasklet module. The tasklet_fn function runs for a few seconds and in the mean time execution of the example_tasklet_init @@ -4818,7 +4818,7 @@ better suited to running multiple things in a sequence. 42 43MODULE_DESCRIPTION("Tasklet example"); 44MODULE_LICENSE("GPL"); -

So with this example loaded dmesg +

So with this example loaded dmesg should show: @@ -4830,23 +4830,23 @@ Example tasklet starts Example tasklet init continues... Example tasklet ends -

Although tasklet is easy to use, it comes with several defators, and developers are +

Although tasklet is easy to use, it comes with several defators, and developers are discussing about getting rid of tasklet in linux kernel. The tasklet callback runs in atomic context, inside a software interrupt, meaning that it cannot sleep or access user-space data, so not all work can be done in a tasklet handler. Also, the kernel only allows one instance of any given tasklet to be running at any given time; multiple different tasklet callbacks can run in parallel. -

In recent kernels, tasklets can be replaced by workqueues, timers, or threaded +

In recent kernels, tasklets can be replaced by workqueues, timers, or threaded interrupts.1 While the removal of tasklets remains a longer-term goal, the current kernel contains more than a hundred uses of tasklets. Now developers are proceeding with the API changes and the macro DECLARE_TASKLET_OLD exists for compatibility. For further information, see https://lwn.net/Articles/830964/. -

+

14.2 Work queues

-

To add a task to the scheduler we can use a workqueue. The kernel then uses the +

To add a task to the scheduler we can use a workqueue. The kernel then uses the Completely Fair Scheduler (CFS) to execute work within the queue.

@@ -4883,36 +4883,36 @@ Completely Fair Scheduler (CFS) to execute work within the queue. 31 32MODULE_LICENSE("GPL"); 33MODULE_DESCRIPTION("Workqueue example"); -

+

15 Interrupt Handlers

-

+

15.1 Interrupt Handlers

-

Except for the last chapter, everything we did in the kernel so far we have done as a +

Except for the last chapter, everything we did in the kernel so far we have done as a response to a process asking for it, either by dealing with a special file, sending an ioctl() , or issuing a system call. But the job of the kernel is not just to respond to process requests. Another job, which is every bit as important, is to speak to the hardware connected to the machine. -

There are two types of interaction between the CPU and the rest of the +

There are two types of interaction between the CPU and the rest of the computer’s hardware. The first type is when the CPU gives orders to the hardware, the order is when the hardware needs to tell the CPU something. The second, called interrupts, is much harder to implement because it has to be dealt with when convenient for the hardware, not the CPU. Hardware devices typically have a very small amount of RAM, and if you do not read their information when available, it is lost. -

Under Linux, hardware interrupts are called IRQ’s (Interrupt ReQuests). There +

Under Linux, hardware interrupts are called IRQ’s (Interrupt ReQuests). There are two types of IRQ’s, short and long. A short IRQ is one which is expected to take a very short period of time, during which the rest of the machine will be blocked and no other interrupts will be handled. A long IRQ is one which can take longer, and during which other interrupts may occur (but not interrupts from the same device). If at all possible, it is better to declare an interrupt handler to be long. -

When the CPU receives an interrupt, it stops whatever it is doing (unless it is +

When the CPU receives an interrupt, it stops whatever it is doing (unless it is processing a more important interrupt, in which case it will deal with this one only when the more important one is done), saves certain parameters on the stack and calls the interrupt handler. This means that certain things are not allowed in the @@ -4924,10 +4924,10 @@ heavy work deferred from an interrupt handler. Historically, BH (Linux naming for Bottom Halves) statistically book-keeps the deferred functions. Softirq and its higher level abstraction, Tasklet, replace BH since Linux 2.3. -

The way to implement this is to call +

The way to implement this is to call request_irq() to get your interrupt handler called when the relevant IRQ is received. -

In practice IRQ handling can be a bit more complex. Hardware is often +

In practice IRQ handling can be a bit more complex. Hardware is often designed in a way that chains two interrupt controllers, so that all the IRQs from interrupt controller B are cascaded to a certain IRQ from interrupt controller A. Of course, that requires that the kernel finds out which IRQ it @@ -4944,7 +4944,7 @@ need to solve another truckload of problems. It is not enough to know if a certain IRQs has happened, it’s also important to know what CPU(s) it was for. People still interested in more details, might want to refer to "APIC" now. -

This function receives the IRQ number, the name of the function, +

This function receives the IRQ number, the name of the function, flags, a name for /proc/interrupts and a parameter to be passed to the interrupt handler. Usually there is a certain number of IRQs available. How many IRQs there are is hardware-dependent. The flags can include @@ -4954,16 +4954,16 @@ How many IRQs there are is hardware-dependent. The flags can include SA_INTERRUPT to indicate this is a fast interrupt. This function will only succeed if there is not already a handler on this IRQ, or if you are both willing to share. -

+

15.2 Detecting button presses

-

Many popular single board computers, such as Raspberry Pi or Beagleboards, have a +

Many popular single board computers, such as Raspberry Pi or Beagleboards, have a bunch of GPIO pins. Attaching buttons to those and then having a button press do something is a classic case in which you might need to use interrupts, so that instead of having the CPU waste time and battery power polling for a change in input state, it is better for the input to trigger the CPU to then run a particular handling function. -

Here is an example where buttons are connected to GPIO numbers 17 and 18 and +

Here is an example where buttons are connected to GPIO numbers 17 and 18 and an LED is connected to GPIO 4. You can change those numbers to whatever is appropriate for your board.

@@ -5112,14 +5112,14 @@ appropriate for your board. 142 143MODULE_LICENSE("GPL"); 144MODULE_DESCRIPTION("Handle some GPIO interrupts"); -

+

15.3 Bottom Half

-

Suppose you want to do a bunch of stuff inside of an interrupt routine. A common +

Suppose you want to do a bunch of stuff inside of an interrupt routine. A common way to do that without rendering the interrupt unavailable for a significant duration is to combine it with a tasklet. This pushes the bulk of the work off into the scheduler. -

The example below modifies the previous example to also run an additional task +

The example below modifies the previous example to also run an additional task when an interrupt is triggered.

@@ -5293,19 +5293,19 @@ when an interrupt is triggered. 165 166MODULE_LICENSE("GPL"); 167MODULE_DESCRIPTION("Interrupt with top and bottom half"); -

+

16 Crypto

-

At the dawn of the internet, everybody trusted everybody completely…but that did +

At the dawn of the internet, everybody trusted everybody completely…but that did not work out so well. When this guide was originally written, it was a more innocent era in which almost nobody actually gave a damn about crypto - least of all kernel developers. That is certainly no longer the case now. To handle crypto stuff, the kernel has its own API enabling common methods of encryption, decryption and your favourite hash functions. -

+

16.1 Hash functions

-

Calculating and checking the hashes of things is a common operation. Here is a +

Calculating and checking the hashes of things is a common operation. Here is a demonstration of how to calculate a sha256 hash within a kernel module.

@@ -5373,20 +5373,20 @@ demonstration of how to calculate a sha256 hash within a kernel module. 62 63MODULE_DESCRIPTION("sha256 hash test"); 64MODULE_LICENSE("GPL"); -

Install the module: +

Install the module:

1sudo insmod cryptosha256.ko 
 2sudo dmesg
-

And you should see that the hash was calculated for the test string. -

Finally, remove the test module: +

And you should see that the hash was calculated for the test string. +

Finally, remove the test module:

1sudo rmmod cryptosha256
-

+

16.2 Symmetric key encryption

-

Here is an example of symmetrically encrypting a string using the AES algorithm +

Here is an example of symmetrically encrypting a string using the AES algorithm and a password.

@@ -5591,10 +5591,10 @@ and a password. 196 197MODULE_DESCRIPTION("Symmetric key encryption example"); 198MODULE_LICENSE("GPL"); -

+

17 Virtual Input Device Driver

-

The input device driver is a module that provides a way to communicate +

The input device driver is a module that provides a way to communicate with the interaction device via the event. For example, the keyboard can send the press or release event to tell the kernel what we want to do. The input device driver will allocate a new input structure with @@ -5602,7 +5602,7 @@ do. The input device driver will allocate a new input structure with and sets up input bitfields, device id, version, etc. After that, registers it by calling input_register_device() . -

Here is an example, vinput, It is an API to allow easy +

Here is an example, vinput, It is an API to allow easy development of virtual input drivers. The drivers needs to export a vinput_device() that contains the virtual device name and @@ -5618,7 +5618,7 @@ development of virtual input drivers. The drivers needs to export a

  • the readback function: read()
  • -

    Then using vinput_register_device() +

    Then using vinput_register_device() and vinput_unregister_device() will add a new device to the list of support virtual input devices.

    @@ -5627,7 +5627,7 @@ development of virtual input drivers. The drivers needs to export a

    1int init(struct vinput *);
    -

    This function is passed a struct vinput +

    This function is passed a struct vinput already initialized with an allocated struct input_dev . The init() function is responsible for initializing the capabilities of the input device and register @@ -5635,20 +5635,20 @@ it.

    1int send(struct vinput *, char *, int);
    -

    This function will receive a user string to interpret and inject the event using the +

    This function will receive a user string to interpret and inject the event using the input_report_XXXX or input_event call. The string is already copied from user.

    1int read(struct vinput *, char *, int);
    -

    This function is used for debugging and should fill the buffer parameter with the +

    This function is used for debugging and should fill the buffer parameter with the last event sent in the virtual input device format. The buffer will then be copied to user. -

    vinput devices are created and destroyed using sysfs. And, event injection is done +

    vinput devices are created and destroyed using sysfs. And, event injection is done through a /dev node. The device name will be used by the userland to export a new virtual input device. -

    The class_attribute +

    The class_attribute structure is similar to other attribute types we talked about in section 8:

    @@ -5659,7 +5659,7 @@ virtual input device. 5    ssize_t (*store)(struct class *class, struct class_attribute *attr, 6                    const char *buf, size_t count); 7}; -

    In vinput.c, the macro CLASS_ATTR_WO(export/unexport) +

    In vinput.c, the macro CLASS_ATTR_WO(export/unexport) defined in include/linux/device.h (in this case, device.h is included in include/linux/input.h) will generate the class_attribute structures which are named class_attr_export/unexport. Then, put them into @@ -5669,14 +5669,14 @@ will generate the class_attribute that should be assigned in vinput_class . Finally, call class_register(&vinput_class) to create attributes in sysfs. -

    To create a vinputX sysfs entry and /dev node. +

    To create a vinputX sysfs entry and /dev node.

    1echo "vkbd" | sudo tee /sys/class/vinput/export
    -

    To unexport the device, just echo its id in unexport: +

    To unexport the device, just echo its id in unexport:

    1echo "0" | sudo tee /sys/class/vinput/unexport
    @@ -6137,7 +6137,7 @@ will generate the class_attribute 400 401MODULE_LICENSE("GPL"); 402MODULE_DESCRIPTION("Emulate input events"); -

    Here the virtual keyboard is one of example to use vinput. It supports all +

    Here the virtual keyboard is one of example to use vinput. It supports all KEY_MAX keycodes. The injection format is the KEY_CODE such as defined in include/linux/input.h. A positive value means @@ -6145,12 +6145,12 @@ will generate the class_attribute while a negative value is a KEY_RELEASE . The keyboard supports repetition when the key stays pressed for too long. The following demonstrates how simulation work. -

    Simulate a key press on "g" ( KEY_G +

    Simulate a key press on "g" ( KEY_G = 34):

    1echo "+34" | sudo tee /dev/vinput0
    -

    Simulate a key release on "g" ( KEY_G +

    Simulate a key release on "g" ( KEY_G = 34):

    @@ -6268,13 +6268,13 @@ following demonstrates how simulation work. 108 109MODULE_LICENSE("GPL"); 110MODULE_DESCRIPTION("Emulate keyboard input events through /dev/vinput"); -

    +

    18 Standardizing the interfaces: The Device Model

    -

    Up to this point we have seen all kinds of modules doing all kinds of things, but there +

    Up to this point we have seen all kinds of modules doing all kinds of things, but there was no consistency in their interfaces with the rest of the kernel. To impose some consistency such that there is at minimum a standardized way to start, suspend and resume a device a device model was added. An example is shown below, and you can @@ -6381,13 +6381,13 @@ functions. 97 98MODULE_LICENSE("GPL"); 99MODULE_DESCRIPTION("Linux Device Model example"); -

    +

    19 Optimizations

    -

    +

    19.1 Likely and Unlikely conditions

    -

    Sometimes you might want your code to run as quickly as possible, +

    Sometimes you might want your code to run as quickly as possible, especially if it is handling an interrupt or doing something which might cause noticeable latency. If your code contains boolean conditions and if you know that the conditions are almost always likely to evaluate as either @@ -6406,7 +6406,7 @@ to succeed. 4    bio = NULL; 5    goto out; 6} -

    When the unlikely +

    When the unlikely macro is used, the compiler alters its machine instruction output, so that it continues along the false branch and only jumps if the condition is true. That avoids flushing the processor pipeline. The opposite happens if you use the @@ -6415,34 +6415,34 @@ avoids flushing the processor pipeline. The opposite happens if you use the -

    +

    20 Common Pitfalls

    -

    +

    20.1 Using standard libraries

    -

    You can not do that. In a kernel module, you can only use kernel functions which are +

    You can not do that. In a kernel module, you can only use kernel functions which are the functions you can see in /proc/kallsyms. -

    +

    20.2 Disabling interrupts

    -

    You might need to do this for a short time and that is OK, but if you do not enable +

    You might need to do this for a short time and that is OK, but if you do not enable them afterwards, your system will be stuck and you will have to power it off. -

    +

    21 Where To Go From Here?

    -

    For people seriously interested in kernel programming, I recommend kernelnewbies.org +

    For people seriously interested in kernel programming, I recommend kernelnewbies.org and the Documentation subdirectory within the kernel source code which is not always easy to understand but can be a starting point for further investigation. Also, as Linus Torvalds said, the best way to learn the kernel is to read the source code yourself. -

    If you would like to contribute to this guide or notice anything glaringly wrong, +

    If you would like to contribute to this guide or notice anything glaringly wrong, please create an issue at https://github.com/sysprog21/lkmpg. Your pull requests will be appreciated. -

    Happy hacking! +

    Happy hacking!

    -

    1The goal of threaded interrupts is to push more of the work to separate threads, so that the +

    1The goal of threaded interrupts is to push more of the work to separate threads, so that the minimum needed for acknowledging an interrupt is reduced, and therefore the time spent handling the interrupt (where it can’t handle any other interrupts at the same time) is reduced. See https://lwn.net/Articles/302043/.

    diff --git a/lkmpg-for-ht.html b/lkmpg-for-ht.html index 19ab682..aad0e54 100644 --- a/lkmpg-for-ht.html +++ b/lkmpg-for-ht.html @@ -18,7 +18,7 @@

    The Linux Kernel Module Programming Guide

    Peter Jay Salzman, Michael Burian, Ori Pomerantz, Bob Mottram, Jim Huang

    -
    June 25, 2022
    +
    July 2, 2022
    @@ -4597,8 +4597,8 @@ to call arbitrary functions within the kernel. Also, the function prototype of t containing a unsigned long argument, will prevent work from any type checking. Furthermore, the function prototype with unsigned long - argument may be an obstacle to the control-flow integrity. Thus, it is better -to use a unique prototype to separate from the cluster that takes an +
    argument may be an obstacle to the forward-edge protection of control-flow integrity. +Thus, it is better to use a unique prototype to separate from the cluster that takes an unsigned long argument. The timer callback should be passed a pointer to the timer_list @@ -4608,8 +4608,8 @@ to use a unique prototype to separate from the cluster that takes an structure, into a larger structure, and it can use the container_of macro instead of the unsigned long - value. -

    Before Linux v4.14, setup_timer + value. For more information see: Improving the kernel timers API. +

    Before Linux v4.14, setup_timer was used to initialize the timer and the timer_list structure looked like: @@ -4624,7 +4624,7 @@ to use a unique prototype to separate from the cluster that takes an 8 9void setup_timer(struct timer_list *timer, void (*callback)(unsigned long), 10                 unsigned long data); -

    Since Linux v4.14, timer_setup +

    Since Linux v4.14, timer_setup is adopted and the kernel step by step converting to timer_setup from setup_timer @@ -4638,7 +4638,7 @@ Moreover, the timer_setup

    1void timer_setup(struct timer_list *timer, 
     2                 void (*callback)(struct timer_list *), unsigned int flags);
    -

    The setup_timer +

    The setup_timer was then removed since v4.15. As a result, the timer_list structure had changed to the following. @@ -4649,7 +4649,7 @@ Moreover, the timer_setup 4    u32 flags; 5    /* ... */ 6}; -

    The following source code illustrates a minimal kernel module which, when +

    The following source code illustrates a minimal kernel module which, when loaded, starts blinking the keyboard LEDs until it is unloaded.

    @@ -4738,7 +4738,7 @@ loaded, starts blinking the keyboard LEDs until it is unloaded. 83module_exit(kbleds_cleanup); 84 85MODULE_LICENSE("GPL"); -

    If none of the examples in this chapter fit your debugging needs, +

    If none of the examples in this chapter fit your debugging needs, there might yet be some other tricks to try. Ever wondered what CONFIG_LL_DEBUG in make menuconfig @@ -4749,25 +4749,25 @@ everything what your code does over a serial line. If you find yourself porting kernel to some new and former unsupported architecture, this is usually amongst the first things that should be implemented. Logging over a netconsole might also be worth a try. -

    While you have seen lots of stuff that can be used to aid debugging here, there are +

    While you have seen lots of stuff that can be used to aid debugging here, there are some things to be aware of. Debugging is almost always intrusive. Adding debug code can change the situation enough to make the bug seem to disappear. Thus, you should keep debug code to a minimum and make sure it does not show up in production code. -

    +

    14 Scheduling Tasks

    -

    There are two main ways of running tasks: tasklets and work queues. Tasklets are a +

    There are two main ways of running tasks: tasklets and work queues. Tasklets are a quick and easy way of scheduling a single function to be run. For example, when triggered from an interrupt, whereas work queues are more complicated but also better suited to running multiple things in a sequence. -

    +

    14.1 Tasklets

    -

    Here is an example tasklet module. The +

    Here is an example tasklet module. The tasklet_fn function runs for a few seconds and in the mean time execution of the example_tasklet_init @@ -4818,7 +4818,7 @@ better suited to running multiple things in a sequence. 42 43MODULE_DESCRIPTION("Tasklet example"); 44MODULE_LICENSE("GPL"); -

    So with this example loaded dmesg +

    So with this example loaded dmesg should show: @@ -4830,23 +4830,23 @@ Example tasklet starts Example tasklet init continues... Example tasklet ends -

    Although tasklet is easy to use, it comes with several defators, and developers are +

    Although tasklet is easy to use, it comes with several defators, and developers are discussing about getting rid of tasklet in linux kernel. The tasklet callback runs in atomic context, inside a software interrupt, meaning that it cannot sleep or access user-space data, so not all work can be done in a tasklet handler. Also, the kernel only allows one instance of any given tasklet to be running at any given time; multiple different tasklet callbacks can run in parallel. -

    In recent kernels, tasklets can be replaced by workqueues, timers, or threaded +

    In recent kernels, tasklets can be replaced by workqueues, timers, or threaded interrupts.1 While the removal of tasklets remains a longer-term goal, the current kernel contains more than a hundred uses of tasklets. Now developers are proceeding with the API changes and the macro DECLARE_TASKLET_OLD exists for compatibility. For further information, see https://lwn.net/Articles/830964/. -

    +

    14.2 Work queues

    -

    To add a task to the scheduler we can use a workqueue. The kernel then uses the +

    To add a task to the scheduler we can use a workqueue. The kernel then uses the Completely Fair Scheduler (CFS) to execute work within the queue.

    @@ -4883,36 +4883,36 @@ Completely Fair Scheduler (CFS) to execute work within the queue. 31 32MODULE_LICENSE("GPL"); 33MODULE_DESCRIPTION("Workqueue example"); -

    +

    15 Interrupt Handlers

    -

    +

    15.1 Interrupt Handlers

    -

    Except for the last chapter, everything we did in the kernel so far we have done as a +

    Except for the last chapter, everything we did in the kernel so far we have done as a response to a process asking for it, either by dealing with a special file, sending an ioctl() , or issuing a system call. But the job of the kernel is not just to respond to process requests. Another job, which is every bit as important, is to speak to the hardware connected to the machine. -

    There are two types of interaction between the CPU and the rest of the +

    There are two types of interaction between the CPU and the rest of the computer’s hardware. The first type is when the CPU gives orders to the hardware, the order is when the hardware needs to tell the CPU something. The second, called interrupts, is much harder to implement because it has to be dealt with when convenient for the hardware, not the CPU. Hardware devices typically have a very small amount of RAM, and if you do not read their information when available, it is lost. -

    Under Linux, hardware interrupts are called IRQ’s (Interrupt ReQuests). There +

    Under Linux, hardware interrupts are called IRQ’s (Interrupt ReQuests). There are two types of IRQ’s, short and long. A short IRQ is one which is expected to take a very short period of time, during which the rest of the machine will be blocked and no other interrupts will be handled. A long IRQ is one which can take longer, and during which other interrupts may occur (but not interrupts from the same device). If at all possible, it is better to declare an interrupt handler to be long. -

    When the CPU receives an interrupt, it stops whatever it is doing (unless it is +

    When the CPU receives an interrupt, it stops whatever it is doing (unless it is processing a more important interrupt, in which case it will deal with this one only when the more important one is done), saves certain parameters on the stack and calls the interrupt handler. This means that certain things are not allowed in the @@ -4924,10 +4924,10 @@ heavy work deferred from an interrupt handler. Historically, BH (Linux naming for Bottom Halves) statistically book-keeps the deferred functions. Softirq and its higher level abstraction, Tasklet, replace BH since Linux 2.3. -

    The way to implement this is to call +

    The way to implement this is to call request_irq() to get your interrupt handler called when the relevant IRQ is received. -

    In practice IRQ handling can be a bit more complex. Hardware is often +

    In practice IRQ handling can be a bit more complex. Hardware is often designed in a way that chains two interrupt controllers, so that all the IRQs from interrupt controller B are cascaded to a certain IRQ from interrupt controller A. Of course, that requires that the kernel finds out which IRQ it @@ -4944,7 +4944,7 @@ need to solve another truckload of problems. It is not enough to know if a certain IRQs has happened, it’s also important to know what CPU(s) it was for. People still interested in more details, might want to refer to "APIC" now. -

    This function receives the IRQ number, the name of the function, +

    This function receives the IRQ number, the name of the function, flags, a name for /proc/interrupts and a parameter to be passed to the interrupt handler. Usually there is a certain number of IRQs available. How many IRQs there are is hardware-dependent. The flags can include @@ -4954,16 +4954,16 @@ How many IRQs there are is hardware-dependent. The flags can include SA_INTERRUPT to indicate this is a fast interrupt. This function will only succeed if there is not already a handler on this IRQ, or if you are both willing to share. -

    +

    15.2 Detecting button presses

    -

    Many popular single board computers, such as Raspberry Pi or Beagleboards, have a +

    Many popular single board computers, such as Raspberry Pi or Beagleboards, have a bunch of GPIO pins. Attaching buttons to those and then having a button press do something is a classic case in which you might need to use interrupts, so that instead of having the CPU waste time and battery power polling for a change in input state, it is better for the input to trigger the CPU to then run a particular handling function. -

    Here is an example where buttons are connected to GPIO numbers 17 and 18 and +

    Here is an example where buttons are connected to GPIO numbers 17 and 18 and an LED is connected to GPIO 4. You can change those numbers to whatever is appropriate for your board.

    @@ -5112,14 +5112,14 @@ appropriate for your board. 142 143MODULE_LICENSE("GPL"); 144MODULE_DESCRIPTION("Handle some GPIO interrupts"); -

    +

    15.3 Bottom Half

    -

    Suppose you want to do a bunch of stuff inside of an interrupt routine. A common +

    Suppose you want to do a bunch of stuff inside of an interrupt routine. A common way to do that without rendering the interrupt unavailable for a significant duration is to combine it with a tasklet. This pushes the bulk of the work off into the scheduler. -

    The example below modifies the previous example to also run an additional task +

    The example below modifies the previous example to also run an additional task when an interrupt is triggered.

    @@ -5293,19 +5293,19 @@ when an interrupt is triggered. 165 166MODULE_LICENSE("GPL"); 167MODULE_DESCRIPTION("Interrupt with top and bottom half"); -

    +

    16 Crypto

    -

    At the dawn of the internet, everybody trusted everybody completely…but that did +

    At the dawn of the internet, everybody trusted everybody completely…but that did not work out so well. When this guide was originally written, it was a more innocent era in which almost nobody actually gave a damn about crypto - least of all kernel developers. That is certainly no longer the case now. To handle crypto stuff, the kernel has its own API enabling common methods of encryption, decryption and your favourite hash functions. -

    +

    16.1 Hash functions

    -

    Calculating and checking the hashes of things is a common operation. Here is a +

    Calculating and checking the hashes of things is a common operation. Here is a demonstration of how to calculate a sha256 hash within a kernel module.

    @@ -5373,20 +5373,20 @@ demonstration of how to calculate a sha256 hash within a kernel module. 62 63MODULE_DESCRIPTION("sha256 hash test"); 64MODULE_LICENSE("GPL"); -

    Install the module: +

    Install the module:

    1sudo insmod cryptosha256.ko 
     2sudo dmesg
    -

    And you should see that the hash was calculated for the test string. -

    Finally, remove the test module: +

    And you should see that the hash was calculated for the test string. +

    Finally, remove the test module:

    1sudo rmmod cryptosha256
    -

    +

    16.2 Symmetric key encryption

    -

    Here is an example of symmetrically encrypting a string using the AES algorithm +

    Here is an example of symmetrically encrypting a string using the AES algorithm and a password.

    @@ -5591,10 +5591,10 @@ and a password. 196 197MODULE_DESCRIPTION("Symmetric key encryption example"); 198MODULE_LICENSE("GPL"); -

    +

    17 Virtual Input Device Driver

    -

    The input device driver is a module that provides a way to communicate +

    The input device driver is a module that provides a way to communicate with the interaction device via the event. For example, the keyboard can send the press or release event to tell the kernel what we want to do. The input device driver will allocate a new input structure with @@ -5602,7 +5602,7 @@ do. The input device driver will allocate a new input structure with and sets up input bitfields, device id, version, etc. After that, registers it by calling input_register_device() . -

    Here is an example, vinput, It is an API to allow easy +

    Here is an example, vinput, It is an API to allow easy development of virtual input drivers. The drivers needs to export a vinput_device() that contains the virtual device name and @@ -5618,7 +5618,7 @@ development of virtual input drivers. The drivers needs to export a

  • the readback function: read()
  • -

    Then using vinput_register_device() +

    Then using vinput_register_device() and vinput_unregister_device() will add a new device to the list of support virtual input devices.

    @@ -5627,7 +5627,7 @@ development of virtual input drivers. The drivers needs to export a

    1int init(struct vinput *);
    -

    This function is passed a struct vinput +

    This function is passed a struct vinput already initialized with an allocated struct input_dev . The init() function is responsible for initializing the capabilities of the input device and register @@ -5635,20 +5635,20 @@ it.

    1int send(struct vinput *, char *, int);
    -

    This function will receive a user string to interpret and inject the event using the +

    This function will receive a user string to interpret and inject the event using the input_report_XXXX or input_event call. The string is already copied from user.

    1int read(struct vinput *, char *, int);
    -

    This function is used for debugging and should fill the buffer parameter with the +

    This function is used for debugging and should fill the buffer parameter with the last event sent in the virtual input device format. The buffer will then be copied to user. -

    vinput devices are created and destroyed using sysfs. And, event injection is done +

    vinput devices are created and destroyed using sysfs. And, event injection is done through a /dev node. The device name will be used by the userland to export a new virtual input device. -

    The class_attribute +

    The class_attribute structure is similar to other attribute types we talked about in section 8:

    @@ -5659,7 +5659,7 @@ virtual input device. 5    ssize_t (*store)(struct class *class, struct class_attribute *attr, 6                    const char *buf, size_t count); 7}; -

    In vinput.c, the macro CLASS_ATTR_WO(export/unexport) +

    In vinput.c, the macro CLASS_ATTR_WO(export/unexport) defined in include/linux/device.h (in this case, device.h is included in include/linux/input.h) will generate the class_attribute structures which are named class_attr_export/unexport. Then, put them into @@ -5669,14 +5669,14 @@ will generate the class_attribute that should be assigned in vinput_class . Finally, call class_register(&vinput_class) to create attributes in sysfs. -

    To create a vinputX sysfs entry and /dev node. +

    To create a vinputX sysfs entry and /dev node.

    1echo "vkbd" | sudo tee /sys/class/vinput/export
    -

    To unexport the device, just echo its id in unexport: +

    To unexport the device, just echo its id in unexport:

    1echo "0" | sudo tee /sys/class/vinput/unexport
    @@ -6137,7 +6137,7 @@ will generate the class_attribute 400 401MODULE_LICENSE("GPL"); 402MODULE_DESCRIPTION("Emulate input events"); -

    Here the virtual keyboard is one of example to use vinput. It supports all +

    Here the virtual keyboard is one of example to use vinput. It supports all KEY_MAX keycodes. The injection format is the KEY_CODE such as defined in include/linux/input.h. A positive value means @@ -6145,12 +6145,12 @@ will generate the class_attribute while a negative value is a KEY_RELEASE . The keyboard supports repetition when the key stays pressed for too long. The following demonstrates how simulation work. -

    Simulate a key press on "g" ( KEY_G +

    Simulate a key press on "g" ( KEY_G = 34):

    1echo "+34" | sudo tee /dev/vinput0
    -

    Simulate a key release on "g" ( KEY_G +

    Simulate a key release on "g" ( KEY_G = 34):

    @@ -6268,13 +6268,13 @@ following demonstrates how simulation work. 108 109MODULE_LICENSE("GPL"); 110MODULE_DESCRIPTION("Emulate keyboard input events through /dev/vinput"); -

    +

    18 Standardizing the interfaces: The Device Model

    -

    Up to this point we have seen all kinds of modules doing all kinds of things, but there +

    Up to this point we have seen all kinds of modules doing all kinds of things, but there was no consistency in their interfaces with the rest of the kernel. To impose some consistency such that there is at minimum a standardized way to start, suspend and resume a device a device model was added. An example is shown below, and you can @@ -6381,13 +6381,13 @@ functions. 97 98MODULE_LICENSE("GPL"); 99MODULE_DESCRIPTION("Linux Device Model example"); -

    +

    19 Optimizations

    -

    +

    19.1 Likely and Unlikely conditions

    -

    Sometimes you might want your code to run as quickly as possible, +

    Sometimes you might want your code to run as quickly as possible, especially if it is handling an interrupt or doing something which might cause noticeable latency. If your code contains boolean conditions and if you know that the conditions are almost always likely to evaluate as either @@ -6406,7 +6406,7 @@ to succeed. 4    bio = NULL; 5    goto out; 6} -

    When the unlikely +

    When the unlikely macro is used, the compiler alters its machine instruction output, so that it continues along the false branch and only jumps if the condition is true. That avoids flushing the processor pipeline. The opposite happens if you use the @@ -6415,34 +6415,34 @@ avoids flushing the processor pipeline. The opposite happens if you use the -

    +

    20 Common Pitfalls

    -

    +

    20.1 Using standard libraries

    -

    You can not do that. In a kernel module, you can only use kernel functions which are +

    You can not do that. In a kernel module, you can only use kernel functions which are the functions you can see in /proc/kallsyms. -

    +

    20.2 Disabling interrupts

    -

    You might need to do this for a short time and that is OK, but if you do not enable +

    You might need to do this for a short time and that is OK, but if you do not enable them afterwards, your system will be stuck and you will have to power it off. -

    +

    21 Where To Go From Here?

    -

    For people seriously interested in kernel programming, I recommend kernelnewbies.org +

    For people seriously interested in kernel programming, I recommend kernelnewbies.org and the Documentation subdirectory within the kernel source code which is not always easy to understand but can be a starting point for further investigation. Also, as Linus Torvalds said, the best way to learn the kernel is to read the source code yourself. -

    If you would like to contribute to this guide or notice anything glaringly wrong, +

    If you would like to contribute to this guide or notice anything glaringly wrong, please create an issue at https://github.com/sysprog21/lkmpg. Your pull requests will be appreciated. -

    Happy hacking! +

    Happy hacking!

    -

    1The goal of threaded interrupts is to push more of the work to separate threads, so that the +

    1The goal of threaded interrupts is to push more of the work to separate threads, so that the minimum needed for acknowledging an interrupt is reduced, and therefore the time spent handling the interrupt (where it can’t handle any other interrupts at the same time) is reduced. See https://lwn.net/Articles/302043/.