Skip to content

kdump: Introduce new kdump_emergency option and /etc/kdump/emergency.d directory interface#143

Open
nishibem wants to merge 9 commits intorhkdump:mainfrom
nishibem:pr/kdump_emergency
Open

kdump: Introduce new kdump_emergency option and /etc/kdump/emergency.d directory interface#143
nishibem wants to merge 9 commits intorhkdump:mainfrom
nishibem:pr/kdump_emergency

Conversation

@nishibem
Copy link
Contributor

This patch set introduces a new kdump_emergency option along with its
corresponding directory interface, emergency.d.
The design follows the existing kdump_pre and kdump_post mechanisms.

The kdump.sh script is extended to execute binaries and scripts provided
via emergency.d and kdump_emergency configuration values during emergency
handling.

With this enhancement, administrators can run custom logic and
emergency-specific actions during emergency handling before system
shutdown starts.

The series also includes documentation updates that describe the
kdump_emergency interface.

…ing error handling

Update kdump.sh to execute binary and script files located under
/etc/kdump/emergency.d and the file specified by kdump_emergency
option when the error handler is invoked in emergency mode.

Running these files during error handling ensures that emergency-specific
actions and configuration are applied even when kdump encounters failures
prior to dump process.
This allows administrators to define custom recovery or logging behavior
in error scenarios.

Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
…to kdump initramfs

Includes files installed under /etc/kdump/emergency.d directory
and the file specified by kdump_emergency option into kdump initramfs.

Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
Add support for kdump_emergency option to kdump.conf parser.
This allows the parser to recognize the option as valid.

Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
Consolidate EXTRA_BINS assignments into a single location and
remove the CHECK_FILES variable.
And also remove the directories from the HOOKS.

No functional change.

Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
Add update checks for binary and script files located under
/etc/kdump/emergency.d directory, and for the file specified by
kdump_emergency option.

Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
This patch adds descriptions of the kdump_emergency option to the documentation.

Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
Replace "All" with "all" to correct grammar.

Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
Add a description of the Kdump Emergency Executable to kexec-kdump-howto.txt.
This documentation explains the purpose of the Kdump Emergency Executable.

Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
Add setup logic to ensure that the kdump/emergency.d directory is properly
created and prepared.

Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @nishibem, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the kdump functionality by introducing a mechanism to execute custom scripts during emergency handling, allowing administrators to perform specific actions before the system shuts down. It adds a new configuration option and directory interface, similar to the existing pre and post kdump mechanisms, to provide greater flexibility in managing emergency scenarios.

Highlights

  • New kdump_emergency Option: Introduces a new kdump_emergency option in kdump.conf to specify executables or scripts to run during emergency handling before system shutdown.
  • Emergency Script Directory: Adds support for executing scripts located in the /etc/kdump/emergency.d directory, which are run in lexical order before any script specified via the kdump_emergency option.
  • Integration with kdump.sh: Modifies the kdump.sh script to execute the scripts from emergency.d and the kdump_emergency configuration value during emergency handling.
  • Documentation Updates: Includes updates to documentation to describe the new kdump_emergency interface.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new kdump_emergency option and /etc/kdump/emergency.d directory interface, enhancing kdump's emergency handling capabilities. The changes include modifications to the Makefile, kdump.sh, module-setup.sh, gen-kdump-conf.sh, kdump-utils.spec, kdump.conf.5, kdumpctl, and kexec-kdump-howto.txt. The review identified a few areas for improvement, primarily focusing on error handling and code clarity.

EXTRA_BINS="$EXTRA_BINS $CHECK_FILES"
CHECK_FILES=${OPT[extra_bins]}
EXTRA_BINS="$EXTRA_BINS $CHECK_FILES"
EXTRA_BINS="${OPT[kdump_post]} ${OPT[kdump_pre]} ${OPT[kdump_emergency]} ${OPT[extra_bins]}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The variables OPT[kdump_post], OPT[kdump_pre], OPT[kdump_emergency], and OPT[extra_bins] are concatenated without any separators. This could lead to issues if the values contain spaces or special characters.

Suggested change
EXTRA_BINS="${OPT[kdump_post]} ${OPT[kdump_pre]} ${OPT[kdump_emergency]} ${OPT[extra_bins]}"
EXTRA_BINS="${OPT[kdump_post]} ${OPT[kdump_pre]} ${OPT[kdump_emergency]} ${OPT[extra_bins]}"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code did not take into account cases where the values contain spaces or special characters.
Could you share your thoughts on how we should handle this?

Comment on lines +381 to +398
do_kdump_emergency() {
if [ -d /etc/kdump/emergency.d ]; then
for file in /etc/kdump/emergency.d/*; do
"$file"
_ret=$?
if [ $_ret -ne 0 ]; then
derror "$file exited with $_ret status"
fi
done
fi

if [ -n "$KDUMP_EMERGENCY" ]; then
"$KDUMP_EMERGENCY"
_ret=$?
if [ $_ret -ne 0 ]; then
derror "$KDUMP_EMERGENCY exited with $_ret status"
fi
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The do_kdump_emergency function does not have any error handling for the directory iteration. If /etc/kdump/emergency.d does not exist or is not a directory, the script will continue without any warning. This could lead to unexpected behavior if the administrator expects the emergency scripts to be executed.

do_kdump_emergency() {
    if [ ! -d /etc/kdump/emergency.d ]; then
        derror "/etc/kdump/emergency.d is not a directory or does not exist"
        return 1 # Indicate failure to the caller
    fi

    for file in /etc/kdump/emergency.d/*; do
        "$file"
        _ret=$?
        if [ $_ret -ne 0 ]; then
            derror "$file exited with $_ret status"
        fi
    done

    if [ -n "$KDUMP_EMERGENCY" ]; then
        "$KDUMP_EMERGENCY"
        _ret=$?
        if [ $_ret -ne 0 ]; then
            derror "$KDUMP_EMERGENCY exited with $_ret status"
        fi
    fi
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is implemented exactly the same way as pre.d and post.d, if we fix this, we also need to apply the same fix to those as well.
I think it would be better to bundle the changes for pre.d and post.d into a separate pull request.

Comment on lines +689 to +690
elif [[ $file != "/etc/kdump/emergency.d/*" ]]; then
echo "$file is not executable"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The echo statement does not redirect to standard error, so the message will not be visible in logs. This could make it difficult to diagnose issues with the emergency scripts.

Suggested change
elif [[ $file != "/etc/kdump/emergency.d/*" ]]; then
echo "$file is not executable"
echo "$file is not executable" >&2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn’t an emergency-time message; it can be logged during initramfs generation when kdump.service starts.
We’ve confirmed it appears in the journal log under kdump.service, consistent with pre.d/post.d.

@coiby
Copy link
Member

coiby commented Jan 30, 2026

Hi @nishibem

Thanks for creating the PR! I don't see anything wrong with the PR. But the logic of kdump_pre, kdump_post and kdump_emergency is pretty the same. So for the sake of readability and maintainability, I will appreciate it if you can refactor the code so they re-use common functions. If you don't have time, that's fine and I'll merge this first and create a separate PR.

@nishibem
Copy link
Contributor Author

nishibem commented Feb 2, 2026

Hi @coiby

Thank you for your feedback. You are right that kdump_pre, kdump_post, and kdump_emergency operate with similar logic, so I will create a refactoring to consolidate them into common functions.
However, since that would fall outside the main purpose of this fix, I will create a separate PR later with the commit that introduces the common functions.
Could you please merge this kdump_emergency feature first?

Copy link
Collaborator

@prudo1 prudo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nishibem,

all in all the implementation looks good to me. There is only one line in the documentation I find a bit confuting. Please see below.

But I would like to better understand the use case for this feature. Looking at the code in kdump.sh, there aren't that many things that can cause it to fail prior to the dump process. If I'm not mistaken it's basically only get_host_ip. Could you please give a specific example when this feature would have been useful (i.e. what has failed and what additional logging/recovery you would have done).

Furthermore one thing I'm not fully sure myself but think we should at least discuss. The way I see it the new kdump_emergency you propose is a generalized version of failure_action. In the way that it allows to run arbitrary scripts rather than performing pre-defined actions. So I'm asking myself if it makes sense to extend failure_action to allow running scripts/binaries instead rather than introducing the new kdump_emergency. Thing is that this is a user facing change. And I'd like to keep changes to kdump.conf to a minimum.

Thanks
Philipp

# kdump_emergency <binary | script>
# - This directive allows you to run a specified executable
# in emergency mode within the kdump capture kernel,
# prior to the dump process.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the prior to the dump process a bit confusing. When kdump.sh end up in the error handler there is no more dump process (unless you specify failure_action dump_to_rootfs). Should that be prior to executing the failure_action.?
Same for the man page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you pointed out, once the kdump capture kernel enters the error handler, the dump process will never be started.
Therefore, I am considering revising the description to state that the executable runs in the emergency mode that the kdump capture kernel transitions into before the dump capture process is executed.
How about wording it as follows?

--- a/kdump.conf.5
+++ b/kdump.conf.5
@@ -133,8 +133,9 @@ than the system's RAM.
 .B kdump_emergency <binary | script>
 .RS
 This directive allows you to run a specified executable
-in emergency mode within the kdump capture kernel,
-prior to the dump process.
+in the emergency mode entered due to an error that occurs
+before the dump collection process is executed in the
+kdump capture kernel.
 .PP
 All files under /etc/kdump/emergency.d are collectively sorted
 and executed in lexical order, before binary or script

@nishibem
Copy link
Contributor Author

nishibem commented Feb 5, 2026

Hi @prudo1
Thank you for your feedback.

But I would like to better understand the use case for this feature. Looking at the code in kdump.sh, there aren't that many things that can cause it to fail prior to the dump process. If I'm not mistaken it's basically only get_host_ip. Could you please give a specific example when this feature would have been useful (i.e. what has failed and what additional logging/recovery you would have done).

We are considering scenarios where an issue occurs during the dracut initqueue stage before the kdump collection process starts, resulting in the dump target becoming unavailable. In such cases, kdump‑emergency.service is triggered before the kdump dump process is executed, and the kdump error handler runs instead. Since the existing kdump_post script is not executed in this situation, we believe that the new kdump_emergency directive would be effective and useful.

Furthermore one thing I'm not fully sure myself but think we should at least discuss. The way I see it the new kdump_emergency you propose is a generalized version of failure_action. In the way that it allows to run arbitrary scripts rather than performing pre-defined actions. So I'm asking myself if it makes sense to extend failure_action to allow running scripts/binaries instead rather than introducing the new kdump_emergency. Thing is that this is a user facing change. And I'd like to keep changes to kdump.conf to a minimum.

The failure_action option is currently executed both in the error handler that runs before the kdump capture process starts and in the failure handling that occurs after the process has begun. What we would like to do is to add specific handling exclusively for the error handler that runs before the kdump capture process starts.
Furthermore, based on the current design, we consider failure_action to be a configuration item intended to specify the type of shutdown behavior—such as reboot, halt, or poweroff—rather than a mechanism for configuring commands or scripts to be executed at a specific point in the processing flow.
For this reason, adding new functionality that allows users to specify and execute scripts or binaries within failure_action could introduce incompatibilities, and we would prefer to avoid such risks as much as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants