Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ All notable user-visible changes should be recorded here.

- Added sanitized golden `report.md` / `report.json` regression fixtures to lock report contracts.
- Added conservative parser coverage for `Accepted publickey` plus selected `pam_faillock` / `pam_sss` variants.
- Added compact host-level summaries to Markdown and JSON reports for multi-host inputs.

### Changed

Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,9 @@ The CLI writes:
- `report.md`
- `report.json`

into the output directory you provide. If you omit the output directory, the files are written into the current working directory.
into the output directory you provide. If you omit the output directory, the files are written into the current working directory.

When an input spans multiple hostnames, both reports add compact host-level summaries without changing detector thresholds or introducing cross-host correlation logic.

## Sample Output

Expand Down
3 changes: 2 additions & 1 deletion src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,8 @@ int main(int argc, char* argv[]) {
parsed.quality,
parsed.events,
findings,
parsed.warnings};
parsed.warnings,
app_config.detector.auth_signal_mappings};

loglens::write_reports(report_data, options.output_directory);

Expand Down
236 changes: 235 additions & 1 deletion src/report.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,25 @@
#include <filesystem>
#include <fstream>
#include <iomanip>
#include <optional>
#include <sstream>
#include <string>
#include <string_view>
#include <unordered_map>
#include <unordered_set>
#include <vector>

namespace loglens {
namespace {

struct HostSummary {
std::string hostname;
std::size_t parsed_event_count = 0;
std::size_t finding_count = 0;
std::size_t warning_count = 0;
std::vector<std::pair<EventType, std::size_t>> event_counts;
};

std::string escape_json(std::string_view value) {
std::string escaped;
escaped.reserve(value.size());
Expand Down Expand Up @@ -125,13 +136,198 @@ std::string format_parse_success_percent(double rate) {
return output.str();
}

std::string_view trim_left(std::string_view value) {
while (!value.empty() && (value.front() == ' ' || value.front() == '\t')) {
value.remove_prefix(1);
}
return value;
}

std::string_view consume_token(std::string_view& input) {
input = trim_left(input);
if (input.empty()) {
return {};
}

const auto separator = input.find(' ');
if (separator == std::string_view::npos) {
const auto token = input;
input = {};
return token;
}

const auto token = input.substr(0, separator);
input.remove_prefix(separator + 1);
return token;
}

std::optional<std::string> extract_hostname_from_input_line(std::string_view line, InputMode input_mode) {
auto remaining = line;
switch (input_mode) {
case InputMode::SyslogLegacy:
if (consume_token(remaining).empty()
|| consume_token(remaining).empty()
|| consume_token(remaining).empty()) {
return std::nullopt;
}
break;
case InputMode::JournalctlShortFull:
if (consume_token(remaining).empty()
|| consume_token(remaining).empty()
|| consume_token(remaining).empty()
|| consume_token(remaining).empty()) {
return std::nullopt;
}
break;
default:
return std::nullopt;
}

const auto hostname = consume_token(remaining);
if (hostname.empty()) {
Comment on lines +186 to +187
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate header before deriving warning hostname

extract_hostname_from_input_line treats the next token as a hostname without validating that the preceding header tokens are structurally valid, so malformed lines can shift fields and create fake hosts in host_summaries. In journalctl_short_full, a line missing the timezone still yields a warning but this logic records the program token (for example sshd[2]:) as the host, which can incorrectly introduce extra hosts and skew per-host warning/finding counts.

Useful? React with 👍 / 👎.

return std::nullopt;
}

return std::string(hostname);
}

std::unordered_map<std::size_t, std::string> load_hostnames_by_line(const ReportData& data) {
std::unordered_map<std::size_t, std::string> hostnames_by_line;
if (data.warnings.empty()) {
return hostnames_by_line;
}

std::ifstream input(data.input_path);
if (!input) {
return hostnames_by_line;
}

std::string line;
std::size_t line_number = 0;
while (std::getline(input, line)) {
++line_number;
const auto hostname = extract_hostname_from_input_line(line, data.parse_metadata.input_mode);
if (hostname.has_value()) {
hostnames_by_line.emplace(line_number, *hostname);
}
}

return hostnames_by_line;
}

bool is_matching_finding_signal(const Finding& finding, const AuthSignal& signal) {
if (signal.timestamp < finding.first_seen || signal.timestamp > finding.last_seen) {
return false;
}

switch (finding.type) {
case FindingType::BruteForce:
return signal.counts_as_terminal_auth_failure
&& signal.source_ip == finding.subject;
case FindingType::MultiUserProbing:
if (!signal.counts_as_attempt_evidence || signal.source_ip != finding.subject) {
return false;
}
if (finding.usernames.empty()) {
return true;
}
return std::find(
finding.usernames.begin(),
finding.usernames.end(),
signal.username)
!= finding.usernames.end();
case FindingType::SudoBurst:
return signal.counts_as_sudo_burst_evidence
&& signal.username == finding.subject;
default:
return false;
}
}

std::vector<HostSummary> build_host_summaries(const ReportData& data) {
std::unordered_map<std::string, HostSummary> summaries_by_host;

for (const auto& event : data.events) {
if (event.hostname.empty()) {
continue;
}

auto& summary = summaries_by_host[event.hostname];
summary.hostname = event.hostname;
++summary.parsed_event_count;
}

const auto hostnames_by_line = load_hostnames_by_line(data);
for (const auto& warning : data.warnings) {
const auto hostname_it = hostnames_by_line.find(warning.line_number);
if (hostname_it == hostnames_by_line.end() || hostname_it->second.empty()) {
continue;
}

auto& summary = summaries_by_host[hostname_it->second];
summary.hostname = hostname_it->second;
++summary.warning_count;
}

if (summaries_by_host.size() <= 1) {
return {};
}

std::unordered_map<std::size_t, std::string> hostname_by_event_line;
hostname_by_event_line.reserve(data.events.size());
std::unordered_map<std::string, std::vector<Event>> events_by_host;
events_by_host.reserve(summaries_by_host.size());

for (const auto& event : data.events) {
hostname_by_event_line.emplace(event.line_number, event.hostname);
events_by_host[event.hostname].push_back(event);
}

const auto signals = build_auth_signals(data.events, data.auth_signal_mappings);
for (const auto& finding : data.findings) {
std::unordered_set<std::string> matching_hosts;
for (const auto& signal : signals) {
if (!is_matching_finding_signal(finding, signal)) {
continue;
}

const auto hostname_it = hostname_by_event_line.find(signal.line_number);
if (hostname_it == hostname_by_event_line.end() || hostname_it->second.empty()) {
continue;
}
matching_hosts.insert(hostname_it->second);
}

for (const auto& hostname : matching_hosts) {
++summaries_by_host[hostname].finding_count;
}
}

std::vector<HostSummary> summaries;
summaries.reserve(summaries_by_host.size());
for (auto& [hostname, summary] : summaries_by_host) {
const auto events_it = events_by_host.find(hostname);
if (events_it != events_by_host.end()) {
summary.event_counts = build_event_counts(events_it->second);
}
summaries.push_back(std::move(summary));
}

std::sort(summaries.begin(), summaries.end(), [](const HostSummary& left, const HostSummary& right) {
return left.hostname < right.hostname;
});

return summaries;
}

} // namespace

std::string render_markdown_report(const ReportData& data) {
std::ostringstream output;
const auto findings = sorted_findings(data.findings);
const auto warnings = sorted_warnings(data.warnings);
const auto event_counts = build_event_counts(data.events);
const auto host_summaries = build_host_summaries(data);

output << "# LogLens Report\n\n";
output << "## Summary\n\n";
Expand All @@ -149,6 +345,19 @@ std::string render_markdown_report(const ReportData& data) {
output << "- Findings: " << findings.size() << '\n';
output << "- Parser warnings: " << warnings.size() << "\n\n";

if (!host_summaries.empty()) {
output << "## Host Summary\n\n";
output << "| Host | Parsed Events | Findings | Warnings |\n";
output << "| --- | ---: | ---: | ---: |\n";
for (const auto& summary : host_summaries) {
output << "| " << summary.hostname
<< " | " << summary.parsed_event_count
<< " | " << summary.finding_count
<< " | " << summary.warning_count << " |\n";
}
output << '\n';
}

output << "## Findings\n\n";
if (findings.empty()) {
output << "No configured detections matched the analyzed events.\n\n";
Expand Down Expand Up @@ -205,6 +414,7 @@ std::string render_json_report(const ReportData& data) {
const auto findings = sorted_findings(data.findings);
const auto warnings = sorted_warnings(data.warnings);
const auto event_counts = build_event_counts(data.events);
const auto host_summaries = build_host_summaries(data);

output << "{\n";
output << " \"tool\": \"LogLens\",\n";
Expand Down Expand Up @@ -236,7 +446,31 @@ std::string render_json_report(const ReportData& data) {
output << " {\"event_type\": \"" << to_string(type) << "\", \"count\": " << count << "}";
output << (index + 1 == event_counts.size() ? "\n" : ",\n");
}
output << " ],\n";
output << " ]";
if (!host_summaries.empty()) {
output << ",\n";
output << " \"host_summaries\": [\n";
for (std::size_t host_index = 0; host_index < host_summaries.size(); ++host_index) {
const auto& summary = host_summaries[host_index];
output << " {\n";
output << " \"hostname\": \"" << escape_json(summary.hostname) << "\",\n";
output << " \"parsed_event_count\": " << summary.parsed_event_count << ",\n";
output << " \"finding_count\": " << summary.finding_count << ",\n";
output << " \"warning_count\": " << summary.warning_count << ",\n";
output << " \"event_counts\": [\n";
for (std::size_t event_index = 0; event_index < summary.event_counts.size(); ++event_index) {
const auto& [type, count] = summary.event_counts[event_index];
output << " {\"event_type\": \"" << to_string(type) << "\", \"count\": " << count << "}";
output << (event_index + 1 == summary.event_counts.size() ? "\n" : ",\n");
}
output << " ]\n";
output << " }";
output << (host_index + 1 == host_summaries.size() ? "\n" : ",\n");
}
output << " ],\n";
} else {
output << ",\n";
}
output << " \"findings\": [\n";
for (std::size_t index = 0; index < findings.size(); ++index) {
const auto& finding = findings[index];
Expand Down
2 changes: 2 additions & 0 deletions src/report.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#pragma once

#include "signal.hpp"
#include "detector.hpp"
#include "parser.hpp"

Expand All @@ -16,6 +17,7 @@ struct ReportData {
std::vector<Event> events;
std::vector<Finding> findings;
std::vector<ParseWarning> warnings;
AuthSignalConfig auth_signal_mappings;
};

std::string render_markdown_report(const ReportData& data);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Wed 2026-03-11 09:00:00 UTC alpha-host sshd[2301]: Failed password for invalid user admin from 203.0.113.10 port 52022 ssh2
Wed 2026-03-11 09:01:05 UTC alpha-host sshd[2302]: Failed password for root from 203.0.113.10 port 52030 ssh2
Wed 2026-03-11 09:02:10 UTC alpha-host sshd[2303]: Failed password for test from 203.0.113.10 port 52040 ssh2
Wed 2026-03-11 09:03:44 UTC alpha-host sshd[2304]: Failed password for guest from 203.0.113.10 port 52050 ssh2
Wed 2026-03-11 09:04:05 UTC alpha-host sshd[2305]: Failed password for invalid user deploy from 203.0.113.10 port 52060 ssh2
Wed 2026-03-11 09:10:10 UTC beta-host sshd[2401]: Accepted publickey for alice from 203.0.113.20 port 52111 ssh2
Wed 2026-03-11 09:11:00 UTC beta-host sudo: alice : TTY=pts/0 ; PWD=/home/alice ; USER=root ; COMMAND=/usr/bin/systemctl restart ssh
Wed 2026-03-11 09:12:10 UTC beta-host sudo: alice : TTY=pts/0 ; PWD=/home/alice ; USER=root ; COMMAND=/usr/bin/journalctl -xe
Wed 2026-03-11 09:14:15 UTC beta-host sudo: alice : TTY=pts/0 ; PWD=/home/alice ; USER=root ; COMMAND=/usr/bin/vi /etc/ssh/sshd_config
Wed 2026-03-11 09:15:12 UTC alpha-host sshd[2306]: Connection closed by authenticating user alice 203.0.113.50 port 52290 [preauth]
Wed 2026-03-11 09:16:18 UTC beta-host sshd[2402]: Timeout, client not responding from 203.0.113.51 port 52291
Loading
Loading