Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
9319bd3
Verify that stack telemetry includes stack_id in the OTEL resource
alco Dec 17, 2025
cba11fc
Include stack_id in the OTEL resource for stack telemetry
alco Dec 17, 2025
fe08f65
Merge stack_id into existing OTEL resource instead of using put_new
erik-the-implementer Mar 25, 2026
cda4795
Include binary_mem and average ref count in top process metrics
alco Dec 17, 2025
26a1859
Add application metrics to the otel telemetry integration test
alco Dec 17, 2025
31e1048
Reduce mem usage in the integration test
alco Apr 13, 2026
db1659a
Fix lingering timeout problem in otel_collector's e2e tests
alco Apr 13, 2026
2e05371
Cache lux on CI
alco Dec 15, 2025
01429a3
Include average number of binaries in top memory-heavy process metrics
alco Dec 17, 2025
931beea
Fix electric-telemetry unit tests
alco Dec 17, 2025
0faba58
Add changesets
alco Dec 17, 2025
1436092
Fix incomplete map in memory_from_info fallback case
erik-the-implementer Mar 20, 2026
00de98c
mix format
alco Apr 13, 2026
82f9df5
Include max_bin_count and max_ref_count for each process group
alco Apr 13, 2026
ed3de5c
Split process.memory into process.memory and process.bin_memory metrics
alco Apr 13, 2026
a9dcbae
Add {:at_least_bytes, n} limit variant for top process queries
alco Apr 13, 2026
933d22c
Use sort_key for cutoff comparison and running total in take_until_ta…
alco Apr 13, 2026
a9b3fb8
Add tests for top_bin_memory_by_type and {:at_least_bytes, n} limit
alco Apr 13, 2026
22a670f
Move Processes imports to module level and improve sort order test
alco Apr 13, 2026
9db1cc8
Make stack telemetry init delay configurable via ELECTRIC_STACK_TELEM…
alco Apr 14, 2026
e8c948d
Remove MIX_OS_DEPS_COMPILE_PARTITION_COUNT from the Dockerfile
alco Apr 14, 2026
5b9694e
Revert "Remove MIX_OS_DEPS_COMPILE_PARTITION_COUNT from the Dockerfile"
alco Apr 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/eleven-nails-argue.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@core/electric-telemetry': patch
---

Add binary memory, average number of off-heap binaries and their ref counts to top processes by memory metric.
5 changes: 5 additions & 0 deletions .changeset/forty-pillows-laugh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@core/sync-service': patch
---

Include stack_id in otel opts for stack metrics. It had been omitted by mistake before.
10 changes: 10 additions & 0 deletions .github/workflows/integration_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ jobs:
- uses: actions/checkout@v4

- uses: erlef/setup-beam@v1
id: setup_beam
with:
version-type: strict
version-file: '.tool-versions'
Expand Down Expand Up @@ -62,6 +63,15 @@ jobs:
run: mix compile
working-directory: packages/sync-service

- name: Cache lux
uses: actions/cache@v4
with:
path: integration-tests/lux
key: '${{ runner.os }}-lux-${{ steps.setup_beam.outputs.otp-version }}'
restore-keys: |
${{ runner.os }}-lux-${{ steps.setup_beam.outputs.otp-version }}
${{ runner.os }}-lux

- name: Setup lux
run: make

Expand Down
5 changes: 5 additions & 0 deletions integration-tests/tests/_macros.luxinc
Original file line number Diff line number Diff line change
Expand Up @@ -231,11 +231,16 @@
-p 4318:4318 \
-v $(realpath ../support_files/otel-collector-config.yaml):/conf/otel-collector-config.yaml \
otel/opentelemetry-collector-contrib --config=/conf/otel-collector-config.yaml

# Allow time for docker to pull the image if not cached
[timeout 120]

??Starting HTTP server
?"endpoint": "(0\.0\.0\.0|\[::\]):4318"
??Everything is ready. Begin running and processing data.

# Reset the timeout for subsequent pattern matching
[timeout 10]
[endmacro]

[macro teardown_container container_name]
Expand Down
232 changes: 232 additions & 0 deletions integration-tests/tests/otel-export.lux
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
[doc Verify that application and stack metrics are correctly exported via Otel]

[include _macros.luxinc]

[global pg_container_name=otel-export__pg]

###

[invoke setup_otel_collector]

[invoke setup_pg]

[invoke setup_electric_with_env "ELECTRIC_OTLP_ENDPOINT=http://localhost:4318 ELECTRIC_SYSTEM_METRICS_POLL_INTERVAL=1s ELECTRIC_STACK_TELEMETRY_INIT_DELAY=1s ELECTRIC_OTEL_EXPORT_PERIOD=2s DO_NOT_START_CONN_MAN_PING=1 ELECTRIC_LOG_LEVEL=info OTEL_RESOURCE_ATTRIBUTES=custom.attr=electric.val"]

# Spawn a process containing off-heap binary references and ensure its in the top 5 by memory footprint.
# Otel Collector sorts metrics in its debug output, so the process label for our process needs
# to be first lexicographically for easier output matching further down.
[shell electric]
"""!
_pid = spawn_link(fn ->
Process.set_label(:A_memory_hog)

on_heap_strings =
Enum.map(1..4000, fn _ -> String.duplicate("on heap", 4000) end)

off_heap_strings =
Enum.map(1..10, fn i -> String.duplicate("1234567890", 7000) end)

receive do
pid -> send(pid, {on_heap_strings, off_heap_strings})
end
end)
"""

??#PID

[shell otel_collector]
# Verify that the Collector receives expected application metrics from Electric
"""?
info ResourceMetrics #0
Resource SchemaURL:
Resource attributes:
-> custom.attr: Str\(electric.val\)
-> instance.id: Str\([-a-f0-9]+\)
-> name: Str\(metrics\)
-> service.name: Str\(electric\)
-> service.version: Str\([0-9.]+\)
ScopeMetrics #0
"""

# Verify the presence of process.bin_memory.* metrics
"""?
Metric #[0-9]+
Descriptor:
-> Name: process\.bin_memory\.avg_bin_count
-> Description:
-> Unit:
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> process_type: Str\(A_memory_hog\)
StartTimestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Timestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Value: [.0-9]+
"""

"""?
Metric #[0-9]+
Descriptor:
-> Name: process\.bin_memory\.avg_ref_count
-> Description:
-> Unit:
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> process_type: Str\(A_memory_hog\)
StartTimestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Timestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Value: 1\.000000
"""

"""?
Metric #[0-9]+
Descriptor:
-> Name: process\.bin_memory\.max_bin_count
-> Description:
-> Unit:
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> process_type: Str\(A_memory_hog\)
StartTimestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Timestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Value: [0-9]+
"""

"""?
Metric #[0-9]+
Descriptor:
-> Name: process\.bin_memory\.max_ref_count
-> Description:
-> Unit:
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> process_type: Str\(A_memory_hog\)
StartTimestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Timestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Value: [0-9]+
"""

"""?
Metric #[0-9]+
Descriptor:
-> Name: process\.bin_memory\.total
-> Description:
-> Unit: By
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> process_type: Str\(A_memory_hog\)
StartTimestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Timestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Value: [0-9]+
"""

# Verify the presence of process.memory.total metric
"""?
Metric #[0-9]+
Descriptor:
-> Name: process\.memory\.total
-> Description:
-> Unit: By
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> process_type: Str\(A_memory_hog\)
StartTimestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Timestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Value: [0-9]+
"""

# Verify that the Collector receives stack metrics from Electric with expected resource attributes
"""?
info ResourceMetrics #0
Resource SchemaURL:
Resource attributes:
-> custom.attr: Str\(electric.val\)
-> instance.id: Str\([-a-f0-9]+\)
-> name: Str\(metrics\)
-> service.name: Str\(electric\)
-> service.version: Str\([0-9.]+\)
-> stack_id: Str\(single_stack\)
ScopeMetrics #0
"""

# Verify that LSN metrics are exported
"""?
Metric #[0-9]+
Descriptor:
-> Name: electric\.postgres\.replication\.pg_wal_offset
-> Description:
-> Unit:
-> DataType: Gauge
NumberDataPoints #0
StartTimestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Timestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Value: [-+.0-9]+
"""

"""?
Metric #[0-9]+
Descriptor:
-> Name: electric\.postgres\.replication\.slot_confirmed_flush_lsn_lag
-> Description:
-> Unit: By
-> DataType: Gauge
NumberDataPoints #0
StartTimestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Timestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Value: [-+.0-9]+
"""

"""?
Metric #[0-9]+
Descriptor:
-> Name: electric\.postgres\.replication\.slot_retained_wal_size
-> Description:
-> Unit: By
-> DataType: Gauge
NumberDataPoints #0
StartTimestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Timestamp: [-0-9]+ [.:0-9]+ [-+0-9]+ UTC
Value: [-+.0-9]+
"""

[invoke start_psql]

[shell psql]
!create table items (val text);
??CREATE

[shell client]
[invoke curl_shape "http://localhost:3000/v1/shape?table=items&offset=-1"]

??HTTP/1.1 200 OK
?electric-handle: ([\d-]+)
[global handle=$1]
?electric-offset: ([\w\d_]+)
[global offset=$1]

[shell psql]
!insert into items values ('3');
??INSERT

[shell client]
[invoke curl_shape "http://localhost:3000/v1/shape?table=items&handle=$handle&offset=$offset&live"]

??HTTP/1.1 200 OK
??"value":{"val":"3"}

[shell otel_collector]
"""?
Metric #[0-9]+
Descriptor:
-> Name: electric\.storage\.transaction_stored\.bytes
"""

###

[cleanup]
[invoke teardown]
98 changes: 0 additions & 98 deletions integration-tests/tests/stack-telemetry.lux

This file was deleted.

Loading
Loading