Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,17 @@ jobs:
uses: actions/checkout@v3
with:
fetch-depth: 0
submodules: recursive
- run: |
git fetch -f --tags
echo exit code $?
git tag --list

- name: Set up JDK 1.8
- name: Set up JDK 17
uses: actions/setup-java@v3
with:
java-version: '8'
distribution: 'zulu'
java-version: '17'
distribution: 'temurin'

- name: Cache Maven packages
uses: actions/cache@v4
Expand All @@ -57,7 +58,7 @@ jobs:
- name: Test with musl
run: |
docker run --rm -t -v ~/.m2:/root/.m2 -v $(pwd):/feedzai-openml-java -e FDZ_OPENML_JAVA_LIBC="musl" alpine:3.18.4 \
/bin/sh -c 'apk add clang openjdk8 maven bash git && \
/bin/sh -c 'apk add clang openjdk17 maven bash git && \
git config --global --add safe.directory /feedzai-openml-java && \
git config --global --add safe.directory /feedzai-openml-java/openml-lightgbm/lightgbm-builder/make-lightgbm && \
cd /feedzai-openml-java && \
Expand All @@ -67,7 +68,7 @@ jobs:
# The skipped tests is because h2o-xgboost isn't ready to use on arm64
- name: Test on arm64
run: |
docker run --rm --platform=arm64 -t -v ~/.m2:/root/.m2 -v $(pwd):/feedzai-openml-java maven:3.8-openjdk-8-slim \
docker run --rm --platform=arm64 -t -v ~/.m2:/root/.m2 -v $(pwd):/feedzai-openml-java maven:3.9-eclipse-temurin-17 \
/bin/bash -c 'apt update && apt install -y --no-install-recommends git && \
git config --global --add safe.directory /feedzai-openml-java && \
git config --global --add safe.directory /feedzai-openml-java/openml-lightgbm/lightgbm-builder/make-lightgbm && \
Expand All @@ -78,4 +79,4 @@ jobs:
uses: codecov/codecov-action@v5
with:
use_oidc: true
fail_ci_if_error: true
fail_ci_if_error: false
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,4 @@ target

openml-h2o/dependency-reduced-pom.xml
openml-lightgbm/openml-lightgbm-builder/lightgbmlib_build/
/*.zip
2 changes: 1 addition & 1 deletion .mvn/extensions.xml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,6 @@
<extension>
<groupId>fr.brouillard.oss</groupId>
<artifactId>jgitver-maven-plugin</artifactId>
<version>1.5.1</version>
<version>1.9.0</version>
</extension>
</extensions>
69 changes: 69 additions & 0 deletions java-migration/KIRO_CHANGES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Java 25 Migration — Changes Made

## Files Modified

### `pom.xml` (root)
- `project.source`: `1.8` → `11`
- Added: `<maven.compiler.release>11</maven.compiler.release>`
- `jmockit.version`: `1.35` → `1.49`
- `maven-compiler-plugin`: `3.7.0` → `3.13.0`
- Replaced `<source>/<target>` with `<release>${maven.compiler.release}</release>`
- Added `<proc>full</proc>` (required for JDK 23+)
- `jacoco-maven-plugin`: `0.8.4` → `0.8.12`
- `maven-surefire-plugin`: `3.0.0-M5` → `3.5.2`
- Added `<argLine>` with `-javaagent` for JMockit (resolved via `${settings.localRepository}`) and `--add-opens` for module access
- `auto-service`: `1.0-rc2` → `1.1.1`

### `.mvn/extensions.xml`
- `jgitver-maven-plugin`: `1.5.1` → `1.9.0`

### `openml-java-utils/src/main/java/.../JavaFileUtils.java`
- Line 111: `urlClassLoader.loadClass(...).newInstance()` → `urlClassLoader.loadClass(...).getDeclaredConstructor().newInstance()`

### `openml-java-utils/src/test/java/.../ModelParameterUtilsTest.java`
- Removed `import mockit.integration.junit4.JMockit` and `import org.junit.runner.RunWith`
- Removed `@RunWith(JMockit.class)` annotation (JMockit 1.49 removed the runner class; mocking works via agent only)

### `openml-h2o/src/main/java/.../ParametersBuilderUtil.java`
- `getParamsInstance()`: `paramsClass.newInstance()` → `paramsClass.getDeclaredConstructor().newInstance()`
- Exception handling: `InstantiationException | IllegalAccessException` → broad `Exception` (covers `NoSuchMethodException`, `InvocationTargetException`)

### `.github/workflows/build.yml`
- Main build JDK: `8` (Zulu) → `17` (Temurin) — H2O 3.46 hard-rejects JDK 18+ at runtime
- musl Docker test: `openjdk8` → `openjdk17`
- arm64 Docker test: `maven:3.8-openjdk-8-slim` → `maven:3.9-eclipse-temurin-17`

## Version Bump Summary Table

| Component | From | To | Reason |
|---|---|---|---|
| maven-compiler-plugin | 3.7.0 | 3.13.0 | `--release` flag, JDK 23+ annotation processing |
| JaCoCo | 0.8.4 | 0.8.12 | JDK 25 bytecode instrumentation |
| Surefire | 3.0.0-M5 | 3.5.2 | JDK 25 fork support, argLine handling |
| Auto-Service | 1.0-rc2 | 1.1.1 | JDK 23+ annotation processing |
| JMockit | 1.35 | 1.49 | JDK 17+ agent attachment |
| jgitver | 1.5.1 | 1.9.0 | JDK 21+ runtime compatibility |
| Compiler target | 1.8 | 11 (release) | Minimum for JDK 25 toolchain |

## Runtime Requirements (Deploying on JDK 25)

When running on JDK 25, add these JVM flags:

```
--add-opens java.base/java.lang=ALL-UNNAMED
--add-opens java.base/java.lang.reflect=ALL-UNNAMED
--add-opens java.base/java.lang.invoke=ALL-UNNAMED
--add-opens java.base/java.util=ALL-UNNAMED
--add-opens java.base/java.io=ALL-UNNAMED
--add-opens java.base/java.net=ALL-UNNAMED
--add-opens java.base/java.nio=ALL-UNNAMED
--add-opens java.base/java.math=ALL-UNNAMED
--add-opens java.base/sun.nio.ch=ALL-UNNAMED
--add-opens java.base/java.util.concurrent=ALL-UNNAMED
--add-opens java.base/java.util.concurrent.locks=ALL-UNNAMED
```

These are needed by:
- **H2O 3.46.0.11**: Deep reflection into JDK internals for serialization and memory management
- **DataRobot** (via commons-lang3 `FieldUtils`): `setAccessible(true)` to read `classLabels` field
- **JMockit 1.49**: Bytecode instrumentation for mocking
55 changes: 55 additions & 0 deletions java-migration/KIRO_CONSTRAINTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Java 25 Migration — Constraints & Risks

## Hard Constraints

### 1. H2O Version Locked to 3.46.0.11
H2O version is determined by Pulse model compatibility. Cannot be bumped independently. H2O uses deep reflection (Unsafe, setAccessible) for its internal serialization — requires `--add-opens` at runtime.

**CRITICAL: H2O 3.46.0.11 hard-rejects JDK 18+.** The `water.H2O.main()` method has a version check: "Only Java versions 8-17 are supported". This means:
- Tests that start the embedded H2O server (training, validation) **cannot run on JDK 25**
- Model-loading tests (using h2o-genmodel) are unaffected — they don't start a server
- CI must use **JDK 17** for test execution

**Impact on Pulse JDK 25 migration — NOT a full blocker:**
- **Model scoring (production hot path):** Uses `h2o-genmodel` to load MOJOs/POJOs and score. Pure Java class loading + math. **Works on JDK 25** — no H2O server is started.
- **Model training (embedded):** Starts the H2O server via `water.H2O.main()`. **Broken on JDK 25.** This path is used for "train in Pulse" feature and integration tests.
- **Production reality:** Most deployments import pre-trained models (trained offline in Data Science Service or separate H2O cluster). The scoring path is what matters.
- **Mitigation:** If in-process training is needed, offload to a JDK 17 subprocess, or wait for an H2O version supporting JDK 21+.

This is a **conditional runtime failure** (like `plv Mailer.java` on the SSL path) — not a full stop for Pulse on JDK 25.

### 2. DataRobot Prediction Library Uses Deep Reflection
`DataRobotModelCreator.getTargetModelValues()` uses `FieldUtils.readField(predictor, "classLabels", true)` which calls `setAccessible(true)`. This fails without `--add-opens java.base/java.lang.reflect=ALL-UNNAMED`. The DataRobot prediction JAR (1.1.1) is old and unlikely to get fixes.

### 3. LightGBM Is Pure JNI — No Java-Level Issues
LightGBM provider loads a native `.so` library via JNI. The Java wrapper code is straightforward and has no reflection or internal API usage. No issues expected on JDK 25.

### 4. Pulse Must Provide `--add-opens` at Runtime
The `--add-opens` flags MUST be configured in Pulse's `startup.sh` and `pulse.properties` (`pulse.global.appengine.jvmopts`). Without them, H2O model training and DataRobot model loading will fail with `InaccessibleObjectException`.

### 5. Alpine/musl: No JDK 25 Package Yet
Alpine Linux packages only up to OpenJDK 21 in stable repos. The musl CI test uses JDK 21 as a compromise. Full JDK 25 musl testing requires building from Temurin source or using a custom Docker image.

## Risks

### Medium: Jackson 2.6.7 Is Very Old
This project pins Jackson 2.6.7 (from 2016). While it compiles fine, it's deeply EOL. Jackson 2.6.x has known CVEs. However, it's provided-scope (Pulse supplies its own Jackson at runtime), so this is a classpath concern, not a vulnerability in this project's artifact.

### Medium: JMockit 1.49 Is the Last Version
JMockit is discontinued. It works on JDK 25 today via `--add-opens`, but future JDK versions may break it further. Long-term plan should migrate tests to Mockito.

### Low: Auto-Service Raw Type Warning
Auto-Service 1.1.1 emits a warning when `@AutoService(MachineLearningProvider.class)` is used on a generic implementation. This is cosmetic and can be suppressed with `@SuppressWarnings("rawtypes")` if desired.

### Low: Bytecode Target is 11, Not 25
Using `release=11` means the bytecode won't exercise JDK 25-specific runtime optimizations. This is intentional — the project produces library JARs loaded as plugins, and lower bytecode targets give broader compatibility.

## Compatibility Matrix

| JDK Version | Compile | Run (with --add-opens) | Run (without --add-opens) |
|---|---|---|---|
| 8 | ❌ (needs release 11 toolchain) | ❌ (bytecode target 11) | ❌ |
| 11 | ✅ | ✅ (mostly — H2O may warn) | ⚠️ (warnings only on JDK 11-15) |
| 17 | ✅ | ✅ | ❌ (InaccessibleObjectException) |
| 21 | ✅ | ✅ | ❌ |
| 25 | ✅ | ✅ | ❌ |
68 changes: 68 additions & 0 deletions java-migration/KIRO_DECISIONS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Java 25 Migration — Decisions

## Context
feedzai-openml-java provides ML model providers (H2O, LightGBM, DataRobot) for the Pulse platform. Pulse is migrating from Java 8 to Java 25, requiring all plugin JARs to compile and run on the new JDK.

## Key Decisions

### 1. Compiler Target: `release=11` (not 25)

**Decision**: Use `--release 11` instead of `--release 25`.

**Rationale**: The bytecode produced with `release=11` runs on any JDK ≥11, including JDK 25. This preserves backward compatibility for environments that aren't yet on JDK 25 while being fully forward-compatible. There is no API in this codebase that requires JDK 25 source features. The `--release` flag (introduced in JDK 9) prevents accidental use of APIs added after the target version.

**Trade-off**: Cannot use Java 12+ language features (records, sealed classes, pattern matching). Acceptable since this project is a stable library with minimal source changes.

### 2. JMockit 1.35 → 1.49

**Decision**: Bump JMockit to 1.49 (the last version).

**Rationale**: JMockit 1.35 does not attach as a Java agent correctly on JDK 17+. Version 1.49 works on JDK 25 with proper `--add-opens` and `-javaagent` configuration. Full removal to Mockito is deferred (too much test rewrite for this scope).

### 3. Auto-Service 1.0-rc2 → 1.1.1

**Decision**: Bump to 1.1.1 (latest stable).

**Rationale**: JDK 23+ disables implicit annotation processing by default (must use `-proc:full` or `annotationProcessorPaths`). Old auto-service 1.0-rc2 has compatibility issues with newer annotation processing APIs. 1.1.1 handles the new annotation processing model correctly.

### 4. `Class.newInstance()` → `getDeclaredConstructor().newInstance()`

**Decision**: Replace the deprecated pattern in two locations.

**Rationale**: `Class.newInstance()` has been deprecated since Java 9 and is marked for removal. The replacement properly propagates checked exceptions and is the canonical approach.

### 5. `--add-opens` for Test Execution

**Decision**: Add broad `--add-opens` to Surefire for test execution.

**Rationale**: JMockit, H2O internals, and DataRobot (via commons-lang3 FieldUtils) all use deep reflection. Without `--add-opens`, tests fail with `InaccessibleObjectException` on JDK 17+. The opens are: `java.lang`, `java.lang.reflect`, `java.lang.invoke`, `java.util`, `java.io`, `java.net`, `java.nio`, `java.math`, `sun.nio.ch`, `java.util.concurrent`, `java.util.concurrent.locks`.

### 6. JaCoCo 0.8.4 → 0.8.12

**Decision**: Bump JaCoCo.

**Rationale**: JaCoCo 0.8.4 cannot instrument bytecode produced by or running on JDK 17+. Version 0.8.12 supports up to JDK 25 class file format.

### 7. jgitver 1.5.1 → 1.9.0

**Decision**: Bump the Maven extension for version inference.

**Rationale**: jgitver 1.5.1 uses JGit APIs that have compatibility issues with newer JDKs. Version 1.9.0 is actively maintained and tested with JDK 21+.

### 8. maven-compiler-plugin 3.7.0 → 3.13.0

**Decision**: Bump the compiler plugin.

**Rationale**: The old 3.7.0 does not support the `--release` flag properly and has no awareness of JDK 23+ annotation processing changes. 3.13.0 adds `<proc>full</proc>` support and proper handling of newer JDK toolchains.

### 9. Surefire 3.0.0-M5 → 3.5.2

**Decision**: Bump Surefire.

**Rationale**: Surefire 3.0.0-M5 has known issues with `argLine` handling on newer JDKs. 3.5.2 is the latest stable with proper JDK 25 process forking support.

### 10. H2O 3.46.0.11 — NOT Bumped

**Decision**: Keep H2O at 3.46.0.11.

**Rationale**: H2O's version is dictated by Pulse's model compatibility requirements. The existing version compiles fine as a dependency — any reflection issues are runtime-only and handled by `--add-opens`.
Original file line number Diff line number Diff line change
Expand Up @@ -472,8 +472,8 @@ private static Field getFieldFromSchemaIn(final Class<? extends water.api.schema
*/
private static water.bindings.pojos.ModelParametersSchemaV3 getParamsInstance(final Class<? extends water.bindings.pojos.ModelParametersSchemaV3> paramsClass) {
try {
return paramsClass.newInstance();
} catch (final InstantiationException | IllegalAccessException e) {
return paramsClass.getDeclaredConstructor().newInstance();
} catch (final Exception e) {
throw new RuntimeException(e);
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import com.feedzai.openml.h2o.algos.mocks.BindingPrivateFieldsFieldParameters;
import com.feedzai.openml.h2o.algos.mocks.BindingRegularParameters;
import com.feedzai.openml.h2o.algos.mocks.FieldsNotArrayParameters;
import com.feedzai.openml.h2o.algos.mocks.NoDefaultConstructorParameters;
import com.feedzai.openml.h2o.algos.mocks.NoFieldsFieldParameters;
import com.feedzai.openml.h2o.algos.mocks.PrivateFieldsFieldParameters;
import com.feedzai.openml.h2o.algos.mocks.RegularParameters;
Expand Down Expand Up @@ -271,4 +272,28 @@ public static class MockCaseMismatchBinding extends water.bindings.pojos.ModelPa
public static class MockFallbackBinding extends water.bindings.pojos.ModelParametersSchemaV3 {
public DummyEnum dummyField = null;
}

/**
* Tests that a {@link RuntimeException} is thrown when the binding parameters class
* has no default (no-arg) constructor.
*/
@Test(expected = RuntimeException.class)
public void noDefaultConstructorThrows() {
ParametersBuilderUtil.getParametersFor(
RegularParameters.class,
NoDefaultConstructorParameters.class
);
}

/**
* Tests that a {@link RuntimeException} is thrown when the binding parameters class
* constructor throws an exception, exercising the catch block in {@code getParamsInstance}.
*/
@Test(expected = RuntimeException.class)
public void failingConstructorThrows() {
ParametersBuilderUtil.getParametersFor(
RegularParameters.class,
com.feedzai.openml.h2o.algos.mocks.FailingConstructorParameters.class
);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
package com.feedzai.openml.h2o.algos.mocks;

import com.google.gson.annotations.SerializedName;
import water.bindings.pojos.ModelParametersSchemaV3;

/**
* Mocked class that extends {@link ModelParametersSchemaV3} with matching fields
* but a constructor that always throws, used to test the exception path in
* {@code ParametersBuilderUtil.getParamsInstance()}.
*
* @since 2.0.2
*/
public class FailingConstructorParameters extends ModelParametersSchemaV3 {

@SerializedName("field_1")
public int field1;

@SerializedName("field_2")
public int field2;

public FailingConstructorParameters() {
throw new RuntimeException("Simulated constructor failure");
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
package com.feedzai.openml.h2o.algos.mocks;

import water.bindings.pojos.ModelParametersSchemaV3;

/**
* Mocked class that extends {@link ModelParametersSchemaV3} but has no no-arg constructor,
* used to test the error path in {@code ParametersBuilderUtil.getParamsInstance()}.
*
* @since 2.0.2
*/
public class NoDefaultConstructorParameters extends ModelParametersSchemaV3 {

public NoDefaultConstructorParameters(final String required) {
// intentionally no default constructor
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,8 @@ public static Object createNewInstanceFromClassLoader(final String modelPath,
final URLClassLoader urlClassLoader) throws ModelLoadingException {
try {
final String simpleNameJar = FilenameUtils.getBaseName(modelPath);
return urlClassLoader.loadClass(String.format(simpleNameFormat, simpleNameJar)).newInstance();
return urlClassLoader.loadClass(String.format(simpleNameFormat, simpleNameJar))
.getDeclaredConstructor().newInstance();
} catch (final Exception e) {
logger.error("Could not load the model [{}].", modelPath, e);
throw new ModelLoadingException(
Expand Down
Loading
Loading