Issue Draft: HBase Backend - Edges Cannot Be Persisted
Environment
- HugeGraph: 1.7.0 (server), pyhugegraph: 1.7.0 (client)
- HBase: 2.1.2
- ID Strategy:
PRIMARY_KEY
- Serializer:
binary
Description
When HugeGraph uses HBase as the backend storage, edges cannot be persisted regardless of the creation method used (REST API or Gremlin). Vertex creation works, but all edge creation approaches either fail explicitly or silently discard the data.
Reproduction Steps
Schema Setup
// Property keys
schema.propertyKey('name').asText().ifNotExist().create()
schema.propertyKey('date').asText().ifNotExist().create()
// Vertex label with PRIMARY_KEY strategy
schema.vertexLabel('person').properties('name').usePrimaryKeyId().primaryKeys('name').ifNotExist().create()
// Edge label
schema.edgeLabel('roommate').sourceLabel('person').targetLabel('person').properties('date').nullableKeys('date').ifNotExist().create()
Create Vertices
// REST API - works fine
POST /graphs/{graph}/graph/vertices
{"label": "person", "properties": {"name": "Alice"}}
{"label": "person", "properties": {"name": "Bob"}}
Response IDs: "1:Alice", "1:Bob" (composite format from PRIMARY_KEY strategy).
Attempt Edge Creation
Method 1: REST API with returned composite IDs — fails with error
POST /graphs/{graph}/graph/edges
{"label": "roommate", "outV": "1:Alice", "inV": "1:Bob", "properties": {"date": "2024"}}
→ 400 Bad Request: IllegalArgumentException: Invalid vertex id '1:Alice'
Method 2: REST API with actual stored IDs (from g.V().id()) — also fails
g.V().id() returns integers (e.g., 9988829793) instead of the composite IDs returned by addVertex. Using these:
POST /graphs/{graph}/graph/edges
{"label": "roommate", "outV": 9988829793, "inV": 9837833573, "properties": {"date": "2024"}}
→ 400 Bad Request: IllegalArgumentException: Invalid vertex id '9988829793'
Note: The vertex CAN be retrieved via REST API GET /vertices/9988829793 (returns 200), but edge creation rejects the same ID.
Method 3: Gremlin addE — appears to succeed but data is not persisted
g.V().hasLabel('person').has('name','Alice').addE('roommate').to(__.V().hasLabel('person').has('name','Bob')).property('date','2024')
Response: Returns the edge object with ID and properties (appears successful).
Verification:
g.E().count() → returns stale/incorrect count (e.g., 3)
g.E().toList() → returns [] (empty, correct)
GET /graph/edges → {"edges": []} (empty)
The edge is created in memory but not persisted to HBase.
Additional Observations
1. Vertex ID Mismatch
With PRIMARY_KEY strategy and HBase backend:
addVertex REST API returns composite IDs: "1:Alice"
g.V().id() returns auto-generated integers: 9988829793
g.V().valueMap(true) returns the integer ID
- The vertex is accessible via
GET /vertices/{integer_id} but NOT via GET /vertices/1:Alice
g.V({integer_id}) lookup also returns empty (cannot find vertex by its own ID)
2. Vertex ID Collision Across Labels
Different vertex labels with different primary key values may share the same internal ID:
person:James (pk="James") → stored as 9837833573
webpage:James的个人网站 (pk="James的个人网站") → also stored as 9837833573
This suggests the HBase backend generates IDs based on a hash that ignores the vertex label, causing cross-label collisions.
3. g.E().count() Returns Incorrect Results
After Gremlin addE calls that don't persist:
g.E().count() may return a non-zero stale count
g.E().toList() returns empty (correct)
- REST API
/edges returns empty (correct)
Expected Behavior
- Edge creation via REST API should find vertices by the IDs returned by
addVertex
- Edge creation via Gremlin
addE should persist edges to the backend
g.V().id() should return IDs that can be used for subsequent lookups and edge creation
- Vertex IDs should be unique across different vertex labels
Actual Behavior
- No method of edge creation works with HBase backend
- Vertex IDs are inconsistent between
addVertex response and g.V().id()
- Vertex IDs collide across labels
g.E().count() returns stale/incorrect counts
Issue Draft: HBase Backend - Edges Cannot Be Persisted
Environment
PRIMARY_KEYbinaryDescription
When HugeGraph uses HBase as the backend storage, edges cannot be persisted regardless of the creation method used (REST API or Gremlin). Vertex creation works, but all edge creation approaches either fail explicitly or silently discard the data.
Reproduction Steps
Schema Setup
Create Vertices
Response IDs:
"1:Alice","1:Bob"(composite format from PRIMARY_KEY strategy).Attempt Edge Creation
Method 1: REST API with returned composite IDs — fails with error
Method 2: REST API with actual stored IDs (from g.V().id()) — also fails
g.V().id()returns integers (e.g.,9988829793) instead of the composite IDs returned by addVertex. Using these:Note: The vertex CAN be retrieved via REST API
GET /vertices/9988829793(returns 200), but edge creation rejects the same ID.Method 3: Gremlin addE — appears to succeed but data is not persisted
Response: Returns the edge object with ID and properties (appears successful).
Verification:
The edge is created in memory but not persisted to HBase.
Additional Observations
1. Vertex ID Mismatch
With
PRIMARY_KEYstrategy and HBase backend:addVertexREST API returns composite IDs:"1:Alice"g.V().id()returns auto-generated integers:9988829793g.V().valueMap(true)returns the integer IDGET /vertices/{integer_id}but NOT viaGET /vertices/1:Aliceg.V({integer_id})lookup also returns empty (cannot find vertex by its own ID)2. Vertex ID Collision Across Labels
Different vertex labels with different primary key values may share the same internal ID:
This suggests the HBase backend generates IDs based on a hash that ignores the vertex label, causing cross-label collisions.
3. g.E().count() Returns Incorrect Results
After Gremlin
addEcalls that don't persist:g.E().count()may return a non-zero stale countg.E().toList()returns empty (correct)/edgesreturns empty (correct)Expected Behavior
addVertexaddEshould persist edges to the backendg.V().id()should return IDs that can be used for subsequent lookups and edge creationActual Behavior
addVertexresponse andg.V().id()g.E().count()returns stale/incorrect counts