Skip to content

Feat/sql redis query#467

Open
rbs333 wants to merge 8 commits intomainfrom
feat/sql-redis-query
Open

Feat/sql redis query#467
rbs333 wants to merge 8 commits intomainfrom
feat/sql-redis-query

Conversation

@rbs333
Copy link
Collaborator

@rbs333 rbs333 commented Jan 29, 2026

Spec for SQLQuery class

Make sql-like commands available to be translated into Redis queries via redisvl to cut down on syntax overhead for engineers.

Ex:

from redisvl.query import SQLQuery
from redisvl.index import SearchIndex

redis_index = SearchIndex.from_existing(
    "my_book_index",
    redis_url="my_redis_connection"
)

sql_query = SQLQuery("""
  SELECT title, author, price 
  FROM my_book_index
  WHERE category = "scify"
"""
)

response = redis_index.query(sql_query)

This code would then produce the equivalent redis query to be executed against the database:

FT.search my_book_index
  "@category:{scify}"
  LOAD 3 @title @author @price
  DIALECT 2

Expose a method on the object: .redis_query_string() such that you can easily inspect the resulting redis query constructed from SQLQuery class invocation.

Ex:

from redisvl.query import SQLQuery

sql_str = """
    SELECT user, credit_score, job, age
    FROM user_simple
    WHERE age > 17
    """

sql_query = SQLQuery(sql_str)
sql_query.redis_query_string(redis_url="redis://localhost:6379")

# result:
# 'FT.SEARCH user_simple "@age:[(17 +inf]" RETURN 4 user credit_score job age'

Packaging and dependencies

In order to use the SQLQuery class, user will have to install the optional dependency on sql-redis. This can be accomplished with the command pip install redisvl[sql-redis].

Tested/supported operators

Datatype Operator Tested SQL Example Redis Result
tag = SELECT title, category FROM {index} WHERE category = 'electronics' FT.SEARCH test_index "@category:{electronics}" RETURN 2 title category
!= SELECT title, category FROM {index} WHERE category != 'electronics' FT.SEARCH test_index "-@category:{electronics}" RETURN 2 title category
IN SELECT title, category FROM {index} WHERE category IN ('books', 'accessories') FT.SEARCH test_index "@category:{books|accessories}" RETURN 2 title category
numeric > SELECT title, price FROM {index} WHERE price > 100 FT.SEARCH test_index "@price:[(100 +inf]" RETURN 2 title price
>= SELECT title, price FROM {index} WHERE price >= 25 AND price <= 50 FT.SEARCH test_index "@price:[25 +inf] @price:[-inf 50]" RETURN 2 title price
= SELECT title, price FROM {index} WHERE price = 45 FT.SEARCH test_index "@price:[45 45]" RETURN 2 title price
!= SELECT title, price FROM {index} WHERE price != 45 FT.SEARCH test_index "-@price:[45 45]" RETURN 2 title price
< SELECT title, price FROM {index} WHERE price < 50 FT.SEARCH test_index "@price:[-inf (50]" RETURN 2 title price
<= SELECT title, price FROM {index} WHERE price >= 25 AND price <= 50 FT.SEARCH test_index "@price:[25 +inf] @price:[-inf 50]" RETURN 2 title price
BETWEEN SELECT title, price FROM {index} WHERE price BETWEEN 40 AND 60 FT.SEARCH test_index "@price:[40 60]" RETURN 2 title price
text = SELECT title, name FROM {index} WHERE title = 'laptop' FT.SEARCH test_index "@title:laptop" RETURN 2 title name
!= SELECT title, name FROM {index} WHERE title != 'laptop' FT.SEARCH test_index "-@title:laptop" RETURN 2 title name
prefix SELECT title, name FROM {index} WHERE title = 'lap*' FT.SEARCH test_index "@title:lap*" RETURN 2 title name
suffix SELECT title, name FROM {index} WHERE name = '*book' FT.SEARCH test_index "@name:*book" RETURN 2 title name
fuzzy SELECT title, name FROM {index} WHERE title = '%laptap%' FT.SEARCH test_index "@title:%laptap%" RETURN 2 title name
phrase SELECT title, name FROM {index} WHERE title = 'gaming laptop' FT.SEARCH test_index "@title:gaming laptop" RETURN 2 title name
phrase with stopword SELECT title, name FROM {index} WHERE title = 'laptop and keyboard' FT.SEARCH test_index "@title:laptop keyboard" RETURN 2 title name
IN SELECT title, name FROM {index} WHERE title IN ('Python', 'Redis') NOT SUPPORTED
vector vector_distance SELECT title, vector_distance(embedding, :vec) AS score FROM {index} LIMIT 3 FT.SEARCH test_index "*=>[KNN 3 @embedding $vector AS score]" PARAMS 2 vector $vector DIALECT 2 RETURN 2 title score LIMIT 0 3
cosine_distance SELECT title, cosine_distance(embedding, :vec) AS vector_distance FROM {index} LIMIT 3 FT.SEARCH test_index "*=>[KNN 3 @embedding $vector AS vector_distance]" PARAMS 2 vector $vector DIALECT 2 RETURN 2 title vector_distance LIMIT 0 3
date >, >=, =, !=, <, <=, IN, BETWEEN TODO
geo = TODO

Tested/supported aggregation reducer functions

Reducer Tested SQL Example Redis Result
COUNT SELECT COUNT(*) as total FROM {index} FT.AGGREGATE test_index "*" GROUPBY 0 REDUCE COUNT 0 AS total
COUNT SELECT category, COUNT(*) as count FROM {index} GROUP BY category FT.AGGREGATE test_index "*" LOAD 1 category GROUPBY 1 @category REDUCE COUNT 0 AS count
COUNT_DISTINCT SELECT category, COUNT_DISTINCT(title) as unique_titles FROM {index} GROUP BY category FT.AGGREGATE test_index "*" LOAD 2 category title GROUPBY 1 @category REDUCE COUNT_DISTINCT 1 @title AS unique_titles
SUM SELECT SUM(price) as total FROM {index} FT.AGGREGATE test_index "*" LOAD 1 price GROUPBY 0 REDUCE SUM 1 @price AS total
SUM SELECT category, SUM(price) as total_price FROM {index} GROUP BY category FT.AGGREGATE test_index "*" LOAD 2 category price GROUPBY 1 @category REDUCE SUM 1 @price AS total_price
MIN SELECT MIN(price) as min_price FROM {index} FT.AGGREGATE test_index "*" LOAD 1 price GROUPBY 0 REDUCE MIN 1 @price AS min_price
MIN SELECT category, MIN(price) as min_price FROM {index} GROUP BY category FT.AGGREGATE test_index "*" LOAD 2 category price GROUPBY 1 @category REDUCE MIN 1 @price AS min_price
MAX SELECT MAX(price) as max_price FROM {index} FT.AGGREGATE test_index "*" LOAD 1 price GROUPBY 0 REDUCE MAX 1 @price AS max_price
MAX SELECT category, MAX(price) as max_price FROM {index} GROUP BY category FT.AGGREGATE test_index "*" LOAD 2 category price GROUPBY 1 @category REDUCE MAX 1 @price AS max_price
AVG SELECT category, AVG(price) as avg_price FROM {index} GROUP BY category FT.AGGREGATE test_index "*" LOAD 2 category price GROUPBY 1 @category REDUCE AVG 1 @price AS avg_price
STDDEV SELECT STDDEV(price) as price_stddev FROM {index} FT.AGGREGATE test_index "*" LOAD 1 price GROUPBY 0 REDUCE STDDEV 1 @price AS price_stddev
QUANTILE SELECT category, QUANTILE(price, 0.5) as median_price FROM {index} GROUP BY category FT.AGGREGATE test_index "*" LOAD 2 price category GROUPBY 1 @category REDUCE QUANTILE 2 @price 0.5 AS median_price
TOLIST SELECT category, ARRAY_AGG(title) as titles FROM {index} GROUP BY category FT.AGGREGATE test_index "*" LOAD 2 category title GROUPBY 1 @category REDUCE TOLIST 1 @title AS titles
FIRST_VALUE SELECT category, FIRST_VALUE(title) as first_title FROM {index} GROUP BY category FT.AGGREGATE test_index "*" LOAD 2 category title GROUPBY 1 @category REDUCE FIRST_VALUE 1 @title AS first_title

TODO: subsequent work that will follow in separate PRs

  • DATE datatype support
  • ISMISSING / EXISTS null check support
  • Advanced text search BM25 etc.
  • Advanced hybrid search
  • GEO datatype support

Copilot AI review requested due to automatic review settings January 29, 2026 15:54
@jit-ci
Copy link

jit-ci bot commented Jan 29, 2026

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces SQL query support to redisvl by adding a new SQLQuery class that translates SQL-like queries into Redis FT.SEARCH and FT.AGGREGATE commands. The implementation wraps the external sql-redis package to provide a more familiar SQL interface for querying Redis.

Changes:

  • Added SQLQuery class that translates SQL SELECT statements to Redis queries
  • Integrated SQLQuery execution into SearchIndex.query() method
  • Added comprehensive integration tests covering various SQL operators and aggregations
  • Created documentation notebook demonstrating SQL query usage

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
pyproject.toml Adds sql-redis as optional dependency for [sql] extra
uv.lock Updates lock file with sql-redis and sqlglot dependencies
redisvl/query/sql.py New SQLQuery class implementation with redis_query_string() method
redisvl/query/init.py Exports SQLQuery from redisvl.query module
redisvl/index/index.py Adds _sql_query() method and integrates SQLQuery into query() dispatcher
tests/integration/test_sql_redis.py Comprehensive integration tests for SQL query functionality
docs/user_guide/12_sql_to_redis_queries.ipynb New user guide notebook demonstrating SQL query usage
docs/user_guide/02_hybrid_queries.ipynb Unintentional execution artifacts from re-running cells
Comments suppressed due to low confidence (1)

redisvl/index/index.py:1150

  • The docstring for the query method still references the old type hint "Union[BaseQuery, AggregateQuery, HybridQuery]" in the Args section (line 1150), but the actual type hint now includes SQLQuery. Update the docstring to reflect that SQLQuery is also accepted.
        """Execute a query on the index.

        This method takes a BaseQuery, AggregationQuery, or HybridQuery object directly, and
        handles post-processing of the search.

        Args:
            query (Union[BaseQuery, AggregateQuery, HybridQuery]): The query to run.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 1144 to 1168
"""Execute a query on the index.

Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for the query method states "This method takes a BaseQuery, AggregationQuery, or HybridQuery object directly" but should also mention SQLQuery since it's now a supported query type. Additionally, the Args section references the old type hint without SQLQuery.

Copilot uses AI. Check for mistakes.
Comment on lines +105 to +113
# Substitute non-bytes params in SQL before translation
sql = self.sql
for key, value in self.params.items():
placeholder = f":{key}"
if isinstance(value, (int, float)):
sql = sql.replace(placeholder, str(value))
elif isinstance(value, str):
sql = sql.replace(placeholder, f"'{value}'")
# bytes (vectors) are handled separately
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter substitution logic only handles int, float, and str types, but doesn't handle bytes (vectors) which are mentioned in the comment. However, looking at the tests, bytes parameters like vectors are passed in the params dict and seem to work. This suggests sql-redis handles bytes parameters internally. Consider adding a code comment explaining that bytes parameters are passed through to sql-redis's translate method rather than being substituted in the SQL string, to clarify the intent.

Copilot uses AI. Check for mistakes.
Comment on lines +932 to +958
# Decode key if bytes
str_key = key.decode("utf-8") if isinstance(key, bytes) else key
# Decode value if bytes
str_value = value.decode("utf-8") if isinstance(value, bytes) else value
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decoding logic assumes all bytes values should be decoded to strings. However, this may incorrectly decode values that should remain as bytes (e.g., binary data, vector embeddings). Consider checking if the value is actually a UTF-8 string before decoding, or preserve the original type for non-string data. You could wrap the decode in a try-except to handle non-UTF-8 bytes gracefully.

Suggested change
# Decode key if bytes
str_key = key.decode("utf-8") if isinstance(key, bytes) else key
# Decode value if bytes
str_value = value.decode("utf-8") if isinstance(value, bytes) else value
# Decode key if bytes, but preserve non-UTF-8 bytes
if isinstance(key, bytes):
try:
str_key = key.decode("utf-8")
except UnicodeDecodeError:
str_key = key
else:
str_key = key
# Decode value if bytes, but preserve non-UTF-8 bytes
if isinstance(value, bytes):
try:
str_value = value.decode("utf-8")
except UnicodeDecodeError:
str_value = value
else:
str_value = value

Copilot uses AI. Check for mistakes.
" MAX(age) as max_age,\n",
" AVG(age) as avg_age,\n",
" STDEV(age) as std_age,\n",
" FIRST_VALUE(age) as fist_value_age,\n",
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the alias name: "fist_value_age" should be "first_value_age" (missing 'r' in 'first'). This appears in both the SQL query and the output, suggesting it's consistently misspelled throughout.

Copilot uses AI. Check for mistakes.
"metadata": {},
"outputs": [],
"source": [
"# Butm the index is still in place\n",
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "Butm" should be "But" (extra 'm' at the end).

Suggested change
"# Butm the index is still in place\n",
"# But the index is still in place\n",

Copilot uses AI. Check for mistakes.
Comment on lines 727 to 747
"# await index.clear()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"# Butm the index is still in place\n",
"# await index.exists()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"# Remove / delete the index in its entirety\n",
"# await index.delete()"
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented cleanup code uses await keywords (async syntax) but this notebook doesn't appear to be using async/await elsewhere. The SearchIndex methods like clear(), exists(), and delete() are synchronous methods, not async. Remove the await keywords from these commented lines.

Copilot uses AI. Check for mistakes.
pyproject.toml Outdated
"pillow>=11.3.0",
]
sql = [
"sql-redis @ file:///Users/robert.shelton/Documents/sql-redis/dist/sql_redis-0.1.0-py3-none-any.whl",
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sql-redis dependency is using an absolute local file path specific to a developer's machine (/Users/robert.shelton/Documents/sql-redis/...). This will break on other systems and in CI/CD environments. According to the PR description, this is blocked on a release/update of sql-redis package. Before merging, this should be changed to reference a published version on PyPI (e.g., "sql-redis>=0.1.0").

Copilot uses AI. Check for mistakes.
@rbs333 rbs333 force-pushed the feat/sql-redis-query branch from a7d4e1c to 7f3f5ff Compare February 3, 2026 19:42
@rbs333 rbs333 marked this pull request as ready for review February 3, 2026 19:45
Copilot AI review requested due to automatic review settings February 3, 2026 19:45
:exclude-members: add_filter,get_args,highlight,return_field,summarize


SQLQuery
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add docs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check this example it's the most conclusive

[project]
name = "redisvl"
version = "0.13.2"
version = "0.14.0"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would qualify as minor release

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test file is what large comment is base off of

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +92 to +116
# Get or create Redis client
if redis_client is None:
from redis import Redis

redis_client = Redis.from_url(redis_url)

# Load schemas from Redis
registry = SchemaRegistry(redis_client)
registry.load_all()

# Translate SQL to Redis command
translator = Translator(registry)

# Substitute non-bytes params in SQL before translation
sql = self.sql
for key, value in self.params.items():
placeholder = f":{key}"
if isinstance(value, (int, float)):
sql = sql.replace(placeholder, str(value))
elif isinstance(value, str):
sql = sql.replace(placeholder, f"'{value}'")
# bytes (vectors) are handled separately

translated = translator.translate(sql)
return translated.to_command_string()
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method creates a temporary Redis client if redis_client is None, but never closes it. This can lead to resource leaks, especially when this method is called multiple times. The Redis connection should be properly closed after use, for example using a context manager or try-finally block.

Suggested change
# Get or create Redis client
if redis_client is None:
from redis import Redis
redis_client = Redis.from_url(redis_url)
# Load schemas from Redis
registry = SchemaRegistry(redis_client)
registry.load_all()
# Translate SQL to Redis command
translator = Translator(registry)
# Substitute non-bytes params in SQL before translation
sql = self.sql
for key, value in self.params.items():
placeholder = f":{key}"
if isinstance(value, (int, float)):
sql = sql.replace(placeholder, str(value))
elif isinstance(value, str):
sql = sql.replace(placeholder, f"'{value}'")
# bytes (vectors) are handled separately
translated = translator.translate(sql)
return translated.to_command_string()
created_client = False
# Get or create Redis client
if redis_client is None:
from redis import Redis
redis_client = Redis.from_url(redis_url)
created_client = True
try:
# Load schemas from Redis
registry = SchemaRegistry(redis_client)
registry.load_all()
# Translate SQL to Redis command
translator = Translator(registry)
# Substitute non-bytes params in SQL before translation
sql = self.sql
for key, value in self.params.items():
placeholder = f":{key}"
if isinstance(value, (int, float)):
sql = sql.replace(placeholder, str(value))
elif isinstance(value, str):
sql = sql.replace(placeholder, f"'{value}'")
# bytes (vectors) are handled separately
translated = translator.translate(sql)
return translated.to_command_string()
finally:
if created_client and redis_client is not None:
# Only close clients created within this method
redis_client.close()

Copilot uses AI. Check for mistakes.
Comment on lines +950 to +959
# Decode bytes to strings in the results (Redis may return bytes)
decoded_rows = []
for row in result.rows:
decoded_row = {}
for key, value in row.items():
# Decode key if bytes
str_key = key.decode("utf-8") if isinstance(key, bytes) else key
# Decode value if bytes
str_value = value.decode("utf-8") if isinstance(value, bytes) else value
decoded_row[str_key] = str_value
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decoding logic for result values only handles bytes-to-string conversion, but doesn't preserve other types. If a value is a list (like in TOLIST/ARRAY_AGG results shown in tests), it will remain as a list potentially containing bytes elements. The code should recursively decode bytes within lists and other nested structures, or document that certain result types may contain raw bytes.

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +29
Requires the optional `sql-redis` package. Install with:
``pip install redisvl[sql]``
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation string states to install with pip install redisvl[sql], but the pyproject.toml defines the extra as sql-redis. This inconsistency will cause user confusion. The install command should be pip install redisvl[sql-redis] to match the extras definition.

Copilot uses AI. Check for mistakes.
Comment on lines +87 to +89
raise ImportError(
"sql-redis is required for SQL query support. "
"Install it with: pip install redisvl[sql]"
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message states to install with pip install redisvl[sql], but this is inconsistent with the extras definition sql-redis in pyproject.toml. The error message should say pip install redisvl[sql-redis] to match the defined extra name.

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +41
class SQLQuery:
"""A query class that translates SQL-like syntax into Redis queries.

This class allows users to write SQL SELECT statements that are
automatically translated into Redis FT.SEARCH or FT.AGGREGATE commands.

.. code-block:: python

from redisvl.query import SQLQuery
from redisvl.index import SearchIndex

index = SearchIndex.from_existing("products", redis_url="redis://localhost:6379")

sql_query = SQLQuery('''
SELECT title, price, category
FROM products
WHERE category = 'electronics' AND price < 100
''')

results = index.query(sql_query)

Note:
Requires the optional `sql-redis` package. Install with:
``pip install redisvl[sql]``
"""

def __init__(self, sql: str, params: Optional[Dict[str, Any]] = None):
"""Initialize a SQLQuery.

Args:
sql: The SQL SELECT statement to execute.
params: Optional dictionary of parameters for parameterized queries.
Useful for passing vector data for similarity searches.
"""
self.sql = sql
self.params = params or {}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SQLQuery class does not inherit from BaseQuery, unlike other query classes in the codebase (VectorQuery, FilterQuery, etc.). This breaks consistency with the query class hierarchy. Additionally, the _sql_query method in SearchIndex.query() doesn't validate or process the query through the standard query pipeline, which could lead to inconsistent behavior. Consider either: 1) making SQLQuery inherit from BaseQuery if possible, or 2) documenting why this design decision was made and ensuring the interface is compatible with the expected query contract.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant