How-To: Use SQL databases with db-ally#
db-ally is a Python library that allows you to use natural language to query various data sources, including SQL databases. This guide will show you how to set up a structured view to query a SQL database using SQLAlchemy. The guide will work with any database that SQLAlchemy supports, including SQLite, PostgreSQL, MySQL, Oracle, MS-SQL, Firebird, Sybase, and others.
Views#
The majority of the db-ally's codebase is independent of any particular kind of data source. The part that is specific to a data source is the view. A view is a class that defines how to interact with a data source. It contains methods that define how to retrieve data from the data source and how to filter the data in response to natural language queries.
There are several methods for creating a view that connects to a SQL database, including creating a custom view from scratch. However, in most cases the easiest will be to use the SqlAlchemyBaseView
class provided by db-ally. This tutorial is designed to work with SQLAlchemy, a popular SQL toolkit and Object-Relational Mapping (ORM) library for Python. To define your view, you will need to produce a class that inherits from SqlAlchemyBaseView
and implement the get_select
method, which returns a SQLAlchemy Select
object:
from dbally import SqlAlchemyBaseView
class CandidateView(SqlAlchemyBaseView):
"""
A view for retrieving candidates from the database.
"""
def get_select(self) -> sqlalchemy.Select:
"""
Create the initial SQLAlchemy select object, defining which columns to select.
"""
return sqlalchemy.select(Candidate)
The returned Select
object is used to build the query that will be sent to the database. In this example, the get_select
method returns a Select
object that selects all columns from the candidates
table. You can customize the Select
object to select specific columns, join multiple tables, and so on.
For example, to only select the name
and country
columns, you might return a Select
object defined as sqlalchemy.select(Candidate.name, Candidate.country)
.
Note
Candidate
is a SQLAlchemy model that represents the candidates
table in the database. You need to define this model using SQLAlchemy's ORM. For instance, you might define it like this:
from sqlalchemy import Column, Integer, String
class Candidate(Base):
__tablename__ = 'candidates'
id = Column(Integer, primary_key=True)
name = Column(String)
country = Column(String)
years_of_experience = Column(Integer)
Alternatively, you can use SQLAlchemy's automap feature to automatically generate the model from the database schema. See the Quickstart for an example of how this is done.
Filters#
In addition to the get_select
method, you can define filter methods in your view. A filter method is a method that takes some parameters and returns a SQLAlchemy ColumnElement
object. This object represents a condition that can be used to filter the data in the database. See the SQLAlchemy documentation on where
clauses for more details - any condition that can be used in a where
clause can be returned from a filter method.
For example, you might define a filter method to filter candidates by country:
from dbally import decorators
class CandidateView(SqlAlchemyBaseView):
# ... (get_select method as before)
@decorators.view_filter()
def from_country(self, country: str) -> sqlalchemy.ColumnElement:
"""
Filter candidates from a specific country.
"""
return Candidate.country == country
@decorators.view_filter()
def with_experience(self, years: int) -> sqlalchemy.ColumnElement:
"""
Filter candidates with at least `years` of experience.
"""
return Candidate.years_of_experience >= years
In this example, the from_country
filter takes a country
parameter and returns a condition that filters candidates by country. The with_experience
filter takes a years
parameter and returns a condition that filters candidates by the required years of experience. You can define as many filter methods as you need to support the queries you want to handle. The LLM will decide which filters to use and provide arguments to the filters as needed. They will be used to control which table rows to fetch.
Connecting to the database#
You need to connect to the database using SQLAlchemy before you can use your view. To work, views that inherit from SqlAlchemyBaseView
require a SQLAlchemy engine to be passed to them. See the SQLAlchemy documentation on engines for information on how to create an engine for your database. Here is an example of how you might create an engine for a SQLite database:
from sqlalchemy import create_engine
engine = create_engine('sqlite:///examples/recruiting/data/candidates.db')
Registering the view#
Once you have defined your view and created an engine, you can register the view with db-ally. You do this by creating a collection and adding the view to it:
from dbally.llms.litellm import LiteLLM
my_collection = dbally.create_collection("collection_name", llm=LiteLLM())
my_collection.add(CandidateView, lambda: CandidateView(engine))
Using the view#
To ask a natural language query using your view, you call the ask
method on the collection:
response = await my_collection.ask("Find me candidates from Italy with at least 5 years of experience")
print(response.results)
In cases where you have multiple views in a collection, db-ally will use LLM to determine the most suitable view to address the query, and then that view will be used to pull the relevant data.