Skip to content

Collection#

Tip

To understand the general idea better, visit the Collection concept page.

dbally.collection.Collection #

Collection(name: str, view_selector: ViewSelector, llm: LLM, nl_responder: NLResponder, event_handlers: Optional[List[EventHandler]] = None, n_retries: int = 3, fallback_collection: Optional[Collection] = None)

Collection is a container for a set of views that can be used by db-ally to answer user questions.

Tip

It is recommended to create new collections using the dbally.create_collection function instead of instantiating this class directly.

PARAMETER DESCRIPTION
name

Name of the collection is available for Event handlers and is used to distinguish different db-ally runs.

TYPE: str

view_selector

As you register more than one View within single collection, before generating the IQL query, a View that fits query the most is selected by the ViewSelector.

TYPE: ViewSelector

llm

LLM used by the collection to generate views and respond to natural language queries.

TYPE: LLM

nl_responder

Object that translates RAW response from db-ally into natural language.

TYPE: NLResponder

event_handlers

Event handlers used by the collection during query executions. Can be used to log events as CLIEventHandler or to validate system performance as LangSmithEventHandler.

TYPE: Optional[List[EventHandler]] DEFAULT: None

nl_responder

Object that translates RAW response from db-ally into natural language.

TYPE: NLResponder

n_retries

IQL generator may produce invalid IQL. If this is the case this argument specifies how many times db-ally will try to regenerate it. Previous try with the error message is appended to the chat history to guide next generations.

TYPE: int DEFAULT: 3

fallback_collection

collection to be asked when the ask function could not find answer in views registered

TYPE: Optional[Collection] DEFAULT: None

Source code in src/dbally/collection/collection.py
def __init__(
    self,
    name: str,
    view_selector: ViewSelector,
    llm: LLM,
    nl_responder: NLResponder,
    event_handlers: Optional[List[EventHandler]] = None,
    n_retries: int = 3,
    fallback_collection: Optional["Collection"] = None,
) -> None:
    """
    Args:
        name: Name of the collection is available for [Event handlers](event_handlers/index.md) and is\
        used to distinguish different db-ally runs.
        view_selector: As you register more than one [View](views/index.md) within single collection,\
        before generating the IQL query, a View that fits query the most is selected by the\
        [ViewSelector](view_selection/index.md).
        llm: LLM used by the collection to generate views and respond to natural language queries.
        nl_responder: Object that translates RAW response from db-ally into natural language.
        event_handlers: Event handlers used by the collection during query executions. Can be used\
        to log events as [CLIEventHandler](event_handlers/cli_handler.md) or to validate system performance\
        as [LangSmithEventHandler](event_handlers/langsmith_handler.md).
        nl_responder: Object that translates RAW response from db-ally into natural language.
        n_retries: IQL generator may produce invalid IQL. If this is the case this argument specifies\
        how many times db-ally will try to regenerate it. Previous try with the error message is\
        appended to the chat history to guide next generations.
        fallback_collection: collection to be asked when the ask function could not find answer in views registered
        to this collection
    """
    self.name = name
    self.n_retries = n_retries
    self._views: Dict[str, Callable[[], BaseView]] = {}
    self._builders: Dict[str, Callable[[], BaseView]] = {}
    self._view_selector = view_selector
    self._nl_responder = nl_responder
    self._llm = llm
    self._fallback_collection: Optional[Collection] = fallback_collection
    self._event_handlers = event_handlers or dbally.event_handlers

name instance-attribute #

name = name

n_retries instance-attribute #

n_retries = n_retries

T class-attribute instance-attribute #

T = TypeVar('T', bound=BaseView)

add #

add(view: Type[T], builder: Optional[Callable[[], T]] = None, name: Optional[str] = None) -> None

Register new View that will be available to query via the collection.

PARAMETER DESCRIPTION
view

A class inheriting from BaseView. Object of this type will be initialized during query execution. We expect Class instead of object, as otherwise Views must have been implemented stateless, which would be cumbersome.

TYPE: Type[T]

builder

Optional factory function that will be used to create the View instance. Use it when you need to pass outcome of API call or database connection to the view, and it can change over time.

TYPE: Optional[Callable[[], T]] DEFAULT: None

name

Custom name of the view (defaults to the name of the class).

TYPE: Optional[str] DEFAULT: None

RAISES DESCRIPTION
ValueError

if view with the given name is already registered or views class possess some non-default arguments.

Example of custom builder usage

    def build_dogs_df_view():
        dogs_df = request.get("https://dog.ceo/api/breeds/list")
        return DogsDFView(dogs_df)

    collection.add(DogsDFView, build_dogs_df_view)
Source code in src/dbally/collection/collection.py
def add(self, view: Type[T], builder: Optional[Callable[[], T]] = None, name: Optional[str] = None) -> None:
    """
    Register new [View](views/index.md) that will be available to query via the collection.

    Args:
        view: A class inheriting from BaseView. Object of this type will be initialized during\
        query execution. We expect Class instead of object, as otherwise Views must have been implemented\
        stateless, which would be cumbersome.
        builder: Optional factory function that will be used to create the View instance. Use it when you\
        need to pass outcome of API call or database connection to the view, and it can change over time.
        name: Custom name of the view (defaults to the name of the class).

    Raises:
        ValueError: if view with the given name is already registered or views class possess some non-default\
        arguments.

    **Example** of custom `builder` usage

    ```python
        def build_dogs_df_view():
            dogs_df = request.get("https://dog.ceo/api/breeds/list")
            return DogsDFView(dogs_df)

        collection.add(DogsDFView, build_dogs_df_view)
    ```
    """
    if name is None:
        name = view.__name__

    if name in self._views or name in self._builders:
        raise ValueError(f"View with name {name} is already registered")

    non_default_args = any(
        p.default == inspect.Parameter.empty for p in inspect.signature(view).parameters.values()
    )
    if non_default_args and builder is None:
        raise ValueError("Builder function is required for views with non-default arguments")

    builder = builder or view

    # instantiate view to check if the builder is correct
    view_instance = builder()
    if not isinstance(view_instance, view):
        raise ValueError(f"The builder function for view {name} must return an instance of {view.__name__}")

    self._views[name] = view
    self._builders[name] = builder

set_fallback #

set_fallback(fallback_collection: Collection) -> Collection

Set fallback collection which will be asked if the ask to base collection does not succeed.

PARAMETER DESCRIPTION
fallback_collection

Collection to be asked in case of base collection failure.

TYPE: Collection

RETURNS DESCRIPTION
Collection

The fallback collection to create chains call

Source code in src/dbally/collection/collection.py
def set_fallback(self, fallback_collection: "Collection") -> "Collection":
    """
    Set fallback collection which will be asked if the ask to base collection does not succeed.

    Args:
        fallback_collection: Collection to be asked in case of base collection failure.

    Returns:
        The fallback collection to create chains call
    """
    self._fallback_collection = fallback_collection
    if fallback_collection._event_handlers != self._event_handlers:  # pylint: disable=W0212
        logging.warning(
            "Event handlers of the fallback collection are different from the base collection. "
            "Continuity of the audit trail is not guaranteed.",
        )

    return fallback_collection

get #

get(name: str) -> BaseView

Returns an instance of the view with the given name

PARAMETER DESCRIPTION
name

Name of the view to return

TYPE: str

RETURNS DESCRIPTION
BaseView

View instance

RAISES DESCRIPTION
NoViewFoundError

If there is no view with the given name

Source code in src/dbally/collection/collection.py
def get(self, name: str) -> BaseView:
    """
    Returns an instance of the view with the given name

    Args:
        name: Name of the view to return

    Returns:
        View instance

    Raises:
         NoViewFoundError: If there is no view with the given name
    """

    if name not in self._views:
        raise NoViewFoundError(name)

    return self._builders[name]()

list #

list() -> Dict[str, str]

Lists all registered view names and their descriptions

RETURNS DESCRIPTION
Dict[str, str]

Dictionary of view names and descriptions

Source code in src/dbally/collection/collection.py
def list(self) -> Dict[str, str]:
    """
    Lists all registered view names and their descriptions

    Returns:
        Dictionary of view names and descriptions
    """
    return {
        name: (textwrap.dedent(view.__doc__).strip() if view.__doc__ else "") for name, view in self._views.items()
    }

get_all_event_handlers #

get_all_event_handlers() -> List[EventHandler]

Retrieves all event handlers, including those from a fallback collection if available.

This method returns a list of event handlers. If there is no fallback collection, it simply returns the event handlers stored in the current object. If a fallback collection is available, it combines the event handlers from both the current object and the fallback collection, ensuring no duplicates.

RETURNS DESCRIPTION
List[EventHandler]

A list of event handlers.

Source code in src/dbally/collection/collection.py
def get_all_event_handlers(self) -> List[EventHandler]:
    """
    Retrieves all event handlers, including those from a fallback collection if available.

    This method returns a list of event handlers. If there is no fallback collection,
    it simply returns the event handlers stored in the current object. If a fallback
    collection is available, it combines the event handlers from both the current object
    and the fallback collection, ensuring no duplicates.

    Returns:
        A list of event handlers.
    """
    if not self._fallback_collection:
        return self._event_handlers
    return list(set(self._event_handlers).union(self._fallback_collection.get_all_event_handlers()))

ask async #

ask(question: str, dry_run: bool = False, return_natural_response: bool = False, llm_options: Optional[LLMOptions] = None, event_tracker: Optional[EventTracker] = None) -> ExecutionResult

Ask question in a text form and retrieve the answer based on the available views.

Question answering is composed of following steps
  1. View Selection
  2. IQL Generation
  3. IQL Parsing
  4. Query Building
  5. Query Execution
PARAMETER DESCRIPTION
question

question posed using natural language representation e.g "What job offers for Data Scientists do we have?"

TYPE: str

dry_run

if True, only generate the query without executing it

TYPE: bool DEFAULT: False

return_natural_response

if True (and dry_run is False as natural response requires query results), the natural response will be included in the answer

TYPE: bool DEFAULT: False

llm_options

options to use for the LLM client. If provided, these options will be merged with the default options provided to the LLM client, prioritizing option values other than NOT_GIVEN

TYPE: Optional[LLMOptions] DEFAULT: None

event_tracker

Event tracker object for given ask.

TYPE: Optional[EventTracker] DEFAULT: None

RETURNS DESCRIPTION
ExecutionResult

ExecutionResult object representing the result of the query execution.

RAISES DESCRIPTION
ValueError

if collection is empty

IQLError

if incorrect IQL was generated n_retries amount of times.

ValueError

if incorrect IQL was generated n_retries amount of times.

NoViewFoundError

if question does not match to any registered view,

UnsupportedQueryError

if the question could not be answered

IndexUpdateError

if index update failed

Source code in src/dbally/collection/collection.py
async def ask(
    self,
    question: str,
    dry_run: bool = False,
    return_natural_response: bool = False,
    llm_options: Optional[LLMOptions] = None,
    event_tracker: Optional[EventTracker] = None,
) -> ExecutionResult:
    """
    Ask question in a text form and retrieve the answer based on the available views.

    Question answering is composed of following steps:
        1. View Selection
        2. IQL Generation
        3. IQL Parsing
        4. Query Building
        5. Query Execution

    Args:
        question: question posed using natural language representation e.g\
        "What job offers for Data Scientists do we have?"
        dry_run: if True, only generate the query without executing it
        return_natural_response: if True (and dry_run is False as natural response requires query results),
            the natural response will be included in the answer
        llm_options: options to use for the LLM client. If provided, these options will be merged with the default
            options provided to the LLM client, prioritizing option values other than NOT_GIVEN
        event_tracker: Event tracker object for given ask.

    Returns:
        ExecutionResult object representing the result of the query execution.

    Raises:
        ValueError: if collection is empty
        IQLError: if incorrect IQL was generated `n_retries` amount of times.
        ValueError: if incorrect IQL was generated `n_retries` amount of times.
        NoViewFoundError: if question does not match to any registered view,
        UnsupportedQueryError: if the question could not be answered
        IndexUpdateError: if index update failed
    """
    if not event_tracker:
        is_fallback_call = False
        event_handlers = self.get_all_event_handlers()
        event_tracker = EventTracker.initialize_with_handlers(event_handlers)
        await event_tracker.request_start(RequestStart(question=question, collection_name=self.name))
    else:
        is_fallback_call = True

    selected_view_name = ""

    try:
        start_time = time.monotonic()
        selected_view_name = await self._select_view(
            question=question, event_tracker=event_tracker, llm_options=llm_options
        )

        start_time_view = time.monotonic()
        view_result = await self._ask_view(
            selected_view_name=selected_view_name,
            question=question,
            event_tracker=event_tracker,
            llm_options=llm_options,
            dry_run=dry_run,
        )
        end_time_view = time.monotonic()

        natural_response = (
            await self._generate_textual_response(view_result, question, event_tracker, llm_options)
            if not dry_run and return_natural_response
            else ""
        )

        result = ExecutionResult(
            results=view_result.results,
            context=view_result.context,
            execution_time=time.monotonic() - start_time,
            execution_time_view=end_time_view - start_time_view,
            view_name=selected_view_name,
            textual_response=natural_response,
        )

    except HANDLED_EXCEPTION_TYPES as caught_exception:
        if self._fallback_collection:
            result = await self._handle_fallback(
                question=question,
                dry_run=dry_run,
                return_natural_response=return_natural_response,
                llm_options=llm_options,
                selected_view_name=selected_view_name,
                event_tracker=event_tracker,
                caught_exception=caught_exception,
            )
        else:
            raise caught_exception

    if not is_fallback_call:
        await event_tracker.request_end(RequestEnd(result=result))

    return result

get_similarity_indexes #

get_similarity_indexes() -> Dict[AbstractSimilarityIndex, List[IndexLocation]]

List all similarity indexes from all views in the collection.

RETURNS DESCRIPTION
Dict[AbstractSimilarityIndex, List[IndexLocation]]

Mapping of similarity indexes to their locations, following view format.

For
  • freeform views, the format is (view_name, table_name, column_name)
  • structured views, the format is (view_name, filter_name, argument_name)

TYPE: Dict[AbstractSimilarityIndex, List[IndexLocation]]

Source code in src/dbally/collection/collection.py
def get_similarity_indexes(self) -> Dict[AbstractSimilarityIndex, List[IndexLocation]]:
    """
    List all similarity indexes from all views in the collection.

    Returns:
        Mapping of similarity indexes to their locations, following view format.
        For:
            - freeform views, the format is (view_name, table_name, column_name)
            - structured views, the format is (view_name, filter_name, argument_name)
    """
    indexes = defaultdict(list)
    for view_name in self._views:
        view = self.get(view_name)
        view_indexes = view.list_similarity_indexes()
        for index, location in view_indexes.items():
            indexes[index].extend(location)
    return indexes

update_similarity_indexes async #

update_similarity_indexes() -> None

Update all similarity indexes from all structured views in the collection.

RAISES DESCRIPTION
IndexUpdateError

if updating any of the indexes fails. The exception provides failed_indexes attribute, a dictionary mapping failed indexes to their respective exceptions. Indexes not present in the dictionary were updated successfully.

Source code in src/dbally/collection/collection.py
async def update_similarity_indexes(self) -> None:
    """
    Update all similarity indexes from all structured views in the collection.

    Raises:
        IndexUpdateError: if updating any of the indexes fails. The exception provides `failed_indexes` attribute,
            a dictionary mapping failed indexes to their respective exceptions. Indexes not present in
            the dictionary were updated successfully.
    """
    indexes = self.get_similarity_indexes()
    update_coroutines = [index.update() for index in indexes]
    results = await asyncio.gather(*update_coroutines, return_exceptions=True)
    failed_indexes = {
        index: exception for index, exception in zip(indexes, results) if isinstance(exception, Exception)
    }
    if failed_indexes:
        failed_locations = [loc for index in failed_indexes for loc in indexes[index]]
        raise IndexUpdateError(failed_indexes, failed_locations)

dbally.collection.results.ExecutionResult dataclass #

ExecutionResult(results: List[Dict[str, Any]], context: Dict[str, Any], execution_time: float, execution_time_view: float, view_name: str, textual_response: Optional[str] = None)

Represents the collection-level result of the query execution.

PARAMETER DESCRIPTION
results

List of dictionaries containing the results of the query execution, each dictionary represents a row in the result set with column names as keys. The exact structure of the result set depends on the view that was used to execute the query, which can be obtained from the view_name attribute.

TYPE: List[Dict[str, Any]]

context

Dictionary containing addtional metadata about the query execution.

TYPE: Dict[str, Any]

execution_time

Time taken to execute the entire query, including view selection and all other operations, in seconds.

TYPE: float

execution_time_view

Time taken that the selected view took to execute the query, in seconds.

TYPE: float

view_name

Name of the view that was used to execute the query.

TYPE: str

textual_response

Optional text response that can be used to display the query results in a human-readable format.

TYPE: Optional[str] DEFAULT: None

results instance-attribute #

results: List[Dict[str, Any]]

context instance-attribute #

context: Dict[str, Any]

execution_time instance-attribute #

execution_time: float

execution_time_view instance-attribute #

execution_time_view: float

view_name instance-attribute #

view_name: str

textual_response class-attribute instance-attribute #

textual_response: Optional[str] = None

dbally.collection.exceptions.IndexUpdateError #

IndexUpdateError(failed_indexes: Dict[AbstractSimilarityIndex, Exception], failed_locations: List[IndexLocation])

Bases: DbAllyError

Exception for when updating any of the Collection's similarity indexes fails.

Provides a dictionary mapping failed indexes to their respective exceptions as the failed_indexes attribute.

PARAMETER DESCRIPTION
failed_indexes

Dictionary mapping failed indexes to their respective exceptions.

TYPE: Dict[AbstractSimilarityIndex, Exception]

failed_locations

List of locations of failed indexes.

TYPE: List[IndexLocation]

Source code in src/dbally/collection/exceptions.py
def __init__(
    self,
    failed_indexes: Dict[AbstractSimilarityIndex, Exception],
    failed_locations: List[IndexLocation],
) -> None:
    """
    Args:
        failed_indexes: Dictionary mapping failed indexes to their respective exceptions.
        failed_locations: List of locations of failed indexes.
    """
    description = ", ".join(".".join(name for name in location) for location in failed_locations)
    super().__init__(f"Failed to update similarity indexes for {description}.")
    self.failed_indexes = failed_indexes

failed_indexes instance-attribute #

failed_indexes = failed_indexes

dbally.collection.exceptions.NoViewFoundError #

NoViewFoundError(view_name: str)

Bases: DbAllyError

Error raised when there is no view with the given name.

PARAMETER DESCRIPTION
view_name

Name of the view that was not found.

TYPE: str

Source code in src/dbally/collection/exceptions.py
def __init__(self, view_name: str) -> None:
    """
    Args:
        view_name: Name of the view that was not found.
    """
    super().__init__(f"No view found with name '{view_name}'.")
    self.view_name = view_name

view_name instance-attribute #

view_name = view_name