Collection#

Tip

To understand the general idea better, visit the Collection concept page.

dbally.collection.Collection #

Collection(name: str, view_selector: ViewSelector, llm: LLM, nl_responder: NLResponder, event_handlers: Optional[List[EventHandler]] = None, n_retries: int = 3, fallback_collection: Optional[Collection] = None)

Collection is a container for a set of views that can be used by db-ally to answer user questions.

Tip

It is recommended to create new collections using the dbally.create_collection function instead of instantiating this class directly.

PARAMETER	DESCRIPTION
`name`	Name of the collection is available for Event handlers and is used to distinguish different db-ally runs. TYPE: `str`
`view_selector`	As you register more than one View within single collection, before generating the IQL query, a View that fits query the most is selected by the ViewSelector. TYPE: `ViewSelector`
`llm`	LLM used by the collection to generate views and respond to natural language queries. TYPE: `LLM`
`nl_responder`	Object that translates RAW response from db-ally into natural language. TYPE: `NLResponder`
`event_handlers`	Event handlers used by the collection during query executions. Can be used to log events as CLIEventHandler or to validate system performance as LangSmithEventHandler. TYPE: `Optional[List[EventHandler]]` DEFAULT: `None`
`nl_responder`	Object that translates RAW response from db-ally into natural language. TYPE: `NLResponder`
`n_retries`	IQL generator may produce invalid IQL. If this is the case this argument specifies how many times db-ally will try to regenerate it. Previous try with the error message is appended to the chat history to guide next generations. TYPE: `int` DEFAULT: `3`
`fallback_collection`	collection to be asked when the ask function could not find answer in views registered TYPE: `Optional[Collection]` DEFAULT: `None`

Source code in src/dbally/collection/collection.py

def __init__(
    self,
    name: str,
    view_selector: ViewSelector,
    llm: LLM,
    nl_responder: NLResponder,
    event_handlers: Optional[List[EventHandler]] = None,
    n_retries: int = 3,
    fallback_collection: Optional["Collection"] = None,
) -> None:
    """
    Args:
        name: Name of the collection is available for [Event handlers](event_handlers/index.md) and is\
        used to distinguish different db-ally runs.
        view_selector: As you register more than one [View](views/index.md) within single collection,\
        before generating the IQL query, a View that fits query the most is selected by the\
        [ViewSelector](view_selection/index.md).
        llm: LLM used by the collection to generate views and respond to natural language queries.
        nl_responder: Object that translates RAW response from db-ally into natural language.
        event_handlers: Event handlers used by the collection during query executions. Can be used\
        to log events as [CLIEventHandler](event_handlers/cli_handler.md) or to validate system performance\
        as [LangSmithEventHandler](event_handlers/langsmith_handler.md).
        nl_responder: Object that translates RAW response from db-ally into natural language.
        n_retries: IQL generator may produce invalid IQL. If this is the case this argument specifies\
        how many times db-ally will try to regenerate it. Previous try with the error message is\
        appended to the chat history to guide next generations.
        fallback_collection: collection to be asked when the ask function could not find answer in views registered
        to this collection
    """
    self.name = name
    self.n_retries = n_retries
    self._views: Dict[str, Callable[[], BaseView]] = {}
    self._builders: Dict[str, Callable[[], BaseView]] = {}
    self._view_selector = view_selector
    self._nl_responder = nl_responder
    self._llm = llm
    self._fallback_collection: Optional[Collection] = fallback_collection
    self._event_handlers = event_handlers or dbally.event_handlers

name `instance-attribute` #

name = name

n_retries `instance-attribute` #

n_retries = n_retries

T `class-attribute` `instance-attribute` #

T = TypeVar('T', bound=BaseView)

add #

add(view: Type[T], builder: Optional[Callable[[], T]] = None, name: Optional[str] = None) -> None

Register new View that will be available to query via the collection.

PARAMETER	DESCRIPTION
`view`	A class inheriting from BaseView. Object of this type will be initialized during query execution. We expect Class instead of object, as otherwise Views must have been implemented stateless, which would be cumbersome. TYPE: `Type[T]`
`builder`	Optional factory function that will be used to create the View instance. Use it when you need to pass outcome of API call or database connection to the view, and it can change over time. TYPE: `Optional[Callable[[], T]]` DEFAULT: `None`
`name`	Custom name of the view (defaults to the name of the class). TYPE: `Optional[str]` DEFAULT: `None`

RAISES	DESCRIPTION
`ValueError`	if view with the given name is already registered or views class possess some non-default arguments.

Example of custom builder usage

    def build_dogs_df_view():
        dogs_df = request.get("https://dog.ceo/api/breeds/list")
        return DogsDFView(dogs_df)

    collection.add(DogsDFView, build_dogs_df_view)

Source code in src/dbally/collection/collection.py

def add(self, view: Type[T], builder: Optional[Callable[[], T]] = None, name: Optional[str] = None) -> None:
    """
    Register new [View](views/index.md) that will be available to query via the collection.

    Args:
        view: A class inheriting from BaseView. Object of this type will be initialized during\
        query execution. We expect Class instead of object, as otherwise Views must have been implemented\
        stateless, which would be cumbersome.
        builder: Optional factory function that will be used to create the View instance. Use it when you\
        need to pass outcome of API call or database connection to the view, and it can change over time.
        name: Custom name of the view (defaults to the name of the class).

    Raises:
        ValueError: if view with the given name is already registered or views class possess some non-default\
        arguments.

    **Example** of custom `builder` usage

    ```python
        def build_dogs_df_view():
            dogs_df = request.get("https://dog.ceo/api/breeds/list")
            return DogsDFView(dogs_df)

        collection.add(DogsDFView, build_dogs_df_view)
    ```
    """
    if name is None:
        name = view.__name__

    if name in self._views or name in self._builders:
        raise ValueError(f"View with name {name} is already registered")

    non_default_args = any(
        p.default == inspect.Parameter.empty for p in inspect.signature(view).parameters.values()
    )
    if non_default_args and builder is None:
        raise ValueError("Builder function is required for views with non-default arguments")

    builder = builder or view

    # instantiate view to check if the builder is correct
    view_instance = builder()
    if not isinstance(view_instance, view):
        raise ValueError(f"The builder function for view {name} must return an instance of {view.__name__}")

    self._views[name] = view
    self._builders[name] = builder

set_fallback #

set_fallback(fallback_collection: Collection) -> Collection

Set fallback collection which will be asked if the ask to base collection does not succeed.

PARAMETER	DESCRIPTION
`fallback_collection`	Collection to be asked in case of base collection failure. TYPE: `Collection`

RETURNS	DESCRIPTION
`Collection`	The fallback collection to create chains call

Source code in src/dbally/collection/collection.py

def set_fallback(self, fallback_collection: "Collection") -> "Collection":
    """
    Set fallback collection which will be asked if the ask to base collection does not succeed.

    Args:
        fallback_collection: Collection to be asked in case of base collection failure.

    Returns:
        The fallback collection to create chains call
    """
    self._fallback_collection = fallback_collection
    if fallback_collection._event_handlers != self._event_handlers:  # pylint: disable=W0212
        logging.warning(
            "Event handlers of the fallback collection are different from the base collection. "
            "Continuity of the audit trail is not guaranteed.",
        )

    return fallback_collection

get #

get(name: str) -> BaseView

Returns an instance of the view with the given name

PARAMETER	DESCRIPTION
`name`	Name of the view to return TYPE: `str`

RETURNS	DESCRIPTION
`BaseView`	View instance

RAISES	DESCRIPTION
`NoViewFoundError`	If there is no view with the given name

Source code in src/dbally/collection/collection.py

def get(self, name: str) -> BaseView:
    """
    Returns an instance of the view with the given name

    Args:
        name: Name of the view to return

    Returns:
        View instance

    Raises:
         NoViewFoundError: If there is no view with the given name
    """

    if name not in self._views:
        raise NoViewFoundError(name)

    return self._builders[name]()

list #

list() -> Dict[str, str]

Lists all registered view names and their descriptions

RETURNS	DESCRIPTION
`Dict[str, str]`	Dictionary of view names and descriptions

Source code in src/dbally/collection/collection.py

def list(self) -> Dict[str, str]:
    """
    Lists all registered view names and their descriptions

    Returns:
        Dictionary of view names and descriptions
    """
    return {
        name: (textwrap.dedent(view.__doc__).strip() if view.__doc__ else "") for name, view in self._views.items()
    }

get_all_event_handlers #

get_all_event_handlers() -> List[EventHandler]

Retrieves all event handlers, including those from a fallback collection if available.

This method returns a list of event handlers. If there is no fallback collection, it simply returns the event handlers stored in the current object. If a fallback collection is available, it combines the event handlers from both the current object and the fallback collection, ensuring no duplicates.

RETURNS	DESCRIPTION
`List[EventHandler]`	A list of event handlers.

Source code in src/dbally/collection/collection.py

def get_all_event_handlers(self) -> List[EventHandler]:
    """
    Retrieves all event handlers, including those from a fallback collection if available.

    This method returns a list of event handlers. If there is no fallback collection,
    it simply returns the event handlers stored in the current object. If a fallback
    collection is available, it combines the event handlers from both the current object
    and the fallback collection, ensuring no duplicates.

    Returns:
        A list of event handlers.
    """
    if not self._fallback_collection:
        return self._event_handlers
    return list(set(self._event_handlers).union(self._fallback_collection.get_all_event_handlers()))

ask `async` #

ask(question: str, dry_run: bool = False, return_natural_response: bool = False, llm_options: Optional[LLMOptions] = None, event_tracker: Optional[EventTracker] = None) -> ExecutionResult

Ask question in a text form and retrieve the answer based on the available views.

Question answering is composed of following steps

View Selection
IQL Generation
IQL Parsing
Query Building
Query Execution

PARAMETER	DESCRIPTION
`question`	question posed using natural language representation e.g "What job offers for Data Scientists do we have?" TYPE: `str`
`dry_run`	if True, only generate the query without executing it TYPE: `bool` DEFAULT: `False`
`return_natural_response`	if True (and dry_run is False as natural response requires query results), the natural response will be included in the answer TYPE: `bool` DEFAULT: `False`
`llm_options`	options to use for the LLM client. If provided, these options will be merged with the default options provided to the LLM client, prioritizing option values other than NOT_GIVEN TYPE: `Optional[LLMOptions]` DEFAULT: `None`
`event_tracker`	Event tracker object for given ask. TYPE: `Optional[EventTracker]` DEFAULT: `None`

RETURNS	DESCRIPTION
`ExecutionResult`	ExecutionResult object representing the result of the query execution.

RAISES	DESCRIPTION
`ValueError`	if collection is empty
`IQLError`	if incorrect IQL was generated `n_retries` amount of times.
`ValueError`	if incorrect IQL was generated `n_retries` amount of times.
`NoViewFoundError`	if question does not match to any registered view,
`UnsupportedQueryError`	if the question could not be answered
`IndexUpdateError`	if index update failed

Source code in src/dbally/collection/collection.py

async def ask(
    self,
    question: str,
    dry_run: bool = False,
    return_natural_response: bool = False,
    llm_options: Optional[LLMOptions] = None,
    event_tracker: Optional[EventTracker] = None,
) -> ExecutionResult:
    """
    Ask question in a text form and retrieve the answer based on the available views.

    Question answering is composed of following steps:
        1. View Selection
        2. IQL Generation
        3. IQL Parsing
        4. Query Building
        5. Query Execution

    Args:
        question: question posed using natural language representation e.g\
        "What job offers for Data Scientists do we have?"
        dry_run: if True, only generate the query without executing it
        return_natural_response: if True (and dry_run is False as natural response requires query results),
            the natural response will be included in the answer
        llm_options: options to use for the LLM client. If provided, these options will be merged with the default
            options provided to the LLM client, prioritizing option values other than NOT_GIVEN
        event_tracker: Event tracker object for given ask.

    Returns:
        ExecutionResult object representing the result of the query execution.

    Raises:
        ValueError: if collection is empty
        IQLError: if incorrect IQL was generated `n_retries` amount of times.
        ValueError: if incorrect IQL was generated `n_retries` amount of times.
        NoViewFoundError: if question does not match to any registered view,
        UnsupportedQueryError: if the question could not be answered
        IndexUpdateError: if index update failed
    """
    if not event_tracker:
        is_fallback_call = False
        event_handlers = self.get_all_event_handlers()
        event_tracker = EventTracker.initialize_with_handlers(event_handlers)
        await event_tracker.request_start(RequestStart(question=question, collection_name=self.name))
    else:
        is_fallback_call = True

    selected_view_name = ""

    try:
        start_time = time.monotonic()
        selected_view_name = await self._select_view(
            question=question, event_tracker=event_tracker, llm_options=llm_options
        )

        start_time_view = time.monotonic()
        view_result = await self._ask_view(
            selected_view_name=selected_view_name,
            question=question,
            event_tracker=event_tracker,
            llm_options=llm_options,
            dry_run=dry_run,
        )
        end_time_view = time.monotonic()

        natural_response = (
            await self._generate_textual_response(view_result, question, event_tracker, llm_options)
            if not dry_run and return_natural_response
            else ""
        )

        result = ExecutionResult(
            results=view_result.results,
            context=view_result.context,
            execution_time=time.monotonic() - start_time,
            execution_time_view=end_time_view - start_time_view,
            view_name=selected_view_name,
            textual_response=natural_response,
        )

    except HANDLED_EXCEPTION_TYPES as caught_exception:
        if self._fallback_collection:
            result = await self._handle_fallback(
                question=question,
                dry_run=dry_run,
                return_natural_response=return_natural_response,
                llm_options=llm_options,
                selected_view_name=selected_view_name,
                event_tracker=event_tracker,
                caught_exception=caught_exception,
            )
        else:
            raise caught_exception

    if not is_fallback_call:
        await event_tracker.request_end(RequestEnd(result=result))

    return result

get_similarity_indexes #

get_similarity_indexes() -> Dict[AbstractSimilarityIndex, List[IndexLocation]]

List all similarity indexes from all views in the collection.

RETURNS	DESCRIPTION
`Dict[AbstractSimilarityIndex, List[IndexLocation]]`	Mapping of similarity indexes to their locations, following view format.
`For`	freeform views, the format is (view_name, table_name, column_name) structured views, the format is (view_name, filter_name, argument_name) TYPE: `Dict[AbstractSimilarityIndex, List[IndexLocation]]`

Source code in src/dbally/collection/collection.py

def get_similarity_indexes(self) -> Dict[AbstractSimilarityIndex, List[IndexLocation]]:
    """
    List all similarity indexes from all views in the collection.

    Returns:
        Mapping of similarity indexes to their locations, following view format.
        For:
            - freeform views, the format is (view_name, table_name, column_name)
            - structured views, the format is (view_name, filter_name, argument_name)
    """
    indexes = defaultdict(list)
    for view_name in self._views:
        view = self.get(view_name)
        view_indexes = view.list_similarity_indexes()
        for index, location in view_indexes.items():
            indexes[index].extend(location)
    return indexes

update_similarity_indexes `async` #

update_similarity_indexes() -> None

Update all similarity indexes from all structured views in the collection.

RAISES	DESCRIPTION
`IndexUpdateError`	if updating any of the indexes fails. The exception provides `failed_indexes` attribute, a dictionary mapping failed indexes to their respective exceptions. Indexes not present in the dictionary were updated successfully.

Source code in src/dbally/collection/collection.py

async def update_similarity_indexes(self) -> None:
    """
    Update all similarity indexes from all structured views in the collection.

    Raises:
        IndexUpdateError: if updating any of the indexes fails. The exception provides `failed_indexes` attribute,
            a dictionary mapping failed indexes to their respective exceptions. Indexes not present in
            the dictionary were updated successfully.
    """
    indexes = self.get_similarity_indexes()
    update_coroutines = [index.update() for index in indexes]
    results = await asyncio.gather(*update_coroutines, return_exceptions=True)
    failed_indexes = {
        index: exception for index, exception in zip(indexes, results) if isinstance(exception, Exception)
    }
    if failed_indexes:
        failed_locations = [loc for index in failed_indexes for loc in indexes[index]]
        raise IndexUpdateError(failed_indexes, failed_locations)

dbally.collection.results.ExecutionResult `dataclass` #

ExecutionResult(results: List[Dict[str, Any]], context: Dict[str, Any], execution_time: float, execution_time_view: float, view_name: str, textual_response: Optional[str] = None)

Represents the collection-level result of the query execution.

PARAMETER	DESCRIPTION
`results`	List of dictionaries containing the results of the query execution, each dictionary represents a row in the result set with column names as keys. The exact structure of the result set depends on the view that was used to execute the query, which can be obtained from the `view_name` attribute. TYPE: `List[Dict[str, Any]]`
`context`	Dictionary containing addtional metadata about the query execution. TYPE: `Dict[str, Any]`
`execution_time`	Time taken to execute the entire query, including view selection and all other operations, in seconds. TYPE: `float`
`execution_time_view`	Time taken that the selected view took to execute the query, in seconds. TYPE: `float`
`view_name`	Name of the view that was used to execute the query. TYPE: `str`
`textual_response`	Optional text response that can be used to display the query results in a human-readable format. TYPE: `Optional[str]` DEFAULT: `None`

results `instance-attribute` #

results: List[Dict[str, Any]]

context `instance-attribute` #

context: Dict[str, Any]

execution_time `instance-attribute` #

execution_time: float

execution_time_view `instance-attribute` #

execution_time_view: float

view_name `instance-attribute` #

view_name: str

textual_response `class-attribute` `instance-attribute` #

textual_response: Optional[str] = None

dbally.collection.exceptions.IndexUpdateError #

IndexUpdateError(failed_indexes: Dict[AbstractSimilarityIndex, Exception], failed_locations: List[IndexLocation])

Bases: DbAllyError

Exception for when updating any of the Collection's similarity indexes fails.

Provides a dictionary mapping failed indexes to their respective exceptions as the failed_indexes attribute.

PARAMETER	DESCRIPTION
`failed_indexes`	Dictionary mapping failed indexes to their respective exceptions. TYPE: `Dict[AbstractSimilarityIndex, Exception]`
`failed_locations`	List of locations of failed indexes. TYPE: `List[IndexLocation]`

Source code in src/dbally/collection/exceptions.py

def __init__(
    self,
    failed_indexes: Dict[AbstractSimilarityIndex, Exception],
    failed_locations: List[IndexLocation],
) -> None:
    """
    Args:
        failed_indexes: Dictionary mapping failed indexes to their respective exceptions.
        failed_locations: List of locations of failed indexes.
    """
    description = ", ".join(".".join(name for name in location) for location in failed_locations)
    super().__init__(f"Failed to update similarity indexes for {description}.")
    self.failed_indexes = failed_indexes

failed_indexes `instance-attribute` #

failed_indexes = failed_indexes

dbally.collection.exceptions.NoViewFoundError #

NoViewFoundError(view_name: str)

Bases: DbAllyError

Error raised when there is no view with the given name.

PARAMETER	DESCRIPTION
`view_name`	Name of the view that was not found. TYPE: `str`

Source code in src/dbally/collection/exceptions.py

def __init__(self, view_name: str) -> None:
    """
    Args:
        view_name: Name of the view that was not found.
    """
    super().__init__(f"No view found with name '{view_name}'.")
    self.view_name = view_name

view_name `instance-attribute` #

view_name = view_name

Collection#

dbally.collection.Collection #

name instance-attribute #

n_retries instance-attribute #

T class-attribute instance-attribute #

add #

set_fallback #

get #

list #

get_all_event_handlers #

ask async #

get_similarity_indexes #

update_similarity_indexes async #

dbally.collection.results.ExecutionResult dataclass #

results instance-attribute #

context instance-attribute #

execution_time instance-attribute #

execution_time_view instance-attribute #

view_name instance-attribute #

textual_response class-attribute instance-attribute #

dbally.collection.exceptions.IndexUpdateError #

failed_indexes instance-attribute #

dbally.collection.exceptions.NoViewFoundError #

view_name instance-attribute #

name `instance-attribute` #

n_retries `instance-attribute` #

T `class-attribute` `instance-attribute` #

ask `async` #

update_similarity_indexes `async` #

dbally.collection.results.ExecutionResult `dataclass` #

results `instance-attribute` #

context `instance-attribute` #

execution_time `instance-attribute` #

execution_time_view `instance-attribute` #

view_name `instance-attribute` #

textual_response `class-attribute` `instance-attribute` #

failed_indexes `instance-attribute` #

view_name `instance-attribute` #