How-To: Trace runs with OpenTelemetry#
db-ally provides you a way to track execution of the query processing using OpenTelemetry standard. As db-ally is a library, it only depends on the OpenTelemtry API. For projects that use db-ally, include OpenTelemetry SDK or perform Auto Instrumentation.
Step-by-step guide#
-
Python OpenTelemetry SDK must be installed:
-
To capture the traces, you can use Jeager. An open-source software for telemetry data. The recommended option is to start with Docker. You can run:
docker run --network host --rm --name jeager -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 jaegertracing/all-in-one
For simplicity we are using
--network host
, however, do not use this settings in production deployments and expose only ports that are needed. -
Import required OpenTelemetry SDKs and db-ally OTel Handler:
from dbally.audit.event_handlers.otel_event_handler import OtelEventHandler from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.resources import Resource from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter from opentelemetry.sdk.trace.export import BatchSpanProcessor
-
Setup the OTel exporter in your project:
exporeter = OTLPSpanExporter("http://localhost:4317", insecure=True) provider = TracerProvider(resource=Resource({"service.name": "db-ally"})) processor = BatchSpanProcessor(exporeter) provider.add_span_processor(processor) handler = OtelEventHandler(provider)
Using Resource you can add a name for your service. OTLPSpanExporter is used to export telemetry data using gRPC or HTTP to desired location. We mark it as insecure, as demo does not use TLS. To efficently send data over network, we should use BatchSpanProcessor to batch exports of telemetry data. Finally, we setup the db-ally handler.
-
Use handler with collection:
df = pd.DataFrame({ "name": ["Alice", "Bob", "Charlie", "David", "Eve"], "city": ["New York", "Los Angeles", "Chicago", "Houston", "Phoenix"], }) llm = LiteLLM(model_name="gpt-4o") collection = dbally.create_collection("clients", llm=llm, event_handlers=[handler], nl_responder=NLResponder(llm)) collection.add(ClientView, lambda: ClientView(df))
-
Ask your questions:
-
Explore your traces in observability platform (Jeager in our case):
Full code example#
import asyncio
import pandas as pd
import dbally
from dbally import DataFrameBaseView
from dbally.audit.event_handlers.otel_event_handler import OtelEventHandler
from dbally.nl_responder.nl_responder import NLResponder
from dbally.views import decorators
from dbally.llms import LiteLLM
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.resources import Resource
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import BatchSpanProcessor
class ClientView(DataFrameBaseView):
@decorators.view_filter()
def filter_by_city(self, city: str):
return self.df['city'] == city
async def main():
exporeter = OTLPSpanExporter("http://localhost:4317", insecure=True)
provider = TracerProvider(resource=Resource({"service.name": "db-ally"}))
processor = BatchSpanProcessor(exporeter)
provider.add_span_processor(processor)
handler = OtelEventHandler(provider)
df = pd.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David", "Eve"],
"city": ["New York", "Los Angeles", "Chicago", "Houston", "Phoenix"],
})
llm = LiteLLM(model_name="gpt-4o")
collection = dbally.create_collection("clients", llm=llm, event_handlers=[handler], nl_responder=NLResponder(llm))
collection.add(ClientView, lambda: ClientView(df))
result = await collection.ask("What clients are from Huston?", return_natural_response=True)
print(result)
if __name__ == '__main__':
asyncio.run(main())