How-To: Use Local LLMs#
db-ally includes a ready-to-use implementation for local LLMs called LocalLLM, which leverages the Hugging Face Transformers library to provide access to various LLMs available on Hugging Face.
Basic Usage#
Install the required dependencies for using local LLMs.
Integrate db-ally with your Local LLM
First, set up your environment to use a Hugging Face model.
import os
from dbally.llms.localllm import LocalLLM
os.environ["HUGGINGFACE_API_KEY"] = "your-api-key"
llm = LocalLLM(model_name="meta-llama/Meta-Llama-3-8B-Instruct")
Use LLM in your collection
my_collection = dbally.create_collection("my_collection", llm)
response = await my_collection.ask("Which LLM should I use?")
Advanced Usage#
For advanced users, you can customize your LLM using LocalLLMOptions. Here is a list of available parameters:
repetition_penalty: float or null (optional) - Penalizes repeated tokens to avoid repetitions.do_sample: bool or null (optional) - Enables sampling instead of greedy decoding.best_of: int or null (optional) - Generates multiple sequences and returns the one with the highest score.max_new_tokens: int (optional) - The maximum number of new tokens to generate.top_k: int or null (optional) - Limits the next token choices to the top-k probability tokens.top_p: float or null (optional) - Limits the next token choices to tokens within the top-p probability mass.seed: int or null (optional) - Sets the seed for random number generation to ensure reproducibility.stop_sequences: list of strings or null (optional) - Specifies sequences where the generation should stop.temperature: float or null (optional) - Adjusts the randomness of token selection.
import dbally
from dbally.llms.clients.localllm import LocalLLMOptions
llm = LocalLLM("meta-llama/Meta-Llama-3-8B-Instruct", default_options=LocalLLMOptions(temperature=0.7))
my_collection = dbally.create_collection("my_collection", llm)
You can also override any default parameter on the ask ask call.