Evaluation
You can evaluate the interactions with the Models in LangDB.
Data Collection
To evaluate model interactions, we extract message data from LangDB. This involves:
Fetching all messages from conversation threads using the LangDB API.
Exporting the data into a structured format such as a DataFrame (df) or CSV file.
from pylangdb.client import LangDb
client = LangDb(
api_key=os.getenv("LANGDB_API_KEY"),
project_id=os.getenv("LANGDB_PROJECT_ID"))
thread_id =[...,...,...] # LangDB Threads
df = client.create_evaluation_df(thread_ids)Cost Calculation
Once the data is collected, we can compute:
Total cost: Sum of the cost of all interactions.
Average: Average cost per message.
print(f"Total cost across all threads: ${df['thread_total_cost'].sum():.4f}")
thread_costs = df.groupby('thread_id')['thread_total_cost'].sum()
avg_cost = df['thread_total_cost'].sum() / len(df)
print(f"\nAverage cost per message: ${avg_cost:.4f}")Custom Evaluations
Beyond cost analysis, the messages allows you to conduct deeper insights into topic distribution, and trends.
Example Output:
For more evaluations, check out the full notebook!
Last updated
Was this helpful?