You can evaluate the interactions with the Models in LangDB.
Data Collection
To evaluate model interactions, we extract message data from LangDB. This involves:
-
Fetching all messages from conversation threads using the LangDB API.
-
Exporting the data into a structured format such as a DataFrame (df) or CSV file.
from pylangdb.client import LangDb
client = LangDb(
api_key=os.getenv("LANGDB_API_KEY"),
project_id=os.getenv("LANGDB_PROJECT_ID"))
thread_id =[...,...,...] # LangDB Threads
df = client.create_evaluation_df(thread_ids)
Cost Calculation
Once the data is collected, we can compute:
-
Total cost: Sum of the cost of all interactions.
-
Average: Average cost per message.
print(f"Total cost across all threads: ${df['thread_total_cost'].sum():.4f}")
thread_costs = df.groupby('thread_id')['thread_total_cost'].sum()
avg_cost = df['thread_total_cost'].sum() / len(df)
print(f"\nAverage cost per message: ${avg_cost:.4f}")
Custom Evaluations
Beyond cost analysis, the messages allows you to conduct deeper insights into topic distribution, and trends.
# Analyze topic distribution
topics = analyzer.get_topic_distribution(thread_ids)
print("\nTopic Distribution Results:")
print(topics)
Example Output:
{
"topic_distribution": {
"Programming Languages": 5,
"Python Concepts": 6,
"Web Development": 2,
"Error Handling": 1,
"Testing": 1,
"Optimization": 1
},
"total_messages": 10
}
For more evaluations, check out the full notebook!