Verify Response Feature
The Verify Response feature automatically evaluates the accuracy and quality of AI responses using another LLM agent invocation, providing transparency into response reliability and enabling automatic correction of inaccurate answers.
Overview
When enabled, this feature provides:
- Response Verification: Automatically evaluates the veracity of AI responses
- Quality Grading: Assigns accuracy grades (A, B, C, F) to responses
- Auto-Reprompting: Automatically retries questions when responses fall below quality thresholds
- Transparency: Shows users the verification grade for each response
- Quality Assurance: Helps maintain high standards for AI responses
How It Works
1. Verification Process
For each AI response, the system:
- Generates Initial Response: The LLM provides an answer to the user's question
- Verification Analysis: A separate verification process evaluates the response accuracy using another LLM agent
- Grade Assignment: The response receives a grade (A, B, C, or F)
- Auto-Reprompt Decision: If the grade falls below the configured threshold, the system automatically retries
- Trace Display: The verification grade is shown to users (if traces are enabled)
2. Quality Grades
The system uses a four-tier grading system:
Grade | Classification | Description |
---|---|---|
A | Accurate | The response is factually accurate and complete |
B | Accurate with Stated Assumptions | The response is accurate but contains clearly stated assumptions |
C | Accurate with Unstated Assumptions | The response is accurate but contains unstated assumptions |
F | Inaccurate | The response is inaccurate or contains made-up information |
Configuration
1. Enable at Site Level
In your pika-config.ts
, enable the verify response feature and configure access control:
export const pikaConfig: PikaConfig = {
// ... other configuration
siteFeatures: {
verifyResponse: {
enabled: true,
// Optional: Configure auto-reprompt threshold
autoRepromptThreshold: 'C'
// Optional: Access control
userTypes: ['internal-user', 'external-user'],
userRoles: ['customer-support'],
applyRulesAs: 'or' // User needs userType OR userRole
}
}
};
Setting enabled: true
alone is not sufficient. You must also specify userTypes
or userRoles
to grant access to users. Without access control configuration, the feature will be disabled for all users due to Pika's secure-by-default system.
2. Configuration Options
Property | Type | Description |
---|---|---|
enabled | boolean | Required. Whether to enable the verify response feature (also requires userTypes or userRoles for users to access) |
autoRepromptThreshold | VerifyResponseClassification | Grade threshold for auto-reprompting (B, C, or F) |
userTypes | string[] | User types that can use this feature |
userRoles | PikaUserRole[] | User roles that can use this feature |
applyRulesAs | 'and' | 'or' | How to combine userTypes and userRoles (default: 'and') |
3. Auto-Reprompt Thresholds
Configure when the system should automatically retry:
// Available threshold options
import { AccurateWithStatedAssumptions, AccurateWithUnstatedAssumptions, Inaccurate } from 'pika-shared/types/chatbot/chatbot-types';
// Retry only on inaccurate responses
autoRepromptThreshold: Inaccurate; // 'F'
// Retry on responses with unstated assumptions or worse
autoRepromptThreshold: AccurateWithUnstatedAssumptions; // 'C' (recommended)
// Retry on responses with any assumptions or worse
autoRepromptThreshold: AccurateWithStatedAssumptions; // 'B'
// Note: Grade 'A' (Accurate) cannot be used as a threshold since it's the highest quality
Chat App Level Overrides
Individual chat apps can override the site-level configuration:
// In your chat app definition
const myChatApp: ChatApp = {
chatAppId: 'my-chat-app',
// ... other properties
features: {
verifyResponse: {
featureId: 'verifyResponse',
enabled: true, // Can only disable if site level is enabled
autoRepromptThreshold: Inaccurate, // More lenient than site level
userTypes: ['internal-user'] // More restrictive than site level
// Complete override - must include all properties from site level
}
}
};
Override Rules
- Site level controls availability: If verification is disabled at the site level, chat apps cannot enable it
- Chat apps can only restrict: Chat apps can make access more restrictive but not more permissive
- Complete override required: Chat apps must provide ALL feature settings when overriding
- No merging: Overrides completely replace site-level settings, they do NOT merge
- Independent thresholds: Chat apps can set different auto-reprompt thresholds
User Experience
1. Verification Display
When verification is enabled, users will see:
- Verification badges showing the response grade (A, B, C, F)
- Auto-reprompt notifications when the system retries a question
- Improved response quality due to automatic corrections
2. Trace Integration
When both verify response and traces features are enabled:
- Verification traces appear in the trace display
- Grade explanations provide context for the verification decision
- Auto-reprompt reasoning shows why a response was retried
3. Auto-Reprompt Behavior
When a response falls below the threshold:
- Automatic Retry: The system automatically resends the user's question
- Improved Response: The LLM generates a new corrected response, hopefully better
Implementation Details
1. Verification Process
The verification process works as follows:
// Simplified verification flow
if (features.verifyResponse.enabled) {
// Generate main response
let mainResponse = await invokeAgent(userQuestion);
// Verify the response
let verificationResult = await invokeAgentToVerifyAnswer(userQuestion, mainResponse);
// Check if auto-reprompt is needed
if (shouldAutoReprompt(verificationResult.grade, autoRepromptThreshold)) {
// Generate improved response
mainResponse = await invokeAgent(userQuestion, 'Please provide a more accurate response');
verificationResult = await invokeAgentToVerifyAnswer(userQuestion, mainResponse);
}
// Add verification trace
addVerificationTrace(verificationResult.grade);
return mainResponse;
}
2. Grade Classification
The system uses the VerifyResponseClassification
enum:
export enum VerifyResponseClassification {
Accurate = 'A', // Factually accurate
AccurateWithStatedAssumptions = 'B', // Accurate with stated assumptions
AccurateWithUnstatedAssumptions = 'C', // Accurate with unstated assumptions
Inaccurate = 'F' // Inaccurate or made-up information
}
3. Auto-Reprompt Logic
Auto-reprompting is triggered when:
- The response grade is at or below the configured threshold
- Grades B, C, and F are considered "retryable" (Grade A is not)
- The feature is enabled and the user has appropriate permissions
Performance Considerations
1. Response Time Impact
- Doubled Processing: Each response requires two LLM calls (initial + verification)
- Auto-reprompt Overhead: Poor responses may trigger additional retries
- Selective Enablement: Consider enabling only for critical chat apps
2. Cost Implications
- Increased Token Usage: Verification requires additional tokens
- Retry Costs: Auto-reprompted responses use more compute resources
- Quality vs. Cost: Balance verification benefits against increased costs
3. Optimization Strategies
- Threshold Tuning: Set appropriate thresholds to minimize unnecessary retries
- User-based Enablement: Enable only for users who need high accuracy
- Chat App Targeting: Focus on chat apps where accuracy is most critical
Security Considerations
1. Access Control
- Role-based Access: Only authorized users can use verified responses
- Quality Gating: Ensure verification doesn't expose sensitive information
- Audit Trail: Track verification decisions for quality monitoring
2. Data Privacy
- Verification Data: Ensure verification prompts don't leak sensitive information
- Response Filtering: Verify that verification doesn't expose internal data
- User Data Protection: Maintain privacy during the verification process
Use Cases
1. Customer Support
- Accurate Information: Ensure customers receive factually correct responses
- Compliance: Meet regulatory requirements for information accuracy
- Trust Building: Increase customer confidence in AI responses
2. Healthcare & Finance
- Critical Accuracy: Ensure responses in sensitive domains are verified
- Regulatory Compliance: Meet industry standards for AI-generated content
- Risk Mitigation: Reduce liability from inaccurate information
3. Internal Knowledge Management
- Employee Training: Provide verified information to staff
- Policy Compliance: Ensure AI responses align with company policies
- Quality Assurance: Maintain high standards for internal communications
Troubleshooting
Common Issues
- Verification not working: Check that the feature is enabled at site level
- Auto-reprompt not triggering: Verify threshold configuration and user permissions
- Performance degradation: Consider adjusting thresholds or limiting user access
Debug Steps
- Check site configuration: Verify
siteFeatures.verifyResponse.enabled = true
- Test threshold settings: Try different
autoRepromptThreshold
values - Monitor traces: Enable traces to see verification grades
- Review user permissions: Ensure users have required access
Example: Complete Configuration
export const pikaConfig: PikaConfig = {
// ... other configuration
siteFeatures: {
verifyResponse: {
enabled: true,
autoRepromptThreshold: AccurateWithUnstatedAssumptions, // 'C'
userTypes: ['internal-user'] // Only internal users get verified responses
},
traces: {
enabled: true, // Enable to see verification grades
userTypes: ['internal-user']
}
}
};
Best Practices
1. Threshold Selection
- Start Conservative: Begin with
'F'
(Inaccurate) to catch only the worst responses - Gradually Improve: Move to
'C'
(recommended) for better quality - Monitor Performance: Track the impact on response times and costs
2. User Targeting
- Critical Users First: Enable for users who need the highest accuracy
- Progressive Rollout: Gradually expand to more user types
- Feedback Loop: Collect user feedback on response quality
Related Features
- Traces Feature : Displays verification grades and reasoning
- Content Admin Feature : Allows admins to see verification for other users
- Overriding Features : How to override verification settings per chat app
Need more help? Check the Troubleshooting Guide or review the Customization Guide for advanced configuration options.