Chat History Management Strategies

The Challenge of Chat History

Managing chat history in LLM applications presents a crucial balance between maintaining context and managing token limits. Every token costs money and consumes context window space, yet losing important context can severely impact the quality of responses.

Core History Management Strategies

1. Message Window Sliding (Basic)

The simplest approach is maintaining a fixed window of recent messages. While straightforward, this method risks losing important context from earlier conversations.

const manageHistory = (messages: Message[], windowSize: number): Message[] => {
    return messages.slice(-windowSize);
};

2. Summarization-Based Management

This more sophisticated approach, which we'll focus on in this course, involves:

Maintaining a window of recent messages
Periodically summarizing older messages
Incorporating summaries into the system prompt

The process typically works like this:

interface Message {
    role: 'user' | 'assistant' | 'system';
    content: string;
    timestamp: number;
}

const manageHistoryWithSummary = async (
    messages: Message[],
    activeWindowSize: number,
    summaryThreshold: number
) => {
    if (messages.length < summaryThreshold) {
        return messages;
    }

    // Keep recent messages
    const recentMessages = messages.slice(-activeWindowSize);

    // Summarize older messages
    const messagesToSummarize = messages.slice(0, -activeWindowSize);
    const summary = await summarizeMessages(messagesToSummarize);

    // Update system prompt with summary
    return [
        { role: 'system', content: `Previous conversation context: ${summary}` },
        ...recentMessages
    ];
};

3. Hierarchical Summarization

A more advanced version of the summarization approach that maintains multiple levels of context: