Code icon

The App is Under a Quick Maintenance

We apologize for the inconvenience. Please come back later

Menu iconMenu iconOpenAI API Bible Volume 2
OpenAI API Bible Volume 2

Chapter 4: Building a Simple Chatbot with Memory

4.3 Implementing Session-Based Chat Memory

Session-based memory is a crucial feature that transforms a simple chatbot into a truly interactive and context-aware conversational agent. Without memory, each interaction becomes isolated and disconnected, forcing users to repeatedly provide context and limiting the natural flow of conversation. This section explores how to implement robust memory management in your chatbot, enabling it to maintain coherent, contextual discussions across multiple exchanges.

We'll examine two distinct approaches to implementing session memory - using Streamlit's built-in state management and Flask's server-side sessions. Both methods offer their own advantages and can be tailored to meet specific project requirements. By the end of this section, you'll understand how to create a chatbot that can maintain context, remember previous interactions, and provide more meaningful and connected responses.

The implementation we'll cover ensures that your chatbot can:

  • Maintain contextual awareness throughout entire conversations
  • Handle complex multi-turn dialogues effectively
  • Provide more relevant and personalized responses based on conversation history
  • Manage memory efficiently without exceeding token limits
✅ Giving your assistant memory — so it doesn’t forget the conversation flow every time the page refreshes or the session resets.

4.3.1 Why Does Session Memory Matter?

Session memory is a fundamental aspect of creating intelligent chatbots that can engage in meaningful, context-aware conversations. Just like humans rely on memory during discussions, chatbots need a way to remember and process previous interactions. When we talk with others, we naturally reference earlier points, build on shared understanding, and maintain a coherent flow of ideas. This natural communication pattern is what makes conversations feel organic and meaningful. Without memory capabilities, chatbots are essentially starting fresh with each response, leading to disconnected and often frustrating interactions that feel more like talking to a machine than having a real conversation.

The concept of session memory fundamentally transforms chatbot interactions in several critical ways:

  • Remember and reference previous parts of the conversation with precision
    • Track specific details mentioned earlier in the chat, such as user preferences, technical specifications, or personal information shared
    • Reference past agreements or decisions accurately, ensuring continuity in complex discussions or negotiations
    • Maintain historical context for better problem-solving and support
  • Build context incrementally throughout an interaction
    • Understand complex topics that unfold over multiple messages, allowing for deeper exploration of subjects
    • Develop more sophisticated responses as the conversation progresses, building upon previously established concepts
    • Create a coherent narrative thread across multiple exchanges
  • Provide more nuanced and relevant responses based on the conversation history
    • Tailor answers to the user's demonstrated knowledge level, adjusting terminology and complexity accordingly
    • Avoid repeating information already discussed, making conversations more efficient
    • Use past interactions to provide more personalized and contextually appropriate responses
  • Create a more natural, human-like conversational experience
    • Maintain consistent personality and tone throughout the chat, enhancing user engagement
    • Adapt responses based on user preferences and past interactions, creating a more personalized experience
    • Learn from previous exchanges to improve the quality of future interactions

In real conversations, context builds over time in multiple sophisticated ways that mirror natural human dialogue patterns:

  • The user refers to past messages
    • Questions naturally build upon previous answers, creating a continuous thread of understanding
    • Topics evolve organically as users reference and expand on earlier points in the conversation
    • Previous context shapes how new information is interpreted and understood
  • The assistant needs to remember previous answers
    • Maintains consistency across responses to build trust and reliability
    • Uses established context to provide more nuanced and relevant information
    • Builds a comprehensive understanding of the user's needs over time
  • Follow-up questions depend on prior knowledge
    • Each question builds upon the foundation of previous exchanges
    • Complex topics can be explored gradually, with increasing depth
    • The conversation naturally progresses from basic concepts to more advanced understanding
    • Creates more meaningful dialogue chains by connecting related ideas
    • Enables natural conversation flow that feels more human-like and engaging

Without memory implementation, chatbots treat each interaction as an isolated event, completely disconnected from previous exchanges. This fundamental limitation affects even sophisticated models like GPT-4o in several critical ways:

  1. Loss of Context: Each response is generated without any awareness of previous conversations, making it impossible to maintain coherent, extended discussions.
  2. Repetitive Interactions: The chatbot may provide the same information multiple times or ask for details that were already shared, creating a frustrating user experience.
  3. Inconsistent Responses: Without access to previous exchanges, the chatbot might give contradictory answers to related questions, undermining user trust.
  4. Limited Understanding: The inability to reference past context means the chatbot cannot build upon previously established knowledge or adapt its responses based on the user's demonstrated understanding.

By the end of this section, you'll know how to:

  • Store and retrieve messages for a given session
    • Implement secure storage mechanisms using industry-standard encryption and protection
    • Handle different types of message data effectively, including text, structured data, and metadata
  • Maintain and update multi-turn memory
    • Process conversation chains efficiently using optimized data structures
    • Manage context across multiple exchanges while maintaining conversation coherence
  • Avoid bloated token usage by capping history length
    • Implement smart memory management strategies that prioritize relevant information
    • Balance context retention with performance through intelligent pruning algorithms

4.3.2 Streamlit: Session State Memory

Streamlit makes managing conversation history super simple with st.session_state, which persists data throughout the browser session.

Here's how you can create a chatbot with session memory using Streamlit:

Step 1: Import Libraries

import streamlit as st
import openai
import os
from dotenv import load_dotenv
  • streamlit: Used to create the user interface for the chatbot.
  • openai: The OpenAI Python library, used to interact with the GPT-4o model.
  • os: Provides a way to interact with the operating system, for example to access environment variables.
  • dotenv: Used to load environment variables from a .env file.

Step 2: Load API Key and Configure Page

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

st.set_page_config(page_title="GPT-4o Chat with Memory", page_icon="🧠")
st.title("🧠 GPT-4o Chatbot with Session Memory")
  • load_dotenv(): Loads the OpenAI API key from a .env file. This file should be in the same directory as your Python script and contain the line OPENAI_API_KEY=YOUR_API_KEY.
  • openai.api_key = os.getenv("OPENAI_API_KEY"): Sets the OpenAI API key.
  • st.set_page_config(...): Configures the page title and icon that appear in the browser tab.
  • st.title(...): Sets the title of the Streamlit application, which is displayed at the top of the page.

Step 3: Initialize Session State

if "messages" not in st.session_state:
    st.session_state.messages = [{"role": "system", "content": "You are a helpful assistant that remembers this session."}]
  • st.session_state: Streamlit's way of storing variables across user interactions. Data in st.session_state persists as long as the user's browser tab remains open.
  • This code checks if the key "messages" exists in st.session_state. If it doesn't (which is the case when the user first loads the app), it initializes "messages" to a list containing a single dictionary.
  • This dictionary represents the "system message," which is used to set the behavior of the assistant. In this case, the system message tells the assistant to be helpful and remember the conversation.

Step 4: Display Chat History

for msg in st.session_state.messages[1:]:
    with st.chat_message(msg["role"]):
        st.markdown(msg["content"])
  • This for loop iterates through the messages in the "messages" list, starting from the second message (index 1) to skip the system message.
  • st.chat_message(msg["role"]): Creates a chat bubble in the Streamlit app to display the message. The role ("user" or "assistant") determines the appearance of the bubble.
  • st.markdown(msg["content"]): Displays the content of the message within the chat bubble. st.markdown is used to render the text.

Step 5: User Input and Response Generation

user_input = st.chat_input("Say something...")
if user_input:
    st.chat_message("user").markdown(user_input)
    st.session_state.messages.append({"role": "user", "content": user_input})

    with st.chat_message("assistant"):
        with st.spinner("Thinking..."):
            response = openai.ChatCompletion.create(
                model="gpt-4o",
                messages=st.session_state.messages,
                temperature=0.6
            )
            reply = response["choices"][0]["message"]["content"]
            st.markdown(reply)
            st.session_state.messages.append({"role": "assistant", "content": reply})
  • user_input = st.chat_input("Say something..."): Creates a text input field at the bottom of the app where the user can type their message. The label "Say something..." is displayed next to the input field.
  • The if user_input: block is executed when the user enters text and presses Enter.
  • st.chat_message("user").markdown(user_input): Displays the user's message in the chat interface.
  • st.session_state.messages.append({"role": "user", "content": user_input}): Appends the user's message to the "messages" list in st.session_state, so it's stored in the conversation history.
  • with st.chat_message("assistant"):: Creates a chat bubble for the assistant's response.
  • with st.spinner("Thinking..."): Displays a spinner animation while the app is waiting for a response from the OpenAI API.
  • response = openai.ChatCompletion.create(...): Calls the OpenAI API to get a response from the GPT-4o model.
    • model: Specifies the language model to use ("gpt-4o").
    • messages: Passes the entire conversation history (stored in st.session_state.messages) to the API. This is how the model "remembers" the conversation.
    • temperature: A value between 0 and 1 that controls the randomness of the model's output. A lower value (e.g., 0.2) makes the output more deterministic, while a higher value (e.g., 0.8) makes it more random.
  • reply = response["choices"][0]["message"]["content"]: Extracts the assistant's reply from the API response.
  • st.markdown(reply): Displays the assistant's reply in the chat interface.
  • st.session_state.messages.append({"role": "assistant", "content": reply}): Appends the assistant's reply to the "messages" list in st.session_state.

This example creates a simple chatbot with memory using Streamlit. The st.session_state.messages list stores the conversation history, allowing the chatbot to maintain context across multiple interactions.  The chat history is displayed in the app, and the user can input messages using the st.chat_input field.  The assistant's responses are generated by the OpenAI GPT-4o model.

4.3.3 Flask: Using Server-Side Sessions

In Flask, server-side session management provides a robust way to maintain conversation history. The session object acts as a persistent dictionary that stores data on the server rather than the client side, making it more secure and reliable. You can use either the built-in session object, which stores data in an encrypted cookie, or implement in-memory storage solutions like Redis for better scalability.

This server-side approach ensures that the chatbot can maintain context and remember previous interactions throughout the user's active session, even if they navigate between different pages or refresh the browser.

Here's how to implement a chatbot with server-side sessions in Flask:

Step 1: Install Required Libraries

pip install flask openai flask-session python-dotenv
  • flask: A web framework.
  • openai: The OpenAI Python library.
  • flask-session: Flask extension to handle server-side sessions.
  • python-dotenv: To load environment variables from a .env file.

Step 2: Update app.py

from flask import Flask, request, render_template, session
import openai
import os
from dotenv import load_dotenv
from flask_session import Session

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

app = Flask(__name__)
app.secret_key = os.urandom(24)
app.config["SESSION_TYPE"] = "filesystem"
Session(app)

@app.route("/", methods=["GET", "POST"])
def chat():
    if "history" not in session:
        session["history"] = [
            {"role": "system", "content": "You are a helpful assistant."}
        ]

    if request.method == "POST":
        user_input = request.form["user_input"]
        session["history"].append({"role": "user", "content": user_input})

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=session["history"],
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        session["history"].append({"role": "assistant", "content": assistant_reply})

        return render_template("chat.html", history=session.get("history")[1:])  # Skip system prompt

    return render_template("chat.html", history=session.get("history")[1:])  # Skip system prompt
  • from flask import ...: Imports necessary Flask components, including session for managing user sessions.
  • import openai: Imports the OpenAI library.
  • import os: Imports the os module for interacting with the operating system, particularly for accessing environment variables.
  • from dotenv import load_dotenv: Imports the load_dotenv function from the python-dotenv library.
  • from flask_session import Session: Imports the Session class from flask_session.
  • load_dotenv(): Loads environment variables (like the OpenAI API key) from a .env file.
  • openai.api_key = os.getenv("OPENAI_API_KEY"): Retrieves the OpenAI API key from the environment and sets it for the OpenAI library.
  • app = Flask(__name__): Creates a Flask application instance.
  • app.secret_key = os.urandom(24): Sets a secret key for the Flask application. This is essential for using Flask sessions. os.urandom(24) generates a random, cryptographically secure key.
  • app.config["SESSION_TYPE"] = "filesystem": Configures Flask-Session to store session data on the server's file system. Other options like "redis" or "mongodb" are available for production use.
  • Session(app): Initializes the Flask-Session extension, binding it to the Flask app.
  • @app.route("/", methods=["GET", "POST"]): Defines the route for the application's main page ("/"). The chat() function handles both GET and POST requests.
  • def chat()::
    • if "history" not in session:: Checks if the user's session already has a conversation history. If not, it initializes the session with a system message. The system message helps set the behavior of the assistant.
    • if request.method == "POST":: Handles POST requests, which occur when the user submits a message through the chat form.
      • user_input = request.form["user_input"]: Retrieves the user's input from the form.
      • session["history"].append({"role": "user", "content": user_input}): Appends the user's message to the conversation history stored in the session.
      • response = openai.ChatCompletion.create(...): Calls the OpenAI API to get a response from the GPT-4o model, passing the conversation history.
      • assistant_reply = response["choices"][0]["message"]["content"]: Extracts the assistant's reply from the API response.
      • session["history"].append({"role": "assistant", "content": assistant_reply}): Appends the assistant's reply to the conversation history in the session.
      • return render_template("chat.html", history=session.get("history")[1:]): Renders the chat.html template, passing the conversation history (excluding the initial system message) to be displayed.
    • return render_template("chat.html", history=session.get("history")[1:]): Handles GET requests (when the user first loads the page). It renders the chat.html template, passing the conversation history (excluding the system message).

Step 3: Your HTML Template (templates/chat.html)

<!DOCTYPE html>
<html>
<head>
  <title>GPT-4o Assistant</title>
  <style>
    body { font-family: Arial; background: #f7f7f7; padding: 40px; }
    .container { max-width: 600px; margin: auto; background: white; padding: 20px; border-radius: 10px; }
    .user, .assistant { margin-bottom: 15px; }
    .user p { background: #d4f0ff; padding: 10px; border-radius: 10px; }
    .assistant p { background: #e8ffe8; padding: 10px; border-radius: 10px; }
    textarea { width: 100%; height: 80px; }
    input[type="submit"] { margin-top: 10px; padding: 10px 20px; }
  </style>
</head>
<body>
  <div class="container">
    <h2>GPT-4o Chatbot</h2>
    {% for msg in history %}
      <div class="{{ msg.role }}">
        <p><strong>{{ msg.role.capitalize() }}:</strong> {{ msg.content }}</p>
      </div>
    {% endfor %}
    <form method="post">
      <textarea name="user_input" placeholder="Type your message..."></textarea><br>
      <input type="submit" value="Send">
    </form>
  </div>
</body>
</html>
  • <!DOCTYPE html>: Declares the document type as HTML5.
  • <html>: The root element of the HTML document.
  • <head>: Contains metadata about the HTML document.
    • <title>: Specifies the title of the HTML page.
    • <style>: Includes CSS for basic styling of the chat interface.
  • <body>: Contains the visible content of the HTML page.
    • <div class="container">: A container for the chat application.
    • <h2>: A heading for the chat application.
    • {% for msg in history %}: A Jinja2 template loop that iterates through the history variable (passed from the Flask code) to display the chat messages.
      • <div class="{{ msg.role }}">: Creates a div element for each message. The class is set to the message's role ("user" or "assistant") for styling.
      • <p>: Displays the message content.
      • <strong>: Displays the role.
    • <form method="post">: A form for the user to submit their messages.
      • <textarea>: A multi-line text input field for the user to type their message.
      • <input type="submit" value="Send">: A button to send the message.
  • templates: Flask, by default, looks for HTML templates in a folder named "templates" in the same directory as your app.py file. So, this file should be saved as templates/chat.html.

This code creates a chatbot using Flask and OpenAI, with the conversation history stored in server-side sessions.  The server retains the chat memory for the duration of the user's session, clearing it when the user closes their browser tab.

4.3.4 Optional: Cap the Message History

Large language models have a limited context window (e.g., 128k tokens for GPT-4o), which determines how much of the conversation the model can "remember" at once. To prevent exceeding this limit and encountering errors, you should cap the number of messages stored in the conversation history.

Here's how to trim older entries in both Streamlit and Flask:

Streamlit

import streamlit as st

MAX_HISTORY = 20  # Maximum number of messages to keep

if SESSION_MESSAGES_KEY in st.session_state:
    if len(st.session_state[SESSION_MESSAGES_KEY]) > MAX_HISTORY:
        # Keep the system message and the last MAX_HISTORY messages
        st.session_state[SESSION_MESSAGES_KEY] = [st.session_state[SESSION_MESSAGES_KEY][0]] + st.session_state[SESSION_MESSAGES_KEY][-MAX_HISTORY + 1:]

Explanation:

  • MAX_HISTORY: A constant defining the maximum number of messages to retain. This example keeps the last 20 messages.
  • if SESSION_MESSAGES_KEY in st.session_state: Check if the key exists
  • The code then trims the st.session_state.messages list, preserving the first message (the system message) and the last MAX_HISTORY - 1 messages.

Flask

from flask import session

MAX_HISTORY = 20  # Maximum number of messages to keep

if SESSION_MESSAGES_KEY in session:
    if len(session[SESSION_MESSAGES_KEY]) > MAX_HISTORY:
        # Keep the system message and the last MAX_HISTORY messages
        session[SESSION_MESSAGES_KEY] = [session[SESSION_MESSAGES_KEY][0]] + session[SESSION_MESSAGES_KEY][-MAX_HISTORY + 1:]

Explanation:

  • MAX_HISTORY: A constant defining the maximum number of messages to retain.
  • if SESSION_MESSAGES_KEY in session: Check if the key exists
  • The code trims the session["history"] list, keeping the first message (the system message) and the last MAX_HISTORY - 1 messages.

Important Implementation Considerations:

  • System Message Management: The system message plays a crucial role in setting the chatbot's behavior and context. It must always be preserved as the first message in your conversation history. When implementing message trimming, ensure your code specifically maintains this message by:
    • Keeping it separate from the regular conversation flow
    • Including special handling in your trimming logic
    • Verifying its presence before each interaction
  • Comprehensive Testing Protocol: To ensure reliable chatbot performance:
    • Test with varying conversation lengths, from short exchanges to extended dialogues
    • Verify that context is maintained even after trimming
    • Check for potential edge cases where coherence might break
    • Monitor system resource usage during extended conversations
  • Advanced Trimming Strategies: Consider these sophisticated approaches:
    • Token-based trimming: Calculate actual token usage using a tokenizer
    • Importance-based trimming: Keep messages based on relevance
    • Hybrid approach: Combine token counting with message relevance
    • Dynamic adjustment: Modify trim threshold based on conversation complexity

4.3.5 Bonus Tip: Save to File or Database

Want to persist memory even after the session ends? Let's explore several effective methods for long-term memory storage:

1. Export conversation history to a JSON file

Process: The st.session_state.messages (in Streamlit) or session["history"] (in Flask) data can be saved to a JSON file. This involves converting the list of message dictionaries into a JSON string and writing it to a file.

Code Example (Streamlit):

import json
import streamlit as st

def save_chat_log(filename="chat_log.json"):
    """Saves the chat log from st.session_state to a JSON file."""
    if SESSION_MESSAGES_KEY in st.session_state:
        try:
            with open(filename, "w") as f:
                json.dump(st.session_state[SESSION_MESSAGES_KEY], f, indent=2)
            print(f"Chat log saved to {filename}")
        except Exception as e:
            st.error(f"Error saving chat log: {e}")

# Example usage:  Call this function when the user ends the session or when appropriate
if st.button("Save Chat Log"):
    save_chat_log()

Code Example (Flask):

import json
from flask import session, Flask

def save_chat_log(filename="chat_log.json"):
    """Saves the chat log from the Flask session to a JSON file."""
    if SESSION_MESSAGES_KEY in session:
        try:
            with open(filename, "w") as f:
                json.dump(session[SESSION_MESSAGES_KEY], f, indent=2)
            print(f"Chat log saved to {filename}")
        except Exception as e:
            #  Use the app context to display a message
            with app.app_context():
                print(f"Error saving chat log: {e}")

# Example Usage
app = Flask(__name__)
@app.route('/save_log')
def save_log():
    save_chat_log()
    return "Chat log saved!"

Explanation:

  • The json.dump() function is used to serialize the list of messages to a JSON formatted string. The indent=2 parameter makes the JSON file more human-readable.
  • The code handles potential errors during file writing.
  • The Streamlit example uses a button to trigger the save, and the Flask example creates a route /save_log to save the file.

Benefits:

  • Simple to implement using Python's built-in json module.
  • Good for small-scale applications and quick prototypes.
  • Easy to backup and version control.

Drawbacks:

  • Not ideal for large-scale, multi-user applications.
  • No efficient querying or indexing.

2. Store conversations in a SQLite, PostgreSQL, or NoSQL database

Process: Store the conversation history in a database. Each message can be a row in a table (for SQL databases) or a document in a collection (for NoSQL databases).

Code Example (SQLite - Streamlit):

import streamlit as st
import sqlite3
import datetime

def get_connection():
    """Gets or creates a SQLite connection."""
    conn = getattr(st.session_state, "sqlite_conn", None)
    if conn is None:
        conn = sqlite3.connect("chat_log.db")
        st.session_state.sqlite_conn = conn
    return conn

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute("""
        CREATE TABLE IF NOT EXISTS messages (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            role TEXT NOT NULL,
            content TEXT NOT NULL,
            created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        )
    """)
    conn.commit()

def store_message(role, content):
    """Stores a message in the database."""
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute(
        "INSERT INTO messages (role, content) VALUES (?, ?)",
        (role, content),
    )
    conn.commit()

create_table()  # Ensure table exists

# Store messages
if st.session_state.user_input:
    store_message("user", st.session_state.user_input)
if st.session_state.get("reply"):  # replace reply with a key you are using
    store_message("assistant", st.session_state.reply)

# Example of retrieving messages (optional, for demonstration)
conn = get_connection()
cursor = conn.cursor()
cursor.execute("SELECT role, content, created_at FROM messages ORDER BY created_at DESC LIMIT 5")
recent_messages = cursor.fetchall()
st.write("Last 5 Messages from DB:")
for row in recent_messages:
    st.write(f"{row[0]}: {row[1]} (at {row[2]})")

Code Example (PostgreSQL - Flask):

from flask import Flask, request, render_template, session
import openai
import os
from dotenv import load_dotenv
import psycopg2  # PostgreSQL library
import datetime
from typing import List, Dict

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL") #Make sure to set this in .env

app = Flask(__name__)
app.secret_key = os.urandom(24)

def get_db_connection():
    """Gets a PostgreSQL connection."""
    try:
        conn = psycopg2.connect(DATABASE_URL)
        return conn
    except psycopg2.Error as e:
        print(f"Database connection error: {e}")
        return None

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS messages (
                    id SERIAL PRIMARY KEY,
                    session_id TEXT NOT NULL,
                    role TEXT NOT NULL,
                    content TEXT NOT NULL,
                    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
                )
            """)
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error creating table: {e}")
        finally:
            conn.close()

def store_message(session_id: str, role: str, content: str):
    """Stores a message in the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "INSERT INTO messages (session_id, role, content) VALUES (%s, %s, %s)",
                (session_id, role, content),
            )
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error storing message: {e}")
        finally:
            conn.close()

create_table()  # Ensure the table exists

@app.route("/", methods=["GET", "POST"])
def chat():
    if "session_id" not in session:
        session["session_id"] = os.urandom(16).hex()  # Unique session ID

    if "history" not in session:
        session["history"] = [{"role": "system", "content": "You are a helpful assistant."}]

    if request.method == "POST":
        user_input = request.form["user_input"]
        session_id = session["session_id"]
        store_message(session_id, "user", user_input)  # Store in DB
        session["history"].append({"role": "user", "content": user_input})

        conn = get_db_connection() #
        if conn: #
            try: #
                cursor = conn.cursor() #
                cursor.execute("SELECT role, content, created_at FROM messages WHERE session_id = %s ORDER BY created_at", (session_id,)) #
                messages_from_db = cursor.fetchall() #
            except psycopg2.Error as e: #
                print(f"Error fetching messages: {e}") #
            finally: #
                conn.close() #

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=session["history"],
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        store_message(session_id, "assistant", assistant_reply)  # Store in DB
        session["history"].append({"role": "assistant", "content": assistant_reply})
        session.modified = True

        return render_template("chat.html", history=session.get("history")[1:])
    return render_template("chat.html", history=session.get("history")[1:])

Explanation:

  • The Streamlit example uses SQLite, a lightweight database that doesn't require a separate server. The Flask example uses PostgreSQL, a more robust database that is suitable for multi-user applications.
  • Both examples create a table named "messages" to store the conversation history. The table includes columns for message ID, role, content, and timestamp.
  • The store_message() function inserts a new message into the database.
  • The Flask example retrieves messages from the database and passes them to the template for display. The Streamlit example also shows how to retrieve data.

Benefits:

  • SQLite: Perfect for single-user applications with structured data. No separate database server is needed.
  • PostgreSQL: Ideal for multi-user systems requiring concurrent access. More robust and scalable than SQLite.
  • NoSQL (Not shown in detail): Best for flexible schema and unstructured conversation data. Databases like MongoDB or CouchDB would be suitable.
  • Drawbacks:
    • Requires more setup than using a JSON file.
    • Need to manage database connections and schemas.

3. Reuse memory in future sessions by tagging with a user ID

Process: To allow users to have persistent, personalized conversations, you can tag each message with a user ID and store this information in the database. When a user returns, you can retrieve their specific conversation history from the database.

Code Example (PostgreSQL - Flask):

from flask import Flask, request, render_template, session, redirect, url_for
import openai
import os
from dotenv import load_dotenv
import psycopg2
import datetime
from typing import List, Dict

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL")

app = Flask(__name__)
app.secret_key = os.urandom(24)

def get_db_connection():
    """Gets a PostgreSQL connection."""
    try:
        conn = psycopg2.connect(DATABASE_URL)
        return conn
    except psycopg2.Error as e:
        print(f"Database connection error: {e}")
        return None

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS messages (
                    id SERIAL PRIMARY KEY,
                    user_id TEXT NOT NULL,  -- Added user_id
                    role TEXT NOT NULL,
                    content TEXT NOT NULL,
                    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
                )
            """)
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error creating table: {e}")
        finally:
            conn.close()

def store_message(user_id: str, role: str, content: str):
    """Stores a message in the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "INSERT INTO messages (user_id, role, content) VALUES (%s, %s, %s)",
                (user_id, role, content),
            )
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error storing message: {e}")
        finally:
            conn.close()

create_table()

def get_user_history(user_id: str) -> List[Dict[str, str]]:
    """Retrieves a user's conversation history from the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "SELECT role, content FROM messages WHERE user_id = %s ORDER BY created_at",
                (user_id,),
            )
            history = [{"role": row[0], "content": row[1]} for row in cursor.fetchall()]
            return history
        except psycopg2.Error as e:
            print(f"Error retrieving user history: {e}")
            return []
        finally:
            conn.close()
    return []

@app.route("/", methods=["GET", "POST"])
def chat():
    if "user_id" not in session:
        session["user_id"] = os.urandom(16).hex()  # Unique user ID
    user_id = session["user_id"]

    history = get_user_history(user_id)  # Get user's history

    if request.method == "POST":
        user_input = request.form["user_input"]
        store_message(user_id, "user", user_input)  # Store with user ID
        history.append({"role": "user", "content": user_input})

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=history,
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        store_message(user_id, "assistant", assistant_reply)  # Store with user ID
        history.append({"role": "assistant", "content": assistant_reply})
        session["history"] = history #update

        return render_template("chat.html", history=history[1:])

    return render_template("chat.html", history=history[1:])

@app.route("/clear", methods=["POST"])
def clear_chat():
    session.pop("user_id", None)  #remove user id
    session.pop("history", None)
    return redirect(url_for("chat"))

Explanation:

  • The database table "messages" now includes a "user_id" column.
  • When a user starts a session, a unique "user_id" is generated and stored in the Flask session.
  • The store_message() function now requires a "user_id" and stores it along with the message.
  • The get_user_history() function retrieves the conversation history for a specific user from the database.
  • The chat route retrieves user history and uses it to construct the messages sent to OpenAI, thus maintaining conversation history across multiple visits from the same user.
  • Benefits:
    • Enables personalized conversation history for each user.
    • Allows for user-specific context and preferences.
    • Facilitates analysis of conversation patterns over time.
  • Drawbacks:
    • Requires a database.
    • More complex to implement than simple session-based storage.

In this section, you learned several crucial aspects of building a chatbot with memory capabilities:

  • Added session-based memory to your chatbot using st.session_state (Streamlit) and flask.session (Flask)
    • Implemented temporary storage for ongoing conversations
    • Learned how to manage session variables effectively
  • Preserved chat history across interactions
    • Created database schemas to store conversation data
    • Implemented methods to save and retrieve past messages
  • Improved context and coherence for multi-turn conversations
    • Developed systems to maintain conversation context
    • Enhanced natural language understanding through historical context
  • Learned to cap token usage by trimming message history
    • Implemented efficient message pruning strategies
    • Balanced memory retention with API token limitations

This gives your chatbot the ability to hold natural, flowing conversations — a key milestone toward building an intelligent assistant. With these features, your chatbot can now remember previous interactions, maintain context throughout conversations, and manage memory efficiently while staying within technical constraints.

4.3 Implementing Session-Based Chat Memory

Session-based memory is a crucial feature that transforms a simple chatbot into a truly interactive and context-aware conversational agent. Without memory, each interaction becomes isolated and disconnected, forcing users to repeatedly provide context and limiting the natural flow of conversation. This section explores how to implement robust memory management in your chatbot, enabling it to maintain coherent, contextual discussions across multiple exchanges.

We'll examine two distinct approaches to implementing session memory - using Streamlit's built-in state management and Flask's server-side sessions. Both methods offer their own advantages and can be tailored to meet specific project requirements. By the end of this section, you'll understand how to create a chatbot that can maintain context, remember previous interactions, and provide more meaningful and connected responses.

The implementation we'll cover ensures that your chatbot can:

  • Maintain contextual awareness throughout entire conversations
  • Handle complex multi-turn dialogues effectively
  • Provide more relevant and personalized responses based on conversation history
  • Manage memory efficiently without exceeding token limits
✅ Giving your assistant memory — so it doesn’t forget the conversation flow every time the page refreshes or the session resets.

4.3.1 Why Does Session Memory Matter?

Session memory is a fundamental aspect of creating intelligent chatbots that can engage in meaningful, context-aware conversations. Just like humans rely on memory during discussions, chatbots need a way to remember and process previous interactions. When we talk with others, we naturally reference earlier points, build on shared understanding, and maintain a coherent flow of ideas. This natural communication pattern is what makes conversations feel organic and meaningful. Without memory capabilities, chatbots are essentially starting fresh with each response, leading to disconnected and often frustrating interactions that feel more like talking to a machine than having a real conversation.

The concept of session memory fundamentally transforms chatbot interactions in several critical ways:

  • Remember and reference previous parts of the conversation with precision
    • Track specific details mentioned earlier in the chat, such as user preferences, technical specifications, or personal information shared
    • Reference past agreements or decisions accurately, ensuring continuity in complex discussions or negotiations
    • Maintain historical context for better problem-solving and support
  • Build context incrementally throughout an interaction
    • Understand complex topics that unfold over multiple messages, allowing for deeper exploration of subjects
    • Develop more sophisticated responses as the conversation progresses, building upon previously established concepts
    • Create a coherent narrative thread across multiple exchanges
  • Provide more nuanced and relevant responses based on the conversation history
    • Tailor answers to the user's demonstrated knowledge level, adjusting terminology and complexity accordingly
    • Avoid repeating information already discussed, making conversations more efficient
    • Use past interactions to provide more personalized and contextually appropriate responses
  • Create a more natural, human-like conversational experience
    • Maintain consistent personality and tone throughout the chat, enhancing user engagement
    • Adapt responses based on user preferences and past interactions, creating a more personalized experience
    • Learn from previous exchanges to improve the quality of future interactions

In real conversations, context builds over time in multiple sophisticated ways that mirror natural human dialogue patterns:

  • The user refers to past messages
    • Questions naturally build upon previous answers, creating a continuous thread of understanding
    • Topics evolve organically as users reference and expand on earlier points in the conversation
    • Previous context shapes how new information is interpreted and understood
  • The assistant needs to remember previous answers
    • Maintains consistency across responses to build trust and reliability
    • Uses established context to provide more nuanced and relevant information
    • Builds a comprehensive understanding of the user's needs over time
  • Follow-up questions depend on prior knowledge
    • Each question builds upon the foundation of previous exchanges
    • Complex topics can be explored gradually, with increasing depth
    • The conversation naturally progresses from basic concepts to more advanced understanding
    • Creates more meaningful dialogue chains by connecting related ideas
    • Enables natural conversation flow that feels more human-like and engaging

Without memory implementation, chatbots treat each interaction as an isolated event, completely disconnected from previous exchanges. This fundamental limitation affects even sophisticated models like GPT-4o in several critical ways:

  1. Loss of Context: Each response is generated without any awareness of previous conversations, making it impossible to maintain coherent, extended discussions.
  2. Repetitive Interactions: The chatbot may provide the same information multiple times or ask for details that were already shared, creating a frustrating user experience.
  3. Inconsistent Responses: Without access to previous exchanges, the chatbot might give contradictory answers to related questions, undermining user trust.
  4. Limited Understanding: The inability to reference past context means the chatbot cannot build upon previously established knowledge or adapt its responses based on the user's demonstrated understanding.

By the end of this section, you'll know how to:

  • Store and retrieve messages for a given session
    • Implement secure storage mechanisms using industry-standard encryption and protection
    • Handle different types of message data effectively, including text, structured data, and metadata
  • Maintain and update multi-turn memory
    • Process conversation chains efficiently using optimized data structures
    • Manage context across multiple exchanges while maintaining conversation coherence
  • Avoid bloated token usage by capping history length
    • Implement smart memory management strategies that prioritize relevant information
    • Balance context retention with performance through intelligent pruning algorithms

4.3.2 Streamlit: Session State Memory

Streamlit makes managing conversation history super simple with st.session_state, which persists data throughout the browser session.

Here's how you can create a chatbot with session memory using Streamlit:

Step 1: Import Libraries

import streamlit as st
import openai
import os
from dotenv import load_dotenv
  • streamlit: Used to create the user interface for the chatbot.
  • openai: The OpenAI Python library, used to interact with the GPT-4o model.
  • os: Provides a way to interact with the operating system, for example to access environment variables.
  • dotenv: Used to load environment variables from a .env file.

Step 2: Load API Key and Configure Page

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

st.set_page_config(page_title="GPT-4o Chat with Memory", page_icon="🧠")
st.title("🧠 GPT-4o Chatbot with Session Memory")
  • load_dotenv(): Loads the OpenAI API key from a .env file. This file should be in the same directory as your Python script and contain the line OPENAI_API_KEY=YOUR_API_KEY.
  • openai.api_key = os.getenv("OPENAI_API_KEY"): Sets the OpenAI API key.
  • st.set_page_config(...): Configures the page title and icon that appear in the browser tab.
  • st.title(...): Sets the title of the Streamlit application, which is displayed at the top of the page.

Step 3: Initialize Session State

if "messages" not in st.session_state:
    st.session_state.messages = [{"role": "system", "content": "You are a helpful assistant that remembers this session."}]
  • st.session_state: Streamlit's way of storing variables across user interactions. Data in st.session_state persists as long as the user's browser tab remains open.
  • This code checks if the key "messages" exists in st.session_state. If it doesn't (which is the case when the user first loads the app), it initializes "messages" to a list containing a single dictionary.
  • This dictionary represents the "system message," which is used to set the behavior of the assistant. In this case, the system message tells the assistant to be helpful and remember the conversation.

Step 4: Display Chat History

for msg in st.session_state.messages[1:]:
    with st.chat_message(msg["role"]):
        st.markdown(msg["content"])
  • This for loop iterates through the messages in the "messages" list, starting from the second message (index 1) to skip the system message.
  • st.chat_message(msg["role"]): Creates a chat bubble in the Streamlit app to display the message. The role ("user" or "assistant") determines the appearance of the bubble.
  • st.markdown(msg["content"]): Displays the content of the message within the chat bubble. st.markdown is used to render the text.

Step 5: User Input and Response Generation

user_input = st.chat_input("Say something...")
if user_input:
    st.chat_message("user").markdown(user_input)
    st.session_state.messages.append({"role": "user", "content": user_input})

    with st.chat_message("assistant"):
        with st.spinner("Thinking..."):
            response = openai.ChatCompletion.create(
                model="gpt-4o",
                messages=st.session_state.messages,
                temperature=0.6
            )
            reply = response["choices"][0]["message"]["content"]
            st.markdown(reply)
            st.session_state.messages.append({"role": "assistant", "content": reply})
  • user_input = st.chat_input("Say something..."): Creates a text input field at the bottom of the app where the user can type their message. The label "Say something..." is displayed next to the input field.
  • The if user_input: block is executed when the user enters text and presses Enter.
  • st.chat_message("user").markdown(user_input): Displays the user's message in the chat interface.
  • st.session_state.messages.append({"role": "user", "content": user_input}): Appends the user's message to the "messages" list in st.session_state, so it's stored in the conversation history.
  • with st.chat_message("assistant"):: Creates a chat bubble for the assistant's response.
  • with st.spinner("Thinking..."): Displays a spinner animation while the app is waiting for a response from the OpenAI API.
  • response = openai.ChatCompletion.create(...): Calls the OpenAI API to get a response from the GPT-4o model.
    • model: Specifies the language model to use ("gpt-4o").
    • messages: Passes the entire conversation history (stored in st.session_state.messages) to the API. This is how the model "remembers" the conversation.
    • temperature: A value between 0 and 1 that controls the randomness of the model's output. A lower value (e.g., 0.2) makes the output more deterministic, while a higher value (e.g., 0.8) makes it more random.
  • reply = response["choices"][0]["message"]["content"]: Extracts the assistant's reply from the API response.
  • st.markdown(reply): Displays the assistant's reply in the chat interface.
  • st.session_state.messages.append({"role": "assistant", "content": reply}): Appends the assistant's reply to the "messages" list in st.session_state.

This example creates a simple chatbot with memory using Streamlit. The st.session_state.messages list stores the conversation history, allowing the chatbot to maintain context across multiple interactions.  The chat history is displayed in the app, and the user can input messages using the st.chat_input field.  The assistant's responses are generated by the OpenAI GPT-4o model.

4.3.3 Flask: Using Server-Side Sessions

In Flask, server-side session management provides a robust way to maintain conversation history. The session object acts as a persistent dictionary that stores data on the server rather than the client side, making it more secure and reliable. You can use either the built-in session object, which stores data in an encrypted cookie, or implement in-memory storage solutions like Redis for better scalability.

This server-side approach ensures that the chatbot can maintain context and remember previous interactions throughout the user's active session, even if they navigate between different pages or refresh the browser.

Here's how to implement a chatbot with server-side sessions in Flask:

Step 1: Install Required Libraries

pip install flask openai flask-session python-dotenv
  • flask: A web framework.
  • openai: The OpenAI Python library.
  • flask-session: Flask extension to handle server-side sessions.
  • python-dotenv: To load environment variables from a .env file.

Step 2: Update app.py

from flask import Flask, request, render_template, session
import openai
import os
from dotenv import load_dotenv
from flask_session import Session

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

app = Flask(__name__)
app.secret_key = os.urandom(24)
app.config["SESSION_TYPE"] = "filesystem"
Session(app)

@app.route("/", methods=["GET", "POST"])
def chat():
    if "history" not in session:
        session["history"] = [
            {"role": "system", "content": "You are a helpful assistant."}
        ]

    if request.method == "POST":
        user_input = request.form["user_input"]
        session["history"].append({"role": "user", "content": user_input})

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=session["history"],
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        session["history"].append({"role": "assistant", "content": assistant_reply})

        return render_template("chat.html", history=session.get("history")[1:])  # Skip system prompt

    return render_template("chat.html", history=session.get("history")[1:])  # Skip system prompt
  • from flask import ...: Imports necessary Flask components, including session for managing user sessions.
  • import openai: Imports the OpenAI library.
  • import os: Imports the os module for interacting with the operating system, particularly for accessing environment variables.
  • from dotenv import load_dotenv: Imports the load_dotenv function from the python-dotenv library.
  • from flask_session import Session: Imports the Session class from flask_session.
  • load_dotenv(): Loads environment variables (like the OpenAI API key) from a .env file.
  • openai.api_key = os.getenv("OPENAI_API_KEY"): Retrieves the OpenAI API key from the environment and sets it for the OpenAI library.
  • app = Flask(__name__): Creates a Flask application instance.
  • app.secret_key = os.urandom(24): Sets a secret key for the Flask application. This is essential for using Flask sessions. os.urandom(24) generates a random, cryptographically secure key.
  • app.config["SESSION_TYPE"] = "filesystem": Configures Flask-Session to store session data on the server's file system. Other options like "redis" or "mongodb" are available for production use.
  • Session(app): Initializes the Flask-Session extension, binding it to the Flask app.
  • @app.route("/", methods=["GET", "POST"]): Defines the route for the application's main page ("/"). The chat() function handles both GET and POST requests.
  • def chat()::
    • if "history" not in session:: Checks if the user's session already has a conversation history. If not, it initializes the session with a system message. The system message helps set the behavior of the assistant.
    • if request.method == "POST":: Handles POST requests, which occur when the user submits a message through the chat form.
      • user_input = request.form["user_input"]: Retrieves the user's input from the form.
      • session["history"].append({"role": "user", "content": user_input}): Appends the user's message to the conversation history stored in the session.
      • response = openai.ChatCompletion.create(...): Calls the OpenAI API to get a response from the GPT-4o model, passing the conversation history.
      • assistant_reply = response["choices"][0]["message"]["content"]: Extracts the assistant's reply from the API response.
      • session["history"].append({"role": "assistant", "content": assistant_reply}): Appends the assistant's reply to the conversation history in the session.
      • return render_template("chat.html", history=session.get("history")[1:]): Renders the chat.html template, passing the conversation history (excluding the initial system message) to be displayed.
    • return render_template("chat.html", history=session.get("history")[1:]): Handles GET requests (when the user first loads the page). It renders the chat.html template, passing the conversation history (excluding the system message).

Step 3: Your HTML Template (templates/chat.html)

<!DOCTYPE html>
<html>
<head>
  <title>GPT-4o Assistant</title>
  <style>
    body { font-family: Arial; background: #f7f7f7; padding: 40px; }
    .container { max-width: 600px; margin: auto; background: white; padding: 20px; border-radius: 10px; }
    .user, .assistant { margin-bottom: 15px; }
    .user p { background: #d4f0ff; padding: 10px; border-radius: 10px; }
    .assistant p { background: #e8ffe8; padding: 10px; border-radius: 10px; }
    textarea { width: 100%; height: 80px; }
    input[type="submit"] { margin-top: 10px; padding: 10px 20px; }
  </style>
</head>
<body>
  <div class="container">
    <h2>GPT-4o Chatbot</h2>
    {% for msg in history %}
      <div class="{{ msg.role }}">
        <p><strong>{{ msg.role.capitalize() }}:</strong> {{ msg.content }}</p>
      </div>
    {% endfor %}
    <form method="post">
      <textarea name="user_input" placeholder="Type your message..."></textarea><br>
      <input type="submit" value="Send">
    </form>
  </div>
</body>
</html>
  • <!DOCTYPE html>: Declares the document type as HTML5.
  • <html>: The root element of the HTML document.
  • <head>: Contains metadata about the HTML document.
    • <title>: Specifies the title of the HTML page.
    • <style>: Includes CSS for basic styling of the chat interface.
  • <body>: Contains the visible content of the HTML page.
    • <div class="container">: A container for the chat application.
    • <h2>: A heading for the chat application.
    • {% for msg in history %}: A Jinja2 template loop that iterates through the history variable (passed from the Flask code) to display the chat messages.
      • <div class="{{ msg.role }}">: Creates a div element for each message. The class is set to the message's role ("user" or "assistant") for styling.
      • <p>: Displays the message content.
      • <strong>: Displays the role.
    • <form method="post">: A form for the user to submit their messages.
      • <textarea>: A multi-line text input field for the user to type their message.
      • <input type="submit" value="Send">: A button to send the message.
  • templates: Flask, by default, looks for HTML templates in a folder named "templates" in the same directory as your app.py file. So, this file should be saved as templates/chat.html.

This code creates a chatbot using Flask and OpenAI, with the conversation history stored in server-side sessions.  The server retains the chat memory for the duration of the user's session, clearing it when the user closes their browser tab.

4.3.4 Optional: Cap the Message History

Large language models have a limited context window (e.g., 128k tokens for GPT-4o), which determines how much of the conversation the model can "remember" at once. To prevent exceeding this limit and encountering errors, you should cap the number of messages stored in the conversation history.

Here's how to trim older entries in both Streamlit and Flask:

Streamlit

import streamlit as st

MAX_HISTORY = 20  # Maximum number of messages to keep

if SESSION_MESSAGES_KEY in st.session_state:
    if len(st.session_state[SESSION_MESSAGES_KEY]) > MAX_HISTORY:
        # Keep the system message and the last MAX_HISTORY messages
        st.session_state[SESSION_MESSAGES_KEY] = [st.session_state[SESSION_MESSAGES_KEY][0]] + st.session_state[SESSION_MESSAGES_KEY][-MAX_HISTORY + 1:]

Explanation:

  • MAX_HISTORY: A constant defining the maximum number of messages to retain. This example keeps the last 20 messages.
  • if SESSION_MESSAGES_KEY in st.session_state: Check if the key exists
  • The code then trims the st.session_state.messages list, preserving the first message (the system message) and the last MAX_HISTORY - 1 messages.

Flask

from flask import session

MAX_HISTORY = 20  # Maximum number of messages to keep

if SESSION_MESSAGES_KEY in session:
    if len(session[SESSION_MESSAGES_KEY]) > MAX_HISTORY:
        # Keep the system message and the last MAX_HISTORY messages
        session[SESSION_MESSAGES_KEY] = [session[SESSION_MESSAGES_KEY][0]] + session[SESSION_MESSAGES_KEY][-MAX_HISTORY + 1:]

Explanation:

  • MAX_HISTORY: A constant defining the maximum number of messages to retain.
  • if SESSION_MESSAGES_KEY in session: Check if the key exists
  • The code trims the session["history"] list, keeping the first message (the system message) and the last MAX_HISTORY - 1 messages.

Important Implementation Considerations:

  • System Message Management: The system message plays a crucial role in setting the chatbot's behavior and context. It must always be preserved as the first message in your conversation history. When implementing message trimming, ensure your code specifically maintains this message by:
    • Keeping it separate from the regular conversation flow
    • Including special handling in your trimming logic
    • Verifying its presence before each interaction
  • Comprehensive Testing Protocol: To ensure reliable chatbot performance:
    • Test with varying conversation lengths, from short exchanges to extended dialogues
    • Verify that context is maintained even after trimming
    • Check for potential edge cases where coherence might break
    • Monitor system resource usage during extended conversations
  • Advanced Trimming Strategies: Consider these sophisticated approaches:
    • Token-based trimming: Calculate actual token usage using a tokenizer
    • Importance-based trimming: Keep messages based on relevance
    • Hybrid approach: Combine token counting with message relevance
    • Dynamic adjustment: Modify trim threshold based on conversation complexity

4.3.5 Bonus Tip: Save to File or Database

Want to persist memory even after the session ends? Let's explore several effective methods for long-term memory storage:

1. Export conversation history to a JSON file

Process: The st.session_state.messages (in Streamlit) or session["history"] (in Flask) data can be saved to a JSON file. This involves converting the list of message dictionaries into a JSON string and writing it to a file.

Code Example (Streamlit):

import json
import streamlit as st

def save_chat_log(filename="chat_log.json"):
    """Saves the chat log from st.session_state to a JSON file."""
    if SESSION_MESSAGES_KEY in st.session_state:
        try:
            with open(filename, "w") as f:
                json.dump(st.session_state[SESSION_MESSAGES_KEY], f, indent=2)
            print(f"Chat log saved to {filename}")
        except Exception as e:
            st.error(f"Error saving chat log: {e}")

# Example usage:  Call this function when the user ends the session or when appropriate
if st.button("Save Chat Log"):
    save_chat_log()

Code Example (Flask):

import json
from flask import session, Flask

def save_chat_log(filename="chat_log.json"):
    """Saves the chat log from the Flask session to a JSON file."""
    if SESSION_MESSAGES_KEY in session:
        try:
            with open(filename, "w") as f:
                json.dump(session[SESSION_MESSAGES_KEY], f, indent=2)
            print(f"Chat log saved to {filename}")
        except Exception as e:
            #  Use the app context to display a message
            with app.app_context():
                print(f"Error saving chat log: {e}")

# Example Usage
app = Flask(__name__)
@app.route('/save_log')
def save_log():
    save_chat_log()
    return "Chat log saved!"

Explanation:

  • The json.dump() function is used to serialize the list of messages to a JSON formatted string. The indent=2 parameter makes the JSON file more human-readable.
  • The code handles potential errors during file writing.
  • The Streamlit example uses a button to trigger the save, and the Flask example creates a route /save_log to save the file.

Benefits:

  • Simple to implement using Python's built-in json module.
  • Good for small-scale applications and quick prototypes.
  • Easy to backup and version control.

Drawbacks:

  • Not ideal for large-scale, multi-user applications.
  • No efficient querying or indexing.

2. Store conversations in a SQLite, PostgreSQL, or NoSQL database

Process: Store the conversation history in a database. Each message can be a row in a table (for SQL databases) or a document in a collection (for NoSQL databases).

Code Example (SQLite - Streamlit):

import streamlit as st
import sqlite3
import datetime

def get_connection():
    """Gets or creates a SQLite connection."""
    conn = getattr(st.session_state, "sqlite_conn", None)
    if conn is None:
        conn = sqlite3.connect("chat_log.db")
        st.session_state.sqlite_conn = conn
    return conn

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute("""
        CREATE TABLE IF NOT EXISTS messages (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            role TEXT NOT NULL,
            content TEXT NOT NULL,
            created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        )
    """)
    conn.commit()

def store_message(role, content):
    """Stores a message in the database."""
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute(
        "INSERT INTO messages (role, content) VALUES (?, ?)",
        (role, content),
    )
    conn.commit()

create_table()  # Ensure table exists

# Store messages
if st.session_state.user_input:
    store_message("user", st.session_state.user_input)
if st.session_state.get("reply"):  # replace reply with a key you are using
    store_message("assistant", st.session_state.reply)

# Example of retrieving messages (optional, for demonstration)
conn = get_connection()
cursor = conn.cursor()
cursor.execute("SELECT role, content, created_at FROM messages ORDER BY created_at DESC LIMIT 5")
recent_messages = cursor.fetchall()
st.write("Last 5 Messages from DB:")
for row in recent_messages:
    st.write(f"{row[0]}: {row[1]} (at {row[2]})")

Code Example (PostgreSQL - Flask):

from flask import Flask, request, render_template, session
import openai
import os
from dotenv import load_dotenv
import psycopg2  # PostgreSQL library
import datetime
from typing import List, Dict

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL") #Make sure to set this in .env

app = Flask(__name__)
app.secret_key = os.urandom(24)

def get_db_connection():
    """Gets a PostgreSQL connection."""
    try:
        conn = psycopg2.connect(DATABASE_URL)
        return conn
    except psycopg2.Error as e:
        print(f"Database connection error: {e}")
        return None

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS messages (
                    id SERIAL PRIMARY KEY,
                    session_id TEXT NOT NULL,
                    role TEXT NOT NULL,
                    content TEXT NOT NULL,
                    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
                )
            """)
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error creating table: {e}")
        finally:
            conn.close()

def store_message(session_id: str, role: str, content: str):
    """Stores a message in the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "INSERT INTO messages (session_id, role, content) VALUES (%s, %s, %s)",
                (session_id, role, content),
            )
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error storing message: {e}")
        finally:
            conn.close()

create_table()  # Ensure the table exists

@app.route("/", methods=["GET", "POST"])
def chat():
    if "session_id" not in session:
        session["session_id"] = os.urandom(16).hex()  # Unique session ID

    if "history" not in session:
        session["history"] = [{"role": "system", "content": "You are a helpful assistant."}]

    if request.method == "POST":
        user_input = request.form["user_input"]
        session_id = session["session_id"]
        store_message(session_id, "user", user_input)  # Store in DB
        session["history"].append({"role": "user", "content": user_input})

        conn = get_db_connection() #
        if conn: #
            try: #
                cursor = conn.cursor() #
                cursor.execute("SELECT role, content, created_at FROM messages WHERE session_id = %s ORDER BY created_at", (session_id,)) #
                messages_from_db = cursor.fetchall() #
            except psycopg2.Error as e: #
                print(f"Error fetching messages: {e}") #
            finally: #
                conn.close() #

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=session["history"],
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        store_message(session_id, "assistant", assistant_reply)  # Store in DB
        session["history"].append({"role": "assistant", "content": assistant_reply})
        session.modified = True

        return render_template("chat.html", history=session.get("history")[1:])
    return render_template("chat.html", history=session.get("history")[1:])

Explanation:

  • The Streamlit example uses SQLite, a lightweight database that doesn't require a separate server. The Flask example uses PostgreSQL, a more robust database that is suitable for multi-user applications.
  • Both examples create a table named "messages" to store the conversation history. The table includes columns for message ID, role, content, and timestamp.
  • The store_message() function inserts a new message into the database.
  • The Flask example retrieves messages from the database and passes them to the template for display. The Streamlit example also shows how to retrieve data.

Benefits:

  • SQLite: Perfect for single-user applications with structured data. No separate database server is needed.
  • PostgreSQL: Ideal for multi-user systems requiring concurrent access. More robust and scalable than SQLite.
  • NoSQL (Not shown in detail): Best for flexible schema and unstructured conversation data. Databases like MongoDB or CouchDB would be suitable.
  • Drawbacks:
    • Requires more setup than using a JSON file.
    • Need to manage database connections and schemas.

3. Reuse memory in future sessions by tagging with a user ID

Process: To allow users to have persistent, personalized conversations, you can tag each message with a user ID and store this information in the database. When a user returns, you can retrieve their specific conversation history from the database.

Code Example (PostgreSQL - Flask):

from flask import Flask, request, render_template, session, redirect, url_for
import openai
import os
from dotenv import load_dotenv
import psycopg2
import datetime
from typing import List, Dict

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL")

app = Flask(__name__)
app.secret_key = os.urandom(24)

def get_db_connection():
    """Gets a PostgreSQL connection."""
    try:
        conn = psycopg2.connect(DATABASE_URL)
        return conn
    except psycopg2.Error as e:
        print(f"Database connection error: {e}")
        return None

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS messages (
                    id SERIAL PRIMARY KEY,
                    user_id TEXT NOT NULL,  -- Added user_id
                    role TEXT NOT NULL,
                    content TEXT NOT NULL,
                    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
                )
            """)
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error creating table: {e}")
        finally:
            conn.close()

def store_message(user_id: str, role: str, content: str):
    """Stores a message in the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "INSERT INTO messages (user_id, role, content) VALUES (%s, %s, %s)",
                (user_id, role, content),
            )
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error storing message: {e}")
        finally:
            conn.close()

create_table()

def get_user_history(user_id: str) -> List[Dict[str, str]]:
    """Retrieves a user's conversation history from the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "SELECT role, content FROM messages WHERE user_id = %s ORDER BY created_at",
                (user_id,),
            )
            history = [{"role": row[0], "content": row[1]} for row in cursor.fetchall()]
            return history
        except psycopg2.Error as e:
            print(f"Error retrieving user history: {e}")
            return []
        finally:
            conn.close()
    return []

@app.route("/", methods=["GET", "POST"])
def chat():
    if "user_id" not in session:
        session["user_id"] = os.urandom(16).hex()  # Unique user ID
    user_id = session["user_id"]

    history = get_user_history(user_id)  # Get user's history

    if request.method == "POST":
        user_input = request.form["user_input"]
        store_message(user_id, "user", user_input)  # Store with user ID
        history.append({"role": "user", "content": user_input})

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=history,
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        store_message(user_id, "assistant", assistant_reply)  # Store with user ID
        history.append({"role": "assistant", "content": assistant_reply})
        session["history"] = history #update

        return render_template("chat.html", history=history[1:])

    return render_template("chat.html", history=history[1:])

@app.route("/clear", methods=["POST"])
def clear_chat():
    session.pop("user_id", None)  #remove user id
    session.pop("history", None)
    return redirect(url_for("chat"))

Explanation:

  • The database table "messages" now includes a "user_id" column.
  • When a user starts a session, a unique "user_id" is generated and stored in the Flask session.
  • The store_message() function now requires a "user_id" and stores it along with the message.
  • The get_user_history() function retrieves the conversation history for a specific user from the database.
  • The chat route retrieves user history and uses it to construct the messages sent to OpenAI, thus maintaining conversation history across multiple visits from the same user.
  • Benefits:
    • Enables personalized conversation history for each user.
    • Allows for user-specific context and preferences.
    • Facilitates analysis of conversation patterns over time.
  • Drawbacks:
    • Requires a database.
    • More complex to implement than simple session-based storage.

In this section, you learned several crucial aspects of building a chatbot with memory capabilities:

  • Added session-based memory to your chatbot using st.session_state (Streamlit) and flask.session (Flask)
    • Implemented temporary storage for ongoing conversations
    • Learned how to manage session variables effectively
  • Preserved chat history across interactions
    • Created database schemas to store conversation data
    • Implemented methods to save and retrieve past messages
  • Improved context and coherence for multi-turn conversations
    • Developed systems to maintain conversation context
    • Enhanced natural language understanding through historical context
  • Learned to cap token usage by trimming message history
    • Implemented efficient message pruning strategies
    • Balanced memory retention with API token limitations

This gives your chatbot the ability to hold natural, flowing conversations — a key milestone toward building an intelligent assistant. With these features, your chatbot can now remember previous interactions, maintain context throughout conversations, and manage memory efficiently while staying within technical constraints.

4.3 Implementing Session-Based Chat Memory

Session-based memory is a crucial feature that transforms a simple chatbot into a truly interactive and context-aware conversational agent. Without memory, each interaction becomes isolated and disconnected, forcing users to repeatedly provide context and limiting the natural flow of conversation. This section explores how to implement robust memory management in your chatbot, enabling it to maintain coherent, contextual discussions across multiple exchanges.

We'll examine two distinct approaches to implementing session memory - using Streamlit's built-in state management and Flask's server-side sessions. Both methods offer their own advantages and can be tailored to meet specific project requirements. By the end of this section, you'll understand how to create a chatbot that can maintain context, remember previous interactions, and provide more meaningful and connected responses.

The implementation we'll cover ensures that your chatbot can:

  • Maintain contextual awareness throughout entire conversations
  • Handle complex multi-turn dialogues effectively
  • Provide more relevant and personalized responses based on conversation history
  • Manage memory efficiently without exceeding token limits
✅ Giving your assistant memory — so it doesn’t forget the conversation flow every time the page refreshes or the session resets.

4.3.1 Why Does Session Memory Matter?

Session memory is a fundamental aspect of creating intelligent chatbots that can engage in meaningful, context-aware conversations. Just like humans rely on memory during discussions, chatbots need a way to remember and process previous interactions. When we talk with others, we naturally reference earlier points, build on shared understanding, and maintain a coherent flow of ideas. This natural communication pattern is what makes conversations feel organic and meaningful. Without memory capabilities, chatbots are essentially starting fresh with each response, leading to disconnected and often frustrating interactions that feel more like talking to a machine than having a real conversation.

The concept of session memory fundamentally transforms chatbot interactions in several critical ways:

  • Remember and reference previous parts of the conversation with precision
    • Track specific details mentioned earlier in the chat, such as user preferences, technical specifications, or personal information shared
    • Reference past agreements or decisions accurately, ensuring continuity in complex discussions or negotiations
    • Maintain historical context for better problem-solving and support
  • Build context incrementally throughout an interaction
    • Understand complex topics that unfold over multiple messages, allowing for deeper exploration of subjects
    • Develop more sophisticated responses as the conversation progresses, building upon previously established concepts
    • Create a coherent narrative thread across multiple exchanges
  • Provide more nuanced and relevant responses based on the conversation history
    • Tailor answers to the user's demonstrated knowledge level, adjusting terminology and complexity accordingly
    • Avoid repeating information already discussed, making conversations more efficient
    • Use past interactions to provide more personalized and contextually appropriate responses
  • Create a more natural, human-like conversational experience
    • Maintain consistent personality and tone throughout the chat, enhancing user engagement
    • Adapt responses based on user preferences and past interactions, creating a more personalized experience
    • Learn from previous exchanges to improve the quality of future interactions

In real conversations, context builds over time in multiple sophisticated ways that mirror natural human dialogue patterns:

  • The user refers to past messages
    • Questions naturally build upon previous answers, creating a continuous thread of understanding
    • Topics evolve organically as users reference and expand on earlier points in the conversation
    • Previous context shapes how new information is interpreted and understood
  • The assistant needs to remember previous answers
    • Maintains consistency across responses to build trust and reliability
    • Uses established context to provide more nuanced and relevant information
    • Builds a comprehensive understanding of the user's needs over time
  • Follow-up questions depend on prior knowledge
    • Each question builds upon the foundation of previous exchanges
    • Complex topics can be explored gradually, with increasing depth
    • The conversation naturally progresses from basic concepts to more advanced understanding
    • Creates more meaningful dialogue chains by connecting related ideas
    • Enables natural conversation flow that feels more human-like and engaging

Without memory implementation, chatbots treat each interaction as an isolated event, completely disconnected from previous exchanges. This fundamental limitation affects even sophisticated models like GPT-4o in several critical ways:

  1. Loss of Context: Each response is generated without any awareness of previous conversations, making it impossible to maintain coherent, extended discussions.
  2. Repetitive Interactions: The chatbot may provide the same information multiple times or ask for details that were already shared, creating a frustrating user experience.
  3. Inconsistent Responses: Without access to previous exchanges, the chatbot might give contradictory answers to related questions, undermining user trust.
  4. Limited Understanding: The inability to reference past context means the chatbot cannot build upon previously established knowledge or adapt its responses based on the user's demonstrated understanding.

By the end of this section, you'll know how to:

  • Store and retrieve messages for a given session
    • Implement secure storage mechanisms using industry-standard encryption and protection
    • Handle different types of message data effectively, including text, structured data, and metadata
  • Maintain and update multi-turn memory
    • Process conversation chains efficiently using optimized data structures
    • Manage context across multiple exchanges while maintaining conversation coherence
  • Avoid bloated token usage by capping history length
    • Implement smart memory management strategies that prioritize relevant information
    • Balance context retention with performance through intelligent pruning algorithms

4.3.2 Streamlit: Session State Memory

Streamlit makes managing conversation history super simple with st.session_state, which persists data throughout the browser session.

Here's how you can create a chatbot with session memory using Streamlit:

Step 1: Import Libraries

import streamlit as st
import openai
import os
from dotenv import load_dotenv
  • streamlit: Used to create the user interface for the chatbot.
  • openai: The OpenAI Python library, used to interact with the GPT-4o model.
  • os: Provides a way to interact with the operating system, for example to access environment variables.
  • dotenv: Used to load environment variables from a .env file.

Step 2: Load API Key and Configure Page

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

st.set_page_config(page_title="GPT-4o Chat with Memory", page_icon="🧠")
st.title("🧠 GPT-4o Chatbot with Session Memory")
  • load_dotenv(): Loads the OpenAI API key from a .env file. This file should be in the same directory as your Python script and contain the line OPENAI_API_KEY=YOUR_API_KEY.
  • openai.api_key = os.getenv("OPENAI_API_KEY"): Sets the OpenAI API key.
  • st.set_page_config(...): Configures the page title and icon that appear in the browser tab.
  • st.title(...): Sets the title of the Streamlit application, which is displayed at the top of the page.

Step 3: Initialize Session State

if "messages" not in st.session_state:
    st.session_state.messages = [{"role": "system", "content": "You are a helpful assistant that remembers this session."}]
  • st.session_state: Streamlit's way of storing variables across user interactions. Data in st.session_state persists as long as the user's browser tab remains open.
  • This code checks if the key "messages" exists in st.session_state. If it doesn't (which is the case when the user first loads the app), it initializes "messages" to a list containing a single dictionary.
  • This dictionary represents the "system message," which is used to set the behavior of the assistant. In this case, the system message tells the assistant to be helpful and remember the conversation.

Step 4: Display Chat History

for msg in st.session_state.messages[1:]:
    with st.chat_message(msg["role"]):
        st.markdown(msg["content"])
  • This for loop iterates through the messages in the "messages" list, starting from the second message (index 1) to skip the system message.
  • st.chat_message(msg["role"]): Creates a chat bubble in the Streamlit app to display the message. The role ("user" or "assistant") determines the appearance of the bubble.
  • st.markdown(msg["content"]): Displays the content of the message within the chat bubble. st.markdown is used to render the text.

Step 5: User Input and Response Generation

user_input = st.chat_input("Say something...")
if user_input:
    st.chat_message("user").markdown(user_input)
    st.session_state.messages.append({"role": "user", "content": user_input})

    with st.chat_message("assistant"):
        with st.spinner("Thinking..."):
            response = openai.ChatCompletion.create(
                model="gpt-4o",
                messages=st.session_state.messages,
                temperature=0.6
            )
            reply = response["choices"][0]["message"]["content"]
            st.markdown(reply)
            st.session_state.messages.append({"role": "assistant", "content": reply})
  • user_input = st.chat_input("Say something..."): Creates a text input field at the bottom of the app where the user can type their message. The label "Say something..." is displayed next to the input field.
  • The if user_input: block is executed when the user enters text and presses Enter.
  • st.chat_message("user").markdown(user_input): Displays the user's message in the chat interface.
  • st.session_state.messages.append({"role": "user", "content": user_input}): Appends the user's message to the "messages" list in st.session_state, so it's stored in the conversation history.
  • with st.chat_message("assistant"):: Creates a chat bubble for the assistant's response.
  • with st.spinner("Thinking..."): Displays a spinner animation while the app is waiting for a response from the OpenAI API.
  • response = openai.ChatCompletion.create(...): Calls the OpenAI API to get a response from the GPT-4o model.
    • model: Specifies the language model to use ("gpt-4o").
    • messages: Passes the entire conversation history (stored in st.session_state.messages) to the API. This is how the model "remembers" the conversation.
    • temperature: A value between 0 and 1 that controls the randomness of the model's output. A lower value (e.g., 0.2) makes the output more deterministic, while a higher value (e.g., 0.8) makes it more random.
  • reply = response["choices"][0]["message"]["content"]: Extracts the assistant's reply from the API response.
  • st.markdown(reply): Displays the assistant's reply in the chat interface.
  • st.session_state.messages.append({"role": "assistant", "content": reply}): Appends the assistant's reply to the "messages" list in st.session_state.

This example creates a simple chatbot with memory using Streamlit. The st.session_state.messages list stores the conversation history, allowing the chatbot to maintain context across multiple interactions.  The chat history is displayed in the app, and the user can input messages using the st.chat_input field.  The assistant's responses are generated by the OpenAI GPT-4o model.

4.3.3 Flask: Using Server-Side Sessions

In Flask, server-side session management provides a robust way to maintain conversation history. The session object acts as a persistent dictionary that stores data on the server rather than the client side, making it more secure and reliable. You can use either the built-in session object, which stores data in an encrypted cookie, or implement in-memory storage solutions like Redis for better scalability.

This server-side approach ensures that the chatbot can maintain context and remember previous interactions throughout the user's active session, even if they navigate between different pages or refresh the browser.

Here's how to implement a chatbot with server-side sessions in Flask:

Step 1: Install Required Libraries

pip install flask openai flask-session python-dotenv
  • flask: A web framework.
  • openai: The OpenAI Python library.
  • flask-session: Flask extension to handle server-side sessions.
  • python-dotenv: To load environment variables from a .env file.

Step 2: Update app.py

from flask import Flask, request, render_template, session
import openai
import os
from dotenv import load_dotenv
from flask_session import Session

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

app = Flask(__name__)
app.secret_key = os.urandom(24)
app.config["SESSION_TYPE"] = "filesystem"
Session(app)

@app.route("/", methods=["GET", "POST"])
def chat():
    if "history" not in session:
        session["history"] = [
            {"role": "system", "content": "You are a helpful assistant."}
        ]

    if request.method == "POST":
        user_input = request.form["user_input"]
        session["history"].append({"role": "user", "content": user_input})

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=session["history"],
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        session["history"].append({"role": "assistant", "content": assistant_reply})

        return render_template("chat.html", history=session.get("history")[1:])  # Skip system prompt

    return render_template("chat.html", history=session.get("history")[1:])  # Skip system prompt
  • from flask import ...: Imports necessary Flask components, including session for managing user sessions.
  • import openai: Imports the OpenAI library.
  • import os: Imports the os module for interacting with the operating system, particularly for accessing environment variables.
  • from dotenv import load_dotenv: Imports the load_dotenv function from the python-dotenv library.
  • from flask_session import Session: Imports the Session class from flask_session.
  • load_dotenv(): Loads environment variables (like the OpenAI API key) from a .env file.
  • openai.api_key = os.getenv("OPENAI_API_KEY"): Retrieves the OpenAI API key from the environment and sets it for the OpenAI library.
  • app = Flask(__name__): Creates a Flask application instance.
  • app.secret_key = os.urandom(24): Sets a secret key for the Flask application. This is essential for using Flask sessions. os.urandom(24) generates a random, cryptographically secure key.
  • app.config["SESSION_TYPE"] = "filesystem": Configures Flask-Session to store session data on the server's file system. Other options like "redis" or "mongodb" are available for production use.
  • Session(app): Initializes the Flask-Session extension, binding it to the Flask app.
  • @app.route("/", methods=["GET", "POST"]): Defines the route for the application's main page ("/"). The chat() function handles both GET and POST requests.
  • def chat()::
    • if "history" not in session:: Checks if the user's session already has a conversation history. If not, it initializes the session with a system message. The system message helps set the behavior of the assistant.
    • if request.method == "POST":: Handles POST requests, which occur when the user submits a message through the chat form.
      • user_input = request.form["user_input"]: Retrieves the user's input from the form.
      • session["history"].append({"role": "user", "content": user_input}): Appends the user's message to the conversation history stored in the session.
      • response = openai.ChatCompletion.create(...): Calls the OpenAI API to get a response from the GPT-4o model, passing the conversation history.
      • assistant_reply = response["choices"][0]["message"]["content"]: Extracts the assistant's reply from the API response.
      • session["history"].append({"role": "assistant", "content": assistant_reply}): Appends the assistant's reply to the conversation history in the session.
      • return render_template("chat.html", history=session.get("history")[1:]): Renders the chat.html template, passing the conversation history (excluding the initial system message) to be displayed.
    • return render_template("chat.html", history=session.get("history")[1:]): Handles GET requests (when the user first loads the page). It renders the chat.html template, passing the conversation history (excluding the system message).

Step 3: Your HTML Template (templates/chat.html)

<!DOCTYPE html>
<html>
<head>
  <title>GPT-4o Assistant</title>
  <style>
    body { font-family: Arial; background: #f7f7f7; padding: 40px; }
    .container { max-width: 600px; margin: auto; background: white; padding: 20px; border-radius: 10px; }
    .user, .assistant { margin-bottom: 15px; }
    .user p { background: #d4f0ff; padding: 10px; border-radius: 10px; }
    .assistant p { background: #e8ffe8; padding: 10px; border-radius: 10px; }
    textarea { width: 100%; height: 80px; }
    input[type="submit"] { margin-top: 10px; padding: 10px 20px; }
  </style>
</head>
<body>
  <div class="container">
    <h2>GPT-4o Chatbot</h2>
    {% for msg in history %}
      <div class="{{ msg.role }}">
        <p><strong>{{ msg.role.capitalize() }}:</strong> {{ msg.content }}</p>
      </div>
    {% endfor %}
    <form method="post">
      <textarea name="user_input" placeholder="Type your message..."></textarea><br>
      <input type="submit" value="Send">
    </form>
  </div>
</body>
</html>
  • <!DOCTYPE html>: Declares the document type as HTML5.
  • <html>: The root element of the HTML document.
  • <head>: Contains metadata about the HTML document.
    • <title>: Specifies the title of the HTML page.
    • <style>: Includes CSS for basic styling of the chat interface.
  • <body>: Contains the visible content of the HTML page.
    • <div class="container">: A container for the chat application.
    • <h2>: A heading for the chat application.
    • {% for msg in history %}: A Jinja2 template loop that iterates through the history variable (passed from the Flask code) to display the chat messages.
      • <div class="{{ msg.role }}">: Creates a div element for each message. The class is set to the message's role ("user" or "assistant") for styling.
      • <p>: Displays the message content.
      • <strong>: Displays the role.
    • <form method="post">: A form for the user to submit their messages.
      • <textarea>: A multi-line text input field for the user to type their message.
      • <input type="submit" value="Send">: A button to send the message.
  • templates: Flask, by default, looks for HTML templates in a folder named "templates" in the same directory as your app.py file. So, this file should be saved as templates/chat.html.

This code creates a chatbot using Flask and OpenAI, with the conversation history stored in server-side sessions.  The server retains the chat memory for the duration of the user's session, clearing it when the user closes their browser tab.

4.3.4 Optional: Cap the Message History

Large language models have a limited context window (e.g., 128k tokens for GPT-4o), which determines how much of the conversation the model can "remember" at once. To prevent exceeding this limit and encountering errors, you should cap the number of messages stored in the conversation history.

Here's how to trim older entries in both Streamlit and Flask:

Streamlit

import streamlit as st

MAX_HISTORY = 20  # Maximum number of messages to keep

if SESSION_MESSAGES_KEY in st.session_state:
    if len(st.session_state[SESSION_MESSAGES_KEY]) > MAX_HISTORY:
        # Keep the system message and the last MAX_HISTORY messages
        st.session_state[SESSION_MESSAGES_KEY] = [st.session_state[SESSION_MESSAGES_KEY][0]] + st.session_state[SESSION_MESSAGES_KEY][-MAX_HISTORY + 1:]

Explanation:

  • MAX_HISTORY: A constant defining the maximum number of messages to retain. This example keeps the last 20 messages.
  • if SESSION_MESSAGES_KEY in st.session_state: Check if the key exists
  • The code then trims the st.session_state.messages list, preserving the first message (the system message) and the last MAX_HISTORY - 1 messages.

Flask

from flask import session

MAX_HISTORY = 20  # Maximum number of messages to keep

if SESSION_MESSAGES_KEY in session:
    if len(session[SESSION_MESSAGES_KEY]) > MAX_HISTORY:
        # Keep the system message and the last MAX_HISTORY messages
        session[SESSION_MESSAGES_KEY] = [session[SESSION_MESSAGES_KEY][0]] + session[SESSION_MESSAGES_KEY][-MAX_HISTORY + 1:]

Explanation:

  • MAX_HISTORY: A constant defining the maximum number of messages to retain.
  • if SESSION_MESSAGES_KEY in session: Check if the key exists
  • The code trims the session["history"] list, keeping the first message (the system message) and the last MAX_HISTORY - 1 messages.

Important Implementation Considerations:

  • System Message Management: The system message plays a crucial role in setting the chatbot's behavior and context. It must always be preserved as the first message in your conversation history. When implementing message trimming, ensure your code specifically maintains this message by:
    • Keeping it separate from the regular conversation flow
    • Including special handling in your trimming logic
    • Verifying its presence before each interaction
  • Comprehensive Testing Protocol: To ensure reliable chatbot performance:
    • Test with varying conversation lengths, from short exchanges to extended dialogues
    • Verify that context is maintained even after trimming
    • Check for potential edge cases where coherence might break
    • Monitor system resource usage during extended conversations
  • Advanced Trimming Strategies: Consider these sophisticated approaches:
    • Token-based trimming: Calculate actual token usage using a tokenizer
    • Importance-based trimming: Keep messages based on relevance
    • Hybrid approach: Combine token counting with message relevance
    • Dynamic adjustment: Modify trim threshold based on conversation complexity

4.3.5 Bonus Tip: Save to File or Database

Want to persist memory even after the session ends? Let's explore several effective methods for long-term memory storage:

1. Export conversation history to a JSON file

Process: The st.session_state.messages (in Streamlit) or session["history"] (in Flask) data can be saved to a JSON file. This involves converting the list of message dictionaries into a JSON string and writing it to a file.

Code Example (Streamlit):

import json
import streamlit as st

def save_chat_log(filename="chat_log.json"):
    """Saves the chat log from st.session_state to a JSON file."""
    if SESSION_MESSAGES_KEY in st.session_state:
        try:
            with open(filename, "w") as f:
                json.dump(st.session_state[SESSION_MESSAGES_KEY], f, indent=2)
            print(f"Chat log saved to {filename}")
        except Exception as e:
            st.error(f"Error saving chat log: {e}")

# Example usage:  Call this function when the user ends the session or when appropriate
if st.button("Save Chat Log"):
    save_chat_log()

Code Example (Flask):

import json
from flask import session, Flask

def save_chat_log(filename="chat_log.json"):
    """Saves the chat log from the Flask session to a JSON file."""
    if SESSION_MESSAGES_KEY in session:
        try:
            with open(filename, "w") as f:
                json.dump(session[SESSION_MESSAGES_KEY], f, indent=2)
            print(f"Chat log saved to {filename}")
        except Exception as e:
            #  Use the app context to display a message
            with app.app_context():
                print(f"Error saving chat log: {e}")

# Example Usage
app = Flask(__name__)
@app.route('/save_log')
def save_log():
    save_chat_log()
    return "Chat log saved!"

Explanation:

  • The json.dump() function is used to serialize the list of messages to a JSON formatted string. The indent=2 parameter makes the JSON file more human-readable.
  • The code handles potential errors during file writing.
  • The Streamlit example uses a button to trigger the save, and the Flask example creates a route /save_log to save the file.

Benefits:

  • Simple to implement using Python's built-in json module.
  • Good for small-scale applications and quick prototypes.
  • Easy to backup and version control.

Drawbacks:

  • Not ideal for large-scale, multi-user applications.
  • No efficient querying or indexing.

2. Store conversations in a SQLite, PostgreSQL, or NoSQL database

Process: Store the conversation history in a database. Each message can be a row in a table (for SQL databases) or a document in a collection (for NoSQL databases).

Code Example (SQLite - Streamlit):

import streamlit as st
import sqlite3
import datetime

def get_connection():
    """Gets or creates a SQLite connection."""
    conn = getattr(st.session_state, "sqlite_conn", None)
    if conn is None:
        conn = sqlite3.connect("chat_log.db")
        st.session_state.sqlite_conn = conn
    return conn

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute("""
        CREATE TABLE IF NOT EXISTS messages (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            role TEXT NOT NULL,
            content TEXT NOT NULL,
            created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        )
    """)
    conn.commit()

def store_message(role, content):
    """Stores a message in the database."""
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute(
        "INSERT INTO messages (role, content) VALUES (?, ?)",
        (role, content),
    )
    conn.commit()

create_table()  # Ensure table exists

# Store messages
if st.session_state.user_input:
    store_message("user", st.session_state.user_input)
if st.session_state.get("reply"):  # replace reply with a key you are using
    store_message("assistant", st.session_state.reply)

# Example of retrieving messages (optional, for demonstration)
conn = get_connection()
cursor = conn.cursor()
cursor.execute("SELECT role, content, created_at FROM messages ORDER BY created_at DESC LIMIT 5")
recent_messages = cursor.fetchall()
st.write("Last 5 Messages from DB:")
for row in recent_messages:
    st.write(f"{row[0]}: {row[1]} (at {row[2]})")

Code Example (PostgreSQL - Flask):

from flask import Flask, request, render_template, session
import openai
import os
from dotenv import load_dotenv
import psycopg2  # PostgreSQL library
import datetime
from typing import List, Dict

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL") #Make sure to set this in .env

app = Flask(__name__)
app.secret_key = os.urandom(24)

def get_db_connection():
    """Gets a PostgreSQL connection."""
    try:
        conn = psycopg2.connect(DATABASE_URL)
        return conn
    except psycopg2.Error as e:
        print(f"Database connection error: {e}")
        return None

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS messages (
                    id SERIAL PRIMARY KEY,
                    session_id TEXT NOT NULL,
                    role TEXT NOT NULL,
                    content TEXT NOT NULL,
                    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
                )
            """)
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error creating table: {e}")
        finally:
            conn.close()

def store_message(session_id: str, role: str, content: str):
    """Stores a message in the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "INSERT INTO messages (session_id, role, content) VALUES (%s, %s, %s)",
                (session_id, role, content),
            )
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error storing message: {e}")
        finally:
            conn.close()

create_table()  # Ensure the table exists

@app.route("/", methods=["GET", "POST"])
def chat():
    if "session_id" not in session:
        session["session_id"] = os.urandom(16).hex()  # Unique session ID

    if "history" not in session:
        session["history"] = [{"role": "system", "content": "You are a helpful assistant."}]

    if request.method == "POST":
        user_input = request.form["user_input"]
        session_id = session["session_id"]
        store_message(session_id, "user", user_input)  # Store in DB
        session["history"].append({"role": "user", "content": user_input})

        conn = get_db_connection() #
        if conn: #
            try: #
                cursor = conn.cursor() #
                cursor.execute("SELECT role, content, created_at FROM messages WHERE session_id = %s ORDER BY created_at", (session_id,)) #
                messages_from_db = cursor.fetchall() #
            except psycopg2.Error as e: #
                print(f"Error fetching messages: {e}") #
            finally: #
                conn.close() #

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=session["history"],
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        store_message(session_id, "assistant", assistant_reply)  # Store in DB
        session["history"].append({"role": "assistant", "content": assistant_reply})
        session.modified = True

        return render_template("chat.html", history=session.get("history")[1:])
    return render_template("chat.html", history=session.get("history")[1:])

Explanation:

  • The Streamlit example uses SQLite, a lightweight database that doesn't require a separate server. The Flask example uses PostgreSQL, a more robust database that is suitable for multi-user applications.
  • Both examples create a table named "messages" to store the conversation history. The table includes columns for message ID, role, content, and timestamp.
  • The store_message() function inserts a new message into the database.
  • The Flask example retrieves messages from the database and passes them to the template for display. The Streamlit example also shows how to retrieve data.

Benefits:

  • SQLite: Perfect for single-user applications with structured data. No separate database server is needed.
  • PostgreSQL: Ideal for multi-user systems requiring concurrent access. More robust and scalable than SQLite.
  • NoSQL (Not shown in detail): Best for flexible schema and unstructured conversation data. Databases like MongoDB or CouchDB would be suitable.
  • Drawbacks:
    • Requires more setup than using a JSON file.
    • Need to manage database connections and schemas.

3. Reuse memory in future sessions by tagging with a user ID

Process: To allow users to have persistent, personalized conversations, you can tag each message with a user ID and store this information in the database. When a user returns, you can retrieve their specific conversation history from the database.

Code Example (PostgreSQL - Flask):

from flask import Flask, request, render_template, session, redirect, url_for
import openai
import os
from dotenv import load_dotenv
import psycopg2
import datetime
from typing import List, Dict

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL")

app = Flask(__name__)
app.secret_key = os.urandom(24)

def get_db_connection():
    """Gets a PostgreSQL connection."""
    try:
        conn = psycopg2.connect(DATABASE_URL)
        return conn
    except psycopg2.Error as e:
        print(f"Database connection error: {e}")
        return None

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS messages (
                    id SERIAL PRIMARY KEY,
                    user_id TEXT NOT NULL,  -- Added user_id
                    role TEXT NOT NULL,
                    content TEXT NOT NULL,
                    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
                )
            """)
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error creating table: {e}")
        finally:
            conn.close()

def store_message(user_id: str, role: str, content: str):
    """Stores a message in the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "INSERT INTO messages (user_id, role, content) VALUES (%s, %s, %s)",
                (user_id, role, content),
            )
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error storing message: {e}")
        finally:
            conn.close()

create_table()

def get_user_history(user_id: str) -> List[Dict[str, str]]:
    """Retrieves a user's conversation history from the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "SELECT role, content FROM messages WHERE user_id = %s ORDER BY created_at",
                (user_id,),
            )
            history = [{"role": row[0], "content": row[1]} for row in cursor.fetchall()]
            return history
        except psycopg2.Error as e:
            print(f"Error retrieving user history: {e}")
            return []
        finally:
            conn.close()
    return []

@app.route("/", methods=["GET", "POST"])
def chat():
    if "user_id" not in session:
        session["user_id"] = os.urandom(16).hex()  # Unique user ID
    user_id = session["user_id"]

    history = get_user_history(user_id)  # Get user's history

    if request.method == "POST":
        user_input = request.form["user_input"]
        store_message(user_id, "user", user_input)  # Store with user ID
        history.append({"role": "user", "content": user_input})

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=history,
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        store_message(user_id, "assistant", assistant_reply)  # Store with user ID
        history.append({"role": "assistant", "content": assistant_reply})
        session["history"] = history #update

        return render_template("chat.html", history=history[1:])

    return render_template("chat.html", history=history[1:])

@app.route("/clear", methods=["POST"])
def clear_chat():
    session.pop("user_id", None)  #remove user id
    session.pop("history", None)
    return redirect(url_for("chat"))

Explanation:

  • The database table "messages" now includes a "user_id" column.
  • When a user starts a session, a unique "user_id" is generated and stored in the Flask session.
  • The store_message() function now requires a "user_id" and stores it along with the message.
  • The get_user_history() function retrieves the conversation history for a specific user from the database.
  • The chat route retrieves user history and uses it to construct the messages sent to OpenAI, thus maintaining conversation history across multiple visits from the same user.
  • Benefits:
    • Enables personalized conversation history for each user.
    • Allows for user-specific context and preferences.
    • Facilitates analysis of conversation patterns over time.
  • Drawbacks:
    • Requires a database.
    • More complex to implement than simple session-based storage.

In this section, you learned several crucial aspects of building a chatbot with memory capabilities:

  • Added session-based memory to your chatbot using st.session_state (Streamlit) and flask.session (Flask)
    • Implemented temporary storage for ongoing conversations
    • Learned how to manage session variables effectively
  • Preserved chat history across interactions
    • Created database schemas to store conversation data
    • Implemented methods to save and retrieve past messages
  • Improved context and coherence for multi-turn conversations
    • Developed systems to maintain conversation context
    • Enhanced natural language understanding through historical context
  • Learned to cap token usage by trimming message history
    • Implemented efficient message pruning strategies
    • Balanced memory retention with API token limitations

This gives your chatbot the ability to hold natural, flowing conversations — a key milestone toward building an intelligent assistant. With these features, your chatbot can now remember previous interactions, maintain context throughout conversations, and manage memory efficiently while staying within technical constraints.

4.3 Implementing Session-Based Chat Memory

Session-based memory is a crucial feature that transforms a simple chatbot into a truly interactive and context-aware conversational agent. Without memory, each interaction becomes isolated and disconnected, forcing users to repeatedly provide context and limiting the natural flow of conversation. This section explores how to implement robust memory management in your chatbot, enabling it to maintain coherent, contextual discussions across multiple exchanges.

We'll examine two distinct approaches to implementing session memory - using Streamlit's built-in state management and Flask's server-side sessions. Both methods offer their own advantages and can be tailored to meet specific project requirements. By the end of this section, you'll understand how to create a chatbot that can maintain context, remember previous interactions, and provide more meaningful and connected responses.

The implementation we'll cover ensures that your chatbot can:

  • Maintain contextual awareness throughout entire conversations
  • Handle complex multi-turn dialogues effectively
  • Provide more relevant and personalized responses based on conversation history
  • Manage memory efficiently without exceeding token limits
✅ Giving your assistant memory — so it doesn’t forget the conversation flow every time the page refreshes or the session resets.

4.3.1 Why Does Session Memory Matter?

Session memory is a fundamental aspect of creating intelligent chatbots that can engage in meaningful, context-aware conversations. Just like humans rely on memory during discussions, chatbots need a way to remember and process previous interactions. When we talk with others, we naturally reference earlier points, build on shared understanding, and maintain a coherent flow of ideas. This natural communication pattern is what makes conversations feel organic and meaningful. Without memory capabilities, chatbots are essentially starting fresh with each response, leading to disconnected and often frustrating interactions that feel more like talking to a machine than having a real conversation.

The concept of session memory fundamentally transforms chatbot interactions in several critical ways:

  • Remember and reference previous parts of the conversation with precision
    • Track specific details mentioned earlier in the chat, such as user preferences, technical specifications, or personal information shared
    • Reference past agreements or decisions accurately, ensuring continuity in complex discussions or negotiations
    • Maintain historical context for better problem-solving and support
  • Build context incrementally throughout an interaction
    • Understand complex topics that unfold over multiple messages, allowing for deeper exploration of subjects
    • Develop more sophisticated responses as the conversation progresses, building upon previously established concepts
    • Create a coherent narrative thread across multiple exchanges
  • Provide more nuanced and relevant responses based on the conversation history
    • Tailor answers to the user's demonstrated knowledge level, adjusting terminology and complexity accordingly
    • Avoid repeating information already discussed, making conversations more efficient
    • Use past interactions to provide more personalized and contextually appropriate responses
  • Create a more natural, human-like conversational experience
    • Maintain consistent personality and tone throughout the chat, enhancing user engagement
    • Adapt responses based on user preferences and past interactions, creating a more personalized experience
    • Learn from previous exchanges to improve the quality of future interactions

In real conversations, context builds over time in multiple sophisticated ways that mirror natural human dialogue patterns:

  • The user refers to past messages
    • Questions naturally build upon previous answers, creating a continuous thread of understanding
    • Topics evolve organically as users reference and expand on earlier points in the conversation
    • Previous context shapes how new information is interpreted and understood
  • The assistant needs to remember previous answers
    • Maintains consistency across responses to build trust and reliability
    • Uses established context to provide more nuanced and relevant information
    • Builds a comprehensive understanding of the user's needs over time
  • Follow-up questions depend on prior knowledge
    • Each question builds upon the foundation of previous exchanges
    • Complex topics can be explored gradually, with increasing depth
    • The conversation naturally progresses from basic concepts to more advanced understanding
    • Creates more meaningful dialogue chains by connecting related ideas
    • Enables natural conversation flow that feels more human-like and engaging

Without memory implementation, chatbots treat each interaction as an isolated event, completely disconnected from previous exchanges. This fundamental limitation affects even sophisticated models like GPT-4o in several critical ways:

  1. Loss of Context: Each response is generated without any awareness of previous conversations, making it impossible to maintain coherent, extended discussions.
  2. Repetitive Interactions: The chatbot may provide the same information multiple times or ask for details that were already shared, creating a frustrating user experience.
  3. Inconsistent Responses: Without access to previous exchanges, the chatbot might give contradictory answers to related questions, undermining user trust.
  4. Limited Understanding: The inability to reference past context means the chatbot cannot build upon previously established knowledge or adapt its responses based on the user's demonstrated understanding.

By the end of this section, you'll know how to:

  • Store and retrieve messages for a given session
    • Implement secure storage mechanisms using industry-standard encryption and protection
    • Handle different types of message data effectively, including text, structured data, and metadata
  • Maintain and update multi-turn memory
    • Process conversation chains efficiently using optimized data structures
    • Manage context across multiple exchanges while maintaining conversation coherence
  • Avoid bloated token usage by capping history length
    • Implement smart memory management strategies that prioritize relevant information
    • Balance context retention with performance through intelligent pruning algorithms

4.3.2 Streamlit: Session State Memory

Streamlit makes managing conversation history super simple with st.session_state, which persists data throughout the browser session.

Here's how you can create a chatbot with session memory using Streamlit:

Step 1: Import Libraries

import streamlit as st
import openai
import os
from dotenv import load_dotenv
  • streamlit: Used to create the user interface for the chatbot.
  • openai: The OpenAI Python library, used to interact with the GPT-4o model.
  • os: Provides a way to interact with the operating system, for example to access environment variables.
  • dotenv: Used to load environment variables from a .env file.

Step 2: Load API Key and Configure Page

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

st.set_page_config(page_title="GPT-4o Chat with Memory", page_icon="🧠")
st.title("🧠 GPT-4o Chatbot with Session Memory")
  • load_dotenv(): Loads the OpenAI API key from a .env file. This file should be in the same directory as your Python script and contain the line OPENAI_API_KEY=YOUR_API_KEY.
  • openai.api_key = os.getenv("OPENAI_API_KEY"): Sets the OpenAI API key.
  • st.set_page_config(...): Configures the page title and icon that appear in the browser tab.
  • st.title(...): Sets the title of the Streamlit application, which is displayed at the top of the page.

Step 3: Initialize Session State

if "messages" not in st.session_state:
    st.session_state.messages = [{"role": "system", "content": "You are a helpful assistant that remembers this session."}]
  • st.session_state: Streamlit's way of storing variables across user interactions. Data in st.session_state persists as long as the user's browser tab remains open.
  • This code checks if the key "messages" exists in st.session_state. If it doesn't (which is the case when the user first loads the app), it initializes "messages" to a list containing a single dictionary.
  • This dictionary represents the "system message," which is used to set the behavior of the assistant. In this case, the system message tells the assistant to be helpful and remember the conversation.

Step 4: Display Chat History

for msg in st.session_state.messages[1:]:
    with st.chat_message(msg["role"]):
        st.markdown(msg["content"])
  • This for loop iterates through the messages in the "messages" list, starting from the second message (index 1) to skip the system message.
  • st.chat_message(msg["role"]): Creates a chat bubble in the Streamlit app to display the message. The role ("user" or "assistant") determines the appearance of the bubble.
  • st.markdown(msg["content"]): Displays the content of the message within the chat bubble. st.markdown is used to render the text.

Step 5: User Input and Response Generation

user_input = st.chat_input("Say something...")
if user_input:
    st.chat_message("user").markdown(user_input)
    st.session_state.messages.append({"role": "user", "content": user_input})

    with st.chat_message("assistant"):
        with st.spinner("Thinking..."):
            response = openai.ChatCompletion.create(
                model="gpt-4o",
                messages=st.session_state.messages,
                temperature=0.6
            )
            reply = response["choices"][0]["message"]["content"]
            st.markdown(reply)
            st.session_state.messages.append({"role": "assistant", "content": reply})
  • user_input = st.chat_input("Say something..."): Creates a text input field at the bottom of the app where the user can type their message. The label "Say something..." is displayed next to the input field.
  • The if user_input: block is executed when the user enters text and presses Enter.
  • st.chat_message("user").markdown(user_input): Displays the user's message in the chat interface.
  • st.session_state.messages.append({"role": "user", "content": user_input}): Appends the user's message to the "messages" list in st.session_state, so it's stored in the conversation history.
  • with st.chat_message("assistant"):: Creates a chat bubble for the assistant's response.
  • with st.spinner("Thinking..."): Displays a spinner animation while the app is waiting for a response from the OpenAI API.
  • response = openai.ChatCompletion.create(...): Calls the OpenAI API to get a response from the GPT-4o model.
    • model: Specifies the language model to use ("gpt-4o").
    • messages: Passes the entire conversation history (stored in st.session_state.messages) to the API. This is how the model "remembers" the conversation.
    • temperature: A value between 0 and 1 that controls the randomness of the model's output. A lower value (e.g., 0.2) makes the output more deterministic, while a higher value (e.g., 0.8) makes it more random.
  • reply = response["choices"][0]["message"]["content"]: Extracts the assistant's reply from the API response.
  • st.markdown(reply): Displays the assistant's reply in the chat interface.
  • st.session_state.messages.append({"role": "assistant", "content": reply}): Appends the assistant's reply to the "messages" list in st.session_state.

This example creates a simple chatbot with memory using Streamlit. The st.session_state.messages list stores the conversation history, allowing the chatbot to maintain context across multiple interactions.  The chat history is displayed in the app, and the user can input messages using the st.chat_input field.  The assistant's responses are generated by the OpenAI GPT-4o model.

4.3.3 Flask: Using Server-Side Sessions

In Flask, server-side session management provides a robust way to maintain conversation history. The session object acts as a persistent dictionary that stores data on the server rather than the client side, making it more secure and reliable. You can use either the built-in session object, which stores data in an encrypted cookie, or implement in-memory storage solutions like Redis for better scalability.

This server-side approach ensures that the chatbot can maintain context and remember previous interactions throughout the user's active session, even if they navigate between different pages or refresh the browser.

Here's how to implement a chatbot with server-side sessions in Flask:

Step 1: Install Required Libraries

pip install flask openai flask-session python-dotenv
  • flask: A web framework.
  • openai: The OpenAI Python library.
  • flask-session: Flask extension to handle server-side sessions.
  • python-dotenv: To load environment variables from a .env file.

Step 2: Update app.py

from flask import Flask, request, render_template, session
import openai
import os
from dotenv import load_dotenv
from flask_session import Session

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

app = Flask(__name__)
app.secret_key = os.urandom(24)
app.config["SESSION_TYPE"] = "filesystem"
Session(app)

@app.route("/", methods=["GET", "POST"])
def chat():
    if "history" not in session:
        session["history"] = [
            {"role": "system", "content": "You are a helpful assistant."}
        ]

    if request.method == "POST":
        user_input = request.form["user_input"]
        session["history"].append({"role": "user", "content": user_input})

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=session["history"],
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        session["history"].append({"role": "assistant", "content": assistant_reply})

        return render_template("chat.html", history=session.get("history")[1:])  # Skip system prompt

    return render_template("chat.html", history=session.get("history")[1:])  # Skip system prompt
  • from flask import ...: Imports necessary Flask components, including session for managing user sessions.
  • import openai: Imports the OpenAI library.
  • import os: Imports the os module for interacting with the operating system, particularly for accessing environment variables.
  • from dotenv import load_dotenv: Imports the load_dotenv function from the python-dotenv library.
  • from flask_session import Session: Imports the Session class from flask_session.
  • load_dotenv(): Loads environment variables (like the OpenAI API key) from a .env file.
  • openai.api_key = os.getenv("OPENAI_API_KEY"): Retrieves the OpenAI API key from the environment and sets it for the OpenAI library.
  • app = Flask(__name__): Creates a Flask application instance.
  • app.secret_key = os.urandom(24): Sets a secret key for the Flask application. This is essential for using Flask sessions. os.urandom(24) generates a random, cryptographically secure key.
  • app.config["SESSION_TYPE"] = "filesystem": Configures Flask-Session to store session data on the server's file system. Other options like "redis" or "mongodb" are available for production use.
  • Session(app): Initializes the Flask-Session extension, binding it to the Flask app.
  • @app.route("/", methods=["GET", "POST"]): Defines the route for the application's main page ("/"). The chat() function handles both GET and POST requests.
  • def chat()::
    • if "history" not in session:: Checks if the user's session already has a conversation history. If not, it initializes the session with a system message. The system message helps set the behavior of the assistant.
    • if request.method == "POST":: Handles POST requests, which occur when the user submits a message through the chat form.
      • user_input = request.form["user_input"]: Retrieves the user's input from the form.
      • session["history"].append({"role": "user", "content": user_input}): Appends the user's message to the conversation history stored in the session.
      • response = openai.ChatCompletion.create(...): Calls the OpenAI API to get a response from the GPT-4o model, passing the conversation history.
      • assistant_reply = response["choices"][0]["message"]["content"]: Extracts the assistant's reply from the API response.
      • session["history"].append({"role": "assistant", "content": assistant_reply}): Appends the assistant's reply to the conversation history in the session.
      • return render_template("chat.html", history=session.get("history")[1:]): Renders the chat.html template, passing the conversation history (excluding the initial system message) to be displayed.
    • return render_template("chat.html", history=session.get("history")[1:]): Handles GET requests (when the user first loads the page). It renders the chat.html template, passing the conversation history (excluding the system message).

Step 3: Your HTML Template (templates/chat.html)

<!DOCTYPE html>
<html>
<head>
  <title>GPT-4o Assistant</title>
  <style>
    body { font-family: Arial; background: #f7f7f7; padding: 40px; }
    .container { max-width: 600px; margin: auto; background: white; padding: 20px; border-radius: 10px; }
    .user, .assistant { margin-bottom: 15px; }
    .user p { background: #d4f0ff; padding: 10px; border-radius: 10px; }
    .assistant p { background: #e8ffe8; padding: 10px; border-radius: 10px; }
    textarea { width: 100%; height: 80px; }
    input[type="submit"] { margin-top: 10px; padding: 10px 20px; }
  </style>
</head>
<body>
  <div class="container">
    <h2>GPT-4o Chatbot</h2>
    {% for msg in history %}
      <div class="{{ msg.role }}">
        <p><strong>{{ msg.role.capitalize() }}:</strong> {{ msg.content }}</p>
      </div>
    {% endfor %}
    <form method="post">
      <textarea name="user_input" placeholder="Type your message..."></textarea><br>
      <input type="submit" value="Send">
    </form>
  </div>
</body>
</html>
  • <!DOCTYPE html>: Declares the document type as HTML5.
  • <html>: The root element of the HTML document.
  • <head>: Contains metadata about the HTML document.
    • <title>: Specifies the title of the HTML page.
    • <style>: Includes CSS for basic styling of the chat interface.
  • <body>: Contains the visible content of the HTML page.
    • <div class="container">: A container for the chat application.
    • <h2>: A heading for the chat application.
    • {% for msg in history %}: A Jinja2 template loop that iterates through the history variable (passed from the Flask code) to display the chat messages.
      • <div class="{{ msg.role }}">: Creates a div element for each message. The class is set to the message's role ("user" or "assistant") for styling.
      • <p>: Displays the message content.
      • <strong>: Displays the role.
    • <form method="post">: A form for the user to submit their messages.
      • <textarea>: A multi-line text input field for the user to type their message.
      • <input type="submit" value="Send">: A button to send the message.
  • templates: Flask, by default, looks for HTML templates in a folder named "templates" in the same directory as your app.py file. So, this file should be saved as templates/chat.html.

This code creates a chatbot using Flask and OpenAI, with the conversation history stored in server-side sessions.  The server retains the chat memory for the duration of the user's session, clearing it when the user closes their browser tab.

4.3.4 Optional: Cap the Message History

Large language models have a limited context window (e.g., 128k tokens for GPT-4o), which determines how much of the conversation the model can "remember" at once. To prevent exceeding this limit and encountering errors, you should cap the number of messages stored in the conversation history.

Here's how to trim older entries in both Streamlit and Flask:

Streamlit

import streamlit as st

MAX_HISTORY = 20  # Maximum number of messages to keep

if SESSION_MESSAGES_KEY in st.session_state:
    if len(st.session_state[SESSION_MESSAGES_KEY]) > MAX_HISTORY:
        # Keep the system message and the last MAX_HISTORY messages
        st.session_state[SESSION_MESSAGES_KEY] = [st.session_state[SESSION_MESSAGES_KEY][0]] + st.session_state[SESSION_MESSAGES_KEY][-MAX_HISTORY + 1:]

Explanation:

  • MAX_HISTORY: A constant defining the maximum number of messages to retain. This example keeps the last 20 messages.
  • if SESSION_MESSAGES_KEY in st.session_state: Check if the key exists
  • The code then trims the st.session_state.messages list, preserving the first message (the system message) and the last MAX_HISTORY - 1 messages.

Flask

from flask import session

MAX_HISTORY = 20  # Maximum number of messages to keep

if SESSION_MESSAGES_KEY in session:
    if len(session[SESSION_MESSAGES_KEY]) > MAX_HISTORY:
        # Keep the system message and the last MAX_HISTORY messages
        session[SESSION_MESSAGES_KEY] = [session[SESSION_MESSAGES_KEY][0]] + session[SESSION_MESSAGES_KEY][-MAX_HISTORY + 1:]

Explanation:

  • MAX_HISTORY: A constant defining the maximum number of messages to retain.
  • if SESSION_MESSAGES_KEY in session: Check if the key exists
  • The code trims the session["history"] list, keeping the first message (the system message) and the last MAX_HISTORY - 1 messages.

Important Implementation Considerations:

  • System Message Management: The system message plays a crucial role in setting the chatbot's behavior and context. It must always be preserved as the first message in your conversation history. When implementing message trimming, ensure your code specifically maintains this message by:
    • Keeping it separate from the regular conversation flow
    • Including special handling in your trimming logic
    • Verifying its presence before each interaction
  • Comprehensive Testing Protocol: To ensure reliable chatbot performance:
    • Test with varying conversation lengths, from short exchanges to extended dialogues
    • Verify that context is maintained even after trimming
    • Check for potential edge cases where coherence might break
    • Monitor system resource usage during extended conversations
  • Advanced Trimming Strategies: Consider these sophisticated approaches:
    • Token-based trimming: Calculate actual token usage using a tokenizer
    • Importance-based trimming: Keep messages based on relevance
    • Hybrid approach: Combine token counting with message relevance
    • Dynamic adjustment: Modify trim threshold based on conversation complexity

4.3.5 Bonus Tip: Save to File or Database

Want to persist memory even after the session ends? Let's explore several effective methods for long-term memory storage:

1. Export conversation history to a JSON file

Process: The st.session_state.messages (in Streamlit) or session["history"] (in Flask) data can be saved to a JSON file. This involves converting the list of message dictionaries into a JSON string and writing it to a file.

Code Example (Streamlit):

import json
import streamlit as st

def save_chat_log(filename="chat_log.json"):
    """Saves the chat log from st.session_state to a JSON file."""
    if SESSION_MESSAGES_KEY in st.session_state:
        try:
            with open(filename, "w") as f:
                json.dump(st.session_state[SESSION_MESSAGES_KEY], f, indent=2)
            print(f"Chat log saved to {filename}")
        except Exception as e:
            st.error(f"Error saving chat log: {e}")

# Example usage:  Call this function when the user ends the session or when appropriate
if st.button("Save Chat Log"):
    save_chat_log()

Code Example (Flask):

import json
from flask import session, Flask

def save_chat_log(filename="chat_log.json"):
    """Saves the chat log from the Flask session to a JSON file."""
    if SESSION_MESSAGES_KEY in session:
        try:
            with open(filename, "w") as f:
                json.dump(session[SESSION_MESSAGES_KEY], f, indent=2)
            print(f"Chat log saved to {filename}")
        except Exception as e:
            #  Use the app context to display a message
            with app.app_context():
                print(f"Error saving chat log: {e}")

# Example Usage
app = Flask(__name__)
@app.route('/save_log')
def save_log():
    save_chat_log()
    return "Chat log saved!"

Explanation:

  • The json.dump() function is used to serialize the list of messages to a JSON formatted string. The indent=2 parameter makes the JSON file more human-readable.
  • The code handles potential errors during file writing.
  • The Streamlit example uses a button to trigger the save, and the Flask example creates a route /save_log to save the file.

Benefits:

  • Simple to implement using Python's built-in json module.
  • Good for small-scale applications and quick prototypes.
  • Easy to backup and version control.

Drawbacks:

  • Not ideal for large-scale, multi-user applications.
  • No efficient querying or indexing.

2. Store conversations in a SQLite, PostgreSQL, or NoSQL database

Process: Store the conversation history in a database. Each message can be a row in a table (for SQL databases) or a document in a collection (for NoSQL databases).

Code Example (SQLite - Streamlit):

import streamlit as st
import sqlite3
import datetime

def get_connection():
    """Gets or creates a SQLite connection."""
    conn = getattr(st.session_state, "sqlite_conn", None)
    if conn is None:
        conn = sqlite3.connect("chat_log.db")
        st.session_state.sqlite_conn = conn
    return conn

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute("""
        CREATE TABLE IF NOT EXISTS messages (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            role TEXT NOT NULL,
            content TEXT NOT NULL,
            created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        )
    """)
    conn.commit()

def store_message(role, content):
    """Stores a message in the database."""
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute(
        "INSERT INTO messages (role, content) VALUES (?, ?)",
        (role, content),
    )
    conn.commit()

create_table()  # Ensure table exists

# Store messages
if st.session_state.user_input:
    store_message("user", st.session_state.user_input)
if st.session_state.get("reply"):  # replace reply with a key you are using
    store_message("assistant", st.session_state.reply)

# Example of retrieving messages (optional, for demonstration)
conn = get_connection()
cursor = conn.cursor()
cursor.execute("SELECT role, content, created_at FROM messages ORDER BY created_at DESC LIMIT 5")
recent_messages = cursor.fetchall()
st.write("Last 5 Messages from DB:")
for row in recent_messages:
    st.write(f"{row[0]}: {row[1]} (at {row[2]})")

Code Example (PostgreSQL - Flask):

from flask import Flask, request, render_template, session
import openai
import os
from dotenv import load_dotenv
import psycopg2  # PostgreSQL library
import datetime
from typing import List, Dict

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL") #Make sure to set this in .env

app = Flask(__name__)
app.secret_key = os.urandom(24)

def get_db_connection():
    """Gets a PostgreSQL connection."""
    try:
        conn = psycopg2.connect(DATABASE_URL)
        return conn
    except psycopg2.Error as e:
        print(f"Database connection error: {e}")
        return None

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS messages (
                    id SERIAL PRIMARY KEY,
                    session_id TEXT NOT NULL,
                    role TEXT NOT NULL,
                    content TEXT NOT NULL,
                    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
                )
            """)
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error creating table: {e}")
        finally:
            conn.close()

def store_message(session_id: str, role: str, content: str):
    """Stores a message in the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "INSERT INTO messages (session_id, role, content) VALUES (%s, %s, %s)",
                (session_id, role, content),
            )
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error storing message: {e}")
        finally:
            conn.close()

create_table()  # Ensure the table exists

@app.route("/", methods=["GET", "POST"])
def chat():
    if "session_id" not in session:
        session["session_id"] = os.urandom(16).hex()  # Unique session ID

    if "history" not in session:
        session["history"] = [{"role": "system", "content": "You are a helpful assistant."}]

    if request.method == "POST":
        user_input = request.form["user_input"]
        session_id = session["session_id"]
        store_message(session_id, "user", user_input)  # Store in DB
        session["history"].append({"role": "user", "content": user_input})

        conn = get_db_connection() #
        if conn: #
            try: #
                cursor = conn.cursor() #
                cursor.execute("SELECT role, content, created_at FROM messages WHERE session_id = %s ORDER BY created_at", (session_id,)) #
                messages_from_db = cursor.fetchall() #
            except psycopg2.Error as e: #
                print(f"Error fetching messages: {e}") #
            finally: #
                conn.close() #

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=session["history"],
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        store_message(session_id, "assistant", assistant_reply)  # Store in DB
        session["history"].append({"role": "assistant", "content": assistant_reply})
        session.modified = True

        return render_template("chat.html", history=session.get("history")[1:])
    return render_template("chat.html", history=session.get("history")[1:])

Explanation:

  • The Streamlit example uses SQLite, a lightweight database that doesn't require a separate server. The Flask example uses PostgreSQL, a more robust database that is suitable for multi-user applications.
  • Both examples create a table named "messages" to store the conversation history. The table includes columns for message ID, role, content, and timestamp.
  • The store_message() function inserts a new message into the database.
  • The Flask example retrieves messages from the database and passes them to the template for display. The Streamlit example also shows how to retrieve data.

Benefits:

  • SQLite: Perfect for single-user applications with structured data. No separate database server is needed.
  • PostgreSQL: Ideal for multi-user systems requiring concurrent access. More robust and scalable than SQLite.
  • NoSQL (Not shown in detail): Best for flexible schema and unstructured conversation data. Databases like MongoDB or CouchDB would be suitable.
  • Drawbacks:
    • Requires more setup than using a JSON file.
    • Need to manage database connections and schemas.

3. Reuse memory in future sessions by tagging with a user ID

Process: To allow users to have persistent, personalized conversations, you can tag each message with a user ID and store this information in the database. When a user returns, you can retrieve their specific conversation history from the database.

Code Example (PostgreSQL - Flask):

from flask import Flask, request, render_template, session, redirect, url_for
import openai
import os
from dotenv import load_dotenv
import psycopg2
import datetime
from typing import List, Dict

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL")

app = Flask(__name__)
app.secret_key = os.urandom(24)

def get_db_connection():
    """Gets a PostgreSQL connection."""
    try:
        conn = psycopg2.connect(DATABASE_URL)
        return conn
    except psycopg2.Error as e:
        print(f"Database connection error: {e}")
        return None

def create_table():
    """Creates the messages table if it doesn't exist."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS messages (
                    id SERIAL PRIMARY KEY,
                    user_id TEXT NOT NULL,  -- Added user_id
                    role TEXT NOT NULL,
                    content TEXT NOT NULL,
                    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
                )
            """)
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error creating table: {e}")
        finally:
            conn.close()

def store_message(user_id: str, role: str, content: str):
    """Stores a message in the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "INSERT INTO messages (user_id, role, content) VALUES (%s, %s, %s)",
                (user_id, role, content),
            )
            conn.commit()
        except psycopg2.Error as e:
            print(f"Error storing message: {e}")
        finally:
            conn.close()

create_table()

def get_user_history(user_id: str) -> List[Dict[str, str]]:
    """Retrieves a user's conversation history from the database."""
    conn = get_db_connection()
    if conn is not None:
        try:
            cursor = conn.cursor()
            cursor.execute(
                "SELECT role, content FROM messages WHERE user_id = %s ORDER BY created_at",
                (user_id,),
            )
            history = [{"role": row[0], "content": row[1]} for row in cursor.fetchall()]
            return history
        except psycopg2.Error as e:
            print(f"Error retrieving user history: {e}")
            return []
        finally:
            conn.close()
    return []

@app.route("/", methods=["GET", "POST"])
def chat():
    if "user_id" not in session:
        session["user_id"] = os.urandom(16).hex()  # Unique user ID
    user_id = session["user_id"]

    history = get_user_history(user_id)  # Get user's history

    if request.method == "POST":
        user_input = request.form["user_input"]
        store_message(user_id, "user", user_input)  # Store with user ID
        history.append({"role": "user", "content": user_input})

        response = openai.ChatCompletion.create(
            model="gpt-4o",
            messages=history,
            temperature=0.6,
        )

        assistant_reply = response["choices"][0]["message"]["content"]
        store_message(user_id, "assistant", assistant_reply)  # Store with user ID
        history.append({"role": "assistant", "content": assistant_reply})
        session["history"] = history #update

        return render_template("chat.html", history=history[1:])

    return render_template("chat.html", history=history[1:])

@app.route("/clear", methods=["POST"])
def clear_chat():
    session.pop("user_id", None)  #remove user id
    session.pop("history", None)
    return redirect(url_for("chat"))

Explanation:

  • The database table "messages" now includes a "user_id" column.
  • When a user starts a session, a unique "user_id" is generated and stored in the Flask session.
  • The store_message() function now requires a "user_id" and stores it along with the message.
  • The get_user_history() function retrieves the conversation history for a specific user from the database.
  • The chat route retrieves user history and uses it to construct the messages sent to OpenAI, thus maintaining conversation history across multiple visits from the same user.
  • Benefits:
    • Enables personalized conversation history for each user.
    • Allows for user-specific context and preferences.
    • Facilitates analysis of conversation patterns over time.
  • Drawbacks:
    • Requires a database.
    • More complex to implement than simple session-based storage.

In this section, you learned several crucial aspects of building a chatbot with memory capabilities:

  • Added session-based memory to your chatbot using st.session_state (Streamlit) and flask.session (Flask)
    • Implemented temporary storage for ongoing conversations
    • Learned how to manage session variables effectively
  • Preserved chat history across interactions
    • Created database schemas to store conversation data
    • Implemented methods to save and retrieve past messages
  • Improved context and coherence for multi-turn conversations
    • Developed systems to maintain conversation context
    • Enhanced natural language understanding through historical context
  • Learned to cap token usage by trimming message history
    • Implemented efficient message pruning strategies
    • Balanced memory retention with API token limitations

This gives your chatbot the ability to hold natural, flowing conversations — a key milestone toward building an intelligent assistant. With these features, your chatbot can now remember previous interactions, maintain context throughout conversations, and manage memory efficiently while staying within technical constraints.