HeliosDB-Lite SQLite Advanced Features Guide¶

Once you're comfortable with basic drop-in SQLite compatibility, HeliosDB-Lite offers powerful advanced features that go far beyond traditional SQLite capabilities.

Table of Contents¶

Vector Search from Python
Database Branching
Time-Travel Queries
Encryption at Rest
Server Mode Switching
Hybrid Mode Usage
Performance Tuning
Gradual Migration to Full Features

Vector Search from Python¶

HeliosDB-Lite includes native vector search capabilities for semantic similarity, embeddings, and AI applications.

Basic Vector Search Setup¶

import heliosdb_sqlite as sqlite3
import numpy as np

# Connect with vector search enabled
conn = sqlite3.connect('vectors.db', enable_vector_search=True)
cursor = conn.cursor()

# Create table with vector column
cursor.execute('''
    CREATE TABLE documents (
        id INTEGER PRIMARY KEY,
        title TEXT,
        content TEXT,
        embedding VECTOR(384)  -- 384-dimensional vector
    )
''')

# Insert document with embedding
doc_embedding = np.random.rand(384).tolist()
cursor.execute('''
    INSERT INTO documents (title, content, embedding)
    VALUES (?, ?, ?)
''', ('Introduction to AI', 'Machine learning basics...', doc_embedding))
conn.commit()

Semantic Search Queries¶

# Generate query embedding
query_text = "artificial intelligence fundamentals"
query_embedding = generate_embedding(query_text)  # Your embedding model

# Find similar documents using cosine similarity
cursor.execute('''
    SELECT id, title, content,
           vector_distance(embedding, ?, 'cosine') as similarity
    FROM documents
    ORDER BY similarity ASC
    LIMIT 10
''', (query_embedding,))

results = cursor.fetchall()
for doc_id, title, content, similarity in results:
    print(f"{title}: {similarity:.4f}")

Advanced Vector Operations¶

# Different distance metrics
cursor.execute('''
    SELECT title,
           vector_distance(embedding, ?, 'cosine') as cosine_dist,
           vector_distance(embedding, ?, 'euclidean') as euclidean_dist,
           vector_distance(embedding, ?, 'manhattan') as manhattan_dist
    FROM documents
''', (query_embedding, query_embedding, query_embedding))

# Vector arithmetic (e.g., "king - man + woman = queen")
cursor.execute('''
    SELECT title,
           vector_distance(
               embedding,
               vector_add(
                   vector_subtract(?, ?),
                   ?
               ),
               'cosine'
           ) as similarity
    FROM documents
    ORDER BY similarity ASC
    LIMIT 5
''', (king_embedding, man_embedding, woman_embedding))

# Batch vector search
query_embeddings = [emb1, emb2, emb3, ...]
cursor.executemany('''
    SELECT id, title FROM documents
    WHERE vector_distance(embedding, ?, 'cosine') < 0.3
''', [(emb,) for emb in query_embeddings])

Real-World Example: Semantic Document Search¶

import heliosdb_sqlite as sqlite3
from sentence_transformers import SentenceTransformer

# Initialize embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Connect to database
conn = sqlite3.connect('knowledge_base.db', enable_vector_search=True)
cursor = conn.cursor()

# Create schema
cursor.execute('''
    CREATE TABLE IF NOT EXISTS articles (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        title TEXT NOT NULL,
        content TEXT NOT NULL,
        category TEXT,
        embedding VECTOR(384),
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    )
''')

def add_article(title, content, category):
    """Add article with automatic embedding generation"""
    embedding = model.encode(content).tolist()
    cursor.execute('''
        INSERT INTO articles (title, content, category, embedding)
        VALUES (?, ?, ?, ?)
    ''', (title, content, category, embedding))
    conn.commit()
    return cursor.lastrowid

def semantic_search(query, limit=10, category=None):
    """Semantic search with optional category filter"""
    query_embedding = model.encode(query).tolist()

    if category:
        cursor.execute('''
            SELECT id, title, content, category,
                   vector_distance(embedding, ?, 'cosine') as similarity
            FROM articles
            WHERE category = ?
            ORDER BY similarity ASC
            LIMIT ?
        ''', (query_embedding, category, limit))
    else:
        cursor.execute('''
            SELECT id, title, content, category,
                   vector_distance(embedding, ?, 'cosine') as similarity
            FROM articles
            ORDER BY similarity ASC
            LIMIT ?
        ''', (query_embedding, limit))

    return cursor.fetchall()

# Usage
add_article(
    'Introduction to Neural Networks',
    'Neural networks are computing systems inspired by biological neural networks...',
    'AI'
)

results = semantic_search('deep learning basics', limit=5)
for article_id, title, content, category, similarity in results:
    print(f"[{category}] {title} (similarity: {similarity:.4f})")

Database Branching¶

Create experimental branches of your database for testing without affecting production data.

Creating and Using Branches¶

import heliosdb_sqlite as sqlite3

conn = sqlite3.connect('production.db')

# Create a branch for experimentation
branch = conn.create_branch('experiment_2024')
print(f"Created branch: {branch.name}")

# Work on the branch
branch_conn = sqlite3.connect('production.db', branch='experiment_2024')
cursor = branch_conn.cursor()

# Make changes on the branch (doesn't affect main database)
cursor.execute('DELETE FROM users WHERE inactive = 1')
cursor.execute('UPDATE products SET price = price * 1.1')
branch_conn.commit()

# Test the changes
cursor.execute('SELECT COUNT(*) FROM users')
print(f"Users after cleanup: {cursor.fetchone()[0]}")

# If satisfied, merge back to main
if results_look_good():
    conn.merge_branch('experiment_2024')
    print("Branch merged successfully!")
else:
    conn.delete_branch('experiment_2024')
    print("Branch discarded")

branch_conn.close()
conn.close()

Use Cases for Branching¶

# Use Case 1: A/B Testing
def ab_test_query_performance():
    conn = sqlite3.connect('analytics.db')

    # Create branch A: with new index
    branch_a = conn.create_branch('with_index')
    conn_a = sqlite3.connect('analytics.db', branch='with_index')
    conn_a.execute('CREATE INDEX idx_user_date ON events(user_id, date)')

    # Create branch B: without index
    branch_b = conn.create_branch('without_index')
    conn_b = sqlite3.connect('analytics.db', branch='without_index')

    # Test query performance on both
    import time

    start = time.time()
    conn_a.execute('SELECT * FROM events WHERE user_id = ? AND date > ?', (123, '2024-01-01'))
    time_a = time.time() - start

    start = time.time()
    conn_b.execute('SELECT * FROM events WHERE user_id = ? AND date > ?', (123, '2024-01-01'))
    time_b = time.time() - start

    print(f"With index: {time_a:.4f}s")
    print(f"Without index: {time_b:.4f}s")

    # Merge the faster version
    if time_a < time_b:
        conn.merge_branch('with_index')
        print("Merged optimized version")

    conn_a.close()
    conn_b.close()
    conn.close()

# Use Case 2: Safe Schema Migrations
def safe_migration():
    conn = sqlite3.connect('app.db')

    # Create migration branch
    migration = conn.create_branch('schema_v2')
    conn_migration = sqlite3.connect('app.db', branch='schema_v2')

    try:
        # Attempt migration
        conn_migration.execute('ALTER TABLE users ADD COLUMN preferences JSON')
        conn_migration.execute('UPDATE users SET preferences = "{}"')
        conn_migration.commit()

        # Run tests on migrated schema
        run_integration_tests(conn_migration)

        # If tests pass, merge
        conn.merge_branch('schema_v2')
        print("Migration successful!")

    except Exception as e:
        print(f"Migration failed: {e}")
        conn.delete_branch('schema_v2')
        print("Rolled back to previous schema")

    conn_migration.close()
    conn.close()

Listing and Managing Branches¶

conn = sqlite3.connect('app.db')

# List all branches
branches = conn.list_branches()
for branch in branches:
    print(f"Branch: {branch.name}, Created: {branch.created_at}, Size: {branch.size_mb}MB")

# Get branch information
branch_info = conn.get_branch_info('experiment_2024')
print(f"Commits: {branch_info.commits}")
print(f"Diverged from main: {branch_info.divergence_point}")

# Compare branches
diff = conn.compare_branches('main', 'experiment_2024')
print(f"Tables changed: {diff.tables_modified}")
print(f"Rows changed: {diff.total_row_changes}")

# Clean up old branches
for branch in branches:
    if branch.age_days > 30 and not branch.is_active:
        conn.delete_branch(branch.name)
        print(f"Deleted old branch: {branch.name}")

Time-Travel Queries¶

Query historical data states without maintaining separate backup databases.

Basic Time-Travel Queries¶

import heliosdb_sqlite as sqlite3
from datetime import datetime, timedelta

conn = sqlite3.connect('app.db', enable_time_travel=True)
cursor = conn.cursor()

# Query current state
cursor.execute('SELECT * FROM users WHERE id = 123')
current_user = cursor.fetchone()
print(f"Current: {current_user}")

# Query state from 1 hour ago
one_hour_ago = datetime.now() - timedelta(hours=1)
cursor.execute(f'''
    SELECT * FROM users
    WHERE id = 123
    AS OF TIMESTAMP '{one_hour_ago.isoformat()}'
''')
past_user = cursor.fetchone()
print(f"1 hour ago: {past_user}")

# Query state from specific date
cursor.execute('''
    SELECT * FROM products
    AS OF TIMESTAMP '2024-01-01 00:00:00'
''')
products_jan1 = cursor.fetchall()
print(f"Products on Jan 1: {len(products_jan1)}")

Tracking Changes Over Time¶

# Find when a record changed
cursor.execute('''
    SELECT * FROM users
    WHERE id = 123
    FOR SYSTEM_TIME BETWEEN
        TIMESTAMP '2024-01-01 00:00:00'
        AND TIMESTAMP '2024-01-31 23:59:59'
''')

history = cursor.fetchall()
print(f"User had {len(history)} changes in January")

for timestamp, user_data in history:
    print(f"{timestamp}: {user_data}")

Audit and Compliance¶

def audit_user_changes(user_id, start_date, end_date):
    """Generate audit report of all changes to a user"""
    conn = sqlite3.connect('secure.db', enable_time_travel=True)
    cursor = conn.cursor()

    cursor.execute('''
        SELECT system_time, id, name, email, role, modified_by
        FROM users
        WHERE id = ?
        FOR SYSTEM_TIME BETWEEN ? AND ?
        ORDER BY system_time
    ''', (user_id, start_date, end_date))

    changes = cursor.fetchall()

    print(f"Audit Report for User {user_id}")
    print(f"Period: {start_date} to {end_date}")
    print(f"Total changes: {len(changes)}\n")

    for i, (timestamp, uid, name, email, role, modified_by) in enumerate(changes):
        if i > 0:
            print(f"\n--- Change {i} at {timestamp} by {modified_by} ---")
            prev = changes[i-1]
            if name != prev[2]:
                print(f"Name: {prev[2]} → {name}")
            if email != prev[3]:
                print(f"Email: {prev[3]} → {email}")
            if role != prev[4]:
                print(f"Role: {prev[4]} → {role}")
        else:
            print(f"Initial state at {timestamp}: {name}, {email}, {role}")

    conn.close()

# Usage
audit_user_changes(123, '2024-01-01', '2024-12-31')

Rollback to Previous State¶

def rollback_table_to_timestamp(table_name, target_timestamp):
    """Restore table to a previous state"""
    conn = sqlite3.connect('app.db', enable_time_travel=True)
    cursor = conn.cursor()

    # Get historical data
    cursor.execute(f'''
        SELECT * FROM {table_name}
        AS OF TIMESTAMP '{target_timestamp}'
    ''')
    historical_data = cursor.fetchall()

    # Get column names
    cursor.execute(f'PRAGMA table_info({table_name})')
    columns = [col[1] for col in cursor.fetchall()]

    # Clear current data
    cursor.execute(f'DELETE FROM {table_name}')

    # Restore historical data
    placeholders = ','.join(['?' for _ in columns])
    cursor.executemany(
        f'INSERT INTO {table_name} VALUES ({placeholders})',
        historical_data
    )

    conn.commit()
    print(f"Restored {table_name} to state at {target_timestamp}")
    print(f"Restored {len(historical_data)} rows")

    conn.close()

# Usage: rollback accidental deletion
rollback_table_to_timestamp('users', '2024-06-01 10:00:00')

Encryption at Rest¶

Protect sensitive data with transparent database encryption.

Enabling Encryption¶

import heliosdb_sqlite as sqlite3
import os

# Generate or retrieve encryption key (keep this secure!)
encryption_key = os.getenv('DB_ENCRYPTION_KEY')

# Create encrypted database
conn = sqlite3.connect('secure.db', encryption_key=encryption_key)
cursor = conn.cursor()

# Use normally - encryption is transparent
cursor.execute('CREATE TABLE secrets (id INTEGER, data TEXT)')
cursor.execute('INSERT INTO secrets VALUES (1, "sensitive information")')
conn.commit()

# Data is encrypted on disk
conn.close()

# To read, must provide the same key
conn = sqlite3.connect('secure.db', encryption_key=encryption_key)
cursor = conn.cursor()
cursor.execute('SELECT * FROM secrets')
print(cursor.fetchall())  # Decrypted automatically

Key Rotation¶

def rotate_encryption_key(db_path, old_key, new_key):
    """Safely rotate database encryption key"""
    import tempfile
    import shutil

    # Open with old key
    conn_old = sqlite3.connect(db_path, encryption_key=old_key)

    # Create temporary database with new key
    temp_db = tempfile.NamedTemporaryFile(delete=False, suffix='.db')
    conn_new = sqlite3.connect(temp_db.name, encryption_key=new_key)

    # Copy all data
    for line in conn_old.iterdump():
        conn_new.execute(line)
    conn_new.commit()

    conn_old.close()
    conn_new.close()

    # Replace old database
    shutil.move(temp_db.name, db_path)
    print("Encryption key rotated successfully")

# Usage
old_key = os.getenv('OLD_DB_KEY')
new_key = os.getenv('NEW_DB_KEY')
rotate_encryption_key('secure.db', old_key, new_key)

Encrypted Columns¶

# Encrypt specific columns while keeping others searchable
conn = sqlite3.connect('mixed.db', encryption_key=encryption_key)
cursor = conn.cursor()

cursor.execute('''
    CREATE TABLE users (
        id INTEGER PRIMARY KEY,
        username TEXT,  -- Not encrypted, searchable
        email TEXT ENCRYPTED,  -- Encrypted column
        password_hash TEXT ENCRYPTED
    )
''')

# Insert with automatic column encryption
cursor.execute('''
    INSERT INTO users (username, email, password_hash)
    VALUES (?, ?, ?)
''', ('alice', 'alice@example.com', 'hashed_password'))

# Search on non-encrypted columns (fast)
cursor.execute('SELECT * FROM users WHERE username = ?', ('alice',))

# Encrypted columns are decrypted automatically on read
user = cursor.fetchone()
print(f"Email: {user[2]}")  # Decrypted

Server Mode Switching¶

Run HeliosDB-Lite as a PostgreSQL-compatible server for multi-client access.

Starting Server Mode¶

import heliosdb_sqlite as sqlite3

# Start database in server mode
server = sqlite3.start_server(
    database='app.db',
    host='0.0.0.0',
    port=5432,
    max_connections=100
)

print(f"HeliosDB-Lite server listening on port {server.port}")
print(f"Connect using: postgresql://localhost:5432/app")

# Server runs in background, application continues
# Clients can connect using standard PostgreSQL drivers

# Stop server when done
# server.stop()

Connecting as Client¶

# From another Python script or machine
import psycopg2  # Standard PostgreSQL driver

# Connect to HeliosDB-Lite server
conn = psycopg2.connect(
    host='localhost',
    port=5432,
    database='app',
    user='helios',
    password=''
)

cursor = conn.cursor()
cursor.execute('SELECT * FROM users')
users = cursor.fetchall()
print(users)

conn.close()

Server Configuration¶

# Advanced server configuration
server = sqlite3.start_server(
    database='app.db',
    host='0.0.0.0',
    port=5432,
    max_connections=200,
    connection_timeout=30,
    enable_ssl=True,
    ssl_cert='server.crt',
    ssl_key='server.key',
    auth_method='password',  # or 'certificate'
    allowed_ips=['192.168.1.0/24', '10.0.0.0/8']
)

# Monitor server status
status = server.get_status()
print(f"Active connections: {status.active_connections}")
print(f"Total queries: {status.total_queries}")
print(f"Uptime: {status.uptime_seconds}s")

Hybrid Mode Usage¶

Combine embedded and server modes for flexible architecture.

Hybrid Mode Setup¶

import heliosdb_sqlite as sqlite3

# Start in hybrid mode
conn = sqlite3.connect('app.db', heliosdb_mode='hybrid')

# Use as embedded database (fast, local)
cursor = conn.cursor()
cursor.execute('SELECT COUNT(*) FROM users')
local_count = cursor.fetchone()[0]
print(f"Local access: {local_count} users")

# Also accessible via network (for other clients)
# Server automatically started in background
server_info = conn.get_server_info()
print(f"Also available at: {server_info.url}")

# Other applications can connect remotely
# while this application uses local access

Use Cases for Hybrid Mode¶

# Use Case 1: Web application with admin tools
# Web app uses local embedded access for speed
import heliosdb_sqlite as sqlite3
from flask import Flask

app = Flask(__name__)
db_conn = sqlite3.connect('app.db', heliosdb_mode='hybrid')

@app.route('/api/users')
def get_users():
    # Fast local access
    cursor = db_conn.cursor()
    cursor.execute('SELECT * FROM users')
    return jsonify(cursor.fetchall())

# Meanwhile, admin can connect remotely for management:
# $ psql -h localhost -p 5432 -U helios app

# Use Case 2: Local ML training with remote monitoring
import heliosdb_sqlite as sqlite3

# ML training script uses local access
conn = sqlite3.connect('training.db', heliosdb_mode='hybrid')

def train_model():
    cursor = conn.cursor()
    for epoch in range(100):
        # Fast local writes
        cursor.execute('INSERT INTO training_log VALUES (?, ?, ?)',
                      (epoch, loss, accuracy))
        conn.commit()

# Monitoring dashboard connects remotely
# to visualize training progress in real-time

Performance Tuning for HeliosDB-Lite¶

Optimize HeliosDB-Lite for your specific workload.

Connection Pooling¶

from heliosdb_sqlite import connection_pool

# Create connection pool
pool = connection_pool.create(
    database='app.db',
    pool_size=20,
    max_overflow=10,
    timeout=30,
    recycle=3600  # Recycle connections after 1 hour
)

# Use connections from pool
with pool.connection() as conn:
    cursor = conn.cursor()
    cursor.execute('SELECT * FROM users')
    users = cursor.fetchall()

# Connection automatically returned to pool

Query Optimization¶

conn = sqlite3.connect('app.db')

# Enable query plan analysis
conn.enable_query_analyzer()

# Run your queries
cursor = conn.cursor()
cursor.execute('SELECT * FROM users WHERE created_at > ?', ('2024-01-01',))

# Get optimization suggestions
suggestions = conn.get_query_suggestions()
for suggestion in suggestions:
    print(f"Query: {suggestion.query}")
    print(f"Current plan: {suggestion.current_plan}")
    print(f"Suggested index: {suggestion.suggested_index}")
    print(f"Expected speedup: {suggestion.speedup_factor}x")

Batch Operations¶

# Efficient batch inserts
conn = sqlite3.connect('app.db')
cursor = conn.cursor()

# Use batch_insert for better performance
users = [
    ('user1', 'email1@example.com'),
    ('user2', 'email2@example.com'),
    # ... thousands of users
]

# Traditional way (slower)
# for user in users:
#     cursor.execute('INSERT INTO users VALUES (?, ?)', user)

# Optimized way (faster)
cursor.batch_insert('users', ['name', 'email'], users)
conn.commit()

print(f"Inserted {len(users)} users efficiently")

Caching Strategies¶

# Enable query result caching
conn = sqlite3.connect('app.db', enable_cache=True, cache_size_mb=100)

# First query: hits database
cursor = conn.cursor()
cursor.execute('SELECT * FROM products WHERE category = ?', ('electronics',))
products1 = cursor.fetchall()  # Slow (first time)

# Same query: served from cache
cursor.execute('SELECT * FROM products WHERE category = ?', ('electronics',))
products2 = cursor.fetchall()  # Fast (cached)

# Cache statistics
stats = conn.get_cache_stats()
print(f"Cache hit rate: {stats.hit_rate:.2%}")
print(f"Queries served from cache: {stats.cache_hits}")

Gradual Migration to Full HeliosDB-Lite Features¶

Incrementally adopt advanced features as your application grows.

Phase 1: Drop-In Replacement¶

# Start simple: just replace import
import heliosdb_sqlite as sqlite3

# Everything else unchanged
conn = sqlite3.connect('app.db')
# ... existing code ...

Phase 2: Enable Better Concurrency¶

# No code changes needed!
# HeliosDB-Lite automatically handles concurrent writes better
# Your existing multi-threaded/async code just works better

Phase 3: Add Vector Search¶

# Add vector search to one feature
conn = sqlite3.connect('app.db', enable_vector_search=True)

# Add vector column to existing table
cursor = conn.cursor()
cursor.execute('ALTER TABLE articles ADD COLUMN embedding VECTOR(384)')

# Populate embeddings for existing data
cursor.execute('SELECT id, content FROM articles')
for article_id, content in cursor.fetchall():
    embedding = generate_embedding(content)
    cursor.execute('UPDATE articles SET embedding = ? WHERE id = ?',
                  (embedding, article_id))
conn.commit()

# Now you have semantic search!

Phase 4: Enable Time-Travel for Audit¶

# Enable time-travel on sensitive tables
conn = sqlite3.connect('app.db', enable_time_travel=True)

# Add system versioning to audit-critical tables
cursor.execute('''
    ALTER TABLE financial_transactions
    ADD SYSTEM VERSIONING
''')

# Now you have automatic audit trail

Phase 5: Create Development Branches¶

# Use branching for safe testing
conn = sqlite3.connect('app.db')

# Before risky operation, create branch
test_branch = conn.create_branch('test_migration')
test_conn = sqlite3.connect('app.db', branch='test_migration')

# Test your changes
test_conn.execute('ALTER TABLE users ADD COLUMN new_field TEXT')

# If successful, merge; if not, discard
if tests_pass():
    conn.merge_branch('test_migration')
else:
    conn.delete_branch('test_migration')

Phase 6: Scale with Server Mode¶

# When you outgrow single-machine setup
server = sqlite3.start_server(
    database='app.db',
    host='0.0.0.0',
    port=5432,
    max_connections=200
)

# Application can still use local access for speed
# But now supports remote clients too

Summary¶

HeliosDB-Lite advanced features are opt-in and gradual:

Start with drop-in SQLite compatibility
Add features as needed:
Vector search for AI/ML applications
Time-travel for audit and compliance
Branching for safe experimentation
Encryption for security
Server mode for scale
Hybrid mode for flexibility
No need to adopt everything at once
Each feature enhances, doesn't replace, basic functionality
All features work together seamlessly

Remember: You can stay on basic SQLite compatibility forever if that meets your needs. Advanced features are there when you're ready.

For more information: - Drop-in guide: HELIOSDB_SQLITE_DROP_IN_GUIDE.md - Migration patterns: HELIOSDB_SQLITE_MIGRATION_PATTERNS.md - FAQ: HELIOSDB_SQLITE_FAQ.md