HeliosDB-Lite SQLite Advanced Features Guide¶
Once you're comfortable with basic drop-in SQLite compatibility, HeliosDB-Lite offers powerful advanced features that go far beyond traditional SQLite capabilities.
Table of Contents¶
- Vector Search from Python
- Database Branching
- Time-Travel Queries
- Encryption at Rest
- Server Mode Switching
- Hybrid Mode Usage
- Performance Tuning
- Gradual Migration to Full Features
Vector Search from Python¶
HeliosDB-Lite includes native vector search capabilities for semantic similarity, embeddings, and AI applications.
Basic Vector Search Setup¶
import heliosdb_sqlite as sqlite3
import numpy as np
# Connect with vector search enabled
conn = sqlite3.connect('vectors.db', enable_vector_search=True)
cursor = conn.cursor()
# Create table with vector column
cursor.execute('''
CREATE TABLE documents (
id INTEGER PRIMARY KEY,
title TEXT,
content TEXT,
embedding VECTOR(384) -- 384-dimensional vector
)
''')
# Insert document with embedding
doc_embedding = np.random.rand(384).tolist()
cursor.execute('''
INSERT INTO documents (title, content, embedding)
VALUES (?, ?, ?)
''', ('Introduction to AI', 'Machine learning basics...', doc_embedding))
conn.commit()
Semantic Search Queries¶
# Generate query embedding
query_text = "artificial intelligence fundamentals"
query_embedding = generate_embedding(query_text) # Your embedding model
# Find similar documents using cosine similarity
cursor.execute('''
SELECT id, title, content,
vector_distance(embedding, ?, 'cosine') as similarity
FROM documents
ORDER BY similarity ASC
LIMIT 10
''', (query_embedding,))
results = cursor.fetchall()
for doc_id, title, content, similarity in results:
print(f"{title}: {similarity:.4f}")
Advanced Vector Operations¶
# Different distance metrics
cursor.execute('''
SELECT title,
vector_distance(embedding, ?, 'cosine') as cosine_dist,
vector_distance(embedding, ?, 'euclidean') as euclidean_dist,
vector_distance(embedding, ?, 'manhattan') as manhattan_dist
FROM documents
''', (query_embedding, query_embedding, query_embedding))
# Vector arithmetic (e.g., "king - man + woman = queen")
cursor.execute('''
SELECT title,
vector_distance(
embedding,
vector_add(
vector_subtract(?, ?),
?
),
'cosine'
) as similarity
FROM documents
ORDER BY similarity ASC
LIMIT 5
''', (king_embedding, man_embedding, woman_embedding))
# Batch vector search
query_embeddings = [emb1, emb2, emb3, ...]
cursor.executemany('''
SELECT id, title FROM documents
WHERE vector_distance(embedding, ?, 'cosine') < 0.3
''', [(emb,) for emb in query_embeddings])
Real-World Example: Semantic Document Search¶
import heliosdb_sqlite as sqlite3
from sentence_transformers import SentenceTransformer
# Initialize embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')
# Connect to database
conn = sqlite3.connect('knowledge_base.db', enable_vector_search=True)
cursor = conn.cursor()
# Create schema
cursor.execute('''
CREATE TABLE IF NOT EXISTS articles (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
content TEXT NOT NULL,
category TEXT,
embedding VECTOR(384),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
''')
def add_article(title, content, category):
"""Add article with automatic embedding generation"""
embedding = model.encode(content).tolist()
cursor.execute('''
INSERT INTO articles (title, content, category, embedding)
VALUES (?, ?, ?, ?)
''', (title, content, category, embedding))
conn.commit()
return cursor.lastrowid
def semantic_search(query, limit=10, category=None):
"""Semantic search with optional category filter"""
query_embedding = model.encode(query).tolist()
if category:
cursor.execute('''
SELECT id, title, content, category,
vector_distance(embedding, ?, 'cosine') as similarity
FROM articles
WHERE category = ?
ORDER BY similarity ASC
LIMIT ?
''', (query_embedding, category, limit))
else:
cursor.execute('''
SELECT id, title, content, category,
vector_distance(embedding, ?, 'cosine') as similarity
FROM articles
ORDER BY similarity ASC
LIMIT ?
''', (query_embedding, limit))
return cursor.fetchall()
# Usage
add_article(
'Introduction to Neural Networks',
'Neural networks are computing systems inspired by biological neural networks...',
'AI'
)
results = semantic_search('deep learning basics', limit=5)
for article_id, title, content, category, similarity in results:
print(f"[{category}] {title} (similarity: {similarity:.4f})")
Database Branching¶
Create experimental branches of your database for testing without affecting production data.
Creating and Using Branches¶
import heliosdb_sqlite as sqlite3
conn = sqlite3.connect('production.db')
# Create a branch for experimentation
branch = conn.create_branch('experiment_2024')
print(f"Created branch: {branch.name}")
# Work on the branch
branch_conn = sqlite3.connect('production.db', branch='experiment_2024')
cursor = branch_conn.cursor()
# Make changes on the branch (doesn't affect main database)
cursor.execute('DELETE FROM users WHERE inactive = 1')
cursor.execute('UPDATE products SET price = price * 1.1')
branch_conn.commit()
# Test the changes
cursor.execute('SELECT COUNT(*) FROM users')
print(f"Users after cleanup: {cursor.fetchone()[0]}")
# If satisfied, merge back to main
if results_look_good():
conn.merge_branch('experiment_2024')
print("Branch merged successfully!")
else:
conn.delete_branch('experiment_2024')
print("Branch discarded")
branch_conn.close()
conn.close()
Use Cases for Branching¶
# Use Case 1: A/B Testing
def ab_test_query_performance():
conn = sqlite3.connect('analytics.db')
# Create branch A: with new index
branch_a = conn.create_branch('with_index')
conn_a = sqlite3.connect('analytics.db', branch='with_index')
conn_a.execute('CREATE INDEX idx_user_date ON events(user_id, date)')
# Create branch B: without index
branch_b = conn.create_branch('without_index')
conn_b = sqlite3.connect('analytics.db', branch='without_index')
# Test query performance on both
import time
start = time.time()
conn_a.execute('SELECT * FROM events WHERE user_id = ? AND date > ?', (123, '2024-01-01'))
time_a = time.time() - start
start = time.time()
conn_b.execute('SELECT * FROM events WHERE user_id = ? AND date > ?', (123, '2024-01-01'))
time_b = time.time() - start
print(f"With index: {time_a:.4f}s")
print(f"Without index: {time_b:.4f}s")
# Merge the faster version
if time_a < time_b:
conn.merge_branch('with_index')
print("Merged optimized version")
conn_a.close()
conn_b.close()
conn.close()
# Use Case 2: Safe Schema Migrations
def safe_migration():
conn = sqlite3.connect('app.db')
# Create migration branch
migration = conn.create_branch('schema_v2')
conn_migration = sqlite3.connect('app.db', branch='schema_v2')
try:
# Attempt migration
conn_migration.execute('ALTER TABLE users ADD COLUMN preferences JSON')
conn_migration.execute('UPDATE users SET preferences = "{}"')
conn_migration.commit()
# Run tests on migrated schema
run_integration_tests(conn_migration)
# If tests pass, merge
conn.merge_branch('schema_v2')
print("Migration successful!")
except Exception as e:
print(f"Migration failed: {e}")
conn.delete_branch('schema_v2')
print("Rolled back to previous schema")
conn_migration.close()
conn.close()
Listing and Managing Branches¶
conn = sqlite3.connect('app.db')
# List all branches
branches = conn.list_branches()
for branch in branches:
print(f"Branch: {branch.name}, Created: {branch.created_at}, Size: {branch.size_mb}MB")
# Get branch information
branch_info = conn.get_branch_info('experiment_2024')
print(f"Commits: {branch_info.commits}")
print(f"Diverged from main: {branch_info.divergence_point}")
# Compare branches
diff = conn.compare_branches('main', 'experiment_2024')
print(f"Tables changed: {diff.tables_modified}")
print(f"Rows changed: {diff.total_row_changes}")
# Clean up old branches
for branch in branches:
if branch.age_days > 30 and not branch.is_active:
conn.delete_branch(branch.name)
print(f"Deleted old branch: {branch.name}")
Time-Travel Queries¶
Query historical data states without maintaining separate backup databases.
Basic Time-Travel Queries¶
import heliosdb_sqlite as sqlite3
from datetime import datetime, timedelta
conn = sqlite3.connect('app.db', enable_time_travel=True)
cursor = conn.cursor()
# Query current state
cursor.execute('SELECT * FROM users WHERE id = 123')
current_user = cursor.fetchone()
print(f"Current: {current_user}")
# Query state from 1 hour ago
one_hour_ago = datetime.now() - timedelta(hours=1)
cursor.execute(f'''
SELECT * FROM users
WHERE id = 123
AS OF TIMESTAMP '{one_hour_ago.isoformat()}'
''')
past_user = cursor.fetchone()
print(f"1 hour ago: {past_user}")
# Query state from specific date
cursor.execute('''
SELECT * FROM products
AS OF TIMESTAMP '2024-01-01 00:00:00'
''')
products_jan1 = cursor.fetchall()
print(f"Products on Jan 1: {len(products_jan1)}")
Tracking Changes Over Time¶
# Find when a record changed
cursor.execute('''
SELECT * FROM users
WHERE id = 123
FOR SYSTEM_TIME BETWEEN
TIMESTAMP '2024-01-01 00:00:00'
AND TIMESTAMP '2024-01-31 23:59:59'
''')
history = cursor.fetchall()
print(f"User had {len(history)} changes in January")
for timestamp, user_data in history:
print(f"{timestamp}: {user_data}")
Audit and Compliance¶
def audit_user_changes(user_id, start_date, end_date):
"""Generate audit report of all changes to a user"""
conn = sqlite3.connect('secure.db', enable_time_travel=True)
cursor = conn.cursor()
cursor.execute('''
SELECT system_time, id, name, email, role, modified_by
FROM users
WHERE id = ?
FOR SYSTEM_TIME BETWEEN ? AND ?
ORDER BY system_time
''', (user_id, start_date, end_date))
changes = cursor.fetchall()
print(f"Audit Report for User {user_id}")
print(f"Period: {start_date} to {end_date}")
print(f"Total changes: {len(changes)}\n")
for i, (timestamp, uid, name, email, role, modified_by) in enumerate(changes):
if i > 0:
print(f"\n--- Change {i} at {timestamp} by {modified_by} ---")
prev = changes[i-1]
if name != prev[2]:
print(f"Name: {prev[2]} → {name}")
if email != prev[3]:
print(f"Email: {prev[3]} → {email}")
if role != prev[4]:
print(f"Role: {prev[4]} → {role}")
else:
print(f"Initial state at {timestamp}: {name}, {email}, {role}")
conn.close()
# Usage
audit_user_changes(123, '2024-01-01', '2024-12-31')
Rollback to Previous State¶
def rollback_table_to_timestamp(table_name, target_timestamp):
"""Restore table to a previous state"""
conn = sqlite3.connect('app.db', enable_time_travel=True)
cursor = conn.cursor()
# Get historical data
cursor.execute(f'''
SELECT * FROM {table_name}
AS OF TIMESTAMP '{target_timestamp}'
''')
historical_data = cursor.fetchall()
# Get column names
cursor.execute(f'PRAGMA table_info({table_name})')
columns = [col[1] for col in cursor.fetchall()]
# Clear current data
cursor.execute(f'DELETE FROM {table_name}')
# Restore historical data
placeholders = ','.join(['?' for _ in columns])
cursor.executemany(
f'INSERT INTO {table_name} VALUES ({placeholders})',
historical_data
)
conn.commit()
print(f"Restored {table_name} to state at {target_timestamp}")
print(f"Restored {len(historical_data)} rows")
conn.close()
# Usage: rollback accidental deletion
rollback_table_to_timestamp('users', '2024-06-01 10:00:00')
Encryption at Rest¶
Protect sensitive data with transparent database encryption.
Enabling Encryption¶
import heliosdb_sqlite as sqlite3
import os
# Generate or retrieve encryption key (keep this secure!)
encryption_key = os.getenv('DB_ENCRYPTION_KEY')
# Create encrypted database
conn = sqlite3.connect('secure.db', encryption_key=encryption_key)
cursor = conn.cursor()
# Use normally - encryption is transparent
cursor.execute('CREATE TABLE secrets (id INTEGER, data TEXT)')
cursor.execute('INSERT INTO secrets VALUES (1, "sensitive information")')
conn.commit()
# Data is encrypted on disk
conn.close()
# To read, must provide the same key
conn = sqlite3.connect('secure.db', encryption_key=encryption_key)
cursor = conn.cursor()
cursor.execute('SELECT * FROM secrets')
print(cursor.fetchall()) # Decrypted automatically
Key Rotation¶
def rotate_encryption_key(db_path, old_key, new_key):
"""Safely rotate database encryption key"""
import tempfile
import shutil
# Open with old key
conn_old = sqlite3.connect(db_path, encryption_key=old_key)
# Create temporary database with new key
temp_db = tempfile.NamedTemporaryFile(delete=False, suffix='.db')
conn_new = sqlite3.connect(temp_db.name, encryption_key=new_key)
# Copy all data
for line in conn_old.iterdump():
conn_new.execute(line)
conn_new.commit()
conn_old.close()
conn_new.close()
# Replace old database
shutil.move(temp_db.name, db_path)
print("Encryption key rotated successfully")
# Usage
old_key = os.getenv('OLD_DB_KEY')
new_key = os.getenv('NEW_DB_KEY')
rotate_encryption_key('secure.db', old_key, new_key)
Encrypted Columns¶
# Encrypt specific columns while keeping others searchable
conn = sqlite3.connect('mixed.db', encryption_key=encryption_key)
cursor = conn.cursor()
cursor.execute('''
CREATE TABLE users (
id INTEGER PRIMARY KEY,
username TEXT, -- Not encrypted, searchable
email TEXT ENCRYPTED, -- Encrypted column
password_hash TEXT ENCRYPTED
)
''')
# Insert with automatic column encryption
cursor.execute('''
INSERT INTO users (username, email, password_hash)
VALUES (?, ?, ?)
''', ('alice', 'alice@example.com', 'hashed_password'))
# Search on non-encrypted columns (fast)
cursor.execute('SELECT * FROM users WHERE username = ?', ('alice',))
# Encrypted columns are decrypted automatically on read
user = cursor.fetchone()
print(f"Email: {user[2]}") # Decrypted
Server Mode Switching¶
Run HeliosDB-Lite as a PostgreSQL-compatible server for multi-client access.
Starting Server Mode¶
import heliosdb_sqlite as sqlite3
# Start database in server mode
server = sqlite3.start_server(
database='app.db',
host='0.0.0.0',
port=5432,
max_connections=100
)
print(f"HeliosDB-Lite server listening on port {server.port}")
print(f"Connect using: postgresql://localhost:5432/app")
# Server runs in background, application continues
# Clients can connect using standard PostgreSQL drivers
# Stop server when done
# server.stop()
Connecting as Client¶
# From another Python script or machine
import psycopg2 # Standard PostgreSQL driver
# Connect to HeliosDB-Lite server
conn = psycopg2.connect(
host='localhost',
port=5432,
database='app',
user='helios',
password=''
)
cursor = conn.cursor()
cursor.execute('SELECT * FROM users')
users = cursor.fetchall()
print(users)
conn.close()
Server Configuration¶
# Advanced server configuration
server = sqlite3.start_server(
database='app.db',
host='0.0.0.0',
port=5432,
max_connections=200,
connection_timeout=30,
enable_ssl=True,
ssl_cert='server.crt',
ssl_key='server.key',
auth_method='password', # or 'certificate'
allowed_ips=['192.168.1.0/24', '10.0.0.0/8']
)
# Monitor server status
status = server.get_status()
print(f"Active connections: {status.active_connections}")
print(f"Total queries: {status.total_queries}")
print(f"Uptime: {status.uptime_seconds}s")
Hybrid Mode Usage¶
Combine embedded and server modes for flexible architecture.
Hybrid Mode Setup¶
import heliosdb_sqlite as sqlite3
# Start in hybrid mode
conn = sqlite3.connect('app.db', heliosdb_mode='hybrid')
# Use as embedded database (fast, local)
cursor = conn.cursor()
cursor.execute('SELECT COUNT(*) FROM users')
local_count = cursor.fetchone()[0]
print(f"Local access: {local_count} users")
# Also accessible via network (for other clients)
# Server automatically started in background
server_info = conn.get_server_info()
print(f"Also available at: {server_info.url}")
# Other applications can connect remotely
# while this application uses local access
Use Cases for Hybrid Mode¶
# Use Case 1: Web application with admin tools
# Web app uses local embedded access for speed
import heliosdb_sqlite as sqlite3
from flask import Flask
app = Flask(__name__)
db_conn = sqlite3.connect('app.db', heliosdb_mode='hybrid')
@app.route('/api/users')
def get_users():
# Fast local access
cursor = db_conn.cursor()
cursor.execute('SELECT * FROM users')
return jsonify(cursor.fetchall())
# Meanwhile, admin can connect remotely for management:
# $ psql -h localhost -p 5432 -U helios app
# Use Case 2: Local ML training with remote monitoring
import heliosdb_sqlite as sqlite3
# ML training script uses local access
conn = sqlite3.connect('training.db', heliosdb_mode='hybrid')
def train_model():
cursor = conn.cursor()
for epoch in range(100):
# Fast local writes
cursor.execute('INSERT INTO training_log VALUES (?, ?, ?)',
(epoch, loss, accuracy))
conn.commit()
# Monitoring dashboard connects remotely
# to visualize training progress in real-time
Performance Tuning for HeliosDB-Lite¶
Optimize HeliosDB-Lite for your specific workload.
Connection Pooling¶
from heliosdb_sqlite import connection_pool
# Create connection pool
pool = connection_pool.create(
database='app.db',
pool_size=20,
max_overflow=10,
timeout=30,
recycle=3600 # Recycle connections after 1 hour
)
# Use connections from pool
with pool.connection() as conn:
cursor = conn.cursor()
cursor.execute('SELECT * FROM users')
users = cursor.fetchall()
# Connection automatically returned to pool
Query Optimization¶
conn = sqlite3.connect('app.db')
# Enable query plan analysis
conn.enable_query_analyzer()
# Run your queries
cursor = conn.cursor()
cursor.execute('SELECT * FROM users WHERE created_at > ?', ('2024-01-01',))
# Get optimization suggestions
suggestions = conn.get_query_suggestions()
for suggestion in suggestions:
print(f"Query: {suggestion.query}")
print(f"Current plan: {suggestion.current_plan}")
print(f"Suggested index: {suggestion.suggested_index}")
print(f"Expected speedup: {suggestion.speedup_factor}x")
Batch Operations¶
# Efficient batch inserts
conn = sqlite3.connect('app.db')
cursor = conn.cursor()
# Use batch_insert for better performance
users = [
('user1', 'email1@example.com'),
('user2', 'email2@example.com'),
# ... thousands of users
]
# Traditional way (slower)
# for user in users:
# cursor.execute('INSERT INTO users VALUES (?, ?)', user)
# Optimized way (faster)
cursor.batch_insert('users', ['name', 'email'], users)
conn.commit()
print(f"Inserted {len(users)} users efficiently")
Caching Strategies¶
# Enable query result caching
conn = sqlite3.connect('app.db', enable_cache=True, cache_size_mb=100)
# First query: hits database
cursor = conn.cursor()
cursor.execute('SELECT * FROM products WHERE category = ?', ('electronics',))
products1 = cursor.fetchall() # Slow (first time)
# Same query: served from cache
cursor.execute('SELECT * FROM products WHERE category = ?', ('electronics',))
products2 = cursor.fetchall() # Fast (cached)
# Cache statistics
stats = conn.get_cache_stats()
print(f"Cache hit rate: {stats.hit_rate:.2%}")
print(f"Queries served from cache: {stats.cache_hits}")
Gradual Migration to Full HeliosDB-Lite Features¶
Incrementally adopt advanced features as your application grows.
Phase 1: Drop-In Replacement¶
# Start simple: just replace import
import heliosdb_sqlite as sqlite3
# Everything else unchanged
conn = sqlite3.connect('app.db')
# ... existing code ...
Phase 2: Enable Better Concurrency¶
# No code changes needed!
# HeliosDB-Lite automatically handles concurrent writes better
# Your existing multi-threaded/async code just works better
Phase 3: Add Vector Search¶
# Add vector search to one feature
conn = sqlite3.connect('app.db', enable_vector_search=True)
# Add vector column to existing table
cursor = conn.cursor()
cursor.execute('ALTER TABLE articles ADD COLUMN embedding VECTOR(384)')
# Populate embeddings for existing data
cursor.execute('SELECT id, content FROM articles')
for article_id, content in cursor.fetchall():
embedding = generate_embedding(content)
cursor.execute('UPDATE articles SET embedding = ? WHERE id = ?',
(embedding, article_id))
conn.commit()
# Now you have semantic search!
Phase 4: Enable Time-Travel for Audit¶
# Enable time-travel on sensitive tables
conn = sqlite3.connect('app.db', enable_time_travel=True)
# Add system versioning to audit-critical tables
cursor.execute('''
ALTER TABLE financial_transactions
ADD SYSTEM VERSIONING
''')
# Now you have automatic audit trail
Phase 5: Create Development Branches¶
# Use branching for safe testing
conn = sqlite3.connect('app.db')
# Before risky operation, create branch
test_branch = conn.create_branch('test_migration')
test_conn = sqlite3.connect('app.db', branch='test_migration')
# Test your changes
test_conn.execute('ALTER TABLE users ADD COLUMN new_field TEXT')
# If successful, merge; if not, discard
if tests_pass():
conn.merge_branch('test_migration')
else:
conn.delete_branch('test_migration')
Phase 6: Scale with Server Mode¶
# When you outgrow single-machine setup
server = sqlite3.start_server(
database='app.db',
host='0.0.0.0',
port=5432,
max_connections=200
)
# Application can still use local access for speed
# But now supports remote clients too
Summary¶
HeliosDB-Lite advanced features are opt-in and gradual:
- Start with drop-in SQLite compatibility
- Add features as needed:
- Vector search for AI/ML applications
- Time-travel for audit and compliance
- Branching for safe experimentation
- Encryption for security
- Server mode for scale
-
Hybrid mode for flexibility
-
No need to adopt everything at once
- Each feature enhances, doesn't replace, basic functionality
- All features work together seamlessly
Remember: You can stay on basic SQLite compatibility forever if that meets your needs. Advanced features are there when you're ready.
For more information:
- Drop-in guide: HELIOSDB_SQLITE_DROP_IN_GUIDE.md
- Migration patterns: HELIOSDB_SQLITE_MIGRATION_PATTERNS.md
- FAQ: HELIOSDB_SQLITE_FAQ.md