DEEP DIVE

LLM Processing

How Aegis Photo Voyager transforms your gallery into an intelligent, searchable experience using local AI.

Aegis Photo Voyager leverages state-of-the-art Large Language Models (LLMs) and Vector Embeddings to transform your local photo library into an intelligent, searchable, and immersive experience. This processing happens entirely under your control, primarily using local or self-hosted models to ensure maximum privacy.

1. Vision-Based Metadata Extraction (The "Brain")

This is the most comprehensive layer of processing. When you run "AI Analysis" on your photos, the application uses a Vision LLM (typically granite3.2-vision:2b) to "look" at your photos and videos and extract rich metadata.

How it Works

  • Model: Integrated via Ollama or LM Studio.
  • Processing: For photos, the image is encoded and sent to the LLM. For videos, the application extracts key frames to analyze the content.
  • Structured Output: The LLM is prompted to return a precise JSON object, which is then parsed into the application's database.

What is Extracted?

  • Executive Summary: A concise one-sentence description of the photo.
  • Narrative Description: A detailed multi-sentence description of the scene, lighting, and composition.
  • Tags & Objects: Automated labeling of items (e.g., "mountain", "bicycle").
  • Technical Quality: AI-driven evaluation of sharpness, graininess, and focus.
How to Access: You can find these results in the Sidebar when viewing a photo in Detail View. You can also use the Filter Pills in the main gallery sidebar to narrow down your library by mood, category, or detected objects.

2. Visual Similarity & Clustering (The "Eye")

Beyond understanding text, Aegis Photo Voyager understands visual relationships. It uses CLIP-style models to create "mathematical signatures" (embeddings) for your photos.

How it Works

  • Technology: Uses Local ONNX models (CLIP-ViT-B-32) processed directly on your CPU/GPU.
  • Dimensionality Reduction: Since these signatures are complex, the application uses PCA (Principal Component Analysis) to flatten them into 2D or 3D coordinates.
Why it's Useful:
  • Similar Photo Search: Right-click any photo and select "Search Similar" to find photos with a similar visual aesthetic or composition.
  • The 2D/3D Map: View your entire library as a "star map" where visually similar photos are physically clustered together.

How to Access: Switch to the 2D Cluster View or 3D Cluster View from the Views menu at the top of the application.

4. Immersive Photo Journeys

The "Photo Journey" feature uses the combined power of all the AI layers to create an automated, themed slideshow experience.

Available Journeys

  • Person Through Time: Uses face recognition and AI dates to follow someone's life.
  • Activity Journey: Groups photos by AI-detected actions.
  • Semantic Journey: Dynamically builds a slideshow based on a natural language prompt.

How to Access: Click the Slideshow button in the top menu and select "Photo Journey...".

Summary of Benefits

Zero Manual Tagging

The AI handles the "grunt work" of organization, automatically extracting tags, objects, and descriptions.

Privacy First

All processing can run locally; your photos never need to leave your machine.

Discovery

Find hidden gems in your library that you forgot existed using visual and semantic relationships.