// DIGITAL FORENSICS PLATFORM v1.0

INVESTIGATE. ANALYZE. UNCOVER.

A local web app that gathers digital artifacts from your machine — browser history, USB events, execution traces — and lets you query them in plain English.

~ python app.py
# starting forensic assistant... ────────────────────────────────── OS Linux (Ubuntu 22.04) Admin Yes (root) DB forensic_data.db ────────────────────────────────── browser_history 1,204 rows usb_devices 17 rows system_logs 892 rows recent_files 338 rows ────────────────────────────────── server → http://127.0.0.1:5000 $
8
artifact types collected
2
platforms — windows & linux
NLP
natural language queries
100%
local — zero data sent out
WHAT IT DOES

Everything you need
to investigate.

Not a feature list written for investors. Just the things that actually make this useful in the field.

🧲
Multi-Source Collection

One click. Browser SQLite databases, Windows registry, event logs, prefetch files, LNK recent-files, USB sysfs — all running concurrently and written to a local indexed database.

CONCURRENT EXECUTION
🤖
AI-Powered Querying

"What USB devices were connected yesterday?" just works. The RAG engine parses intent, extracts date and type filters, and runs the SQL — no special syntax needed.

LOCAL LLM SUPPORT
📋
HTML Report Generator

Generates standalone, styled HTML reports — either just the filtered results from a query, or a full dump of everything collected. Self-contained, no dependencies to open.

DOWNLOAD READY
🖥️
Cross-Platform

OS detected at startup. Windows gets registry + event log collectors. Linux gets sysfs + journald collectors. The UI template even changes per OS automatically.

WINDOWS + LINUX
🔒
Fully Local & Private

SQLite on disk. Flask on localhost. Optional local LLM. Nothing phones home. Your artifact data doesn't leave the machine — ever. Zero telemetry by design.

OFFLINE CAPABLE
Threaded Performance

Windows collectors run concurrently with ThreadPoolExecutor — 5 workers in parallel. WAL-mode SQLite ensures fast reads during parallel writes with zero locking issues.

THREADPOOLEXECUTOR
ARCHITECTURE

How it works.

Five stages from raw OS data to a downloadable report. Clean pipeline, nothing magic.

01

Launch & Detect

Flask starts on 127.0.0.1:5000. It reads sys.platform, checks admin privileges, and picks the right template and collector set for the OS.

02

Collect Artifacts

OS-specific collectors run — browser SQLite DBs, event logs, prefetch .pf files, LNK shortcuts, /sys/bus/usb, journald. Windows runs them concurrently.

03

Store in SQLite

Everything goes into a WAL-mode SQLite database across 8 indexed tables. Thread-safe writes. Fast reads. Stays on disk until you re-collect.

04

Query with NLP

HybridRAG parses natural language — extracts dates, artifact types, browser names — and translates to filtered SQL against the local DB.

05

Report & Export

Results render in the UI with an optional LLM summary. One click exports a self-contained HTML file — full report or query-specific.

// DATA FLOW
🌐GET /
Flask route
↓ OS detected
🪟Windows collectors
🐧Linux collectors
↓ insert_many()
🗄️SQLite — 8 tables (WAL)
on disk
↓ POST /chat
🧠HybridRAG engine
NLP → SQL
↓ generate()
💬LLM summary
📄HTML report
DATA SOURCES

What gets collected.

Platform-aware. Each OS gets the collectors that actually work on it — no wasted runs.

🪟

Windows

SUPPORTED
Browser History
Chrome, Edge, Firefox, Opera & Brave SQLite DBs. URL, title, visit time, count — all profiles.
Windows Event Logs
Application, Security and System channels. Event ID, source, level, and message body.
Prefetch Files (.pf)
Executable names, run counts, and last execution timestamps from Windows prefetch.
Recent Files (LNK)
Shell link files from %APPDATA%\Recent to reconstruct recently accessed document paths.
USB Devices + History
Registry enumeration + System event log for connect/disconnect events with timestamps.
🐧

Linux

SUPPORTED
Browser History
Firefox, Chrome and Brave SQLite history databases discovered across all user profiles.
USB Devices
/sys/bus/usb/devices and udev data. Vendor, product ID, speed, mount point, history.
System Logs
journald via journalctl and /var/log/syslog. Structured entries with timestamps and sources.
Recent Files
GTK recently-used.xbel and XDG recent document registries to reconstruct file access.
TECHNOLOGY

Built with precision.

Nothing exotic. Chosen for reliability and zero deployment friction.

🐍
Python + Flask
Lightweight web framework serving the UI and REST API endpoints from localhost.
🗄️
SQLite (WAL)
Write-Ahead Logging for concurrent reads during parallel collection writes.
🧠
LLaMA 3.2 GGUF
Optional local model for natural language forensic summaries. Fully offline.
ThreadPoolExecutor
Concurrent Windows collector execution — 5 workers, no waiting around.
🔍
Hybrid RAG
Custom NLP parser. No vector DB — smart regex + SQL filters do the job.
📄
HTML Reports
Self-contained dark-themed reports. Open anywhere, no server needed.
🪟
Windows APIs
ctypes, win32evtlog, HKLM registry access for deep Windows artifact collection.
🐧
Linux Subsystems
sysfs, udev, journald, XDG spec. Reads the system the way the kernel sees it.
NATURAL LANGUAGE

Just ask it anything.

The query engine doesn't need special syntax. Type the way you'd ask a colleague — it figures out what you mean and returns the right data.

What USB devices were connected yesterday?
Show me Chrome history from 3 days ago
What programs were executed today?
Security event logs from last week
Recently opened files on 15 January
All external USB devices this month
Firefox history from yesterday
RESPONSE — USB QUERY
query usb_devices, usb_history
date 2025-01-14
found 17 artifacts

device SanDisk Ultra 3.0
type External Storage
connected14 Jan 09:42:17
vendor SanDisk Corp.
serial 4C530001140115...

device Generic Mass Storage
type External Storage
connected14 Jan 14:08:53
2 external USB storage devices connected on this date. First device active for ~4.5 hours. Second device — unrecognized vendor — connected in the afternoon. Worth a closer look.
GET THE SOFTWARE

Download & install.

Each package contains two things: the /models folder (LLaMA 3.2 GGUF) and the software installer for your platform. Unzip and run.

RECOMMENDED · COMPLETE PACKAGE
Full Bundle — Both Platforms

Everything in one archive. The software installers for both Windows and Linux, plus the /models folder pre-loaded with llama-3.2-3b-instruct-q4_k_m.gguf. Unzip, pick your platform, run.

📁 /models — LLaMA 3.2 GGUF · ~2.0 GB 🪟 forensic_assistant_setup.exe 🐧 forensic_assistant_amd64.deb
Download Full Bundle Source Code Only
ZIP archive · ~2.1 GB total · platform-agnostic
🪟

Windows

Windows 10 / 11 — 64-bit

📁
models/
llama-3.2-3b-instruct-q4_k_m.gguf · ~2.0 GB
FOLDER
⚙️
forensic_assistant_setup.exe
Windows installer · ~45 MB
.EXE
Download for Windows
Run as Administrator for full collection
🐧

Linux

Ubuntu 20.04+ / Debian / Kali

📁
models/
llama-3.2-3b-instruct-q4_k_m.gguf · ~2.0 GB
FOLDER
📦
forensic_assistant_amd64.deb
Debian package · ~38 MB
.DEB
Download for Linux
Run with sudo for full collection
// QUICK START — AFTER DOWNLOAD
WINDOWS
1. Unzip the downloaded archive
2. Run setup.exe as Administrator
3. models/ folder auto-detected
4. Open http://localhost:5000
LINUX
$ unzip forensic_assistant_linux.zip
$ sudo dpkg -i forensic_assistant_amd64.deb
$ sudo forensic-assistant
http://127.0.0.1:5000
GET STARTED

Start your
investigation.

Run locally. No cloud. No telemetry. Your forensic data stays exactly where it should — on your machine.

Download Now View Documentation