Skip to content

Data Collection Practices

Transparency Report | Last Updated: December 2025

This document provides a comprehensive breakdown of every piece of data qwip collects, why we collect it, and how long we keep it.


Table of Contents

  1. Anonymous Session Analytics
  2. Image Hash Database (Optional)
  3. Aggregated Daily Metrics
  4. What We DON'T Collect
  5. Data Lifecycle & Retention
  6. Your Control
  7. Compliance Summary

1. Anonymous Session Analytics

Purpose: Track platform usage for product improvement and investor metrics (DAU/MAU).

User Control: Cannot disable (required for basic analytics), but fully anonymous.

What We Collect

Data PointTechnical DescriptionExample ValueWhy We Collect
Session IDCryptographically random UUID v4a3f2b1c4-5d6e-7f8g-9h0i-1j2k3l4m5n6oCount unique users without identifying them
Images Analyzed (Count)Integer counter47Measure platform usage and engagement
Model PreferenceString (model name)"cifake" or "genimage"Optimize default model selection
Last SeenTimestamp (UTC)2025-12-28T10:30:00ZCalculate DAU/WAU/MAU metrics

Example Database Entry

json
{
  "session_id": "a3f2b1c4-5d6e-7f8g-9h0i-1j2k3l4m5n6o",
  "first_seen": "2025-12-20T14:22:00Z",
  "last_seen": "2025-12-28T10:30:00Z",
  "total_images_analyzed": 47,
  "preferred_model": "cifake"
}

What This Data CANNOT Tell Us

  • Who you are - Session ID is random, not linked to any identity
  • What images you analyzed - No image data, URLs, or content
  • Where you browsed - No website URLs or domains
  • Your location - No IP addresses stored
  • Your device - No fingerprinting or hardware info

How Session ID Is Generated

javascript
// Extension code: server-api.js
const sessionId = crypto.randomUUID();
// Example output: "a3f2b1c4-5d6e-7f8g-9h0i-1j2k3l4m5n6o"

Properties:

  • Random: Uses browser's cryptographically secure random number generator
  • Unique: Collision probability < 1 in 10^18 (astronomically low)
  • Anonymous: Not linked to any personal identifier
  • Local: Generated and stored in browser's local storage
  • Deletable: User can clear it anytime in settings

2. Image Hash Database (Optional)

Purpose: Crowdsource detection accuracy by recognizing previously-seen images.

User Control: Can disable in settings ("Contribute to Database" toggle).

What We Collect

When you enable "Contribute to Database":

Data PointTechnical DescriptionExample ValueCan Identify You?
Perceptual HashesBinary hash vectors (5 types)mean: 0xA3F2B1C4...❌ No - one-way function
BLAKE3 Content Hash256-bit cryptographic hashabc123def456... (64 chars)❌ No - one-way function
Detection ResultBoolean + confidencelikely_ai: true, confidence: 0.92❌ No - just a label
Model UsedString (model name)"cifake"❌ No
TimestampUTC timestamp2025-12-28T10:30:00Z❌ No

Example Database Entry

json
{
  "blake3": "abc123def456789...abcdef123456789abcdef123456789abcdef123456789",
  "perceptual_hashes": {
    "mean": "0xA3F2B1C4D5E6F7G8",
    "gradient": "0xB4C5D6E7F8G9H0I1",
    "double_gradient": "0xC6D7E8F9G0H1I2J3",
    "block": "0xD8E9F0G1H2I3J4K5",
    "dct": "0xE0F1G2H3I4J5K6L7"
  },
  "likely_ai": true,
  "confidence": 0.92,
  "model": "cifake",
  "votes": 3,
  "first_seen": "2025-12-28T10:30:00Z"
}

Why Hashes Can't Be Reversed

Perceptual Hashing:

  • Input: 1920×1080 image = 2,073,600 pixels × 3 channels = 6.2 million values
  • Output: 64-bit hash = 8 bytes = 18,446,744,073,709,551,616 possible values
  • Information loss: Massive reduction (6.2 million → 1 value)
  • Reversal: Mathematically impossible - infinite possible images map to same hash

BLAKE3 Content Hashing:

  • Cryptographic hash function (secure variant of BLAKE2)
  • One-way: Computing hash from image is fast, reverse is impossible
  • Collision resistant: Different images → different hashes (with overwhelming probability)

Analogy: It's like taking a photo and reducing it to one number: its average brightness. You can't recreate the photo from just knowing it was "medium brightness."

What This Data CANNOT Tell Us

  • The original image - Hashes are one-way, can't be reversed
  • Who submitted it - No session ID or user identifier attached
  • Where it came from - No URL, domain, or website information
  • When you saw it - Only when hash was first added to database

3. Aggregated Daily Metrics

Purpose: Platform health monitoring and investor reporting.

Storage: Aggregated counters, no individual records.

What We Aggregate

MetricDescriptionExample ValueGranularity
Daily Active Users (DAU)Unique session IDs seen today1,234Per day
Weekly Active Users (WAU)Unique session IDs seen this week5,678Per week
Monthly Active Users (MAU)Unique session IDs seen this month12,345Per month
Images AnalyzedTotal count across all users98,765Per day
AI Detection RatePercentage flagged as AI23.4%Per day
Average ConfidenceMean confidence score0.87Per day

Example Aggregated Data

json
{
  "date": "2025-12-28",
  "dau": 1234,
  "images_analyzed": 9876,
  "ai_detection_rate": 0.234,
  "avg_confidence": 0.87,
  "model_distribution": {
    "cifake": 67,
    "genimage": 28,
    "swin": 5
  }
}

Privacy Protection

  • No individual records - Only totals and averages
  • No reverse lookup - Can't go from aggregate → individual users
  • Time-series only - Daily/weekly/monthly snapshots

4. What We DON'T Collect

Never Collected (By Design)

CategoryExamplesWhy Not
Personal InformationName, email, phone number, addressNot needed for functionality
Browsing HistoryURLs visited, websites browsedNot needed, privacy violation
Image ContentPixels, thumbnails, screenshotsDetection runs locally
Location DataGPS, IP address geolocationNot needed
Device FingerprintsBrowser version, screen size, fontsPrivacy violation
CookiesTracking cookies, third-party cookiesNot used
Cross-Site TrackingFollowing you across websitesPrivacy violation

Not Even in Server Logs

Standard web server logs often include:

  • IP addresses → We strip these from logs
  • User agents → We don't log these
  • Referrer headers → We don't log these

Our server logs only contain:

  • Timestamp
  • Endpoint accessed (/api/check, /api/contribute)
  • Response status (200, 404, 500)
  • Anonymous session ID (only if included in request)

Example sanitized log:

2025-12-28T10:30:00Z POST /api/heartbeat 200 45ms
2025-12-28T10:30:15Z POST /api/check 200 120ms
2025-12-28T10:30:22Z POST /api/contribute 200 35ms

5. Data Lifecycle & Retention

Redis Cache (Temporary Storage)

Data TypeTTL (Time To Live)Auto-Delete
Daily active users25 hoursYes
Weekly active users8 daysYes
Monthly active users32 daysYes
Session stats30 daysYes

Why Redis?

  • In-memory storage (fast)
  • Automatic expiration (privacy by default)
  • No persistent logging

SQLite Database (Persistent Storage)

Data TypeRetention PolicyDeletion
Session records30 days of inactivityAutomatic cron job
Daily aggregates90 daysAutomatic
Image hashesIndefinite*On user request

*Image hashes are kept indefinitely to maintain database accuracy, but can be deleted on user request.

Data Flow Diagram

Extension (Local)

       │ Heartbeat every 5 min

Redis Cache (TTL: 30 days)

       │ Aggregate daily

SQLite Database

       │ Purge > 30 days inactive

Deleted permanently

6. Your Control

Option 1: Disable All Server Features

How:

  1. Open qwip extension → Settings tab
  2. Toggle OFF "Server-Assisted Detection"
  3. Toggle OFF "Contribute to Database"

Result:

  • ✅ 100% local-only processing
  • ✅ Zero data sent to servers
  • ✅ Still works (slightly lower accuracy)

Option 2: Clear Your Session Data

How:

  1. Open qwip extension → Settings tab
  2. Click "Clear Session Data"

Result:

  • ✅ Session ID deleted immediately
  • ✅ New random ID generated on next use
  • ✅ Previous ID becomes orphaned (no way to link to you)

Option 3: Request Hash Deletion

How:

  • Email privacy@qwip.io with the BLAKE3 hash you want deleted
  • We'll remove it within 7 days

Note: You'd need to know the exact hash (visible in extension developer console if you enabled logging).

Option 4: Uninstall Extension

How:

  1. Right-click extension icon → Remove from Chrome
  2. Confirm deletion

Result:

  • ✅ All local data cleared (session ID, stats, settings)
  • ✅ Server still has orphaned session ID (anonymous, can't link to you)
  • ✅ Image hashes remain (no way to identify which were yours)

7. Compliance Summary

GDPR (EU General Data Protection Regulation)

RequirementOur Compliance
Data minimization✅ Only collect anonymous session IDs
Purpose limitation✅ Only for platform analytics
Storage limitation✅ Auto-delete after 30 days
Right to erasure✅ Clear session data anytime
Right to access✅ View local storage, request server data
Right to portability✅ Export local storage JSON

Lawful basis: Legitimate interest (Art. 6(1)(f)) - minimal anonymous analytics for service improvement.

CCPA (California Consumer Privacy Act)

CategoryStatus
Personal Information collected❌ None (session IDs are not PI under CCPA)
Sale of personal information❌ N/A (no PI to sell)
Right to know✅ This document
Right to delete✅ Clear session data
Right to opt-out✅ Disable server features

COPPA (Children's Online Privacy Protection Act)

Status: ✅ Compliant

  • No personal information collected from anyone (including children under 13)
  • No parental consent needed
  • Safe for all ages

Questions?


Transparency Commitment: We update this document quarterly and notify users of any material changes. Last review: December 2025.

Open source and privacy-first