Skip to content

Data Collection Practices

Transparency Report | Last Updated: January 2026

This document provides a comprehensive breakdown of every piece of data qwip collects, why we collect it, and how long we keep it.


Table of Contents

  1. Anonymous Session Analytics
  2. 24-Hour Rotating API Keys (v2.0)
  3. Image Hash Database (Optional)
  4. Aggregated Daily Metrics
  5. What We DON'T Collect
  6. Data Lifecycle & Retention
  7. Your Control
  8. Compliance Summary

1. Anonymous Session Analytics

Purpose: Track platform usage for product improvement and investor metrics (DAU/MAU).

User Control: Cannot disable (required for basic analytics), but fully anonymous.

What We Collect

Data PointTechnical DescriptionExample ValueWhy We Collect
Session IDCryptographically random UUID v4a3f2b1c4-5d6e-7f8g-9h0i-1j2k3l4m5n6oCount unique users without identifying them
Images Analyzed (Count)Integer counter47Measure platform usage and engagement
Model PreferenceString (model name)"cifake" or "genimage"Optimize default model selection
Last SeenTimestamp (UTC)2025-12-28T10:30:00ZCalculate DAU/WAU/MAU metrics

Example Database Entry

json
{
  "session_id": "a3f2b1c4-5d6e-7f8g-9h0i-1j2k3l4m5n6o",
  "first_seen": "2025-12-20T14:22:00Z",
  "last_seen": "2025-12-28T10:30:00Z",
  "total_images_analyzed": 47,
  "preferred_model": "cifake"
}

What This Data CANNOT Tell Us

  • Who you are - Session ID is random, not linked to any identity
  • What images you analyzed - No image data, URLs, or content
  • Where you browsed - No website URLs or domains
  • Your location - No IP addresses stored
  • Your device - No fingerprinting or hardware info

How Session ID Is Generated

javascript
// Extension code: server-api.js
const sessionId = crypto.randomUUID();
// Example output: "a3f2b1c4-5d6e-7f8g-9h0i-1j2k3l4m5n6o"

Properties:

  • Random: Uses browser's cryptographically secure random number generator
  • Unique: Collision probability < 1 in 10^18 (astronomically low)
  • Anonymous: Not linked to any personal identifier
  • Local: Generated and stored in browser's local storage
  • Deletable: User can clear it anytime in settings

2. 24-Hour Rotating API Keys (v2.0)

Purpose: Rate limiting and abuse prevention while preserving privacy.

User Control: Automatically managed (no manual configuration needed).

What We Collect

When using server-assisted detection:

Data PointTechnical DescriptionExample ValueCan Identify You?
API KeyCryptographically random 64-char hexqwip_v2_a1b2c3d4...⚠️ Temporary (24h only)
Created AtUTC timestamp2026-01-05T15:30:00Z❌ No
Expires AtUTC timestamp2026-01-06T15:30:00Z❌ No
Last UsedUTC timestamp2026-01-06T10:15:00Z❌ No
Request CountsIntegers (per endpoint)check: 47, contribute: 12❌ No
Reputation ScoreFloat (0.0-1.0)0.95❌ No
Session IDOptional linkagea3f2b1c4... (from section 1)❌ No

Example Database Entry

json
{
  "key_id": "qwip_v2_a1b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef123456",
  "created_at": "2026-01-05T15:30:00Z",
  "expires_at": "2026-01-06T15:30:00Z",
  "last_used": "2026-01-06T10:15:00Z",
  "is_active": true,
  "session_id": "a3f2b1c4-5d6e-7f8g-9h0i-1j2k3l4m5n6o",
  "request_count_check": 47,
  "request_count_contribute": 12,
  "total_requests": 59,
  "reputation_score": 0.95,
  "positive_votes": 45,
  "negative_votes": 2,
  "rate_limit_hits": 0,
  "suspicious_patterns": false
}

How It Works

  1. Extension requests key on first install via /api/register (IP rate-limited: 100/day)
  2. Server generates key using cryptographically secure random number generator
  3. Key expires after 24 hours automatically (privacy by design)
  4. Extension auto-rotates 30 minutes before expiry (seamless, no downtime)
  5. Expired keys have 2-hour grace period (server issues new key in 401 response)
  6. After grace period: Must request new key (IP rate-limited prevents abuse)

Privacy Properties

What This Data CAN Tell Us (Temporarily):

  • ✅ How many requests you made during the 24-hour window
  • ✅ Whether your contributions tend to be accurate (reputation score)
  • ✅ If you're hitting rate limits (abuse detection)

What This Data CANNOT Tell Us (Even Temporarily):

  • Who you are - Key is random, no identity linkage
  • What images you analyzed - Only request counts, no content
  • Where you browsed - No URLs or domains
  • Your location - IP addresses hashed before storage (for rate limiting only)
  • Long-term tracking - Keys expire after 24h, no persistent identifier

Why 24-Hour Rotation?

Privacy: Prevents long-term tracking. After 24 hours, your old key is orphaned and cannot be linked to your new key.

Abuse Prevention: Rate limiting (1000 req/day per key) prevents spam and abuse.

Balance: Rotating keys strike a balance between privacy (no permanent ID) and functionality (rate limiting, reputation tracking).

Comparison: Traditional vs Rotating Keys

PropertyTraditional API Keysqwip Rotating Keys (v2.0)
LifetimePermanent24 hours
Long-term tracking✅ Yes❌ No
Rate limiting✅ Yes✅ Yes
Abuse prevention✅ Yes✅ Yes
User privacy❌ Poor✅ Good
Extractable⚠️ Yes⚠️ Yes (by design)

IP Address Handling (Rate Limiting Only)

When registering new keys via /api/register:

  • IP address is hashed (one-way SHA256) before storage
  • Used only to enforce 100 registrations/day limit per IP
  • Not stored in plaintext anywhere
  • Not logged in server logs
  • Cannot be reversed to identify you
  • Automatically deleted after 25 hours

Example:

Your IP: 192.168.1.1
Stored hash: a1b2c3d4e5f6...  (SHA256, cannot be reversed)
Purpose: Count registrations from this hash
Retention: 25 hours (auto-delete)

3. Image Hash Database (Optional)

Purpose: Crowdsource detection accuracy by recognizing previously-seen images.

User Control: Can disable in settings ("Contribute to Database" toggle).

What We Collect

When you enable "Contribute to Database":

Data PointTechnical DescriptionExample ValueCan Identify You?
Perceptual HashesBinary hash vectors (5 types)mean: 0xA3F2B1C4...❌ No - one-way function
BLAKE3 Content Hash256-bit cryptographic hashabc123def456... (64 chars)❌ No - one-way function
Detection ResultBoolean + confidencelikely_ai: true, confidence: 0.92❌ No - just a label
Model UsedString (model name)"cifake"❌ No
TimestampUTC timestamp2025-12-28T10:30:00Z❌ No

Example Database Entry

json
{
  "blake3": "abc123def456789...abcdef123456789abcdef123456789abcdef123456789",
  "perceptual_hashes": {
    "mean": "0xA3F2B1C4D5E6F7G8",
    "gradient": "0xB4C5D6E7F8G9H0I1",
    "double_gradient": "0xC6D7E8F9G0H1I2J3",
    "block": "0xD8E9F0G1H2I3J4K5",
    "dct": "0xE0F1G2H3I4J5K6L7"
  },
  "likely_ai": true,
  "confidence": 0.92,
  "model": "cifake",
  "votes": 3,
  "first_seen": "2025-12-28T10:30:00Z"
}

Why Hashes Can't Be Reversed

Perceptual Hashing:

  • Input: 1920×1080 image = 2,073,600 pixels × 3 channels = 6.2 million values
  • Output: 64-bit hash = 8 bytes = 18,446,744,073,709,551,616 possible values
  • Information loss: Massive reduction (6.2 million → 1 value)
  • Reversal: Mathematically impossible - infinite possible images map to same hash

BLAKE3 Content Hashing:

  • Cryptographic hash function (secure variant of BLAKE2)
  • One-way: Computing hash from image is fast, reverse is impossible
  • Collision resistant: Different images → different hashes (with overwhelming probability)

Analogy: It's like taking a photo and reducing it to one number: its average brightness. You can't recreate the photo from just knowing it was "medium brightness."

What This Data CANNOT Tell Us

  • The original image - Hashes are one-way, can't be reversed
  • Who submitted it - No session ID or user identifier attached
  • Where it came from - No URL, domain, or website information
  • When you saw it - Only when hash was first added to database

3. Aggregated Daily Metrics

Purpose: Platform health monitoring and investor reporting.

Storage: Aggregated counters, no individual records.

What We Aggregate

MetricDescriptionExample ValueGranularity
Daily Active Users (DAU)Unique session IDs seen today1,234Per day
Weekly Active Users (WAU)Unique session IDs seen this week5,678Per week
Monthly Active Users (MAU)Unique session IDs seen this month12,345Per month
Images AnalyzedTotal count across all users98,765Per day
AI Detection RatePercentage flagged as AI23.4%Per day
Average ConfidenceMean confidence score0.87Per day

Example Aggregated Data

json
{
  "date": "2025-12-28",
  "dau": 1234,
  "images_analyzed": 9876,
  "ai_detection_rate": 0.234,
  "avg_confidence": 0.87,
  "model_distribution": {
    "cifake": 67,
    "genimage": 28,
    "swin": 5
  }
}

Privacy Protection

  • No individual records - Only totals and averages
  • No reverse lookup - Can't go from aggregate → individual users
  • Time-series only - Daily/weekly/monthly snapshots

4. What We DON'T Collect

Never Collected (By Design)

CategoryExamplesWhy Not
Personal InformationName, email, phone number, addressNot needed for functionality
Browsing HistoryURLs visited, websites browsedNot needed, privacy violation
Image ContentPixels, thumbnails, screenshotsDetection runs locally
Location DataGPS, IP address geolocationNot needed
Device FingerprintsBrowser version, screen size, fontsPrivacy violation
CookiesTracking cookies, third-party cookiesNot used
Cross-Site TrackingFollowing you across websitesPrivacy violation

Not Even in Server Logs

Standard web server logs often include:

  • IP addresses → We strip these from logs
  • User agents → We don't log these
  • Referrer headers → We don't log these

Our server logs only contain:

  • Timestamp
  • Endpoint accessed (/api/check, /api/contribute)
  • Response status (200, 404, 500)
  • Anonymous session ID (only if included in request)

Example sanitized log:

2025-12-28T10:30:00Z POST /api/heartbeat 200 45ms
2025-12-28T10:30:15Z POST /api/check 200 120ms
2025-12-28T10:30:22Z POST /api/contribute 200 35ms

6. Data Lifecycle & Retention

Redis Cache (Temporary Storage)

Data TypeTTL (Time To Live)Auto-Delete
Daily active users25 hoursYes
Weekly active users8 daysYes
Monthly active users32 daysYes
Session stats30 daysYes
Rate limit counters25 hoursYes
IP registration hashes25 hoursYes

Why Redis?

  • In-memory storage (fast)
  • Automatic expiration (privacy by default)
  • No persistent logging

SQLite Database (Persistent Storage)

Data TypeRetention PolicyDeletion
API keys24 hours + 2h grace periodAutomatic (26h total)
Session records30 days of inactivityAutomatic cron job
Daily aggregates90 daysAutomatic
Image hashesIndefinite*On user request

*Image hashes are kept indefinitely to maintain database accuracy, but can be deleted on user request.

Data Flow Diagram

Extension (Local)

       │ Heartbeat every 5 min

Redis Cache (TTL: 30 days)

       │ Aggregate daily

SQLite Database

       │ Purge > 30 days inactive

Deleted permanently

7. Your Control

Option 1: Disable All Server Features

How:

  1. Open qwip extension → Settings tab
  2. Toggle OFF "Server-Assisted Detection"
  3. Toggle OFF "Contribute to Database"

Result:

  • ✅ 100% local-only processing
  • ✅ Zero data sent to servers
  • ✅ No API keys generated
  • ✅ Still works (slightly lower accuracy)

Option 2: Clear Your Session Data

How:

  1. Open qwip extension → Settings tab
  2. Click "Clear Session Data"

Result:

  • ✅ Session ID deleted immediately
  • ✅ Current API key deleted (orphaned on server)
  • ✅ New random session ID generated on next use
  • ✅ New API key generated on next server request
  • ✅ Previous ID and key become orphaned (no way to link to you)

Option 3: Request Hash Deletion

How:

  • Email privacy@qwip.io with the BLAKE3 hash you want deleted
  • We'll remove it within 7 days

Note: You'd need to know the exact hash (visible in extension developer console if you enabled logging).

Option 4: Uninstall Extension

How:

  1. Right-click extension icon → Remove from Chrome
  2. Confirm deletion

Result:

  • ✅ All local data cleared (session ID, stats, settings)
  • ✅ Server still has orphaned session ID (anonymous, can't link to you)
  • ✅ Image hashes remain (no way to identify which were yours)

7. Compliance Summary

GDPR (EU General Data Protection Regulation)

RequirementOur Compliance
Data minimization✅ Only collect anonymous session IDs
Purpose limitation✅ Only for platform analytics
Storage limitation✅ Auto-delete after 30 days
Right to erasure✅ Clear session data anytime
Right to access✅ View local storage, request server data
Right to portability✅ Export local storage JSON

Lawful basis: Legitimate interest (Art. 6(1)(f)) - minimal anonymous analytics for service improvement.

CCPA (California Consumer Privacy Act)

CategoryStatus
Personal Information collected❌ None (session IDs are not PI under CCPA)
Sale of personal information❌ N/A (no PI to sell)
Right to know✅ This document
Right to delete✅ Clear session data
Right to opt-out✅ Disable server features

COPPA (Children's Online Privacy Protection Act)

Status: ✅ Compliant

  • No personal information collected from anyone (including children under 13)
  • No parental consent needed
  • Safe for all ages

Questions?


Transparency Commitment: We update this document quarterly and notify users of any material changes. Last review: January 2026 (v2.0 rotating API keys).

Open source and privacy-first