Update README.md with new features and optimizations (Danish)

Improve cancellation logic and sync performance.
- Implement explicit threading.Event propagation for robust GUI cancellation. - Optimize file synchronization by skipping hash validation for up-to-date files (matching size and timestamp). - Update Windows long path support to correctly handle UNC network shares. - Refactor configuration management to eliminate global state and improve modularity. - Remove requests.get monkey-patch in GUI. - Delete CLAUDE.md as it is no longer required.
2026-04-12 12:46:15 +02:00 · 2026-04-12 12:44:43 +02:00 · 2026-03-30 09:18:40 +02:00 · 2026-03-29 19:58:45 +02:00 · 2026-03-29 19:56:07 +02:00 · 2026-03-29 19:55:08 +02:00
6 changed files with 592 additions and 186 deletions
--- a/GEMINI.md
+++ b/GEMINI.md
@@ -1,51 +1,46 @@
-# SharePoint Download Tool
+# SharePoint Download Tool - Technical Documentation
-A Python-based utility designed to recursively download folders and files from a specific SharePoint Online Site using the Microsoft Graph API.
+A production-ready Python utility for robust synchronization of SharePoint Online folders using Microsoft Graph API.
 ## Project Overview
-*   **Purpose:** Automates the synchronization of specific SharePoint document library folders to a local directory.
+*   **Purpose:** Enterprise-grade synchronization tool for local mirroring of SharePoint content.
 *   **Technologies:** 
-    *   **Python 3.x**
+    *   **Microsoft Graph API:** Advanced REST API for SharePoint data.
-    *   **Microsoft Graph API:** Used for robust data access.
+    *   **MSAL:** Secure authentication using Azure AD Client Credentials.
-    *   **MSAL (Microsoft Authentication Library):** Handles Entra ID (Azure AD) authentication using Client Credentials flow.
+    *   **Requests:** High-performance HTTP client with streaming and Range header support.
-    *   **Requests:** Manages HTTP streaming for large file downloads.
+    *   **ThreadPoolExecutor:** Parallel file processing for optimized throughput.
-*   **Architecture:**
+
-    *   `download_sharepoint.py`: The core script that orchestrates authentication, site/drive discovery, and recursive folder traversal.
+## Core Features (Production Ready)
-    *   `connection_info.txt`: Centralized configuration file for credentials and target paths.
+
-    *   `requirements.txt`: Defines necessary Python dependencies.
+1.  **Windows Long Path Support:** Automatically handles Windows path limitations by using `get_long_path` and `\\?\` absolute path prefixing.
 2.  **High-Performance Integrity:** Uses the `quickxorhash` C-library if available for fast validation of large files. Includes a manual 160-bit circular XOR fallback implementation.
 3.  **Timestamp Synchronization:** Compares SharePoint `lastModifiedDateTime` with local file `mtime`. Only downloads if the remote source is newer, significantly reducing sync time.
 4.  **Optimized Integrity Validation:** Includes a configurable threshold (default 30MB) and a global toggle to balance security and performance for large assets.
 5.  **Resumable Downloads:** Implements HTTP `Range` headers to resume partially downloaded files, critical for multi-gigabyte assets.
 6.  **Reliability:** Includes a custom `retry_request` decorator for Exponential Backoff, handling throttling (429) and transient network errors.
 7.  **Robust Library Discovery:** Automatic resolution of document library IDs with built-in fallbacks for localized names.
 8.  **Self-Healing Sessions:** Automatically refreshes expiring Microsoft Graph Download URLs and MSAL Access Tokens mid-process.
 9.  **Concurrency:** Multi-threaded architecture (5 workers) for simultaneous scanning and downloading.
 10. **Pagination:** Full support for OData pagination, ensuring complete folder traversal.
 ## Building and Running
 ### Prerequisites
 *   Python 3.x installed.
 *   A registered application in Microsoft Entra ID with `Sites.Read.All` (or higher) application permissions.
 ### Setup
-1.  **Install Dependencies:**
+1.  **Dependencies:** `pip install -r requirements.txt` (Installing `quickxorhash` via C-compiler is recommended for best performance).
-    ```bash
+2.  **Configuration:** Settings are managed via `connection_info.txt` or the GUI.
-    pip install -r requirements.txt
+    *   `ENABLE_HASH_VALIDATION`: (True/False)
-    ```
+    *   `HASH_THRESHOLD_MB`: (Size limit for hashing)
 2.  **Configure Connection:**
    Edit `connection_info.txt` with your specific details:
    *   `TENANT_ID`, `CLIENT_ID`, `CLIENT_SECRET`
    *   `SITE_URL`: Full URL to the SharePoint site.
    *   `DOCUMENT_LIBRARY`: The name of the target library (e.g., "Documents").
    *   `FOLDERS_TO_DOWNLOAD`: Comma-separated list of folder names to sync.
    *   `LOCAL_PATH`: The destination path on your local machine.
 ### Execution
-Run the main download script:
+*   **GUI:** `python sharepoint_gui.py`
-```bash
+*   **CLI:** `python download_sharepoint.py`
 python download_sharepoint.py
 ```
 ### Validation
 After execution, a CSV report named `download_report_YYYYMMDD_HHMMSS.csv` is generated, detailing any failed downloads or size mismatches for verification.
 ## Development Conventions
-*   **Authentication:** Always use the Graph API with MSAL for app-only authentication.
+*   **QuickXorHash:** When implementing/updating hashing, ensure the file length is XORed into the **last 64 bits** (bits 96-159) of the 160-bit state per MS spec.
-*   **Error Handling:** All file and folder operations should be wrapped in try-except blocks, with errors logged to the generated CSV report.
+*   **Long Paths:** Always use `get_long_path()` when interacting with local file system (open, os.path.exists, etc.).
-*   **Verification:** Post-download verification is performed by comparing the local file size against the `size` property returned by the Graph API.
+*   **Timezone Handling:** Always use UTC (ISO8601) when comparing timestamps with SharePoint.
-*   **Security:** Never commit `connection_info.txt` or any file containing secrets. Use the provided `.gitignore`.
+*   **Error Handling:** Always use the `safe_get` (retry-wrapped) method for Graph API calls. For item-specific operations, use `get_fresh_download_url`.
 *   **Authentication:** Use `get_headers(app, force_refresh=True)` when a 401 error is encountered.
 *   **Logging:** Prefer `logger.info()` or `logger.error()` over `print()`.
--- a/README.md
+++ b/README.md
@@ -1,19 +1,23 @@
 # SharePoint Folder Download Tool
-Dette script gør det muligt at downloade specifikke mapper fra et SharePoint dokumentbibliotek til din lokale computer ved hjælp af Microsoft Graph API. Scriptet understøtter rekursiv download, filvalidering (størrelsestjek) og genererer en fejlrapport, hvis noget går galt.
+Dette script gør det muligt at downloade specifikke mapper fra et SharePoint dokumentbibliotek til din lokale computer ved hjælp af Microsoft Graph API. Scriptet er designet til professionelt brug med fokus på hastighed, stabilitet og dataintegritet.
 ## Funktioner
-*   **Rekursiv Download:** Downloader alle undermapper og filer i de valgte mapper.
+*   **Moderne GUI (UX):** Flot mørkt interface med CustomTkinter, der gør det nemt at gemme indstillinger, vælge mapper og se status i realtid.
-*   **Filnavn-sanitering:** Håndterer ulovlige tegn (f.eks. `<`, `>`, `:`, `"`, `/`, `\`, `|`, `?`, `*`) og Unicode-mellemrum, så SharePoint-filer altid kan gemmes på Windows.
+*   **Stop-funktionalitet:** Afbryd synkroniseringen øjeblikkeligt direkte fra GUI. Systemet benytter nu eksplicit signalering (`threading.Event`), som afbryder igangværende downloads midt i en stream (chunk-level), hvilket sikrer en lynhurtig stop-respons uden ventetid.
-*   **Long Path Support:** Understøtter filstier på over 260 tegn på Windows ved brug af `\\?\` præfiks.
+*   **Paralleldownload:** Benytter `ThreadPoolExecutor` (default 5 tråde) for markant højere overførselshastighed.
-*   **Status i Realtid:** Viser en progress-indikator med antal tjekkede, downloadede, skippede og fejlede filer, samt den aktuelle sti, der arbejdes på.
+*   **Windows Long Path Support:** Håndterer automatisk Windows' begrænsning på 260 tegn i filstier ved brug af `\\?\` præfiks. Systemet understøtter nu også korrekt **UNC-stier** (netværksdrev) via `\\?\UNC\` formatet, hvilket sikrer fuld kompatibilitet i enterprise-miljøer.
-*   **Netværksstabilitet:** Tjekker om destinationsstien er tilgængelig ved opstart og håndterer fejl, hvis f.eks. et netværksdrev mister forbindelsen under kørslen.
+*   **Optimeret Synkronisering:** Hvis filstørrelse og tidsstempel matcher perfekt (indenfor 1 sekunds præcision), springer værktøjet automatisk over både download og den tunge hash-validering. Dette giver en markant hastighedsforbedring ved gentagne synkroniseringer af store biblioteker med mange små filer.
-*   **Smart Skip:** Skipper automatisk filer, der allerede findes lokalt med den korrekte filstørrelse (sparer tid ved genstart).
+*   **Timestamp Synkronisering:** Downloader kun filer, hvis kilden på SharePoint er nyere end din lokale fil (`lastModifiedDateTime` vs. lokal `mtime`).
-*   **Token Refresh:** Håndterer automatisk fornyelse af adgangstoken, så lange kørsler ikke afbrydes af timeout.
+*   **Integritets-validering:** Validerer filernes korrekthed med Microsofts officielle **QuickXorHash**-algoritme (160-bit circular XOR).
-*   **Fejlrapportering:** Genererer en CSV-fil med detaljer om eventuelle fejl og specifikke fejlkoder (f.eks. `[Error 22]` eller netværksfejl).
+    *   **Fallback:** Har indbygget en præcis 160-bit Python-implementering som standard.
-*   **Dataintegritet:** Sammenligner lokal filstørrelse med SharePoint-størrelsen for at sikre korrekt overførsel.
+    *   **Optimering:** Understøtter automatisk det lynhurtige `quickxorhash` C-bibliotek, hvis det er installeret (valgfrit).
-*   **Entra ID Integration:** Benytter MSAL for sikker godkendelse via Client Credentials flow.
+    *   **Smart Grænse:** Definer en MB-grænse (standard 30 MB), hvor filer herunder altid hashes, mens større filer (f.eks. 65 GB) kun sammenlignes på størrelse for at spare tid (kan konfigureres).
 *   **Robust Bibliotekssøgning:** Finder automatisk dit bibliotek og har indbygget fallback (f.eks. fra "Delte dokumenter" til "Documents").
 *   **Resume Download:** Understøtter HTTP `Range` headers for genoptagelse af store filer.
 *   **Auto-Refresh af Downloads & Tokens:** Fornyer automatisk sessioner og links midt i processen uden unødig ventetid (Optimized 401 handling).
 *   **Intelligent Fejlhåndtering:** Inkluderer retry-logik med exponential backoff og specialiseret håndtering af udløbne tokens (safe_graph_get).
 ## Installation
@@ -23,39 +27,23 @@ Dette script gør det muligt at downloade specifikke mapper fra et SharePoint do
    pip install -r requirements.txt
    ```
-## Opsætning i Microsoft Entra ID (Azure AD)
+> **Bemærk:** Biblioteket `quickxorhash` er fjernet fra standard-requirements for at undgå problemer med C++ Build Tools på Windows. Værktøjet fungerer perfekt uden det, da det har en indbygget Python-fallback. Hvis du har brug for lynhurtig hash-validering af meget store filer (GB-klassen), kan du manuelt installere det med `pip install quickxorhash`.
 For at scriptet kan få adgang til SharePoint, skal du oprette en App-registrering:
 1.  Log ind på [Microsoft Entra admin center](https://entra.microsoft.com/).
 2.  Gå til **Identity** > **Applications** > **App registrations** > **New registration**.
 3.  Giv appen et navn (f.eks. "SharePoint Download Tool") og vælg "Accounts in this organizational directory only". Klik på **Register**.
 4.  Noter din **Application (client) ID** og **Directory (tenant) ID**.
 5.  Gå til **API permissions** > **Add a permission** > **Microsoft Graph**.
 6.  Vælg **Application permissions**.
 7.  Søg efter og tilføj `Sites.Read.All` (eller `Sites.ReadWrite.All` hvis du har brug for skriveadgang).
 8.  **VIGTIGT:** Klik på **Grant admin consent for [dit domæne]** for at godkende rettighederne.
 9.  Gå til **Certificates & secrets** > **New client secret**. Tilføj en beskrivelse og vælg udløbsdato.
 10. **VIGTIGT:** Kopier værdien under **Value** med det samme (det er din `CLIENT_SECRET`). Du kan ikke se den igen senere.
 ## Konfiguration
 1.  Kopier `connection_info.template.txt` til en ny fil kaldet `connection_info.txt`.
 2.  Indstil dine forbindelsesoplysninger i `connection_info.txt`:
 *   `TENANT_ID`, `CLIENT_ID`, `CLIENT_SECRET` (Fra Microsoft Entra admin center).
 *   `SITE_URL`: URL til din SharePoint site.
 *   `DOCUMENT_LIBRARY`: Navnet på dokumentbiblioteket (f.eks. "22 Studies").
 *   `FOLDERS_TO_DOWNLOAD`: Liste over mapper adskilt af komma. Hvis denne efterlades tom, downloades hele biblioteket.
 *   `LOCAL_PATH`: Hvor filerne skal gemmes lokalt.
 ## Anvendelse
-Kør scriptet med:
+### 1. GUI Version (Anbefalet)
-```bash
+Kør: `python sharepoint_gui.py`
 python download_sharepoint.py
 ```
-Efter kørsel vil en CSV-rapport (f.eks. `download_report_20260326.csv`) være tilgængelig, hvis der er opstået fejl.
+### 2. CLI Version (Til automatisering)
 Kør: `python download_sharepoint.py`
 ## Konfiguration (connection_info.txt)
 *   `ENABLE_HASH_VALIDATION`: Sæt til `"True"` eller `"False"`.
 *   `HASH_THRESHOLD_MB`: Talværdi (f.eks. `"30"` eller `"50"`).
 ## Status
 **Vurdering:** ✅ **Produktionsklar (Enterprise-grade)**  
 Dette værktøj er gennemtestet og optimeret til professionel brug. Det håndterer komplekse scenarier som dybe mappestrukturer (Long Path), cloud-throttling, resumable downloads og intelligent tidsstempel-synkronisering med høj præcision.
 ## Sikkerhed
 Husk at `.gitignore` er sat op til at ignorere `connection_info.txt`, så dine adgangskoder ikke uploades til Git.
--- a/connection_info.template.txt
+++ b/connection_info.template.txt
@@ -5,3 +5,7 @@ SITE_URL = "*** INPUT SHAREPOINT SITE URL HERE ***"
 DOCUMENT_LIBRARY = "*** INPUT DOCUMENT LIBRARY NAME HERE (e.g. Documents) ***"
 FOLDERS_TO_DOWNLOAD = "*** INPUT FOLDERS TO DOWNLOAD (Comma separated). LEAVE EMPTY TO DOWNLOAD ENTIRE LIBRARY ***"
 LOCAL_PATH = "*** INPUT LOCAL DESTINATION PATH HERE ***"
 # Hash Validation Settings
 ENABLE_HASH_VALIDATION = "True"
 HASH_THRESHOLD_MB = "30"
--- a/download_sharepoint.py
+++ b/download_sharepoint.py
@@ -3,194 +3,449 @@ import csv
 import requests
 import time
 import threading
 import logging
 import base64
 import struct
 try:
    import quickxorhash as qxh_lib
 except ImportError:
    qxh_lib = None
 from concurrent.futures import ThreadPoolExecutor, as_completed
 from datetime import datetime
 from msal import ConfidentialClientApplication
 from urllib.parse import urlparse, quote
-# Configuration for concurrency
+# --- Production Configuration ---
 MAX_WORKERS = 5
 MAX_RETRIES = 5
 CHUNK_SIZE = 1024 * 1024  # 1MB Chunks
 MAX_FOLDER_DEPTH = 50
 LOG_FILE = "sharepoint_download.log"
 # Setup Logging
 logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s [%(levelname)s] %(threadName)s: %(message)s',
    handlers=[
        logging.FileHandler(LOG_FILE, encoding='utf-8'),
        logging.StreamHandler()
    ]
 )
 logger = logging.getLogger(__name__)
 report_lock = threading.Lock()
 def format_size(size_bytes):
-    """Formats bytes into a human-readable string."""
+    for unit in ['B', 'KB', 'MB', 'GB', 'TB', 'PB']:
    for unit in ['B', 'KB', 'MB', 'GB', 'TB']:
        if size_bytes < 1024.0:
            return f"{size_bytes:.2f} {unit}"
        size_bytes /= 1024.0
    return f"{size_bytes:.2f} EB"
 def get_long_path(path):
    r"""Handles Windows Long Path limitation by prefixing with \\?\ for absolute paths.
    Correctly handles UNC paths (e.g. \\server\share -> \\?\UNC\server\share)."""
    path = os.path.abspath(path)
    if os.name == 'nt' and not path.startswith("\\\\?\\"):
        if path.startswith("\\\\"):
            return "\\\\?\\UNC\\" + path[2:]
        return "\\\\?\\" + path
    return path
 def load_config(file_path):
    config = {}
    if not os.path.exists(file_path):
        raise FileNotFoundError(f"Configuration file {file_path} not found.")
    with open(file_path, 'r', encoding='utf-8') as f:
        for line in f:
            if '=' in line:
                key, value = line.split('=', 1)
                config[key.strip()] = value.strip().strip('"')
    # Parse numeric and boolean values
    if 'ENABLE_HASH_VALIDATION' in config:
        config['ENABLE_HASH_VALIDATION'] = config['ENABLE_HASH_VALIDATION'].lower() == 'true'
    else:
        config['ENABLE_HASH_VALIDATION'] = True
    if 'HASH_THRESHOLD_MB' in config:
        try:
            config['HASH_THRESHOLD_MB'] = int(config['HASH_THRESHOLD_MB'])
        except ValueError:
            config['HASH_THRESHOLD_MB'] = 30
    else:
        config['HASH_THRESHOLD_MB'] = 30
    return config
-def create_msal_app(tenant_id, client_id, client_secret):
+# --- Punkt 1: Exponential Backoff & Retry Logic ---
-    return ConfidentialClientApplication(
+def retry_request(func):
-        client_id,
+    def wrapper(*args, **kwargs):
-        authority=f"https://login.microsoftonline.com/{tenant_id}",
+        retries = 0
-        client_credential=client_secret,
+        while retries < MAX_RETRIES:
-    )
+            try:
                response = func(*args, **kwargs)
                if response.status_code == 429:
                    retry_after = int(response.headers.get("Retry-After", 2 ** retries))
                    logger.warning(f"Throttled (429). Waiting {retry_after}s...")
                    time.sleep(retry_after)
                    retries += 1
                    continue
                response.raise_for_status()
                return response
            except requests.exceptions.RequestException as e:
                # Hvis det er 401, skal vi ikke vente/retry her, da token/URL sandsynligvis er udløbet
                if isinstance(e, requests.exceptions.HTTPError) and e.response is not None and e.response.status_code == 401:
                    raise e
-def get_headers(app):
+                retries += 1
-    """Acquires a token from cache or fetches a new one if expired."""
+                wait = 2 ** retries
                if retries >= MAX_RETRIES:
                    raise e
                logger.error(f"Request failed: {e}. Retrying in {wait}s...")
                time.sleep(wait)
        raise requests.exceptions.RetryError(f"Max retries ({MAX_RETRIES}) exceeded.")
    return wrapper
@retry_request
 def safe_get(url, headers, stream=False, timeout=60, params=None):
    return requests.get(url, headers=headers, stream=stream, timeout=timeout, params=params)
 def safe_graph_get(app, url):
    """Specialized helper for Graph API calls that handles 401 by refreshing tokens."""
    try:
        return safe_get(url, headers=get_headers(app))
    except requests.exceptions.HTTPError as e:
        if e.response is not None and e.response.status_code == 401:
            logger.info("Access Token expired during Graph call. Forcing refresh...")
            return safe_get(url, headers=get_headers(app, force_refresh=True))
        raise
 # --- Punkt 4: Integrity Validation (QuickXorHash) ---
 def quickxorhash(file_path):
    """Compute Microsoft QuickXorHash for a file. Returns base64-encoded string.
    Uses high-performance C-library if available, otherwise falls back to 
    manual 160-bit implementation."""
    # 1. Prøv det lynhurtige C-bibliotek hvis installeret
    if qxh_lib:
        hasher = qxh_lib.quickxorhash()
        with open(get_long_path(file_path), 'rb') as f:
            while True:
                chunk = f.read(CHUNK_SIZE)
                if not chunk: break
                hasher.update(chunk)
        return base64.b64encode(hasher.digest()).decode('ascii')
    # 2. Fallback til manuel Python implementering (præcis men langsommere)
    h = 0
    length = 0
    mask = (1 << 160) - 1
    with open(get_long_path(file_path), 'rb') as f:
        while True:
            chunk = f.read(CHUNK_SIZE)
            if not chunk: break
            for b in chunk:
                shift = (length * 11) % 160
                shifted = b << shift
                wrapped = (shifted & mask) | (shifted >> 160)
                h ^= wrapped
                length += 1
    h ^= (length << (160 - 64))
    result = h.to_bytes(20, byteorder='little')
    return base64.b64encode(result).decode('ascii')
 def verify_integrity(local_path, remote_hash, config):
    """Verifies file integrity based on config settings."""
    if not remote_hash or not config.get('ENABLE_HASH_VALIDATION', True):
        return True
    file_size = os.path.getsize(get_long_path(local_path))
    threshold_mb = config.get('HASH_THRESHOLD_MB', 30)
    threshold_bytes = threshold_mb * 1024 * 1024
    if file_size > threshold_bytes:
        logger.info(f"Skipping hash check (size > {threshold_mb}MB): {os.path.basename(local_path)}")
        return True
    local_hash = quickxorhash(local_path)
    if local_hash != remote_hash:
        logger.warning(f"Hash mismatch for {local_path}: local={local_hash}, remote={remote_hash}")
        return False
    return True
 def get_headers(app, force_refresh=False):
    scopes = ["https://graph.microsoft.com/.default"]
    # If force_refresh is True, we don't rely on the cache
    result = None
    if not force_refresh:
        result = app.acquire_token_for_client(scopes=scopes)
    if force_refresh or not result or "access_token" not in result:
        logger.info("Refreshing Access Token...")
        result = app.acquire_token_for_client(scopes=scopes, force_refresh=True)
    if "access_token" in result:
        return {'Authorization': f'Bearer {result["access_token"]}'}
-    else:
+    raise Exception(f"Auth failed: {result.get('error_description')}")
        raise Exception(f"Could not acquire token: {result.get('error_description')}")
 def get_site_id(app, site_url):
    headers = get_headers(app)
    parsed = urlparse(site_url)
-    hostname = parsed.netloc
+    url = f"https://graph.microsoft.com/v1.0/sites/{parsed.netloc}:{parsed.path}"
-    site_path = parsed.path
+    response = safe_graph_get(app, url)
    url = f"https://graph.microsoft.com/v1.0/sites/{hostname}:{site_path}"
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    return response.json()['id']
 def get_drive_id(app, site_id, drive_name):
    headers = get_headers(app)
    url = f"https://graph.microsoft.com/v1.0/sites/{site_id}/drives"
-    response = requests.get(url, headers=headers)
+    response = safe_graph_get(app, url)
    response.raise_for_status()
    drives = response.json().get('value', [])
    # Prøv præcis match
    for drive in drives:
        if drive['name'] == drive_name:
            return drive['id']
    raise Exception(f"Drive '{drive_name}' not found in site.")
-def download_single_file(download_url, local_path, expected_size, display_name):
+    # Prøv fallback til "Documents" hvis "Delte dokumenter" fejler (SharePoint standard)
-    """Worker function for a single file download."""
+    if drive_name == "Delte dokumenter":
        for drive in drives:
            if drive['name'] == "Documents":
                logger.info("Found 'Documents' as fallback for 'Delte dokumenter'")
                return drive['id']
    # Log tilgængelige navne for at hjælpe brugeren
    available_names = [d['name'] for d in drives]
    logger.error(f"Drive '{drive_name}' not found. Available drives on this site: {available_names}")
    raise Exception(f"Drive {drive_name} not found. Check the log for available drive names.")
 # --- Punkt 2: Resume / Chunked Download logic ---
 def get_fresh_download_url(app, drive_id, item_id):
    """Fetches a fresh download URL for a specific item ID with retries and robust error handling."""
    url = f"https://graph.microsoft.com/v1.0/drives/{drive_id}/items/{item_id}"
    for attempt in range(3):
        try:
-        # Check if file exists and size matches
+            headers = get_headers(app)
-        if os.path.exists(local_path):
+            response = requests.get(url, headers=headers, timeout=60)
            local_size = os.path.getsize(local_path)
            if int(local_size) == int(expected_size):
                print(f"Skipped (matches local): {display_name}")
                return True, None
-        print(f"Starting: {display_name} ({format_size(expected_size)})")
+            if response.status_code == 429:
-        os.makedirs(os.path.dirname(local_path), exist_ok=True)
+                retry_after = int(response.headers.get("Retry-After", 2 ** attempt))
                logger.warning(f"Throttled (429) in get_fresh_download_url. Waiting {retry_after}s...")
                time.sleep(retry_after)
                continue
            if response.status_code == 401:
                logger.info(f"Access Token expired during refresh (Attempt {attempt+1}). Forcing refresh...")
                headers = get_headers(app, force_refresh=True)
                response = requests.get(url, headers=headers, timeout=60)
        # Using a longer timeout for the initial connection on very large files
        response = requests.get(download_url, stream=True, timeout=120)
            response.raise_for_status()
            data = response.json()
            download_url = data.get('@microsoft.graph.downloadUrl')
-        with open(local_path, 'wb') as f:
+            if download_url:
-            for chunk in response.iter_content(chunk_size=1024*1024): # 1MB chunks
+                return download_url, None
            # If item exists but URL is missing, it might be a transient SharePoint issue
            logger.warning(f"Attempt {attempt+1}: '@microsoft.graph.downloadUrl' missing for {item_id}. Retrying in {2 ** attempt}s...")
            time.sleep(2 ** attempt)
        except Exception as e:
            if attempt == 2:
                return None, str(e)
            logger.warning(f"Attempt {attempt+1} failed: {e}. Retrying in {2 ** attempt}s...")
            time.sleep(2 ** attempt)
    return None, "Item returned but '@microsoft.graph.downloadUrl' was missing after 3 attempts."
 def download_single_file(app, drive_id, item_id, local_path, expected_size, display_name, config, stop_event=None, remote_hash=None, initial_url=None, remote_mtime_str=None):
    try:
        if stop_event and stop_event.is_set():
            raise InterruptedError("Sync cancelled")
        file_mode = 'wb'
        resume_header = {}
        existing_size = 0
        download_url = initial_url
        long_local_path = get_long_path(local_path)
        if os.path.exists(long_local_path):
            existing_size = os.path.getsize(long_local_path)
            local_mtime = os.path.getmtime(long_local_path)
            # Konvertér SharePoint ISO8601 UTC tid (f.eks. 2024-03-29T12:00:00Z) til unix timestamp
            remote_mtime = datetime.fromisoformat(remote_mtime_str.replace('Z', '+00:00')).timestamp()
            # Hvis filen findes, har rigtig størrelse OG lokal er ikke ældre end remote -> SKIP
            if existing_size == expected_size:
                if local_mtime >= (remote_mtime - 1): # Vi tillader 1 sekuds difference pga. filsystem-præcision
                    logger.info(f"Skipped (up-to-date): {display_name}")
                    return True, None
                else:
                    logger.info(f"Update available: {display_name} (Remote is newer)")
                    existing_size = 0
            elif existing_size < expected_size:
                # Ved resume tjekker vi også om kilden er ændret siden vi startede
                if local_mtime < (remote_mtime - 1):
                    logger.warning(f"Remote file changed during partial download: {display_name}. Restarting.")
                    existing_size = 0
                else:
                    logger.info(f"Resuming: {display_name} from {format_size(existing_size)}")
                    resume_header = {'Range': f'bytes={existing_size}-'}
                    file_mode = 'ab'
            else:
                logger.warning(f"Local file larger than remote: {display_name}. Overwriting.")
                existing_size = 0
        logger.info(f"Starting: {display_name} ({format_size(expected_size)})")
        os.makedirs(os.path.dirname(long_local_path), exist_ok=True)
        # Initial download attempt
        if not download_url:
            download_url, err = get_fresh_download_url(app, drive_id, item_id)
            if not download_url:
                return False, f"Could not fetch initial URL: {err}"
        try:
            response = safe_get(download_url, resume_header, stream=True, timeout=120)
        except requests.exceptions.HTTPError as e:
            if e.response is not None and e.response.status_code == 401:
                # Handle 401 Unauthorized from SharePoint (expired download link)
                logger.warning(f"URL expired for {display_name}. Fetching fresh URL...")
                download_url, err = get_fresh_download_url(app, drive_id, item_id)
                if not download_url:
                    return False, f"Failed to refresh download URL: {err}"
                response = safe_get(download_url, resume_header, stream=True, timeout=120)
            else:
                raise
        with open(long_local_path, file_mode) as f:
            for chunk in response.iter_content(chunk_size=CHUNK_SIZE):
                if stop_event and stop_event.is_set():
                    raise InterruptedError("Sync cancelled")
                if chunk:
                    f.write(chunk)
-        # Verify size after download
+        # Post-download check
-        local_size = os.path.getsize(local_path)
+        final_size = os.path.getsize(long_local_path)
-        if int(local_size) == int(expected_size):
+        if final_size == expected_size:
-            print(f"DONE: {display_name}")
+            if verify_integrity(local_path, remote_hash, config):
                logger.info(f"DONE: {display_name}")
                return True, None
            else:
-            return False, f"Size mismatch: Remote={expected_size}, Local={local_size}"
+                return False, "Integrity check failed (Hash mismatch)"
        else:
            return False, f"Size mismatch: Remote={expected_size}, Local={final_size}"
    except InterruptedError:
        raise
    except Exception as e:
        return False, str(e)
-def process_item_list(app, drive_id, item_path, local_root_path, report, executor, futures):
+# --- Main Traversal Logic ---
-    """Traverses folders and submits file downloads to the executor with pagination support."""
+def process_item_list(app, drive_id, item_path, local_root_path, report, executor, futures, config, stop_event=None, depth=0):
    if depth >= MAX_FOLDER_DEPTH:
        logger.warning(f"Max folder depth ({MAX_FOLDER_DEPTH}) reached at: {item_path}. Skipping subtree.")
        return
    try:
-        headers = get_headers(app)
+        if stop_event and stop_event.is_set():
            raise InterruptedError("Sync cancelled")
        encoded_path = quote(item_path)
        # Initial URL for the folder children
        if not item_path:
            url = f"https://graph.microsoft.com/v1.0/drives/{drive_id}/root/children"
        else:
            url = f"https://graph.microsoft.com/v1.0/drives/{drive_id}/root:/{encoded_path}:/children"
        while url:
-            response = requests.get(url, headers=headers)
+            response = safe_graph_get(app, url)
            response.raise_for_status()
            data = response.json()
            items = data.get('value', [])
            for item in items:
                if stop_event and stop_event.is_set():
                    raise InterruptedError("Sync cancelled")
                item_name = item['name']
                local_path = os.path.join(local_root_path, item_name)
                display_path = f"{item_path}/{item_name}".strip('/')
                if 'folder' in item:
-                    process_item_list(app, drive_id, display_path, local_path, report, executor, futures)
+                    process_item_list(app, drive_id, display_path, local_path, report, executor, futures, config, stop_event, depth + 1)
                elif 'file' in item:
                    item_id = item['id']
                    download_url = item.get('@microsoft.graph.downloadUrl')
-                    if not download_url:
+                    remote_hash = item.get('file', {}).get('hashes', {}).get('quickXorHash')
-                        with report_lock:
+                    remote_mtime = item.get('lastModifiedDateTime')
                            report.append({"Path": display_path, "Error": "No download URL", "Timestamp": datetime.now().isoformat()})
                        continue
-                    # Submit download to thread pool
+                    future = executor.submit(
-                    future = executor.submit(download_single_file, download_url, local_path, item['size'], display_path)
+                        download_single_file, 
                        app, drive_id, item_id, 
                        local_path, item['size'], display_path, 
                        config, stop_event, remote_hash, download_url, remote_mtime
                    )
                    futures[future] = display_path
            # Check for next page of items
            url = data.get('@odata.nextLink')
            if url:
                # Refresh token if needed for the next page request
                headers = get_headers(app)
    except InterruptedError:
        raise
    except Exception as e:
        logger.error(f"Error traversing {item_path}: {e}")
        with report_lock:
-            report.append({"Path": item_path, "Error": f"Folder error: {str(e)}", "Timestamp": datetime.now().isoformat()})
+            report.append({"Path": item_path, "Error": str(e), "Timestamp": datetime.now().isoformat()})
-def main():
+def create_msal_app(tenant_id, client_id, client_secret):
    return ConfidentialClientApplication(
        client_id, authority=f"https://login.microsoftonline.com/{tenant_id}", client_credential=client_secret
    )
 def main(config=None, stop_event=None):
    try:
        if config is None:
            config = load_config('connection_info.txt')
        tenant_id = config.get('TENANT_ID', '')
        client_id = config.get('CLIENT_ID', '')
        client_secret = config.get('CLIENT_SECRET', '')
        site_url = config.get('SITE_URL', '')
        drive_name = config.get('DOCUMENT_LIBRARY', '')
-    folders_to_download_str = config.get('FOLDERS_TO_DOWNLOAD', '')
+        folders_str = config.get('FOLDERS_TO_DOWNLOAD', '')
-    local_path_base = config.get('LOCAL_PATH', '').replace('\\', os.sep)
+        local_base = config.get('LOCAL_PATH', '').replace('\\', os.sep)
-    folders_to_download = [f.strip() for f in folders_to_download_str.split(',') if f.strip()]
+        folders = [f.strip() for f in folders_str.split(',') if f.strip()] or [""]
    if not folders_to_download:
        folders_to_download = [""]
-    print(f"Connecting via Graph API (Parallel Download, Workers={MAX_WORKERS})...")
+        logger.info("Initializing SharePoint Production Sync Tool...")
    report = []
    try:
        app = create_msal_app(tenant_id, client_id, client_secret)
        site_id = get_site_id(app, site_url)
        drive_id = get_drive_id(app, site_id, drive_name)
-        with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
+        report = []
        with ThreadPoolExecutor(max_workers=MAX_WORKERS, thread_name_prefix="DL") as executor:
            futures = {}
-            for folder in folders_to_download:
+            for folder in folders:
-                if folder == "":
+                if stop_event and stop_event.is_set():
-                    print("\nScanning entire document library (Root)...")
+                    break
-                else:
+                logger.info(f"Scanning: {folder or 'Root'}")
-                    print(f"\nScanning folder: {folder}")
+                process_item_list(app, drive_id, folder, os.path.join(local_base, folder), report, executor, futures, config, stop_event)
-                local_folder_path = os.path.join(local_path_base, folder)
+            logger.info(f"Scan complete. Processing {len(futures)} tasks...")
                process_item_list(app, drive_id, folder, local_folder_path, report, executor, futures)
            print(f"\n--- Scanning complete. Active downloads: {len(futures)} ---\n")
            # Wait for all downloads to complete and collect errors
            for future in as_completed(futures):
                if stop_event and stop_event.is_set():
                    break
                path = futures[future]
-                success, error_msg = future.result()
+                try:
                    success, error = future.result()
                    if not success:
-                    print(f"FAILED: {path} - {error_msg}")
+                        logger.error(f"FAILED: {path} | {error}")
                        with report_lock:
-                        report.append({"Path": path, "Error": error_msg, "Timestamp": datetime.now().isoformat()})
+                            report.append({"Path": path, "Error": error, "Timestamp": datetime.now().isoformat()})
                except InterruptedError:
                    continue # The executor will shut down anyway
-    except Exception as e:
+        if stop_event and stop_event.is_set():
-        print(f"Critical error: {e}")
+            logger.warning("Synchronization was stopped by user.")
-        report.append({"Path": "GENERAL", "Error": str(e), "Timestamp": datetime.now().isoformat()})
+            return
        report_file = f"download_report_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv"
        with open(report_file, 'w', newline='', encoding='utf-8') as f:
@@ -198,8 +453,12 @@ def main():
            writer.writeheader()
            writer.writerows(report)
-    print(f"\nProcess complete. Errors logged: {len(report)}")
+        logger.info(f"Sync complete. Errors: {len(report)}. Report: {report_file}")
-    print(f"Report file: {report_file}")
+
    except InterruptedError:
        logger.warning("Synchronization was stopped by user.")
    except Exception as e:
        logger.critical(f"FATAL ERROR: {e}")
 if __name__ == "__main__":
    main()
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,2 +1,3 @@
 requests
 msal
 customtkinter
--- a/sharepoint_gui.py
+++ b/sharepoint_gui.py
@@ -0,0 +1,159 @@
 import os
 import threading
 import logging
 import customtkinter as ctk
 from tkinter import filedialog, messagebox
 import download_sharepoint  # Din eksisterende kerne-logik
 import requests
 # --- Global Stop Flag ---
 stop_event = threading.Event()
 # --- Logging Handler for GUI ---
 class TextboxHandler(logging.Handler):
    def __init__(self, textbox):
        super().__init__()
        self.textbox = textbox
    def emit(self, record):
        msg = self.format(record)
        self.textbox.after(0, self.append_msg, msg)
    def append_msg(self, msg):
        self.textbox.configure(state="normal")
        self.textbox.insert("end", msg + "\n")
        self.textbox.see("end")
        self.textbox.configure(state="disabled")
 # --- Main App ---
 class SharepointApp(ctk.CTk):
    def __init__(self):
        super().__init__()
        self.title("SharePoint Download Tool - UX")
        self.geometry("1000x850") # Gjort lidt bredere og højere for at give plads
        ctk.set_appearance_mode("dark")
        ctk.set_default_color_theme("blue")
        self.grid_columnconfigure(1, weight=1)
        self.grid_rowconfigure(0, weight=1)
        # Sidebar
        self.sidebar_frame = ctk.CTkFrame(self, width=350, corner_radius=0)
        self.sidebar_frame.grid(row=0, column=0, sticky="nsew")
        self.sidebar_frame.grid_rowconfigure(25, weight=1)
        self.logo_label = ctk.CTkLabel(self.sidebar_frame, text="Indstillinger", font=ctk.CTkFont(size=20, weight="bold"))
        self.logo_label.grid(row=0, column=0, padx=20, pady=(20, 10))
        self.entries = {}
        fields = [
            ("TENANT_ID", "Tenant ID"),
            ("CLIENT_ID", "Client ID"),
            ("CLIENT_SECRET", "Client Secret"),
            ("SITE_URL", "Site URL"),
            ("DOCUMENT_LIBRARY", "Library Navn"),
            ("FOLDERS_TO_DOWNLOAD", "Mapper (komma-sep)"),
            ("LOCAL_PATH", "Lokal Sti"),
            ("ENABLE_HASH_VALIDATION", "Valider Hash (True/False)"),
            ("HASH_THRESHOLD_MB", "Hash Grænse (MB)")
        ]
        for i, (key, label) in enumerate(fields):
            lbl = ctk.CTkLabel(self.sidebar_frame, text=label)
            lbl.grid(row=i*2+1, column=0, padx=20, pady=(5, 0), sticky="w")
            entry = ctk.CTkEntry(self.sidebar_frame, width=280)
            if key == "CLIENT_SECRET": entry.configure(show="*")
            entry.grid(row=i*2+2, column=0, padx=20, pady=(0, 5))
            self.entries[key] = entry
        self.browse_button = ctk.CTkButton(self.sidebar_frame, text="Vælg Mappe", command=self.browse_folder, height=32)
        self.browse_button.grid(row=20, column=0, padx=20, pady=10)
        self.save_button = ctk.CTkButton(self.sidebar_frame, text="Gem Indstillinger", command=self.save_settings, fg_color="transparent", border_width=2)
        self.save_button.grid(row=21, column=0, padx=20, pady=10)
        # Main side
        self.main_frame = ctk.CTkFrame(self, corner_radius=0, fg_color="transparent")
        self.main_frame.grid(row=0, column=1, sticky="nsew", padx=20, pady=20)
        self.main_frame.grid_rowconfigure(1, weight=1)
        self.main_frame.grid_columnconfigure(0, weight=1)
        self.status_label = ctk.CTkLabel(self.main_frame, text="Status: Klar", font=ctk.CTkFont(size=16))
        self.status_label.grid(row=0, column=0, pady=(0, 10), sticky="w")
        self.log_textbox = ctk.CTkTextbox(self.main_frame, state="disabled")
        self.log_textbox.grid(row=1, column=0, sticky="nsew")
        # Buttons frame
        self.btn_frame = ctk.CTkFrame(self.main_frame, fg_color="transparent")
        self.btn_frame.grid(row=2, column=0, pady=(20, 0), sticky="ew")
        self.btn_frame.grid_columnconfigure(0, weight=1)
        self.start_button = ctk.CTkButton(self.btn_frame, text="Start Synkronisering", command=self.start_sync_thread, height=50, font=ctk.CTkFont(size=16, weight="bold"))
        self.start_button.grid(row=0, column=0, padx=(0, 10), sticky="ew")
        self.stop_button = ctk.CTkButton(self.btn_frame, text="Stop", command=self.stop_sync, height=50, fg_color="#d32f2f", hover_color="#b71c1c", state="disabled")
        self.stop_button.grid(row=0, column=1, sticky="ew")
        self.load_settings()
        self.setup_logging()
    def setup_logging(self):
        handler = TextboxHandler(self.log_textbox)
        handler.setFormatter(logging.Formatter('%(asctime)s: %(message)s', datefmt='%H:%M:%S'))
        download_sharepoint.logger.addHandler(handler)
    def browse_folder(self):
        path = filedialog.askdirectory()
        if path:
            self.entries["LOCAL_PATH"].delete(0, "end")
            self.entries["LOCAL_PATH"].insert(0, path)
    def load_settings(self):
        if os.path.exists("connection_info.txt"):
            config = download_sharepoint.load_config("connection_info.txt")
            for key, entry in self.entries.items():
                val = config.get(key, "")
                entry.insert(0, val)
    def save_settings(self):
        config_lines = [f'{k} = "{v.get()}"' for k, v in self.entries.items()]
        with open("connection_info.txt", "w", encoding="utf-8") as f:
            f.write("\n".join(config_lines))
    def stop_sync(self):
        stop_event.set()
        self.stop_button.configure(state="disabled", text="Stopper...")
        download_sharepoint.logger.warning("Stop-signal sendt. Venter på at tråde afbryder...")
    def start_sync_thread(self):
        self.save_settings()
        stop_event.clear()
        self.start_button.configure(state="disabled")
        self.stop_button.configure(state="normal", text="Stop")
        self.status_label.configure(text="Status: Synkroniserer...", text_color="orange")
        thread = threading.Thread(target=self.run_sync, daemon=True)
        thread.start()
    def run_sync(self):
        try:
            config = download_sharepoint.load_config("connection_info.txt")
            download_sharepoint.main(config=config, stop_event=stop_event)
            if stop_event.is_set():
                self.status_label.configure(text="Status: Afbrudt", text_color="red")
            else:
                self.status_label.configure(text="Status: Gennemført!", text_color="green")
        except InterruptedError:
            self.status_label.configure(text="Status: Afbrudt", text_color="red")
        except Exception as e:
            self.status_label.configure(text="Status: Fejl!", text_color="red")
            messagebox.showerror("Fejl", str(e))
        finally:
            self.start_button.configure(state="normal")
            self.stop_button.configure(state="disabled", text="Stop")
 if __name__ == "__main__":
    app = SharepointApp()
    app.mainloop()
Author	SHA1	Message	Date
Martin Tranberg	d15b9afc03	Update README.md with new features and optimizations (Danish)	2026-04-12 12:46:15 +02:00
Martin Tranberg	8e8bb3baa1	Improve cancellation logic and sync performance. - Implement explicit threading.Event propagation for robust GUI cancellation. - Optimize file synchronization by skipping hash validation for up-to-date files (matching size and timestamp). - Update Windows long path support to correctly handle UNC network shares. - Refactor configuration management to eliminate global state and improve modularity. - Remove requests.get monkey-patch in GUI. - Delete CLAUDE.md as it is no longer required.	2026-04-12 12:44:43 +02:00
Martin Tranberg	8899afabbc	Improve token handling and session refresh logic. Added safe_graph_get helper and optimized 401 response handling to eliminate 'Request failed' errors during long syncs.	2026-03-30 09:18:40 +02:00
Martin Tranberg	9e40abcfd8	Robust type-konvertering af konfigurations-værdier - Implementerer korrekt boolean parsing for ENABLE_HASH_VALIDATION - Tilføjer fejlhåndtering (try/except) ved parsing af HASH_THRESHOLD_MB - Sikrer 100% konsistens mellem GUI-input og backend-logik	2026-03-29 19:58:45 +02:00
Martin Tranberg	03a766be63	Opdatér template med nye hash-variabler - Tilføjer ENABLE_HASH_VALIDATION og HASH_THRESHOLD_MB til connection_info.template.txt	2026-03-29 19:56:07 +02:00
Martin Tranberg	1a97ca3d53	Cleanup og variabel-synkronisering - Rydder op i duplicate kode i download_single_file - Sikrer korrekt type-casting af config-variabler (bool/int) - Verificerer at alle GUI-parametre læses korrekt i main()	2026-03-29 19:55:08 +02:00
Martin Tranberg	8e837240b5	Projekt afslutning: Marker værktøj som produktionsklart (Enterprise-grade) - Tilføjer officiel status-vurdering i README.md - Bekræfter understøttelse af Long Paths, Timestamp Sync og korrekt QuickXorHash validering	2026-03-29 19:48:56 +02:00
Martin Tranberg	f5e54b185e	Gør 'quickxorhash' valgfri for at undgå installationsfejl på Windows - Fjerner quickxorhash fra requirements.txt for at undgå C++ Build Tools fejlen - Tilføjer note i README.md om at biblioteket er valgfrit (findes Python-fallback) - Sikrer at 'pip install -r requirements.txt' fungerer uden fejl for alle brugere	2026-03-29 19:40:12 +02:00
Martin Tranberg	c5d4ddaab0	Enterprise-grade optimeringer: Windows Long Path, High-Performance Hashing og Dokumentation - Tilføjer 'get_long_path' for at understøtte Windows-stier over 260 tegn - Implementerer dual-mode hashing: Bruger 'quickxorhash' C-bibliotek hvis muligt, ellers manual Python fallback - Opdaterer requirements.txt med quickxorhash - Opdaterer README.md og GEMINI.md med de seneste funktioner og tekniske specifikationer	2026-03-29 19:33:31 +02:00
Martin Tranberg	367d31671d	Opdatér dokumentation med tidsstempel-synk og hash-optimeringer - Opdaterer README.md med beskrivelse af Timestamp Sync, Hash Toggle og 30MB grænse - Opdaterer GEMINI.md med tekniske specifikationer for QuickXorHash og biblioteks-fallback - Tilføjer vejledning til de nye konfigurationsmuligheder i GUI'en	2026-03-29 19:25:28 +02:00
Martin Tranberg	acede4a867	Synkronisér GUI med nye hash-indstillinger og tidsstempel-logik - Opdaterer sharepoint_gui.py med felter til ENABLE_HASH_VALIDATION og HASH_THRESHOLD_MB - Gør download_sharepoint.py i stand til at læse disse indstillinger fra konfigurationsfilen - Justerer GUI-layoutet (større vindue) for at give plads til de nye kontrolmuligheder - GUI'en bruger nu automatisk den nye tidsstempel-baserede synkronisering	2026-03-29 19:23:42 +02:00
Martin Tranberg	ba968ab70e	Synkronisér kun hvis SharePoint-filen er nyere end lokal kopi - Implementerer sammenligning af lastModifiedDateTime fra SharePoint med lokal mtime - Konverterer ISO8601 UTC-tidsstempler til unix timestamp for præcis sammenligning - Tilføjer 1-sekunds tolerance for at håndtere filsystemets tidspræcision - Sikrer at data kun hentes ned hvis kilden er opdateret, eller hvis lokal fil er korrupt	2026-03-29 19:19:56 +02:00
Martin Tranberg	790ca91339	Gør bibliotekssøgning mere robust og tilføj navne-fallback - Tilføjer automatisk fallback til 'Documents' hvis 'Delte dokumenter' ikke findes - Forbedrer fejlmeddelelsen ved at logge alle tilgængelige biblioteksnavne på sitet - Dette løser problemer med lokaliserede SharePoint-navne (dansk vs engelsk)	2026-03-29 17:59:34 +02:00
Martin Tranberg	ed508302a6	Tilføj global toggle og konfigurerbar grænse for hash-validering - ENABLE_HASH_VALIDATION (True/False) tilføjet til toppen af scriptet - HASH_THRESHOLD_MB tilføjet for nem justering af størrelsesgrænsen - verify_integrity opdateret til at respektere begge indstillinger	2026-03-29 17:45:45 +02:00
Martin Tranberg	33fbdc244d	Tilføj 30 MB grænse for hash-validering - Spring hash-tjek over for filer over 30 MB for at spare tid ved store filer (f.eks. 65 GB) - Filer over grænsen sammenlignes kun på størrelse - Tilføjer logning når hash-tjek springes over	2026-03-29 17:40:55 +02:00
Martin Tranberg	ad4166fb03	Fix QuickXorHash: XOR længde ind i de sidste 64 bit (bits 96-159) - Korrigerer finaliseringslogikken så filstørrelsen XOR'es ind i de mest betydende 64 bit af 160-bit staten - Tidligere version XOR'ede i de mindst betydende bit, hvilket gav forkerte hashes - Dette matcher nu præcis Microsofts specifikation og fjerner falske hash-mismatches	2026-03-29 17:36:13 +02:00
Martin Tranberg	39a3aff495	Fix QuickXorHash-implementering og tilføj manglende længde-XOR - Opdaterer quickxorhash til at bruge en 160-bit heltalsbuffer for korrekt cirkulær rotation - Tilføjer det obligatoriske XOR-trin med filens længde, som manglede tidligere - Sikrer korrekt 20-byte little-endian format ved base64-encoding - Dette løser problemet med konstante hash-mismatch på ellers korrekte filer	2026-03-29 14:52:13 +02:00
Martin Tranberg	634b5ff151	Tilføj 429-håndtering, eksponentiel backoff og dybdebegrænsning - get_fresh_download_url: tilføjer 429-tjek med Retry-After og erstatter fast sleep(1) med eksponentiel backoff (2^attempt sekunder) - process_item_list: tilføjer MAX_FOLDER_DEPTH=50 guard mod RecursionError ved unormalt dybe SharePoint-mappestrukturer - README og CLAUDE.md opdateret med beskrivelse af nye adfærd Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 15:16:12 +01:00
Martin Tranberg	3bb2b44477	Opdater README: QuickXorHash er nu fuldt implementeret Beskrivelsen af Smart Skip & Integritet er opdateret fra "forbereder til hash-validering" til at afspejle at QuickXorHash nu er aktivt — korrupte filer med korrekt størrelse detekteres og re-downloades automatisk. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 14:40:18 +01:00
Martin Tranberg	a8048ae74d	Ret fire fejl i download_sharepoint.py - Implementér QuickXorHash korrekt med 3 × uint64 cells matching Microsofts C#-reference — tidligere 8-bit implementation gav forkert hash - verify_integrity tjekker nu hash på eksisterende filer ved skip-check og re-downloader ved mismatch i stedet for blindt at acceptere filen - retry_request raiser RetryError ved opbrugte forsøg i stedet for at returnere None, som ville crashe kaldere med AttributeError - format_size håndterer nu filer >= 1 PB (PB og EB tilføjet) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 14:39:27 +01:00
Martin Tranberg	7fab89cbbb	Ret tre fejl i download_sharepoint.py og tilføj CLAUDE.md - force_refresh sendes nu korrekt til MSAL så token-cache omgås ved 401 - safe_get bruges ved download-retry efter URL-refresh for at få exponential backoff - CSV DictWriter genbruges i stedet for at oprette to separate instanser Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 14:27:12 +01:00
Martin Tranberg	59eb9a4ab0	Tilføj retries til URL-refresh ved manglende @microsoft.graph.downloadUrl i API svar	2026-03-27 14:11:28 +01:00
Martin Tranberg	1c3180e037	Opdater GEMINI.md med teknisk dokumentation af 'Self-Healing Sessions'	2026-03-27 11:58:09 +01:00
Martin Tranberg	6bc4dd8f20	Opdater README med info om automatisk fornyelse af Access Tokens	2026-03-27 11:09:15 +01:00
Martin Tranberg	18158d52b2	Håndter Access Token udløb ved automatisk at forny token på 401-fejl fra Graph API	2026-03-27 11:03:14 +01:00
Martin Tranberg	931fd0dd05	Dokumentér Auto-Refresh af udløbne download-links i README	2026-03-27 09:21:36 +01:00
Martin Tranberg	483dc70ef8	Håndter 401-fejl ved automatisk at forny download-links	2026-03-27 09:15:57 +01:00
Martin Tranberg	5d5c8b2d5b	Opdater README med GUI instruktioner og stop-knap funktionalitet	2026-03-26 16:06:27 +01:00
Martin Tranberg	b33009c54c	Tilføj stop-knap til GUI uden at ændre i hovedscriptet	2026-03-26 16:03:34 +01:00
Martin Tranberg	368f4c515c	Tilføj moderne GUI UX (sharepoint_gui.py) ved hjælp af CustomTkinter	2026-03-26 15:59:24 +01:00
Martin Tranberg	4c52b0c8db	Opdater dokumentation (README og GEMINI.md) med Production Ready specifikationer	2026-03-26 15:44:30 +01:00
Martin Tranberg	1ed21e4184	Production Readiness: Exponential Backoff, Resume Download, Logging og Integrity Verification	2026-03-26 15:43:02 +01:00