- get_fresh_download_url: tilføjer 429-tjek med Retry-After og erstatter fast sleep(1) med eksponentiel backoff (2^attempt sekunder) - process_item_list: tilføjer MAX_FOLDER_DEPTH=50 guard mod RecursionError ved unormalt dybe SharePoint-mappestrukturer - README og CLAUDE.md opdateret med beskrivelse af nye adfærd Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
51 lines
2.7 KiB
Markdown
51 lines
2.7 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
A Python utility that synchronizes SharePoint Online folders to local storage using the Microsoft Graph API. Offers both a CLI (`download_sharepoint.py`) and a modern GUI (`sharepoint_gui.py`).
|
|
|
|
## Running the Tool
|
|
|
|
```bash
|
|
# Install dependencies
|
|
pip install -r requirements.txt
|
|
|
|
# GUI mode (recommended for interactive use)
|
|
python sharepoint_gui.py
|
|
|
|
# CLI mode (for automation/scripting)
|
|
python download_sharepoint.py
|
|
```
|
|
|
|
Configuration is read from `connection_info.txt` (gitignored — copy from `connection_info.template.txt` and fill in credentials).
|
|
|
|
## Architecture
|
|
|
|
Two-file structure with clear separation of concerns:
|
|
|
|
**`download_sharepoint.py`** — Core engine with four logical layers:
|
|
1. **Authentication** — MSAL `ConfidentialClientApplication` using OAuth 2.0 Client Credentials flow. Tokens are refreshed via `force_refresh=True` when a 401 is received.
|
|
2. **Graph API navigation** — `get_site_id()` → `get_drive_id()` → `process_item_list()` (recursive, handles `@odata.nextLink` pagination).
|
|
3. **Download & resilience** — `download_single_file()` with Range header support for resumable downloads. `get_fresh_download_url()` handles expired pre-signed URLs and includes its own 429 detection and exponential backoff (`2^attempt` seconds). The `@retry_request` decorator provides the same for all other API calls (up to 5 retries).
|
|
4. **Concurrency** — `ThreadPoolExecutor` (max 5 workers). A `report_lock` guards the shared error list. A `stop_event` allows the GUI stop button to cancel in-flight work.
|
|
5. **Folder depth guard** — `process_item_list()` accepts a `depth` parameter and stops recursion at `MAX_FOLDER_DEPTH = 50`, logging a warning for any skipped subtrees.
|
|
|
|
**`sharepoint_gui.py`** — CustomTkinter wrapper that:
|
|
- Persists settings to a local JSON file
|
|
- Spawns the core engine in a background thread
|
|
- Patches `requests.get` to route through the GUI's log display
|
|
- Provides a folder browser for `LOCAL_PATH`
|
|
|
|
## Key Behaviors to Preserve
|
|
|
|
- **Self-healing sessions**: On 401, the code refreshes both the MSAL access token *and* the pre-signed Graph download URL before retrying — these are two separate expiry mechanisms.
|
|
- **Resumable downloads**: Files are downloaded in 1 MB chunks using HTTP Range headers. Existing files are skipped if their size matches; partial files are resumed from the last byte.
|
|
- **Stop signal**: `stop_event.is_set()` is checked in the download loop and recursive traversal — any new code that loops must respect this.
|
|
|
|
## Output
|
|
|
|
- `sharepoint_download.log` — Full operation log
|
|
- `download_report_YYYYMMDD_HHMMSS.csv` — Per-run error report (gitignored)
|