37 lines
1.9 KiB
Markdown
37 lines
1.9 KiB
Markdown
# SharePoint Download Tool - Technical Documentation
|
|
|
|
A production-ready Python utility for robust synchronization of SharePoint Online folders using Microsoft Graph API.
|
|
|
|
## Project Overview
|
|
|
|
* **Purpose:** Enterprise-grade synchronization tool for local mirroring of SharePoint content.
|
|
* **Technologies:**
|
|
* **Microsoft Graph API:** Advanced REST API for SharePoint data.
|
|
* **MSAL:** Secure authentication using Azure AD Client Credentials.
|
|
* **Requests:** High-performance HTTP client with streaming and Range header support.
|
|
* **ThreadPoolExecutor:** Parallel file processing for optimized throughput.
|
|
|
|
## Core Features (Production Ready)
|
|
|
|
1. **Resumable Downloads:** Implements HTTP `Range` headers to resume partially downloaded files, critical for multi-gigabyte assets.
|
|
2. **Reliability:** Includes a custom `retry_request` decorator for Exponential Backoff, handling throttling (429) and transient network errors.
|
|
3. **Concurrency:** Multi-threaded architecture (5 workers) for simultaneous scanning and downloading.
|
|
4. **Pagination:** Full support for OData pagination, ensuring complete folder traversal regardless of item count.
|
|
5. **Logging & Audit:** Integrated Python `logging` to `sharepoint_download.log` and structured CSV reports for error auditing.
|
|
|
|
## Building and Running
|
|
|
|
### Setup
|
|
1. **Dependencies:** `pip install -r requirements.txt`
|
|
2. **Configuration:** Use `connection_info.template.txt` to create `connection_info.txt`.
|
|
|
|
### Execution
|
|
`python download_sharepoint.py`
|
|
|
|
## Development Conventions
|
|
|
|
* **Error Handling:** Always use the `safe_get` (retry-wrapped) method for Graph API calls.
|
|
* **Thread Safety:** Use `report_lock` when updating the shared error list from worker threads.
|
|
* **Logging:** Prefer `logger.info()` or `logger.error()` over `print()` to ensure persistence in `sharepoint_download.log`.
|
|
* **Integrity:** Always verify file integrity using `size` and `quickXorHash` where available.
|