1.9 KiB
1.9 KiB
SharePoint Download Tool - Technical Documentation
A production-ready Python utility for robust synchronization of SharePoint Online folders using Microsoft Graph API.
Project Overview
- Purpose: Enterprise-grade synchronization tool for local mirroring of SharePoint content.
- Technologies:
- Microsoft Graph API: Advanced REST API for SharePoint data.
- MSAL: Secure authentication using Azure AD Client Credentials.
- Requests: High-performance HTTP client with streaming and Range header support.
- ThreadPoolExecutor: Parallel file processing for optimized throughput.
Core Features (Production Ready)
- Resumable Downloads: Implements HTTP
Rangeheaders to resume partially downloaded files, critical for multi-gigabyte assets. - Reliability: Includes a custom
retry_requestdecorator for Exponential Backoff, handling throttling (429) and transient network errors. - Concurrency: Multi-threaded architecture (5 workers) for simultaneous scanning and downloading.
- Pagination: Full support for OData pagination, ensuring complete folder traversal regardless of item count.
- Logging & Audit: Integrated Python
loggingtosharepoint_download.logand structured CSV reports for error auditing.
Building and Running
Setup
- Dependencies:
pip install -r requirements.txt - Configuration: Use
connection_info.template.txtto createconnection_info.txt.
Execution
python download_sharepoint.py
Development Conventions
- Error Handling: Always use the
safe_get(retry-wrapped) method for Graph API calls. - Thread Safety: Use
report_lockwhen updating the shared error list from worker threads. - Logging: Prefer
logger.info()orlogger.error()overprint()to ensure persistence insharepoint_download.log. - Integrity: Always verify file integrity using
sizeandquickXorHashwhere available.