2.4 KiB
2.4 KiB
SharePoint Download Tool
A Python-based utility designed to recursively download folders and files from a specific SharePoint Online Site using the Microsoft Graph API.
Project Overview
- Purpose: Automates the synchronization of specific SharePoint document library folders to a local directory.
- Technologies:
- Python 3.x
- Microsoft Graph API: Used for robust data access.
- MSAL (Microsoft Authentication Library): Handles Entra ID (Azure AD) authentication using Client Credentials flow.
- Requests: Manages HTTP streaming for large file downloads.
- Architecture:
download_sharepoint.py: The core script that orchestrates authentication, site/drive discovery, and recursive folder traversal.connection_info.txt: Centralized configuration file for credentials and target paths.requirements.txt: Defines necessary Python dependencies.
Building and Running
Prerequisites
- Python 3.x installed.
- A registered application in Microsoft Entra ID with
Sites.Read.All(or higher) application permissions.
Setup
- Install Dependencies:
pip install -r requirements.txt - Configure Connection:
Edit
connection_info.txtwith your specific details:TENANT_ID,CLIENT_ID,CLIENT_SECRETSITE_URL: Full URL to the SharePoint site.DOCUMENT_LIBRARY: The name of the target library (e.g., "Documents").FOLDERS_TO_DOWNLOAD: Comma-separated list of folder names to sync.LOCAL_PATH: The destination path on your local machine.
Execution
Run the main download script:
python download_sharepoint.py
Validation
After execution, a CSV report named download_report_YYYYMMDD_HHMMSS.csv is generated, detailing any failed downloads or size mismatches for verification.
Development Conventions
- Authentication: Always use the Graph API with MSAL for app-only authentication.
- Error Handling: All file and folder operations should be wrapped in try-except blocks, with errors logged to the generated CSV report.
- Verification: Post-download verification is performed by comparing the local file size against the
sizeproperty returned by the Graph API. - Security: Never commit
connection_info.txtor any file containing secrets. Use the provided.gitignore.