Fix [Errno 22] Invalid argument by adding filename sanitization and long path support. Improved error reporting and folder path cleaning for Windows compatibility.
This commit is contained in:
51
GEMINI.md
Normal file
51
GEMINI.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# SharePoint Download Tool
|
||||
|
||||
A Python-based utility designed to recursively download folders and files from a specific SharePoint Online Site using the Microsoft Graph API.
|
||||
|
||||
## Project Overview
|
||||
|
||||
* **Purpose:** Automates the synchronization of specific SharePoint document library folders to a local directory.
|
||||
* **Technologies:**
|
||||
* **Python 3.x**
|
||||
* **Microsoft Graph API:** Used for robust data access.
|
||||
* **MSAL (Microsoft Authentication Library):** Handles Entra ID (Azure AD) authentication using Client Credentials flow.
|
||||
* **Requests:** Manages HTTP streaming for large file downloads.
|
||||
* **Architecture:**
|
||||
* `download_sharepoint.py`: The core script that orchestrates authentication, site/drive discovery, and recursive folder traversal.
|
||||
* `connection_info.txt`: Centralized configuration file for credentials and target paths.
|
||||
* `requirements.txt`: Defines necessary Python dependencies.
|
||||
|
||||
## Building and Running
|
||||
|
||||
### Prerequisites
|
||||
* Python 3.x installed.
|
||||
* A registered application in Microsoft Entra ID with `Sites.Read.All` (or higher) application permissions.
|
||||
|
||||
### Setup
|
||||
1. **Install Dependencies:**
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
2. **Configure Connection:**
|
||||
Edit `connection_info.txt` with your specific details:
|
||||
* `TENANT_ID`, `CLIENT_ID`, `CLIENT_SECRET`
|
||||
* `SITE_URL`: Full URL to the SharePoint site.
|
||||
* `DOCUMENT_LIBRARY`: The name of the target library (e.g., "Documents").
|
||||
* `FOLDERS_TO_DOWNLOAD`: Comma-separated list of folder names to sync.
|
||||
* `LOCAL_PATH`: The destination path on your local machine.
|
||||
|
||||
### Execution
|
||||
Run the main download script:
|
||||
```bash
|
||||
python download_sharepoint.py
|
||||
```
|
||||
|
||||
### Validation
|
||||
After execution, a CSV report named `download_report_YYYYMMDD_HHMMSS.csv` is generated, detailing any failed downloads or size mismatches for verification.
|
||||
|
||||
## Development Conventions
|
||||
|
||||
* **Authentication:** Always use the Graph API with MSAL for app-only authentication.
|
||||
* **Error Handling:** All file and folder operations should be wrapped in try-except blocks, with errors logged to the generated CSV report.
|
||||
* **Verification:** Post-download verification is performed by comparing the local file size against the `size` property returned by the Graph API.
|
||||
* **Security:** Never commit `connection_info.txt` or any file containing secrets. Use the provided `.gitignore`.
|
||||
Reference in New Issue
Block a user