Files
Sharepoint-Download-Tool/GEMINI.md

2.4 KiB

SharePoint Download Tool

A Python-based utility designed to recursively download folders and files from a specific SharePoint Online Site using the Microsoft Graph API.

Project Overview

  • Purpose: Automates the synchronization of specific SharePoint document library folders to a local directory.
  • Technologies:
    • Python 3.x
    • Microsoft Graph API: Used for robust data access.
    • MSAL (Microsoft Authentication Library): Handles Entra ID (Azure AD) authentication using Client Credentials flow.
    • Requests: Manages HTTP streaming for large file downloads.
  • Architecture:
    • download_sharepoint.py: The core script that orchestrates authentication, site/drive discovery, and recursive folder traversal.
    • connection_info.txt: Centralized configuration file for credentials and target paths.
    • requirements.txt: Defines necessary Python dependencies.

Building and Running

Prerequisites

  • Python 3.x installed.
  • A registered application in Microsoft Entra ID with Sites.Read.All (or higher) application permissions.

Setup

  1. Install Dependencies:
    pip install -r requirements.txt
    
  2. Configure Connection: Edit connection_info.txt with your specific details:
    • TENANT_ID, CLIENT_ID, CLIENT_SECRET
    • SITE_URL: Full URL to the SharePoint site.
    • DOCUMENT_LIBRARY: The name of the target library (e.g., "Documents").
    • FOLDERS_TO_DOWNLOAD: Comma-separated list of folder names to sync.
    • LOCAL_PATH: The destination path on your local machine.

Execution

Run the main download script:

python download_sharepoint.py

Validation

After execution, a CSV report named download_report_YYYYMMDD_HHMMSS.csv is generated, detailing any failed downloads or size mismatches for verification.

Development Conventions

  • Authentication: Always use the Graph API with MSAL for app-only authentication.
  • Error Handling: All file and folder operations should be wrapped in try-except blocks, with errors logged to the generated CSV report.
  • Verification: Post-download verification is performed by comparing the local file size against the size property returned by the Graph API.
  • Security: Never commit connection_info.txt or any file containing secrets. Use the provided .gitignore.