Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

SharePoint Connector

Sync content from a SharePoint Online document library into ScaiDrive. The connector uses the Microsoft Graph API with an Azure AD app registration, mirrors a SharePoint site's drive/library into a ScaiDrive share, and schedules recurring syncs.

Base path: /api/v1/sharepoint-connectors/

Prerequisites#

You need an Azure AD app registration with Microsoft Graph permissions on the SharePoint site. Either:

  • App authentication — the connector uses an app-only token with Sites.Selected or Sites.ReadWrite.All permission. Best for bulk-sync scenarios.
  • User authentication — the connector acts as a specific user via delegated permissions. Best when per-user access tracking matters.

For either mode, collect:

  • Azure tenant ID
  • Application (client) ID
  • Client secret (app auth) or user auth flow details
  • SharePoint site URL
  • Site ID and drive ID (retrieved via Graph API or the connector's helper endpoints)

Creating a connector#

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
curl -X POST $SCAIDRIVE_URL/api/v1/sharepoint-connectors \
  -H "Authorization: Bearer $SCAIDRIVE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Marketing SP Site",
    "description": "Marketing team SharePoint library",
    "site_url": "https://contoso.sharepoint.com/sites/marketing",
    "site_id": "contoso.sharepoint.com,abc...",
    "drive_id": "b!xyz...",
    "library_name": "Documents",
    "base_path": "Campaigns/2026",
    "azure_tenant_id": "11111111-1111-1111-1111-111111111111",
    "azure_client_id": "22222222-2222-2222-2222-222222222222",
    "azure_client_secret": "<secret>",
    "auth_type": "app_auth",
    "target_share_id": "shr_01J3H",
    "target_path": "/From SharePoint",
    "sync_direction": "bidirectional",
    "sync_permissions": true,
    "sync_versions": true,
    "sync_metadata": true,
    "sync_interval_minutes": 30
  }'
python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
resp = httpx.post(
    f"{url}/api/v1/sharepoint-connectors",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "name": "Marketing SP Site",
        "site_url": "https://contoso.sharepoint.com/sites/marketing",
        "site_id": os.environ["SP_SITE_ID"],
        "drive_id": os.environ["SP_DRIVE_ID"],
        "library_name": "Documents",
        "azure_tenant_id": os.environ["AZ_TENANT"],
        "azure_client_id": os.environ["AZ_CLIENT"],
        "azure_client_secret": os.environ["AZ_SECRET"],
        "auth_type": "app_auth",
        "target_share_id": "shr_01J3H",
        "sync_direction": "bidirectional",
        "sync_interval_minutes": 30,
    },
)
connector = resp.json()
typescript
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
const resp = await fetch(`${url}/api/v1/sharepoint-connectors`, {
  method: "POST",
  headers: {
    Authorization: `Bearer ${token}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    name: "Marketing SP Site",
    site_url: "https://contoso.sharepoint.com/sites/marketing",
    site_id: process.env.SP_SITE_ID,
    drive_id: process.env.SP_DRIVE_ID,
    library_name: "Documents",
    azure_tenant_id: process.env.AZ_TENANT,
    azure_client_id: process.env.AZ_CLIENT,
    azure_client_secret: process.env.AZ_SECRET,
    auth_type: "app_auth",
    target_share_id: "shr_01J3H",
    sync_direction: "bidirectional",
    sync_interval_minutes: 30,
  }),
});
const connector = await resp.json();

Credentials are encrypted at rest; azure_client_secret is write-only.

Configuration fields#

Field Notes
name, description Display
site_url, site_id, drive_id SharePoint site/drive identifiers
library_name Document library name
base_path Subpath inside the library
azure_tenant_id, azure_client_id Azure app registration
azure_client_secret App auth only
auth_type app_auth or user_auth
target_share_id, target_path ScaiDrive destination
sync_direction bidirectional, sharepoint_to_scaidrive, scaidrive_to_sharepoint
sync_permissions Mirror SharePoint ACLs to ScaiDrive
sync_versions Replicate SharePoint version history
sync_metadata Sync column values / metadata
include_patterns, exclude_patterns Filters (same shape as SMB connector)
conflict_resolution Strategy
sync_interval_minutes Schedule

User-auth flow#

For user_auth, kick off the OAuth dance:

bash
1
2
curl -X POST $SCAIDRIVE_URL/api/v1/sharepoint-connectors/cnn_01J5Y/authorize \
  -H "Authorization: Bearer $SCAIDRIVE_TOKEN"

Response contains an authorization URL. Direct the user to it; when they complete consent, Microsoft redirects back to a ScaiDrive callback, and the connector stores the refresh token automatically.

Testing#

bash
1
2
curl -X POST $SCAIDRIVE_URL/api/v1/sharepoint-connectors/cnn_01J5Y/test \
  -H "Authorization: Bearer $SCAIDRIVE_TOKEN"

Returns connection status, site accessible status, and the effective permissions the app/user has.

Discovering sites and drives#

If you don't know site_id or drive_id, the connector has helper endpoints once basic Azure credentials are set:

bash
1
2
3
curl -G $SCAIDRIVE_URL/api/v1/sharepoint-connectors/cnn_01J5Y/sites \
  -H "Authorization: Bearer $SCAIDRIVE_TOKEN" \
  --data-urlencode "search=marketing"

Returns a list of sites the app can access, with site_id. From there:

bash
1
2
curl -H "Authorization: Bearer $SCAIDRIVE_TOKEN" \
     "$SCAIDRIVE_URL/api/v1/sharepoint-connectors/cnn_01J5Y/drives?site_id=$SITE"

Returns drives within the site.

Azure AD identity lookup#

For identity mapping, you need ScaiDrive to resolve Azure AD users and groups:

bash
1
2
3
4
curl -X POST $SCAIDRIVE_URL/api/v1/sharepoint-connectors/cnn_01J5Y/azure/users/search \
  -H "Authorization: Bearer $SCAIDRIVE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "alice"}'
json
1
2
3
4
5
{
  "users": [
    {"id": "aad-user-id", "email": "alice@contoso.com", "name": "Alice Cooper"}
  ]
}

Group search:

bash
1
2
3
4
curl -X POST $SCAIDRIVE_URL/api/v1/sharepoint-connectors/cnn_01J5Y/azure/groups/search \
  -H "Authorization: Bearer $SCAIDRIVE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "marketing"}'

Create identity mappings the same way as SMB:

bash
1
2
3
4
5
6
7
8
curl -X POST $SCAIDRIVE_URL/api/v1/sharepoint-connectors/cnn_01J5Y/identity-mappings \
  -H "Authorization: Bearer $SCAIDRIVE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "identity_type": "user",
    "smb_principal": "aad-user-id",
    "scaidrive_principal_id": "usr_01J4M"
  }'

(The field is named smb_principal for consistency with the SMB connector — here it's the Azure AD object ID.)

Triggering and monitoring syncs#

bash
1
2
3
4
5
curl -X POST $SCAIDRIVE_URL/api/v1/sharepoint-connectors/cnn_01J5Y/sync \
  -H "Authorization: Bearer $SCAIDRIVE_TOKEN"

curl -H "Authorization: Bearer $SCAIDRIVE_TOKEN" \
     $SCAIDRIVE_URL/api/v1/sharepoint-connectors/cnn_01J5Y/jobs

Same endpoints, same response shape as SMB jobs.

Version and metadata sync#

With sync_versions: true, the connector pulls SharePoint's version history and stores each as a ScaiDrive file version. The most recent SharePoint version becomes the current ScaiDrive version.

With sync_metadata: true, SharePoint column values (document author, custom metadata columns) are stored in the ScaiDrive file's metadata dict. They're readable via the file metadata endpoint but don't affect search unless explicitly indexed.

Limitations#

  • Large libraries — Full-sync of libraries with >100k files takes hours. The connector uses Graph's delta query, so subsequent runs are fast.
  • SharePoint check-outs — Files checked out in SharePoint are read-only until checked in. The connector respects this in bidirectional mode.
  • Per-file permissions — Unique SharePoint permissions (broken inheritance) are replicated if sync_permissions: true and identity mappings exist. Without mappings, broken permissions fall back to the connector's service identity.
  • OneNote notebooks — Indexed as opaque blobs. Semantic search doesn't work inside them.

Troubleshooting#

CONNECTOR_AUTH_FAILED — Azure rejected the app. Verify the client ID/secret, verify the app has the required Graph permissions, verify admin consent is granted.

CONNECTOR_UNREACHABLE — Graph API timing out or returning 5xx. Usually transient; the connector retries.

Sync errors on specific items — Often SharePoint item-level permission denials. Shown per-file in the job log.

Drift between versions — If SharePoint and ScaiDrive disagree on version history, force a resync with ?full_sync=true on the sync endpoint.

What's next#

Updated 2026-05-18 15:04:14 View source (.md) rev 2