Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
298 changes: 290 additions & 8 deletions scripts/spo-time-based-file-reports/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,18 +22,22 @@ The three scripts are as follows:
- Scan the specific folders in the specific libraries on the specific sites from the csv file to get files older than 4 years
- Create a report with the specific name in the csv file and upload it to the specified SharePoint location

## Pre-requisites
## Pre-requisites

Several things must be configured or installed ahead of time
An App registration with a certificate and appropriate permissions has to be created in Azure
That certificate must be installed on any machine that will run the scripts
PowerShell 7 must be installed
The following PowerShell modules are needed:
PnP.PowerShell
ImportExcel
- An App registration with a certificate and appropriate permissions has to be created in Azure
- That certificate must be installed on any machine that will run the scripts
- PowerShell 7 must be installed
- The following PowerShell modules are needed:
- PnP.PowerShell
- ImportExcel

## Setup

Each of these scripts runs off of a csv file that you must fill out before running. You also have to configure the scripts themselves. See below for more details.

### All Scripts

This script uses an Azure App registration for authentication. You must create the registration with certificate to get the client ID, tenantID, and certificate thumbprint. You should also make sure that app has sufficient permissions. These are the ones I used:

![APIPermissions.png](assets/APIPermissions.png)
Expand All @@ -58,6 +62,7 @@ Fill out the OneDriveURLs.CSV file with the URL of the OneDrive user you wish to
Fill out the Script2SPURLs.csv file with the URLs for all the SharePoint sites you wish to scan. Make sure you use the complete path!

### Single document library and folder

1. Open customSPURLs.csv
2. Add the URL for every site you wish to scan
3. Add the library name for the title of the library you are scanning
Expand All @@ -77,10 +82,10 @@ Now that the scripts are setup, you just need to run them. All these steps are t
4. Hit enter and the script will run
5. Do the same thing with script 2 and 3 if you wish


## 1. OneDrive Scan

# [PnP PowerShell](#tab/pnpps)

```powershell

# Declare and initialize your app-only authentication details
Expand Down Expand Up @@ -259,6 +264,7 @@ Stop-Transcript
## 3. Individual library and folder Scan

# [PnP PowerShell](#tab/pnpps3)

```powershell
# Declare and initialize your app-only authentication details
$clientId = "xxxxx"
Expand Down Expand Up @@ -334,11 +340,287 @@ Stop-Transcript
[!INCLUDE [More about PnP PowerShell](../../docfx/includes/MORE-PNPPS.md)]
***

## All in one CLI for Microsoft 365 version

# [CLI for Microsoft 365](#tab/cli-m365-ps)

```powershell

[CmdletBinding(SupportsShouldProcess)]
param(
[Parameter(Mandatory = $false, HelpMessage = "SharePoint admin center URL (e.g., https://contoso-admin.sharepoint.com)")]
[ValidatePattern('^https://')]
[string]$TenantAdminUrl,

[Parameter(Mandatory = $false, HelpMessage = "SharePoint site URL to scan (e.g., https://contoso.sharepoint.com/sites/project)")]
[ValidatePattern('^https://')]
[string]$SiteUrl,

[Parameter(Mandatory = $false, HelpMessage = "Title of a specific document library to scan")]
[string]$LibraryName,

[Parameter(Mandatory = $false, HelpMessage = "Server-relative URL of a specific folder to scan")]
[string]$FolderUrl,

[Parameter(Mandatory = $false, HelpMessage = "Number of days to use as age threshold (default: 1460 = 4 years)")]
[int]$DaysOld = 1460,

[Parameter(Mandatory = $false, HelpMessage = "Full path for the CSV report file")]
[string]$OutputPath,

[Parameter(Mandatory = $false, HelpMessage = "Include OneDrive personal sites in scan (requires TenantAdminUrl)")]
[switch]$IncludeOneDrive,

[Parameter(Mandatory = $false, HelpMessage = "Scan subfolders recursively")]
[switch]$Recursive
)

begin {
if (-not $TenantAdminUrl -and -not $SiteUrl) {
throw "You must specify either -TenantAdminUrl or -SiteUrl parameter."
}

if ($TenantAdminUrl -and $SiteUrl) {
throw "Cannot specify both -TenantAdminUrl and -SiteUrl. Choose one scan mode."
}

if ($IncludeOneDrive -and -not $TenantAdminUrl) {
throw "-IncludeOneDrive requires -TenantAdminUrl parameter."
}

if ($LibraryName -and -not $SiteUrl) {
throw "-LibraryName requires -SiteUrl parameter."
}

if ($FolderUrl -and -not $SiteUrl) {
throw "-FolderUrl requires -SiteUrl parameter."
}

if (-not $OutputPath) {
$timestamp = Get-Date -Format "yyyyMMdd-HHmmss"
$OutputPath = Join-Path (Get-Location) "OldFilesReport-$timestamp.csv"
} else {
$parentFolder = Split-Path -Path $OutputPath -Parent
if (-not (Test-Path -Path $parentFolder)) {
throw "Output folder does not exist: $parentFolder"
}
}

$transcriptPath = $OutputPath -replace '\.csv$', '-Transcript.log'
Start-Transcript -Path $transcriptPath

Write-Host "Starting old file report generation..." -ForegroundColor Cyan
Write-Host "Age threshold: $DaysOld days" -ForegroundColor Cyan
Write-Host "Output path: $OutputPath" -ForegroundColor Cyan

Write-Verbose "Ensuring CLI for Microsoft 365 login..."
m365 login --ensure
if ($LASTEXITCODE -ne 0) {
Stop-Transcript
throw "Failed to authenticate with CLI for Microsoft 365."
}

$cutoffDate = (Get-Date).AddDays(-$DaysOld)
$cutoffDateString = $cutoffDate.ToString("yyyy-MM-ddTHH:mm:ssZ")
Write-Verbose "Cutoff date: $cutoffDateString"

$script:ReportCollection = [System.Collections.Generic.List[object]]::new()
$script:Summary = @{
SitesProcessed = 0
LibrariesProcessed = 0
OldFilesFound = 0
Failures = 0
}
}

process {
try {
$sitesToProcess = @()

if ($TenantAdminUrl) {
Write-Host "Retrieving sites from tenant..." -ForegroundColor Cyan
Write-Verbose "Building site list command..."

$siteListArgs = @('spo', 'site', 'list', '--output', 'json')
if ($IncludeOneDrive) {
Write-Verbose "Including OneDrive sites"
$siteListArgs += '--withOneDriveSites'
} else {
Write-Verbose "Filtering for SharePoint sites only"
$siteListArgs += '--filter'
$siteListArgs += "Url -like '/sites/'"
}

$sitesJson = m365 @siteListArgs 2>&1
if ($LASTEXITCODE -ne 0) {
Write-Warning "Failed to retrieve sites. CLI: $sitesJson"
$script:Summary.Failures++
return
}

$sitesToProcess = @($sitesJson | ConvertFrom-Json)
Write-Host "Found $($sitesToProcess.Count) sites to scan" -ForegroundColor Green
} else {
Write-Verbose "Using single site: $SiteUrl"
$sitesToProcess = @(@{ Url = $SiteUrl; Title = "" })
}

$siteCounter = 0
foreach ($site in $sitesToProcess) {
$siteCounter++
$siteUrl = $site.Url
$siteTitle = if ($site.Title) { $site.Title } else { $siteUrl }

Write-Progress -Activity "Processing sites" -Status "Site $siteCounter of $($sitesToProcess.Count): $siteTitle" -PercentComplete (($siteCounter / $sitesToProcess.Count) * 100)
Write-Verbose "Processing site: $siteUrl"

try {
$librariesToProcess = @()

if ($LibraryName) {
Write-Verbose "Using specific library: $LibraryName"
$libraryListJson = m365 spo list list --webUrl $siteUrl --filter "Title eq '$LibraryName' and BaseTemplate eq 101 and Hidden eq false" --properties "Title,RootFolder/ServerRelativeUrl" --output json 2>&1
if ($LASTEXITCODE -ne 0) {
Write-Warning "Failed to retrieve library '$LibraryName' from site '$siteUrl'. CLI: $libraryListJson"
$script:Summary.Failures++
continue
}
$librariesToProcess = @($libraryListJson | ConvertFrom-Json)
} else {
Write-Verbose "Retrieving all document libraries..."
$libraryListJson = m365 spo list list --webUrl $siteUrl --filter "BaseTemplate eq 101 and Hidden eq false" --properties "Title,RootFolder/ServerRelativeUrl" --output json 2>&1
if ($LASTEXITCODE -ne 0) {
Write-Warning "Failed to retrieve libraries from site '$siteUrl'. CLI: $libraryListJson"
$script:Summary.Failures++
continue
}
$librariesToProcess = @($libraryListJson | ConvertFrom-Json)
}

if ($librariesToProcess.Count -eq 0) {
Write-Verbose "No document libraries found in site '$siteUrl'"
continue
}

$script:Summary.SitesProcessed++

foreach ($library in $librariesToProcess) {
$libraryTitle = $library.Title
$folderPath = if ($FolderUrl) { $FolderUrl } else { $library.RootFolder.ServerRelativeUrl }

Write-Verbose "Scanning library: $libraryTitle (Folder: $folderPath)"

try {
$fileListArgs = @(
'spo', 'file', 'list',
'--webUrl', $siteUrl,
'--folderUrl', $folderPath,
'--fields', 'Name,ServerRelativeUrl,TimeLastModified,TimeCreated,Length,ListItemAllFields/Author,ListItemAllFields/Editor',
'--filter', "TimeLastModified lt datetime'$cutoffDateString'",
'--output', 'json'
)
if ($Recursive) {
$fileListArgs += '--recursive'
}

$filesJson = m365 @fileListArgs 2>&1
if ($LASTEXITCODE -ne 0) {
Write-Warning "Failed to retrieve files from library '$libraryTitle' in site '$siteUrl'. CLI: $filesJson"
$script:Summary.Failures++
continue
}

$files = @($filesJson | ConvertFrom-Json)
Write-Verbose "Found $($files.Count) old files in library '$libraryTitle'"

foreach ($file in $files) {
$lastModified = [DateTime]::Parse($file.TimeLastModified)
$daysOldValue = [Math]::Round((Get-Date).Subtract($lastModified).TotalDays)

$script:ReportCollection.Add([PSCustomObject]@{
SiteUrl = $siteUrl
SiteTitle = $siteTitle
LibraryTitle = $libraryTitle
FileName = $file.Name
FilePath = $file.ServerRelativeUrl
LastModified = $file.TimeLastModified
Created = $file.TimeCreated
SizeBytes = $file.Length
Author = if ($file.ListItemAllFields.Author) { $file.ListItemAllFields.Author.LookupValue } else { "N/A" }
Editor = if ($file.ListItemAllFields.Editor) { $file.ListItemAllFields.Editor.LookupValue } else { "N/A" }
DaysOld = $daysOldValue
})
$script:Summary.OldFilesFound++
}

$script:Summary.LibrariesProcessed++
} catch {
Write-Warning "Error scanning library '$libraryTitle' in site '$siteUrl': $_"
$script:Summary.Failures++
continue
}
}
} catch {
Write-Warning "Error processing site '$siteUrl': $_"
$script:Summary.Failures++
continue
}
}
} catch {
Write-Warning "Unexpected error during processing: $_"
$script:Summary.Failures++
}
}

end {
Write-Progress -Activity "Processing sites" -Completed

if ($script:ReportCollection.Count -gt 0) {
Write-Host "Exporting report to CSV..." -ForegroundColor Cyan
$script:ReportCollection | Sort-Object SiteUrl, LibraryTitle, LastModified | Export-Csv -Path $OutputPath -NoTypeInformation -Encoding UTF8
Write-Host "Report exported: $OutputPath" -ForegroundColor Green
} else {
Write-Host "No old files found matching criteria." -ForegroundColor Yellow
}

Write-Host "`n===== Summary =====" -ForegroundColor Cyan
Write-Host "Sites processed: $($script:Summary.SitesProcessed)" -ForegroundColor White
Write-Host "Libraries processed: $($script:Summary.LibrariesProcessed)" -ForegroundColor White
Write-Host "Old files found: $($script:Summary.OldFilesFound)" -ForegroundColor White
if ($script:Summary.Failures -gt 0) {
Write-Host "Failures: $($script:Summary.Failures)" -ForegroundColor Red
} else {
Write-Host "Failures: 0" -ForegroundColor Green
}
Write-Host "==================`n" -ForegroundColor Cyan

Stop-Transcript
}

# Scans all SharePoint sites for files older than 3 years
# .\Report-OldFiles.ps1 -TenantAdminUrl "https://contoso-admin.sharepoint.com" -DaysOld 1095

# Scans all SharePoint AND OneDrive sites for files older than 4 years
# .\Report-OldFiles.ps1 -TenantAdminUrl "https://contoso-admin.sharepoint.com" -IncludeOneDrive

# Scans all libraries in a specific site recursively
# .\Report-OldFiles.ps1 -SiteUrl "https://contoso.sharepoint.com/sites/project" -Recursive

# Scans a specific folder in a specific library with verbose output
# .\Report-OldFiles.ps1 -SiteUrl "https://contoso.sharepoint.com/sites/project" -LibraryName "Documents" -FolderUrl "/sites/project/Documents/Archive" -DaysOld 730 -Recursive -Verbose

```


[!INCLUDE [More about CLI for Microsoft 365](../../docfx/includes/MORE-CLIM365.md)]
***

## Contributors

| Author(s) |
|-----------|
| Nick Brattoli|
| Adam Wójcik|


[!INCLUDE [DISCLAIMER](../../docfx/includes/DISCLAIMER.md)]
Expand Down
Loading