Converting Page or Append Blobs to Block Blobs with Azure Data Factory

In this article, SaikumarMandepudi explains how to use Azure Data Factory to convert page or append blobs into block blobs, enabling access tier changes and storage cost optimization.

Introduction

Converting page or append blobs to block blobs can be necessary when optimizing storage costs in Azure. Certain blob types, like page or append blobs, cannot be directly moved to the archive access tier—only block blobs support access tier functionality. This article outlines how to convert page or append blobs into block blobs using Azure Data Factory (ADF), after which any standard method can be used to transition them to the archive tier.

Problem Context

Some storage accounts have many infrequently accessed page blobs in the hot tier—often kept solely for backup purposes.
Only block blobs can have their access tier changed in Azure Blob Storage (see documentation).

Azure Data Factory Solution

The Azure Blob Storage connector in ADF supports copying data from block, append, or page blobs, and copying to block blobs (see ADF connector docs).
No special configuration is required—the ADF copy activity will create destination blobs as block blobs by default.

Step-by-Step Guide

Step 1: Create Azure Data Factory (ADF) Instance

In the Azure Portal, create a new Azure Data Factory resource using the quickstart guide.
After creation, launch the ADF Studio UI.

Step 2: Create Datasets

Navigate to Author > Datasets > New dataset.
Select Azure Blob Storage, then set the format to “binary.”
Create one dataset for the source (the storage account containing page or append blobs) and another for the destination.

Step 3: Create Linked Services

Create a new linked service in ADF, referencing the storage account with the source blobs.
Set the file path for the page (or append) blobs to convert.
Create a separate dataset and corresponding linked service for the destination storage account (can be the same or different account, as required).

Step 4: Configure the Copy Data Pipeline

Create a new pipeline in ADF.
From Move and Transform, drag and drop the Copy data activity.
Assign the previously created source and destination datasets.
Select the “Recursively” option if you want to include subfolders and publish your changes.
Adjust filters and copy behavior as required to suit your scenario.

Step 5: Debug and Validate

Run the pipeline in debug mode.
If successful, the output will display a “succeeded” status.
Verify in the destination storage account that the blob type is now “block blob,” and the access tier defaults to Hot.

Next Steps: Changing Access Tier

Once blobs are converted to block blobs, you can change their access tier to Archive using methods such as:

Azure Blob Lifecycle Management (LCM) policies
Azure Storage actions
Azure CLI or PowerShell scripts

See lifecycle management docs or bulk archive docs for references.

Conclusion

Using Azure Data Factory provides a streamlined approach to convert page or append blobs into block blobs, after which standard tools and policies can be used to transition the access tier and optimize storage costs. This approach is more efficient than developing custom scripts or utilities.

References

This post appeared first on “Microsoft Tech Community”. Read the entire article here