Query and Ingest JSONL Files in Fabric Data Warehouse and SQL Endpoint
Microsoft Fabric Blog details how OPENROWSET in Fabric Data Warehouse and SQL Endpoint allows querying and ingesting JSONL files, streamlining data analysis and loading workflows.
Query and Ingest JSONL Files in Fabric Data Warehouse and SQL Endpoint
The Microsoft Fabric Blog introduces the general availability of JSONL (JSON Lines) file support in Fabric Data Warehouse and SQL Endpoint for Lakehouse environments. This enhancement utilizes the OPENROWSET
T-SQL function, empowering users to read, query, and ingest JSONL files as table-like sources without manual transformations or parsing.
Key Capabilities
- OPENROWSET for JSONL: Seamlessly query JSONL files—such as logs, social media streams, ML datasets, and configuration files—using familiar T-SQL syntax.
- Row-wise Mapping: Each JSON object in a JSONL file maps to a table row, with all properties exposed as distinct columns.
- Schema Flexibility: The
WITH
clause allows users to define schemas and extract nested properties, flattening even complex or deeply nested data structures. - Tool Integration: Query JSONL data using Fabric Query editor, T-SQL Notebook, or SQL tools like SSMS.
Example Usage
SELECT * FROM OPENROWSET(BULK '/Files/samples/jsonl/farmers-protest-tweets-2021-2-4.jsonl')
- Nested or complex JSON structures can be flattened and selectively extracted for analysis.
Data Ingestion Workflows
- Use
CREATE TABLE AS SELECT (CTAS)
orINSERT SELECT
statements to load JSONL data into Fabric Data Warehouse tables. - Example:
INSERT INTO OpenRowsetDW.dbo.Tweets
SELECT * FROM OPENROWSET(BULK '/Files/jsonl/farmers-protest-2022-12-04.jsonl')
- Enables both initial data loads and ongoing automation for refreshing datasets with new semi-structured data.
Conclusion
JSONL support through the OPENROWSET function extends the capability of Fabric Data Warehouse and SQL Endpoint for Lakehouse, facilitating enterprise-scale analytics on diverse semi-structured data. This update streamlines previously-complex pipelines—making it easier to explore, transform, and ingest data from varied sources using familiar SQL constructs.
This post appeared first on “Microsoft Fabric Blog”. Read the entire article here