Semantic Kernel Python 1.34: Major Vector Store Overhaul Simplifies AI Workflows
Authored by Eduard van Valkenburg, this detailed news highlights the major enhancements in Semantic Kernel Python’s vector store with version 1.34, focusing on improved developer experience, unified APIs, and streamlined vector data workflows for AI projects.
Semantic Kernel Python Gets a Major Vector Store Upgrade (v1.34)
By Eduard van Valkenburg
Semantic Kernel Python’s vector store implementation has received a comprehensive overhaul in version 1.34, greatly simplifying and enriching the AI development experience. This update introduces a consolidated API, automatic embedding generation, improved filtering, streamlined data operations, and better connector experience—all aimed at empowering efficient, modern AI workflows.
What Makes This Release Special?
The new vector store architecture consolidates all functionality under semantic_kernel.data.vector
and introduces three core improvements:
- Simplified API: A single unified field model—
VectorStoreField
—replaces multiple complex field types, making configuration clearer and more maintainable. - Integrated Embeddings: Embeddings can now be generated automatically wherever needed, eliminating manual steps.
- Enhanced Features: Support for advanced filtering, hybrid search, and consistent, streamlined operations.
Unified Field Model – Simplified Configuration
- Old Approach: Multiple field classes (e.g.,
VectorStoreRecordKeyField
,VectorStoreRecordDataField
,VectorStoreRecordVectorField
) required verbose and error-prone configuration. - New Approach: The versatile
VectorStoreField
class covers all field types, resulting in cleaner code and better IDE support.
# Old Way
from semantic_kernel.data import (
VectorStoreRecordKeyField, VectorStoreRecordDataField, VectorStoreRecordVectorField
)
fields = [
VectorStoreRecordKeyField(name="id"),
VectorStoreRecordDataField(name="text", is_filterable=True, is_full_text_searchable=True),
VectorStoreRecordVectorField(name="vector", dimensions=1536, distance_function="cosine")
]
# New Way
from semantic_kernel.data.vector import VectorStoreField
from semantic_kernel.connectors.ai.open_ai import OpenAITextEmbedding
embedding_service = OpenAITextEmbedding(ai_model_id="text-embedding-3-small")
fields = [
VectorStoreField("key", name="id"),
VectorStoreField("data", name="text", is_indexed=True, is_full_text_indexed=True),
VectorStoreField("vector", name="vector", dimensions=1536, distance_function="cosine", embedding_generator=embedding_service)
]
Integrated Embeddings – Automatic Generation
With the new architecture, embedding generation is defined directly in your field declarations for data classes, leading to:
- Automatic, consistent embedding creation
- Ability to combine multiple fields for rich vector representations
from semantic_kernel.data.vector import VectorStoreField, vectorstoremodel
from semantic_kernel.connectors.ai.open_ai import OpenAITextEmbedding
from typing import Annotated
from dataclasses import dataclass
@vectorstoremodel
@dataclass
class MyRecord:
content: Annotated[str, VectorStoreField('data', is_indexed=True, is_full_text_indexed=True)]
title: Annotated[str, VectorStoreField('data', is_indexed=True, is_full_text_indexed=True)]
id: Annotated[str, VectorStoreField('key')]
vector: Annotated[
list[float] | str | None,
VectorStoreField(
'vector',
dimensions=1536,
distance_function="cosine",
embedding_generator=OpenAITextEmbedding(ai_model_id="text-embedding-3-small"),
),
] = None
def __post_init__(self):
if self.vector is None:
# Combine fields for richer embeddings
self.vector = f"Title: {self.title}, Content: {self.content}"
Lambda-Powered Filtering – Type-Safe and Expressive
Filtering functionality has moved away from string-based constructs to type-safe lambda expressions. This greatly improves readability, safety, and IDE support.
# Old (string-based)
from semantic_kernel.data.text_search import SearchFilter
text_filter = SearchFilter()
text_filter.equal_to("category", "AI")
text_filter.equal_to("status", "active")
# New (lambda-based)
results = await collection.search(
"query text",
filter=lambda record: record.category == "AI" and record.status == "active"
)
# Complex filtering
results = await collection.search(
"machine learning concepts",
filter=lambda record: (
record.category == "AI"
and record.score > 0.8
and "important" in record.tags
and 0.5 <= record.confidence_score <= 0.9
)
)
Streamlined Operations – Consistent Interface
A unified API surface means the same methods are used for both single and batch operations, as well as flexible retrieval and search:
from semantic_kernel.connectors.in_memory import InMemoryCollection
collection = InMemoryCollection(
record_type=MyRecord,
embedding_generator=OpenAITextEmbedding(ai_model_id="text-embedding-3-small")
)
# Upsert records
await collection.upsert(single_record)
await collection.upsert([record1, record2, record3])
# Retrieval
await collection.get(["id1", "id2"])
await collection.get(top=10, skip=0, order_by='title')
# Search
results = await collection.search("find AI articles", top=10)
results = await collection.hybrid_search("machine learning", top=10)
Instant Search Functions – Simplified Creation
Creating and registering search functions in the kernel can now be accomplished directly on the collection object:
# Old
from semantic_kernel.data import VectorStoreTextSearch
collection = InMemoryCollection(collection_name='collection', record_type=MyRecord)
search = VectorStoreTextSearch.from_vectorized_search(
vectorized_search=collection,
embedding_generator=OpenAITextEmbedding(ai_model_id="text-embedding-3-small")
)
search_function = search.create_search(function_name='search')
# New
search_function = collection.create_search_function(
function_name="search",
search_type="vector", # or "keyword_hybrid"
top=10,
vector_property_name="vector"
)
kernel.add_function(plugin_name="memory", function=search_function)
Enhanced Data Model Expressiveness
Data models are more capable than before, supporting:
- Rich metadata fields
- Multiple vectors for different embedding strategies
@vectorstoremodel(collection_name="documents")
@dataclass
class DocumentRecord:
id: Annotated[str, VectorStoreField('key')]
title: Annotated[str, VectorStoreField('data', is_indexed=True, is_full_text_indexed=True)]
content: Annotated[str, VectorStoreField('data', is_full_text_indexed=True)]
category: Annotated[str, VectorStoreField('data', is_indexed=True)]
tags: Annotated[list[str], VectorStoreField('data', is_indexed=True)]
created_date: Annotated[datetime, VectorStoreField('data', is_indexed=True)]
confidence_score: Annotated[float, VectorStoreField('data', is_indexed=True)]
# Multiple vectors for different purposes
content_vector: Annotated[
list[float] | str | None,
VectorStoreField(
'vector',
dimensions=1536,
storage_name="content_embedding",
embedding_generator=OpenAITextEmbedding(ai_model_id="text-embedding-3-small")
)
] = None
title_vector: Annotated[
list[float] | str | None,
VectorStoreField(
'vector',
dimensions=1536,
storage_name="title_embedding",
embedding_generator=OpenAITextEmbedding(ai_model_id="text-embedding-3-small")
)
] = None
def __post_init__(self):
if self.content_vector is None:
self.content_vector = self.content
if self.title_vector is None:
self.title_vector = self.title
Better Connector Experience
Connectors and imports are logically reorganized for clarity and convenience. Connector stores include Azure AI Search, Chroma, Pinecone, and Qdrant. Lazy loading is supported for easy imports:
from semantic_kernel.connectors.azure_ai_search import AzureAISearchStore
from semantic_kernel.connectors.chroma import ChromaVectorStore
from semantic_kernel.connectors.pinecone import PineconeVectorStore
from semantic_kernel.connectors.qdrant import QdrantVectorStore
# Or lazy-load all
from semantic_kernel.connectors.memory import (
AzureAISearchStore,
ChromaVectorStore,
PineconeVectorStore,
QdrantVectorStore
)
Real-World Example: Complete Implementation
A clear, end-to-end example using the new design pattern:
from semantic_kernel.data.vector import VectorStoreField, vectorstoremodel
from semantic_kernel.connectors.in_memory import InMemoryCollection
from semantic_kernel.connectors.ai.open_ai import OpenAITextEmbedding
from typing import Annotated
from dataclasses import dataclass
@vectorstoremodel(collection_name="knowledge_base")
@dataclass
class KnowledgeBase:
id: Annotated[str, VectorStoreField('key')]
content: Annotated[str, VectorStoreField('data', is_full_text_indexed=True)]
category: Annotated[str, VectorStoreField('data', is_indexed=True)]
vector: Annotated[
list[float] | str | None,
VectorStoreField(
'vector',
dimensions=1536,
embedding_generator=OpenAITextEmbedding(ai_model_id="text-embedding-3-small")
)
] = None
def __post_init__(self):
if self.vector is None:
self.vector = self.content
# Usage with automatic embedding
docs = [
KnowledgeBase(id="1", content="Semantic Kernel is awesome", category="general"),
KnowledgeBase(id="2", content="Python makes AI development easy", category="programming"),
]
async with InMemoryCollection(record_type=KnowledgeBase) as collection:
await collection.ensure_collection_exists()
await collection.upsert(docs)
results = await collection.search(
"AI development", top=5, filter=lambda doc: doc.category == "programming"
)
search_func = collection.create_search_function("knowledge_search", search_type="vector")
kernel.add_function(plugin_name="kb", function=search_func)
What This Means for Your Projects
- Faster Development: Eliminate boilerplate and focus on AI logic
- Better Maintainability: Concise, understandable, modifiable code
- Enhanced Performance: Built-in optimizations and batch processing
- Future-Proof: Aligned with .NET SDK for consistent, cross-platform development
- Richer Functionality: Hybrid search, advanced filtering, integrated embeddings
Migration and Deprecations
- Deprecated:
MemoryStore
abstractions, implementations,Semantic Text Memory
, and theTextMemoryPlugin
- Deprecated connectors now moved to
semantic_kernel.connectors.memory_stores
, with full removal planned for August - Migration guide: Learn more
Conclusion
Semantic Kernel Python 1.34 marks a substantial advancement for AI application development. The revamped vector store system unifies the experience, delivers efficient and expressive APIs, enhances maintainability, and prepares developers for future growth. Developers are encouraged to upgrade, consult the migration guide, and experiment with the new, streamlined workflows.
This post appeared first on “Microsoft DevBlog”. Read the entire article here