Introduction
The structured data is essentially the most important part of the modern SEO. The search engines rely heavily on schema mark-up to understand the content, identify and understand the relationships between the entities and deliver seamless search experience. But implementing and maintaining schema manually is often very difficult especially for the organizations that manage large content libraries.
Here the auto generated schema becomes a game changer.
If you are the one who doesn't understand what auto generated schema is, this is for you. The auto generated schema is the process of automatically generating structured data from the content that is already present in a CMS. The CMS automatically generates and updates schema when content is published, not having the need of editors to manually maintain JSON-LD.
The teams using Umbraco Headless CMS, implementing an auto generated schema eliminates the SEO maintenance while ensuring every blog post contains accurate structured data at all times.
How Techxot helped
Techxot team recently implemented an auto generated and synchronized schema for Umbraco every time the content is published for our client. Let’s get a deep dive into understanding how this whole project was implemented.
The Issues with The Manual Schema Management
Even though majority of the organizations understand the importance of schema markups, keeping them updated is a big challenge.
Structured data is not visible to the website visitors; this the editors often forget to maintain it. When a schema is added at an initial stage, it becomes outdated as content evolves in future.
Manual schema management has to deal with issues like:
Author change
Featured images get replaced
Schema becomes stale
URLs are modified
Manual schema doesn't scale
Content structures evolve
Schema and content tend to fall out of sync over time when not updated. If an organization publishes hundreds of articles maintaining manual structured data will not scale.
The only solution to this problem is automation.
Our Headless Architecture we worked with
The headless architecture is built with:
Umbraco 17.4 LTS
.NET 10
Next.js frontend
Vercel hosting
Umbraco Content Delivery API
Umbraco Cloud
When we started, we had an important observation, every piece of information required to create a Blog posting schema already existed in the CMS were:
Blog title
Description
Author
Featured image
Publication date
URL
Now we decided to generate schema automatically every time the blog was published, instead of adding more SEO fields.
How To Leverage Umbraco’s Event-Driven Architecture
Umbraco is embraced with the biggest strength that is its notification systems. When in Umbraco content is created, updated, published or deleted a notification is raised, for developers to intercept.
We used the following for our implementation:
INotificationHandler<ContentPublishedNotification>This handler is registered from a composer:
builder.AddNotificationHandler<
ContentPublishedNotification,
BlogSchemaGeneratorHandler>();This ensured us with a reliable way to execute custom code immediately after content was published.
What Publishing Workflows We Followed?
The schema generation follows simple and straightforward sequence:
Editors click Save & Publish
Umbraco initiates the database transaction
The ContentPublishedNotification is raised
The handler identifies blog content
Now the Blog posting JSON-LD is generated
A background task is started
The publish transaction commits
Schema is written back to the database
The result to this workflow is a completely automated publishing workflow, where schema generation becomes the crucial part of content lifecycle.
How To implement The Auto Generated Schema in Umbraco?
How To Build the Notification Handler
The handler looks for published entities and process blog content types only. The notification handler looks for Umbracos’s ContentPublishedNotification event to automatically execute custom logic every time the blog is published.
public class BlogSchemaGeneratorHandler
: INotificationHandler<ContentPublishedNotification>
{
private readonly IScopeProvider _scopeProvider;
private readonly IContentService _contentService;
private readonly ILogger<BlogSchemaGeneratorHandler> _logger;
private const string BlogContentTypeAlias = "blog";
private const string SiteBaseUrl = "https://yoursite.com";
private const string Culture = "en-US";
public void Handle(ContentPublishedNotification notification)
{
foreach (var content in notification.PublishedEntities)
{
if (!content.ContentType.Alias.Equals(BlogContentTypeAlias,
StringComparison.OrdinalIgnoreCase))
continue;
var schema = BuildBlogPostingSchema(content);
if (schema == null) continue;
var nodeId = content.Id;
var nodeKey = content.Key;
_ = Task.Run(async () =>
{
await Task.Delay(1000);
SaveSchemaToDb(nodeId, nodeKey, schema);
});
}
}
}This function ensures, schema is generated only for blog pages and doesn't affect other content types on the CMS.
Understanding What an Auto-Generated Schema in Umbraco Is
An auto generated schema for Umbraco uses the properties like title, description, author, image, and publish data that already exist in document type and automatically converts it into valid BlogPosting JSON-LD. Duplicate data entry is avoided since schema is created from content that already exists and keeps data synchronized.
How To Resolve Content Picker Reference while Building Schema
The next important implementation detail to keep an eye on is about content pickers. The author information is stored as UDI reference:
Umb://document/… The UDI alone is not enough for schema generation, since search engines expect readable author information.
To solve this reference, the developers use:
IContentServices This service converts UDI to actual content; this also allows the schema generation to retrieve properties like author’s name and meta data.
Example for building the schema:
private string? BuildBlogPostingSchema(IContent content)
{
var headline = content.Name ?? "";
var description = GetSplitHeroValue(content, "details") ?? "";
var publishedOn = ExtractPublishDate(content);
(authorName, authorLinkedIn) = ExtractAuthor(content);
var
var imageUrl = ExtractImageUrl(content);
var fullUrl = $"{SiteBaseUrl}/blogs/{content.Name?.ToLowerInvariant().Replace("
var schema = new Dictionary<string, object?>
{
["@context"] = "https://schema.org",
["@type"] = "BlogPosting",
["mainEntityOfPage"] = new Dictionary<string, object?>
{
["@type"] = "WebPage",
["@id"] = fullUrl
},
["headline"] = headline,
["description"] = description,
["image"] = imageUrl,
{
["author"] = new Dictionary<string, object?>
{
["@type"] = "Person",
["name"] = authorName,
["url"] = authorLinkedIn
},
["publisher"] = new Dictionary<string, object?>
{
["@type"] = "Organization"
["name"] = "Your Company Name",
["logo"] = new Dictionary<string, object?>
{
["@type"] = "Image0bject",
["url"] = $"{SiteBaseUrl}/logo.svg"
},
},
["datePublished"] = publishedOn.ToString("yyyy-MM-dd"),
["dateModified"] = content.UpdateDate.ToString("yyyy-MM-dd")
};
return JsonSerializer.Serialize(schema);
}The objective of this is to automatically generate and update schema in CMS without the editor intervention every time the blog is published.
How To Merge with Existing Schema
The blog pages that already contains schema for FAQ page or Breadcrumblist, then this implementation replaces the expired BlogPosting schema, on the other hand they preserve all types of schemas ensuring SEO enhancements intact.
private static string MergeSchema(string? existingValue, string blogPostingJson)
{
var schemas = new List<string>
{
blogPostingJson
};
if (Istring.IsNull0rWhiteSpace(existingValue))
var unwrapped = existingValue.TrimStart().StartsWith("\"")
? JsonSerializer.Deserialize<string>(existingValue)
: existingValue;
if (!string.IsNull0rEmpty(unwrapped))
{
using var doc = JsonDocument.Parse(unwrapped);
foreach (var el in doc.RootElement.EnumerateArray())
{
if (el.TryGetProperty("@type", out var t) && t.GetString() == "BlogPosting")
continue; // replace old BlogPosting
schemas.Add(el.GetRawText()); // keep FAQPage etc.
}
}
return "[" + string.Join(",", schemas) + "]";
} Working With Raw Block Editor Data When Writing the Database
The most crucial implementation challenge we faced was during publishing, the IContent stores block the editor value as raw JSON and not the expanded API representation. This means the development team cannot get access to block properties exactly like they appear in the Content Delivery API.
The implementation must traverse raw contentData structures to retrieve the exact information manually.
This is crucial when we retrieve:
Author information
Nested block value
Featured images
Rich content properties
private void SaveSchemaToDb(int nodeId, Guid nodeKey, string blogPostingJson)
{
using var scope = scopeProvider.CreateScope();
var db = scope.Database;
var rowExists = db.ExecuteScalar<int>(
"SELECT COUNT (1) FROM [YourSchemaTable] WHERE NodeKey = @O AND Alias = @1 AN
nodeKey,
"schema",
Culture);
var existingValue = rowExists > 0
? db.ExecuteScalar<string>(
"SELECT UserValue FROM [YourSchemaTable] WHERE NodeKey = @O AND Alias
nodeKey,
"schema",
Culture)
: null;
var merged = MergeSchema(existingValue, blogPostingJson);
var storedValue = JsonSerializer.Serialize(merged);
if (rowExists > 0)
{
}
else
{
db.Execute(
"UPDATE [YourSchemaTable] SET UserValue = @0 WHERE Nodekey = @1 AND 31:a
storedValue,
nodeKey,
"schema",
Culture);
db.Execute(
"INSERT INTO [You
nodeId,
nodeKey,
);
}
scope.Complete();
}This step where we try to understand how efficiently Umbraco stores raw block data during publishing while building automated schema.
Registering the Handler
Registering the notification handler tells Umbraco to revoke schema generation logic when content is published. This is a one-time registration seamlessly integrates automated schema generation in the CMS ecosystems.
public class SchemaComposer : IComposer
{
public void Compose(IUmbracoBuilder builder)
{
builder.AddNotificationHandler<
ContentPublishedNotification,
BlogSchemaGeneratorHandler>();
}
}How The Auto-generated Schema Handles Different Publishing Scenarios
Existing Schema State | System Action |
|---|---|
No schema exists | Creates and inserts a new BlogPosting schema. |
Schema exists but is empty or null | Updates the entry with a new BlogPosting schema. |
Only FAQPage or other schemas exist | Preserves existing schemas and adds BlogPosting. |
BlogPosting already exists | Replaces it with a newly generated BlogPosting schema. |
BlogPosting exists alongside FAQPage, BreadcrumbList, or other schemas | Replaces only BlogPosting and retains all other schemas. |
Why Auto Generated Schema Works in Umbraco
Here are some of the key benefits that define why auto generated schema works in Umbraco:
Zero manual schema maintenance
Consistent structured data across all blog posts
Better eligibility for Google rich results
Reduce editor workloads
Low risk of human error
Easy scalability even for large content
Automatic synchronization schema after content is updated
Here is an example of auto generated schema in Umbraco for new published blog for Techxot
public class SchemaComposer : IComposer
{
public void Compose(IUmbracoBuilder builder)
{
builder.AddNotificationHandler<
ContentPublishedNotification,
BlogSchemaGeneratorHandler>();
}
}Conclusion
With integrating schema generation automatically in the Umbraco publishing workflow, we eliminated manual schema maintenance to ensure every blog published receives accurate and up-to-date schema every time. This process as whole resulted in a scalable implementation that kept structured data in synchronization with content changes while reducing editor efforts and enhancing SEO impact.
At Techxot, we help enterprises build intelligent and automation-led digital experiences on Umbraco to turn complex issues into seamless capabilities and opportunities that are directly embedded in modern headless CMS architectures.




