Adjust dependencies in Readiness probe

It is important to understand the Liveness and Readiness probes when you run Sitecore on Kubernetes. Their documentation provides more detail around these 2 probes:

  • Liveness: Indicates whether the container is running. If the liveness probe fails, the kubelet kills the container, and the container is subjected to its restart policy.
  • Readiness: Indicates whether the container is ready to respond to requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod.

In Sitecore these probes can be found under /healthz/live and /healthz/ready. The rest of this post will focus on the Readiness probe. There is a great article by Vitalii Tylyk which discusses these probes at a high level. By default, a Sitecore XP install checks a variety of xDB services as well as Solr as part of this probe. If these services are not all healthy then the probe will fail and no requests will be send to this pod.

It is important to understand this default behavior in the context of your Sitecore solution. If having xDB and Solr up are critical to a solution then this default behavior does not need to be changed. The risk with this setup is that all pods can be pulled from the load balancer if there is an issue with Solr or xDB and the site will be completely down. If these services are not critical to the Sitecore site then they can be removed from the readiness probe.

Below patch file shows how to remove all these checks so the readiness probe still returns healthy even if Solr and xDB are completely down. In many real world scenarios only a subset of these will have to be removed, for example removing xDB services but leaving Solr as the solution has a critical dependency on it.

<?xml version="1.0"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <services>
      <configurator type="Sitecore.ContentSearch.SolrProvider.DependencyInjection.ContentSearchServicesConfigurator, Sitecore.ContentSearch.SolrProvider">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.XConnect.Client.Configuration.HealthCheckServicesConfigurators.XConnectCollectionHealthCheckServicesConfigurator, Sitecore.XConnect.Client.Configuration">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.XConnect.Client.Configuration.HealthCheckServicesConfigurators.XConnectConfigurationHealthCheckServicesConfigurator, Sitecore.XConnect.Client.Configuration">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.XConnect.Client.Configuration.HealthCheckServicesConfigurators.XConnectSearchHealthCheckServicesConfigurator, Sitecore.XConnect.Client.Configuration">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Reporting.Service.Http.XConnectClient.XdbReportingWebClient, Sitecore.Reporting.Service.Http.XConnectClient],[Sitecore.Reporting.Service.Http.Abstractions.Routes, Sitecore.Reporting.Service.Http.Abstractions]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>      
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Xdb.ReferenceData.Client.ReferenceDataHttpClient, Sitecore.Xdb.ReferenceData.Client],[Sitecore.Xdb.ReferenceData.Client.Routes, Sitecore.Xdb.ReferenceData.Client]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Xdb.ReferenceData.Client.ReadOnlyReferenceDataHttpClient, Sitecore.Xdb.ReferenceData.Client],[Sitecore.Xdb.ReferenceData.Client.Routes, Sitecore.Xdb.ReferenceData.Client]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Xdb.MarketingAutomation.ReportingClient.AutomationReportingClient, Sitecore.Xdb.MarketingAutomation.ReportingClient],[Sitecore.Xdb.MarketingAutomation.ReportingClient.ReportingRoutes, Sitecore.Xdb.MarketingAutomation.ReportingClient]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Xdb.MarketingAutomation.OperationsClient.AutomationOperationsClient, Sitecore.Xdb.MarketingAutomation.OperationsClient],[Sitecore.Xdb.MarketingAutomation.OperationsClient.OperationRoutes, Sitecore.Xdb.MarketingAutomation.OperationsClient]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>      
    </services>
  </sitecore>
</configuration>

Fix NullReferenceException in CompositeDataProvider

Sitecore 10 comes with a new Dataprovider which merges items from disk and the database. Jeremy Davis already wrote a great article about this. Sitecore is using this new approach by deploying all their items through an Item Resource file.

TDS now also supports creating item resource files, which makes it convenient to deploy your own items this way too. There are a few uncommon scenarios where this will result in a uncaught NullReferenceException, for example when there is an item without a version. Many areas of Sitecore and your custom solution will most likely be broken if this happens. The stack trace will look something like this:

FATAL Uncaught application error 
Exception: System.NullReferenceException 
Message: Object reference not set to an instance of an object. 
Source: Sitecore.Kernel 
   at Sitecore.Data.DataProviders.CompositeDataProvider.GetItemVersions(ItemDefinition itemDefinition, CallContext context) 
   at Sitecore.Data.DataProviders.DataProvider.GetItemVersions(ItemDefinition item, CallContext context, DataProviderCollection providers) 
   at Sitecore.Data.DataSource.LoadVersions(ItemDefinition definition, Language language) 
   at Sitecore.Data.DataSource.GetVersions(ItemInformation itemInformation, Language language) 
   at Sitecore.Data.DataSource.GetLatestVersion(ItemInformation itemInformation, Language language) 
   at Sitecore.Data.DataSource.GetItemData(ID itemID, Language language, Version version) 
   at Sitecore.Nexus.Data.DataCommands.GetItemCommand.GetItem(ID itemId, Language language, Version version, Database database) 

One way to solve this is by identifying all the items which cause the issue and fix each of them. Another way it to override the CompositeDataProvider and wrap a try/catch block around this logic. The advantage of this approach is that it addresses the root cause and the issue will not reoccur in the future. Following code can be used to catch the exception:

using Sitecore.Collections;
using Sitecore.Configuration;
using Sitecore.Data;
using Sitecore.Data.DataProviders;
using Sitecore.Data.DataProviders.ReadOnly;
using Sitecore.Diagnostics;
using System;
using System.Collections.Generic;

namespace Foundation.Providers
{
    public class SafeCompositeDataProvider : CompositeDataProvider
    {
        public SafeCompositeDataProvider(IEnumerable<ReadOnlyDataProvider> readOnlyDataProviders, DataProvider headProvider) : base(readOnlyDataProviders, headProvider) { }

        public SafeCompositeDataProvider(ObjectList readOnlyDataProviders, DataProvider headProvider) : base(readOnlyDataProviders, headProvider) { }

        public override VersionUriList GetItemVersions(ItemDefinition itemDefinition, CallContext context)
        {
            try
            {
                return base.GetItemVersions(itemDefinition, context);
            }
            catch (Exception ex)
            {
                Log.Error($"SafeCompositeDataProvider: Caught exception for item {itemDefinition.ID} {itemDefinition.Name} {ex}", this);
                return null;
            }
        }
    }
}

This new provider can be patched in through a configuration file like this:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:set="http://www.sitecore.net/xmlconfig/set/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
  <sitecore>
    <databases >
      <!-- target to CM or standalone, no master database in CD-->
      <database id="master" role:require="ContentManagement or StandAlone">
        <dataProviders >
          <dataProvider >
            <patch:attribute name="type">Foundation.Providers.SafeCompositeDataProvider, Foundation.Providers</patch:attribute>
          </dataProvider>
        </dataProviders>
      </database>
    </databases>
  </sitecore>
</configuration>

Fix Custom Cache after Sitecore upgrade

Sitecore allows you to create custom caches which can be managed just like all other Sitecore caches. Brian Caos wrote a great blog post about this which can be found here. Recently Sitecore has made some changes to this and if you are upgrading Sitecore you will notice that the code will not compile anymore. For example below code:

 Cache cache = CacheManager.FindCacheByName(cacheName);

will now throw a compile error:

Error CS0411 The type arguments for method 'CacheManager.FindCacheByName<T>(string)' cannot be inferred from the usage. Try specifying the type arguments explicitly. 

Furthermore this now returns an object of type Sitecore.Caching.Generics.ICache<TKey>. My first attempt at fixing this was by just passing the type of my custom cache in the generic which fixes the compile error but now the code throws an exception at runtime:

Exception: System.InvalidCastException 
Message: Unable to cast object of type 'Sitecore.Caching.Cache' to type 'Sitecore.Caching.Generics.ICache`1[MyCustomCache]'.
Source: Sitecore.Kernel
   at Sitecore.Caching.DefaultCacheManager.FindCacheByName[TKey](String name)

The generic now needs the type of the key passed in. In most cases this can just be set to string. To fix this code changes will be required in 2 places.

1. Custom cache object

In the custom cache make sure the base object uses Sitecore.Caching.Generics and also specify string as the key e.g:

 //updated from Sitecore.Caching
using Sitecore.Caching.Generics;

//Specify the generic type of the key
public class MyCustomCache : CustomCache<string> 

2. Custom Cache Clearer

In the method which clears the cache make sure to also pass the same generic for the key, see highlighted line:

private void DoClear()
{
    foreach (string cacheName in Caches)
    {
        //update to pass generic
        Cache cache = CacheManager.FindCacheByName<string>(cacheName); 
        if (cache == null)
            continue;
        Log.Info(this + ". Clearing " + cache.Count + " items from " + cacheName, this);
        cache.Clear();
    }
}

Prevent page error when Datasource is missing after Upgrade

If you have upgraded Sitecore and along with it GlassMapper to version 5 there might be an issue when the Datasource item is not available in the target database, which will throw an error and the page will not load. Prior to the upgrade this would not have been an issue and the invalid Datasource was simply ignored. When this happens you will see an error in the log like this:

Exception: System.InvalidOperationException 
Message: The model item passed into the dictionary is of type 'Sitecore.Mvc.Presentation.RenderingModel', but this dictionary requires a model item of type 'DatasourceNamespace.DatasourceType'.
Source: System.Web.Mvc
   at System.Web.Mvc.ViewDataDictionary`1.SetModel(Object value)
   at System.Web.Mvc.ViewDataDictionary..ctor(ViewDataDictionary dictionary)
   at System.Web.Mvc.WebViewPage`1.SetViewData(ViewDataDictionary viewData)
   at System.Web.Mvc.RazorView.RenderView(ViewContext viewContext, TextWriter writer, Object instance)
   at System.Web.Mvc.Html.PartialExtensions.Partial(HtmlHelper htmlHelper, String partialViewName, Object model, ViewDataDictionary viewData)
   at Sitecore.Mvc.Presentation.ViewRenderer.Render(TextWriter writer) 

Solution

This can be fixed by checking if the Datasource exists and clearing it in case it doesn’t. The RenderRendering processor is a good place to insert this logic:

using Sitecore.Mvc.Pipelines.Response.RenderRendering; 

namespace Custom.Pipelines.RenderRendering
{
    public class ClearInvalidDatasource : RenderRenderingProcessor
    {
        public override void Process(RenderRenderingArgs args)
        {
            var rendering = args.Rendering;

            if (!string.IsNullOrWhiteSpace(rendering.DataSource) && Sitecore.Context.Database.Items.GetItem(rendering.DataSource) == null)
            {
                rendering.DataSource = string.Empty;
            }
        }
    }
}

This customization can be enabled with the following patch file:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/"> 
  <sitecore>
    <pipelines>
      <mvc.renderRendering>
        <processor patch:before="*[@type='Sitecore.Mvc.Pipelines.Response.RenderRendering.EnterRenderingContext, Sitecore.Mvc']" type="Custom.Pipelines.RenderRendering.ClearInvalidDatasource, Project.Common" />
      </mvc.renderRendering>
    </pipelines>
  </sitecore>
</configuration>

Now that this change is in place the behavior will be identical to the behavior prior to the upgrade.

Execute Ajax calls concurrently without session locking

Large Sitecore applications can be bogged down by pressure on the session state store. This can also be an issue in regular ASP .NET sites and this SO article helps to explain the issue and provides several solutions. Sitecore provides options in this great KB article. In a previous article I wrote about a solution when session locking is slowing down MVC pages.

Another perhaps even more common scenario is when session locking is slowing down multiple Ajax calls. Sitecore’s previously mentioned KB article recommends to not send multiple Ajax calls back to Sitecore. This is good advice and in general it is a good practice to combine Ajax calls as much as possible. However often there are scenario’s where this can be challenging, for example when there are multiple independent components on a page and they each need to load data through Ajax. Before diving into the solution lets look at the issue with multiple Ajax calls.

Ajax calls and session locking

The issue with multiple Ajax calls is that IIS will execute them one at a time. In this article lets look at below simple controller method, which will just wait for 10 seconds and then return:

public class SessionLockTestController : SitecoreController
{
    public async Task<JsonResult> ServiceCall()
    {
        await Task.Delay(10000);

        return Json(new { Ok = true }, JsonRequestBehavior.AllowGet);
    }
}

This Sitecore article describes how to register a custom route. Below image from the network panel shows what happens when a page makes 3 concurrent Ajax requests to this method:

  • Each call starts at the same time
  • The first call completes after 10 seconds
  • The second call completes after 20 seconds, because it had to wait for the first call to finish before it could start
  • The third call completes only after 30 seconds, because it had to wait for the second call to complete before it could start

As can be seen the final Ajax call only completes after 30 seconds. This locking mechanism will have a significant impact on the page loading performance, and also puts a lot of stress on the session store as each locked requests polls the session store frequently to see if it can start executing.

Reducing the locking of Ajax calls

Before unblocking the call it is important to understand what the relationship is between the Ajax call and the session in your solution. There are 3 possibilities which will be discussed separately.

  1. Ajax calls do not use session
  2. Ajax calls read from the session, but do not update it
  3. Ajax calls update/write to the session

Below sections will discuss how to address each scenarios

1. Ajax calls do not use session

This is the easiest scenario. If an action does not use session state, then the controller should disable the session state as can be seen below. In this case the Session object will be null. This will make the action not be subject to any session locking.

[SessionState(SessionStateBehavior.Disabled)]
public class SessionLockTestController : SitecoreController
{
    public async Task<JsonResult> ServiceCall()
    {
        await Task.Delay(10000);

        return Json(new { Ok = true }, JsonRequestBehavior.AllowGet);
    }
}

Below is the same code calling this method. No request is blocked now and they all execute at the same time.

2. Ajax calls read from the session, but do not update it

Sometimes an action method needs access to the session, but does not need to update it. In this case the sessionstate behavior should be set to readonly as can be seen in below code:

[SessionState(SessionStateBehavior.ReadOnly)]
public class SessionLockTestController : SitecoreController
{
    public async Task<JsonResult> ServiceCall()
    {
        await Task.Delay(10000);

        return Json(new { Ok = true }, JsonRequestBehavior.AllowGet);
    }
}

There are a couple of important things to note about readonly session state:

  • The code will have access to the session, an can update it. However any change to the session will be lost after the request.
  • Multiple requests in readonly mode will execute at the same time
  • A request in readonly mode still needs a read lock on the session, and cannot execute at the same time as a request which has read and write access to the session i.e. SessionStateBehavior.Required. For example if one Ajax call is setup in a controller with SessionStateBehavior.Required, then any other Ajax calls with SessionStateBehavior.ReadOnly will still have to wait

3. Ajax calls update/write to the session

This is the most complicated scenario. As far as I know there is no way for Ajax calls to run concurrently, and each update the session, even if you would address the concurrency issues in the application.

A different approach needs to be used since there is no option to execute the requests at the same time while also updating the session. Here are some approaches which might work:

  • Only update session state on initial page load, and use Ajax calls to retrieve additional information which should not be stored in session
  • Have a single Ajax call do all the updates to the session
  • Store any updates somewhere on the client, for example local storage, and update the session at a later point
  • Store data server side in some other store outside of session, and have the application manage concurrency

My Ajax calls are always fast anyway, should I not worry about this?

NO! Each Ajax call will still need to obtain a lock, and this check happens only every 500ms. Consider below method which returns immediately:

public JsonResult ServiceCall()
{
    return Json(new { Ok = true }, JsonRequestBehavior.AllowGet);
}

Below is the same network window with the default session locking, it still takes more than a second before the 3rd API is complete.

Compare this to the same network diagram of the same code without locking below. In this case all Ajax calls return in under 40 ms!

Conclusion

This article showed the detrimental effect the session locking mechanism can have on multiple Ajax calls. Multiple techniques were described to significantly mitigate or eliminate this issue. It is important to remember to try to avoid using session as much as possible, as it comes with scalability and availability concerns. Consider using more advanced and extensible options instead of session if a significant amount of user data needs to be stored server side. The session locking mechanism is on by default. Being able to disable the session can result in large performance improvements.

Deploying Sitecore with Terraform part 2

This is the second part of my article about deploying Sitecore through Terraform. The first part can be found here and discusses Terraform features and explains why Terraform is a good fit for deploying Sitecore.

This article will walk through my sample Github solution which deploys a Sitecore site in Azure through Terraform. Below graph generated by Terraform’s Graph command shows the components which will be created.

Clone Github Repo

This article uses the code from the Github repo which can be found here. This repo contains a variables.tf file. The first section of this file needs to be populated with settings specific to each deployment. In this article these values will be set in a file called secret.tfvars which is excluded form Github. The variables can be set like below:

StorageKey = "your value"
LicenseFile = "your value"
BacpacCoreDB = "your value"
BacpacMasterDB = "your value"
BacpacWebDB = "your value"
VCppPackage = "your value"
SitecoreZip = "your value"
DomainNameLabel = "your value"
xDbDisableFile = "your value"

The last setting is xDbDisableFile. This patch file disabled xDB as this repo stands up a Sitecore XM topology without xDB. Folllowing is in this file, but it can also be used to patch in other settings.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:set="http://www.sitecore.net/xmlconfig/set/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
  <sitecore>
    <settings>      
      <setting name="Xdb.Enabled">
        <patch:attribute name="value" value="false" />
      </setting>
      <setting name="Xdb.Tracking.Enabled">
        <patch:attribute name="value" value="false" />
      </setting>
    </settings>
  </sitecore>
</configuration>

After the initial setup this article will follow the standard 4-step Terraform workflow which can be seen below.

Terraform Init

After getting the repo Terraform needs to initialize the working directory and download the Azure Resource Manager provider. Running terraform init will take care of this:

Terraform Plan

The plan command can be run now that the workspace is initialized. This will show all the resources which will be created, in this case 11 resources will be created:

  1. Network Interface
  2. Public IP
  3. Resource Group
  4. Sitecore Core Database
  5. Sitecore Master Database
  6. Sitecore Web Database
  7. Firewall rule for SQL traffic
  8. Azure SQL DB server
  9. Subnet
  10. Virtual Machine
  11. Virtual Network

As mentioned before the values of some of the variables are specified in secret.tfvars, this will be passed into the plan command terraform plan -var-file=”secret.tfvars”. Below is the start of the output from this command:

Terraform Apply

The apply command can be run now after the output from the plan command is reviewed. This will create the 11 resources from the plan command which make up the Sitecore site.

Inside the files folder in the repo is a winrm.ps1 script. This will be put on the virtual machine which is created by Terraform and will setup Sitecore. This file uses some variables which are passed in by the main.tf file here

The apply command will also need the Terraform variables files so the full apply command will look like this: terraform apply -var-file=”secret.tfvars”, this will create the resources and returns a success message after this is done:

The Sitecore site can be opened when this is done and when the winrm.ps1 powershell script is complete. By default it will be publicly accessible at <domain-name-label>.<azure-region>.cloudapp.azure.com, see below for the example with this setup:

Terraform Destroy

When done all resources can be torn down again by running terraform destroy. This is obviously an optional step, however it is important to continue to make any changes to resources through Terraform to ensure the code stays up to date and the deployment process is repeatable, predictable and automated.

Sitecore URL and domain setup options

In Sitecore solutions there are often requirements around the URL structure or domain on which the site will live. This is often driven by SEO and perhaps the Sitecore site even needs to share the same domain with a non-Sitecore site. This article will look at several common options and will explain when they should be used.

Below flowchart shows which option to choose depending on your needs. None of the options are mutually exclusive and they can be combined to meet multiple requirements:

Option 1: Reverse proxy/application gateway

In some cases requirements are more complex, for example your company’s Sitecore site needs to be hosted at mycompany.com but complete different servers needs to host mycompany.com/careers. This might not even be an IIS site but a site hosted on Apache or Nginx.

There are several ways to do this depending on the infrastructure which is in place. If Sitecore is hosted in the cloud here are some options which can route the traffic accordingly to prevent it from even hitting IIS

In an on-premise setup there might be similar infrastructure which can do this routing. IIS can also do this routing through the Application Request Routing (ARR) extension. In general routing this traffic away before it hits IIS is the preferred solution as it reduces the load on IIS.

Below is a sample reverse proxy rule in ARR which will route all traffic under /Careers to http://internalcareersite. This is all achieved through the routing and the site will be exposed over mycompany.com/careers:

<rewrite>
    <rules>
        <rule name="Reverse Proxy to Careers" stopProcessing="true">
            <match url="^Careers/(.*)" />
            <action type="Rewrite" url="http://internalcareersite/{R:1}" />
        </rule>        
    </rules>
</rewrite>

Option 2: IgnoreUrlPrefixes setting

The IgnoreUrlPrefixes setting tells Sitecore which Paths to exclude from Sitecore. This setting comes in handy if there is another site which needs to be hosted on the same domain, and the site can be hosted in the same IIS site as Sitecore. This is often an easy solution which can be helpful in many circumstances.

It can be challenging to update this setting in a clean way as there are already a number of entries of Sitecore in this setting. This article provides a good solution to only patch in your solution specific values.

URL Rewrite/Redirect

Before diving into the next few options let’s be clear on the difference between a redirect and a rewrite:

  • Redirect: a redirect happens when the servers responds to the client/browser and tells it to load a page or resource from a different location. The client can then retrieve the resource from the location provided by the server. For example, you can have a vanity URL at mycompany.com/mycampaign which redirects to mycompany.com/some/long/path/mycampaign
  • Rewrite: a rewrite is when IIS modifies an incoming request before it hands it off to its handlers (inbound rewrite) or modifies an outgoing response before it sends it to the client (outbound rewrite). In this case there are no additional request/responses like there are in case of a redirect.
    An example of this could be to rewrite all the incoming request starting with mycompany.com/english to mycompany.com/en-us. In this case the URL in the browser would be mycompany.com/english, but on Sitecore it would appear as if it came at mycompany.com/en-us as IIS modified it before handing off to Sitecore. These techniques can be powerful in a variety of situations, for example to have different URL structure then what Sitecore OOTB supports with language and sites.

There are a variety of Sitecore modules which can help with redirects which are managed in Sitecore, SXA also has a redirect module. Also IIS has a redirect/rewrite extension which is already a prerequisite for Sitecore installations.

3. Rewrite through IIS rewrite module

Sometimes it can be challenging to meet certain URL requirements with Sitecore without making any customizations, for example there might be a need to have the language in the URL as /english or /spanish etc. Implementing this in Sitecore can be complex as it requires customizations to processor(s) in the httpRequestBegin pipeline as well as the logic to generate links correctly. Below inbound and outbound rewrite rule will have the URL as /english but to Sitecore it appears as /en-us.

An important consideration when doing this is that in Sitecore the links will still show as /en-us, for example in the analytics.

Inbound rule: to make sure /english and anything after it gets send to /en-us on the same path, e.g. /english/products will go to /en-us/products:

<rewrite>
	<rules>
		<rule name="Englist to en-us">
			<match url="^english/(.*)" />
			<action type="Rewrite" url="en-us/{R:1}" />
		</rule>
	</rules>
</rewrite>

Outbound rule: to make sure any link in an anchor text gets “en-us” replaced with “english”. Depending on your setup more rules might be required to update the links in other places:

<rewrite>            
	<outboundRules>
		<rule name="Update en-us to english in anchor">
			<match filterByTags="A" pattern="^/en-us/(.*)" />
			<action type="Rewrite" value="/english/{R:1}" />
		</rule>
	</outboundRules>
</rewrite>

Option 4: Redirect through IIS Rewrite module

Some redirects rarely change and therefore are not managed by content authors, for example a redirect from http to https, to lowercase URLs or to enforce a trailing slash. In such cases it makes sense to have the redirects in IIS as it is acceptable to update these only with a deployment. Another advantage of doing this through IIS is that the request never hits Sitecore, it gets redirected straight from IIS which is more efficient. Below is a sample IIS redirect which redirect from http to https:

<rule name="Redirect to HTTPS" patternSyntax="Wildcard" stopProcessing="true">
	<match url="*" ignoreCase="true" negate="false" />
	<conditions logicalGrouping="MatchAny" trackAllCaptures="false">
	  <add input="{HTTPS}" ignoreCase="true" matchType="Pattern" negate="false" pattern="OFF" />
	</conditions>
	<action type="Redirect" url="https://{HTTP_HOST}{REQUEST_URI}" appendQueryString="false" redirectType="Permanent" />
</rule>

Option 5: Custom Sitecore processors

Sometimes there is a need to do something similar to a rewrite (option 3), but it cannot be done with the rewrite module because it requires information from the Sitecore context. In this case perhaps the OOTB item resolution logic does not resolve the desired item or a different context language or site needs to be set. Before getting into more detail here are some common use cases:

  • Sitecore’s language or site resolution logic does not meet the requirements. For example the language does not need to come from the URL but from the Sitecore IP Geolocation service.
  • A Single Page Application (SPA) is used which uses push-state. In this case it is common to want to deep link into these push state route. For example the SPA could live in mycompany.com/myspa, with myspa being the Sitecore item. There could be a route like /myspa/filter/category1. Sitecore would try to resolve an item under that path but it is not there so it would return a 404. In this case the Sitecore ItemResolver can be customized to still resolve the correct item. An more detailed approach for this can be found here

In these cases the correct Sitecore processor needs to be updated. Below are the 3 most common processors which need to be updated. They are all part of the HttpRequestBegin pipeline:

  • Sitecore.Pipelines.HttpRequest.SiteResolver: as the name suggests this processor resolves the site from the context. This processor serves as a good starting point to adjust the resolution of the context site.
  • Sitecore.Pipelines.HttpRequest.LanguageResolver: the language resolver determines the correct language based on the context. The default Sitecore language resolution can put some constraints on the URL structure as the language needs to match a culture. Often business wants user friendly language in the URL like English, Spanish, French etc. In these cases customizing the language resolver can be a good option.
  • Sitecore.Pipelines.HttpRequest.ItemResolver: this resolves the context item in Sitecore and might be the most customized processor in Sitecore to solve a variety of different issues. There are even some popular modules that customize this like the wildcard module

Updating Sitecore processors can be tricky as they can also run outside the scope of your solution, for example when a user logs into Sitecore CMS (/sitecore/login). If possible it is a good idea to keep the existing logic as is, but run additional code in case the existing resolution fails because the custom solution requirements are different.

Updating these processors will ensure Sitecore handles incoming requests correctly, however the links Sitecore generates to other pages will not take this logic into account. To ensure these links work correctly there are 2 options:

  • Put in a redirect to send the links to the correct location
  • Override Sitecore’s logic to generate links by putting in a custom link provider or a custom renderfield pipeline

Option 6: Sitecore Redirect Module

In many cases the content authors need to manage redirects along with other content updates they make in Sitecore. In these cases it is recommended to manage the redirects in Sitecore content as well. There are a variety of Sitecore modules which can help with redirects and SXA also has a redirect module.

Check Redis connectivity inside heartbeat.aspx health check

Getting your health check right is important when designing a highly available and elastic Sitecore solution. For years Sitecore comes with an builtin health check page at /sitecore/service/heartbeat.aspx which checks the status of the the SQL databases. This can be used for a load balancer or docker health check. A few quick notes on the heartbeat.aspx:

  • In some versions of Sitecore the heartbeat.aspx will throw an error, and you will have to exclude some connection strings from it as described in a different article on this blog
  • Starting in Sitecore 9.3 a new health check mechanism is used based on the Microsoft.Extensions.Diagnostics.HealthChecks namespace. Here is a great article describing how to customize this. The same code from below can be used in the updated health check mechanism.

There are several different approaches when setting up a health check in Sitecore. In most cases I recommend keeping the health check small to prevent it from going unhealthy during heavy load, this technique can be combined with the Application Initialization feature in IIS to warmup the solution after the site starts.

The code for the heartbeat.aspx lives in Sitecore.Web.Services.HeartbeatCode in the Sitecore.Client assembly. The important methods are virtual so they can be overriden to implement additional checks to ensure all critical components of the solutions are healthy.

There are many Sitecore solutions where the private session state is stored in Redis and its availability is critical. In such scenarios it will make sense to ping Redis from the health check to ensure the server can access it. Below code sample shows how to check the Redis database which is setup for private session state:

public class CustomHeartbeat : Sitecore.Web.Services.HeartbeatCode
{
    protected BeatResults CheckRedis(BeatResults beatresult)
    {
        //get connection details for private Redis session database
        //same pattern can be used to check shared session database
        var sessionSection = (SessionStateSection)WebConfigurationManager.GetSection("system.web/sessionState");
        var connString = sessionSection.Providers["Redis"].Parameters.Get("connectionString");
        string redisConnection = ConfigurationManager.ConnectionStrings[connString].ConnectionString;

        using (ConnectionMultiplexer connection = ConnectionMultiplexer.Connect(redisConnection))
        {
            var subscriber = connection.GetSubscriber();
            var timespan = subscriber.Ping();

            Log.Info($"Successfully pinged Redis from healthcheck in: {timespan}", this);
        }

        return beatresult;
    }

    protected override BeatResults DoBeat()
    {
        //this checks the SQL databases
        var beatResults = base.DoBeat();

        beatResults = CheckRedis(beatResults);

        return beatResults;
    }
}

Power Sitecore with CloudFront

There are already a few good blog posts which describe how to use CloudFront in combination with Sitecore. All the posts I saw used CloudFront as a solution to offload images and other static resources on a different domain, and then change Sitecore to load static resources from this different domain. Sitecore itself will not be using CloudFront. This post will describe a different approach where Sitecore sits behind CloudFront as well. It will also discuss the advantages of using this approach.

Before diving into using CloudFront for Sitecore lets quickly look at the approach of just offloading static resources to CloudFront on a separate domain. Following flowchart shows this in more detail:

This is an good solutions and it solves the issue of making the site faster by delivering large resources quicker from edge locations globally, at scale and in a more cost effective manner.

Advantages of using CloudFront with Sitecore

In the previous setup Sitecore pages are not using CloudFront, and therefore cannot take advantage of some useful CloudFront features like:

  • Security at the Edge: CloudFront provides protection against most common DDoS attacks. It can also be used to block traffic from certain countries or only allow traffic from a specific set of countries
  • SSL Offloading/compression: SSL can be terminated at CloudFront and CloudFront could gzip compress the responses. Even when not offloading SSL, SSL will be more efficient with CloudFront as it will reuse connections between itself and the origin
  • Restricting access to premium content: if you have premium content which users need to pay for then you can use signed URLs or signed cookies to limit access. This is much easier to standup then a fully fledged IDP but still provides enterprise grade security.
  • Error Pages: Sitecore can handle most common errors better than CloudFront can. However some catastrophic errors are better handled outside of Sitecore, for example a scenario where the app pool is crashing or hanging or a scenario where Sitecore is unable to handle the incoming load. CloudFront supports customized error pages depending on what gets returned from the origin. More info can be found here.

Host Sitecore behind CloudFront overview

This article will will describe a different solution where everything is fronted by CloudFront, and there is only one URL. This way all CloudFront features can be leveraged for both the Sitecore pages as well as all other resources. The origin in this case will be the Sitecore ELB. Below diagram explains this in more detail:

Now that Sitecore pages are also fronted by CloudFront they could be setup to be cacheable, see number 3. Recommendation is to never do this when using Sitecore as all analytics and personalization will not apply and register the page view when CloudFront returns it form cache.

Setting up Sitecore behind CloudFront

This section walks through some of the important configuration when setting up Sitecore’s ELB as the origin of the load balancer.

Set Sitecore ELB as origin

When creating the CloudFront distribution the ELB which hosts Sitecore needs to be set as the origin, see below screenshot:

Allowed HTTP Methods

Since all traffic goes through CloudFront it is important to allow all HTTP verbs which your site uses, e.g. POST, PUT, DELETE etc. See below screenshot for this option in the UI.

Configure caching

There are 2 ways to control how long content stays in the CloudFront cache, more details can be found here

  • Using headers to control cache duration for individual objects: CloudFront will honor the Cache-Control and Expires response headers to control how long objects stay in the cache. This is recommended by CloudFront and personally I also prefer this approach as everything can be controlled from the Sitecore site in IIS.
  • Specifying the amount of time that CloudFront caches objects: In this case the values for time to live are specified in CloudFront. To use this select “Use legacy cache settings” and then “Customize” under object caching, see screenshot below

Understanding other CloudFront settings

There are many other settings in CloudFront and using them correctly can be critical for your site to work properly and to gain maximum benefit from CloudFront. This page provides more detail about all the different settings and how they work.

Sitecore and Redis lessons learned

I noticed that my previous post about Redis is one of the most popular on my blog. Since I’ve been using Redis for a while I decided to write another post with some of the lessons learned.

Sitecore connectivity to Redis

The first step in getting Redis to work with Sitecore is to ensure there is connectivity between them. When Sitecore starts up it will ping Redis. The Sitecore log will contain something like below when connectivity to Redis is established successfully. Notice the Redis response to the Ping and the message that the endpoint returned with success.

11056 11:06:22 INFO  Sending critical tracer: Interactive/jeroen.redis.cache.windows.net:6380
11056 11:06:22 INFO  Writing to Interactive/jeroen.redis.cache.windows.net:6380: ECHO
11056 11:06:22 INFO  Flushing outbound buffer
11056 11:06:22 INFO  Starting read
11056 11:06:22 INFO  Connect complete: jeroen.redis.cache.windows.net:6380
11056 11:06:22 INFO  Response from Interactive/jeroen.redis.cache.windows.net:6380 / ECHO: BulkString: 16 bytes
WIN-RCJOA5J2MOL:Write 11:06:22 INFO  Writing to Interactive/jeroen.redis.cache.windows.net:6380: GET __Booksleeve_TieBreak
WIN-RCJOA5J2MOL:Write 11:06:22 INFO  Writing to Interactive/jeroen.redis.cache.windows.net:6380: PING
8912 11:06:22 INFO  Response from Interactive/jeroen.redis.cache.windows.net:6380 / GET __Booksleeve_TieBreak: (null)
8912 11:06:22 INFO  Response from Interactive/jeroen.redis.cache.windows.net:6380 / PING: SimpleString: PONG
1068 11:06:22 INFO  All tasks completed cleanly, IOCP: (Busy=0,Free=800,Min=800,Max=800), WORKER: (Busy=43,Free=757,Min=789,Max=800)
1068 11:06:22 INFO  jeroen.redis.cache.windows.net:6380 returned with success

There can be a variety of issues which prevents Sitecore from connecting to Redis:

  • Wrong Redis engine version: Sitecore does not work with Redis engine version 4 or 5. This is easy to get wrong especially if using AWS ElastiCache which currently defaults to version 5.0.3. When using AWS ElastiCache make sure to select version 3.2.6. This issue is not obvious from the log. When using the wrong version the log might show something like this:
INFO name.cache.amazonaws.com: 6380 failed to nominate (Faulted)
INFO > UnableToResolvePhysicalConnection on GET 33488
  • AccessKey missing in connection string: The access key might need to be put inside connectionString value. I have blogged about this issue before see here
  • Intermittent timeout issues: There might be intermittent timeout issues when Sitecore is connected to Redis. This KB article provides a good start to resolve these kind of issues. If this happens the log will show something like this:
Exception: System.TimeoutException
Message: Timeout performing EVAL, inst: 1, mgr: Inactive, err: never, queue: 24, qu: 0, qs: 24, qc: 0, wr: 0, wq: 0, in: 12544, ar: 0, IOCP: (Busy=5,Free=395,Min=200,Max=400), WORKER: (Busy=4,Free=396,Min=88,Max=400), clientName: client
Source: StackExchange.Redis.StrongName
   at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor`1 processor, ServerEndPoint server)
   at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor`1 processor, ServerEndPoint server)
   at StackExchange.Redis.RedisDatabase.ScriptEvaluate(String script, RedisKey[] keys, RedisValue[] values, CommandFlags flags)
   at Sitecore.SessionProvider.Redis.StackExchangeClientConnection.<>c__DisplayClass12_0.<Eval>b__0()
   at Sitecore.SessionProvider.Redis.StackExchangeClientConnection.RetryForScriptNotFound(Func`1 redisOperation)
   at Sitecore.SessionProvider.Redis.StackExchangeClientConnection.RetryLogic(Func`1 redisOperation)
   at Sitecore.SessionProvider.Redis.StackExchangeClientConnection.Eval(String script, String[] keyArgs, Object[] valueArgs)
   at Sitecore.SessionProvider.Redis.RedisConnectionWrapper.TryTakeWriteLockAndGetData(String sessionId, DateTime lockTime, Object& lockId, ISessionStateItemCollection& data, Int32& sessionTimeout)
   at Sitecore.SessionProvider.Redis.RedisSessionStateProvider.GetItemFromSessionStore(Boolean isWriteLockRequired, HttpContext context, String id, Boolean& locked, TimeSpan& lockAge, Object& lockId, SessionStateActions& actions)
  at Sitecore.SessionProvider.Redis.RedisSessionStateProvider.GetItemExclusive(HttpContext context, String id, Boolean& locked, TimeSpan& lockAge, Object& lockId, SessionStateActions& actions)
   at System.Web.SessionState.SessionStateModule.GetSessionStateItem()
   at System.Web.SessionState.SessionStateModule.BeginAcquireState(Object source, EventArgs e, AsyncCallback cb, Object extraData)
   at System.Web.HttpApplication.AsyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

Designing for performance

There are many factors which impact the performance of Redis. The only way to determine the best configuration for a certain site is to perform a load test with a load that is similar to production traffic. Based on my experience I recommend exploring below options:

  • Enable Clustering: It is often more effective to create a Redis cluster with multiple instances than to increase the size of a single non clustered Redis instance. Each Redis instance can only be scaled vertically by allocating more resources to it. With a cluster Redis will create multiple instances and divide the data over the instances based on its key. This technique is also referred to as sharding and is supported by Redis, which makes it transparent to Sitecore. Therefore there are no changes needed on Sitecore’s side, it just needs to have its Redis connection string pointed to the endpoint of the cluster.
    • Important note: Sitecore is using StackExchange.Redis.StrongName to access Redis. “Move” exceptions can occur below version 1.1.603 of this library when clustering is enabled. A little more information about this issue can be found here. This link only describes the issue in Azure but the same issue can occur anywhere else as well. Per below table all Sitecore 9.0 versions use a version of the Stackexchange Redis driver below 1.1.603 and might throw “Move” exceptions when configured to use a Redis cluster.
      Sitecore StackExchange Redis
      9.0 Initial Release (171002) 1.0.488
      9.0 Update-1 (171219) 1.0.488
      9.0 Update-2 (180604) 1.0.488
      9.1 Initial Release (001564) 1.2.6
      9.1 Update-1 (002459) 1.2.6
  • Keep compression disabled: the Redis server is single-threaded. This makes it perform well with small key-value pairs, but performance will decrease when the size of the data it stores goes up. The advantage of disabling compression is that Sitecore does not need to spend CPU time compressing and decompressing the data. However the amount of data that needs to be send to Redis goes up, we have seen the amount of data send to Redis triple without compression. This had a significant adverse impact on Redis’ performance and the performance of the entire site. The extra CPU time with compression enabled was negligible compared to overall CPU. Below image taken from Redis.io shows how throughput decreases with increased data size.