Sitecore on Docker: logging intro and how to avoid losing log entries

Sitecore 10 on Containers use the Windows Container Tools by default for logging. This is configured through a json file which is located in C:\LogMonitor\logmonitorconfig.json. It will look something like below:

{
  "LogConfig": {
    "sources": [
      {
        "type": "EventLog",
        "startAtOldestRecord": false,
        "eventFormatMultiLine": false,
        "channels": [
          {
            "name": "system",
            "level": "Error"
          }
        ]
      },
      {
        "type": "File",
        "directory": "c:\\inetpub\\logs",
        "filter": "*.log",
        "includeSubdirectories": true
      },
      {
        "type": "File",
        "directory": "c:\\inetpub\\wwwroot\\App_data\\logs",
        "filter": "log.*",
        "includeSubdirectories": false
      }
    ]
  }
}

This configuration ensures the logs get send to STDOUT and the logging driver can pick up the logs from there. Docker support several logging drivers, a full list can be found on their site.

Logs getting lost intermittently

This works well initially and logs will be collected properly by the logging solution. However at scale logs might get lost intermittently, in some cases a significant percentage of logs will be lost. Sitecore uses a rolling style of logging, which is not properly supported by the Windows Container Tools. There is an issue created on their Github Repo here but it has received little attention so far.

One potential workaround is to configure a large maximumFileSize in Sitecore’s logging configuration. In some cases this can prevent the issue completely if the log does not need to rollover. At scale this will at best mitigate the issue but it will not address the root cause.

Solution; mount logs directly on the host

The only fix that I’m aware of is to mount the log folder on the host to prevent the issue described above. If anyone else found a better solution please let me know in the comments section below. Docker provides good documentation on how a volume should be mounted which can be found here. Below is some sample code to mount the volume, this needs to be specified using Windows filename semantics without a leading slash, and the destination log directory needs to be empty:

docker run -v c:\logs:c:\inetpub\wwwroot\App_data\logs ...

One thing to keep in mind here is that the host can potentially run multiple containers which share the same log folder on the host. One potential workaround would be to have a separate folder for each container but the best solution will be different depending on many factors.

Glass v5 properties intermittently null when security applied

Glass Mapper version 5 caches data more aggressively which is beneficial most of the time as it improves site performance. More information on the new caching features can be found here. In order to speed up performance some checks which Sitecore normally does are not performed when an item is returned from the cache. Usually this will not cause any issues but there could be potentially serious issues when security is involved. For example consider below scenario:

  1. User 1 has access to a certain item, but User 2 does not. Security is applied through the regular Sitecore security mechanisms.
  2. User 2 visits the site first, and Glass loads the model. This model has a link to the item which user 2 does not have access to, therefore this link is empty. It will get cached this way.
  3. Now user 1 comes and the model will be returned from cache. Since it was initially cached for user 2 the link is still empty whereas user 1 should be able to view this.

Notice that this behavior is not deterministic, i.e. if user 1 would have hit the site first then the item would be loaded. It would also be loaded for user 2 if that user visits later. This can cause intermittent and hard to find issues especially if the code was working correctly prior to upgrading. Good news is that fixing this is straightforward!

Solution

It is recommended to leave the cache on as much as possible to take advantage of the performance improvements.

Is there is an issue with a certain link which can point to an item where security might be applied, then caching can be turned off through an attribute like this:

[SitecoreType(Cache = Glass.Mapper.Configuration.Cache.Disabled)]
public virtual Link SecureLink { get; set; }

Perhaps this issue only occurs when retrieving items through the SitecoreService or MvcContext. In these cases the cache can be turned off explicitly in the call like this:

T val = mvcContext.GetDataSourceItem<T>(x => x.CacheDisabled())

GetItemUrl Error after upgrading to Sitecore 10

After upgrading an older version of Sitecore to version 10 you might see below error with the call to LinkManager.GetItemUrl:

[ArgumentOutOfRangeException: startIndex cannot be larger than length of string.
Parameter name: startIndex]
   System.String.Substring(Int32 startIndex, Int32 length) +14342534
   Sitecore.Links.UrlBuilders.Helpers.ItemPathBuilder.TryRemoveSubPath(String subPath, String& itemPath) +48
   Sitecore.Links.UrlBuilders.Helpers.ItemPathBuilder.GetRelativePath(Item item, SiteInfo site) +205
   Sitecore.Links.UrlBuilders.Helpers.ItemPathBuilder.Build(Item item, SiteInfo site) +105
   Sitecore.Links.UrlBuilders.ItemUrlBuilder.UpdateItemPath(Item item, UrlBuildModel model, ItemUrlBuilderOptions options) +44
   Sitecore.Links.UrlBuilders.ItemUrlBuilder.Build(Item item, ItemUrlBuilderOptions options) +246

The same code most likely ran without any issue in the older Sitecore version. Usually this is caused by the SiteDefinition and specifically the values for the rootPath and startItem attributes. Consider below configuration:

<?xml version="1.0" encoding="utf-8"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/" xmlns:security="http://www.sitecore.net/xmlconfig/security/">
    <sitecore>
        <sites>
			<site name="website" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content/home" startItem="/" language="en" database="web" domain="extranet" />          
        </sites>
    </sitecore>
</configuration>

In above code the startItem is set to “/”, and it turns out this will cause the above error. Changing the startItem to an actual item, and removing this from rootPath will fix the issue:

<?xml version="1.0" encoding="utf-8"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/" xmlns:security="http://www.sitecore.net/xmlconfig/security/">
    <sitecore>
        <sites>
			<site name="website" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/home" language="en" database="web" domain="extranet" />          
        </sites>
    </sitecore>
</configuration>

Remove items from Resource File

Sitecore recently introduced Resource Files which contain Sitecore items. This is a great improvement and is beneficial when upgrading Sitecore especially while running on containers. There are some good resources out there to learn more about them for example this post by Martin Miles.

One limitation of this feature is that it is not possible to delete Sitecore items included this way. This makes sense in most cases, items included through this file are Sitecore system items and should generally not be deleted. However there are a few valid reasons why you would want to delete such an item.

The rest of this post describes how you can get rid off these items, without actually modifying the resource file itself. Modifying the resource file itself is not a good approach as you would get a new Resource File during an upgrade and any deleted items would be back.

The Resource files get read by the ProtobufDataProvider and it stores the items in a few dictionaries in memory. The solution below will remove the items from these dictionaries. Below code inherits the ProtobufDataProvider and adds the functionality to remove the items:

using Microsoft.Extensions.DependencyInjection;
using Sitecore.Abstractions;
using Sitecore.Configuration;
using Sitecore.Data.DataProviders.ReadOnly.Protobuf;
using Sitecore.DependencyInjection;
using System;
using System.Linq;

namespace Foundation.Providers
{
    public class RemoveItemsProtobufDataProvider : ProtobufDataProvider
    {
        public void RemoveItem(string item)
        {
            Guid itemGuid;

            if (Guid.TryParse(item, out itemGuid))
            {
                bool foundParent = false;

                if (base.DataSet.Definitions.ContainsKey(itemGuid))
                {
                    var parentID = base.DataSet.Definitions[itemGuid].ParentID;
                    foundParent = true;

                    //the item will also live as a child under the parent item, remove it from here as well
                    base.DataSet.Children[parentID] = base.DataSet.Children[parentID].Where(x => x.ID != itemGuid).ToArray();
                    base.DataSet.Definitions.Remove(itemGuid);
                }

                if (base.DataSet.Children.ContainsKey(itemGuid) && foundParent)
                {
                    base.DataSet.Children.Remove(itemGuid);
                }

                if (base.DataSet.ItemsByTemplate.ContainsKey(itemGuid) && foundParent)
                {
                    base.DataSet.ItemsByTemplate.Remove(itemGuid);
                }

                if (base.DataSet.LanguageData.ContainsKey(itemGuid) && foundParent)
                {
                    base.DataSet.LanguageData.Remove(itemGuid);
                }

                if (base.DataSet.SharedData.ContainsKey(itemGuid) && foundParent)
                {
                    base.DataSet.SharedData.Remove(itemGuid);
                }
            }
        }

        public RemoveItemsProtobufDataProvider(ObjectList filePaths) : base(filePaths.List.OfType<string>().Where(s => !string.IsNullOrEmpty(s)), ServiceLocator.ServiceProvider.GetRequiredService<BaseLog>())
        {
        }
    }
}

This dataprovider will need to be patched into Sitecore, below patch file will do this for the master database. The removeItems list, highlighted in the XML below, contains the list of the items which need to be removed:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:set="http://www.sitecore.net/xmlconfig/set/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
	<sitecore>
		<databases >
			<database id="master" role:require="ContentManagement or StandAlone">
				<dataProviders >
					<dataProvider>
						<param>
							<protobufItems>
								<patch:attribute name="type">Foundation.Providers.RemoveItemsProtobufDataProvider, Foundation</patch:attribute>
								<filePaths>
									<patch:delete/>
								</filePaths>
								<param desc="filePaths" hint="list">
									<filePath>$(dataFolder)/items/$(id)</filePath>
								</param>
								<!-- Add items which need to be removed below -->
								<removeItems hint="list:removeItem">									
									<item>{22222222-2222-2222-2222-222222222222}</item>
									<item>{33333333-3333-3333-3333-333333333333}</item>
								</removeItems>
							</protobufItems>
						</param>
					</dataProvider>
				</dataProviders>
			</database>
		</databases>
	</sitecore>
</configuration>

Adjust dependencies in Readiness probe

It is important to understand the Liveness and Readiness probes when you run Sitecore on Kubernetes. Their documentation provides more detail around these 2 probes:

  • Liveness: Indicates whether the container is running. If the liveness probe fails, the kubelet kills the container, and the container is subjected to its restart policy.
  • Readiness: Indicates whether the container is ready to respond to requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod.

In Sitecore these probes can be found under /healthz/live and /healthz/ready. The rest of this post will focus on the Readiness probe. There is a great article by Vitalii Tylyk which discusses these probes at a high level. By default, a Sitecore XP install checks a variety of xDB services as well as Solr as part of this probe. If these services are not all healthy then the probe will fail and no requests will be send to this pod.

It is important to understand this default behavior in the context of your Sitecore solution. If having xDB and Solr up are critical to a solution then this default behavior does not need to be changed. The risk with this setup is that all pods can be pulled from the load balancer if there is an issue with Solr or xDB and the site will be completely down. If these services are not critical to the Sitecore site then they can be removed from the readiness probe.

Below patch file shows how to remove all these checks so the readiness probe still returns healthy even if Solr and xDB are completely down. In many real world scenarios only a subset of these will have to be removed, for example removing xDB services but leaving Solr as the solution has a critical dependency on it.

<?xml version="1.0"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <services>
      <configurator type="Sitecore.ContentSearch.SolrProvider.DependencyInjection.ContentSearchServicesConfigurator, Sitecore.ContentSearch.SolrProvider">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.XConnect.Client.Configuration.HealthCheckServicesConfigurators.XConnectCollectionHealthCheckServicesConfigurator, Sitecore.XConnect.Client.Configuration">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.XConnect.Client.Configuration.HealthCheckServicesConfigurators.XConnectConfigurationHealthCheckServicesConfigurator, Sitecore.XConnect.Client.Configuration">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.XConnect.Client.Configuration.HealthCheckServicesConfigurators.XConnectSearchHealthCheckServicesConfigurator, Sitecore.XConnect.Client.Configuration">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Reporting.Service.Http.XConnectClient.XdbReportingWebClient, Sitecore.Reporting.Service.Http.XConnectClient],[Sitecore.Reporting.Service.Http.Abstractions.Routes, Sitecore.Reporting.Service.Http.Abstractions]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>      
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Xdb.ReferenceData.Client.ReferenceDataHttpClient, Sitecore.Xdb.ReferenceData.Client],[Sitecore.Xdb.ReferenceData.Client.Routes, Sitecore.Xdb.ReferenceData.Client]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Xdb.ReferenceData.Client.ReadOnlyReferenceDataHttpClient, Sitecore.Xdb.ReferenceData.Client],[Sitecore.Xdb.ReferenceData.Client.Routes, Sitecore.Xdb.ReferenceData.Client]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Xdb.MarketingAutomation.ReportingClient.AutomationReportingClient, Sitecore.Xdb.MarketingAutomation.ReportingClient],[Sitecore.Xdb.MarketingAutomation.ReportingClient.ReportingRoutes, Sitecore.Xdb.MarketingAutomation.ReportingClient]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>
      <configurator type="Sitecore.Xdb.Common.Web.Xmgmt.XdbEnabledHealthCheckServicesConfigurator`2[[Sitecore.Xdb.MarketingAutomation.OperationsClient.AutomationOperationsClient, Sitecore.Xdb.MarketingAutomation.OperationsClient],[Sitecore.Xdb.MarketingAutomation.OperationsClient.OperationRoutes, Sitecore.Xdb.MarketingAutomation.OperationsClient]], Sitecore.Xdb.Common.Web.Xmgmt">
        <patch:delete />
      </configurator>      
    </services>
  </sitecore>
</configuration>

Fix NullReferenceException in CompositeDataProvider

Sitecore 10 comes with a new Dataprovider which merges items from disk and the database. Jeremy Davis already wrote a great article about this. Sitecore is using this new approach by deploying all their items through an Item Resource file.

TDS now also supports creating item resource files, which makes it convenient to deploy your own items this way too. There are a few uncommon scenarios where this will result in a uncaught NullReferenceException, for example when there is an item without a version. Many areas of Sitecore and your custom solution will most likely be broken if this happens. The stack trace will look something like this:

FATAL Uncaught application error 
Exception: System.NullReferenceException 
Message: Object reference not set to an instance of an object. 
Source: Sitecore.Kernel 
   at Sitecore.Data.DataProviders.CompositeDataProvider.GetItemVersions(ItemDefinition itemDefinition, CallContext context) 
   at Sitecore.Data.DataProviders.DataProvider.GetItemVersions(ItemDefinition item, CallContext context, DataProviderCollection providers) 
   at Sitecore.Data.DataSource.LoadVersions(ItemDefinition definition, Language language) 
   at Sitecore.Data.DataSource.GetVersions(ItemInformation itemInformation, Language language) 
   at Sitecore.Data.DataSource.GetLatestVersion(ItemInformation itemInformation, Language language) 
   at Sitecore.Data.DataSource.GetItemData(ID itemID, Language language, Version version) 
   at Sitecore.Nexus.Data.DataCommands.GetItemCommand.GetItem(ID itemId, Language language, Version version, Database database) 

One way to solve this is by identifying all the items which cause the issue and fix each of them. Another way it to override the CompositeDataProvider and wrap a try/catch block around this logic. The advantage of this approach is that it addresses the root cause and the issue will not reoccur in the future. Following code can be used to catch the exception:

using Sitecore.Collections;
using Sitecore.Configuration;
using Sitecore.Data;
using Sitecore.Data.DataProviders;
using Sitecore.Data.DataProviders.ReadOnly;
using Sitecore.Diagnostics;
using System;
using System.Collections.Generic;

namespace Foundation.Providers
{
    public class SafeCompositeDataProvider : CompositeDataProvider
    {
        public SafeCompositeDataProvider(IEnumerable<ReadOnlyDataProvider> readOnlyDataProviders, DataProvider headProvider) : base(readOnlyDataProviders, headProvider) { }

        public SafeCompositeDataProvider(ObjectList readOnlyDataProviders, DataProvider headProvider) : base(readOnlyDataProviders, headProvider) { }

        public override VersionUriList GetItemVersions(ItemDefinition itemDefinition, CallContext context)
        {
            try
            {
                return base.GetItemVersions(itemDefinition, context);
            }
            catch (Exception ex)
            {
                Log.Error($"SafeCompositeDataProvider: Caught exception for item {itemDefinition.ID} {itemDefinition.Name} {ex}", this);
                return null;
            }
        }
    }
}

This new provider can be patched in through a configuration file like this:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:set="http://www.sitecore.net/xmlconfig/set/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
  <sitecore>
    <databases >
      <!-- target to CM or standalone, no master database in CD-->
      <database id="master" role:require="ContentManagement or StandAlone">
        <dataProviders >
          <dataProvider >
            <patch:attribute name="type">Foundation.Providers.SafeCompositeDataProvider, Foundation.Providers</patch:attribute>
          </dataProvider>
        </dataProviders>
      </database>
    </databases>
  </sitecore>
</configuration>

Fix Custom Cache after Sitecore upgrade

Sitecore allows you to create custom caches which can be managed just like all other Sitecore caches. Brian Caos wrote a great blog post about this which can be found here. Recently Sitecore has made some changes to this and if you are upgrading Sitecore you will notice that the code will not compile anymore. For example below code:

 Cache cache = CacheManager.FindCacheByName(cacheName);

will now throw a compile error:

Error CS0411 The type arguments for method 'CacheManager.FindCacheByName<T>(string)' cannot be inferred from the usage. Try specifying the type arguments explicitly. 

Furthermore this now returns an object of type Sitecore.Caching.Generics.ICache<TKey>. My first attempt at fixing this was by just passing the type of my custom cache in the generic which fixes the compile error but now the code throws an exception at runtime:

Exception: System.InvalidCastException 
Message: Unable to cast object of type 'Sitecore.Caching.Cache' to type 'Sitecore.Caching.Generics.ICache`1[MyCustomCache]'.
Source: Sitecore.Kernel
   at Sitecore.Caching.DefaultCacheManager.FindCacheByName[TKey](String name)

The generic now needs the type of the key passed in. In most cases this can just be set to string. To fix this code changes will be required in 2 places.

1. Custom cache object

In the custom cache make sure the base object uses Sitecore.Caching.Generics and also specify string as the key e.g:

 //updated from Sitecore.Caching
using Sitecore.Caching.Generics;

//Specify the generic type of the key
public class MyCustomCache : CustomCache<string> 

2. Custom Cache Clearer

In the method which clears the cache make sure to also pass the same generic for the key, see highlighted line:

private void DoClear()
{
    foreach (string cacheName in Caches)
    {
        //update to pass generic
        Cache cache = CacheManager.FindCacheByName<string>(cacheName); 
        if (cache == null)
            continue;
        Log.Info(this + ". Clearing " + cache.Count + " items from " + cacheName, this);
        cache.Clear();
    }
}

Setup Sitecore Databases in AWS RDS

Last year Sitecore announced it supports AWS RDS. I’ve previously blogged about how to setup Sitecore databases in AWS using SIF. Containers are the preferred deployment model in Sitecore 10 now. A popular option when deploying Sitecore on AWS is to use RDS instead of Containers for the SQL Databases. In this case it can be a little bit challenging to figure out how to get the databases in RDS. This blog will walk through a sample solution.

Step 1: Take a backup of the databases

Take a backup of the databases, for example from a development environment VM which has the databases installed. Sitecore 10 comes with a Graphical setup package which is great to quickly setup a new Sitecore 10 environment including databases.

Step 2: Upload to S3

At this point you should have a backup of all your databases in a .bak format. Upload all these databases to AW S3. Make sure the S3 bucket is in the same region as your RDS database instance.

Step 3: Restore Databases through RDS’ SP

Databases need to be restored through RDS’ Stored Procedures. This page has a detailed overview but restoring a database can be done through the rds_restore_database SP. It is recommended to not change the database name during restore as it can cause some issues, see troubleshooting section for more details. See example below:

exec msdb.dbo.rds_restore_database 
	@restore_db_name='database_name', 
	@s3_arn_to_restore_from='arn:aws:s3:::bucket_name/file_name.extension',
	@with_norecovery=0|1,
	[@kms_master_key_arn='arn:aws:kms:region:account-id:key/key-id'],
	[@type='DIFFERENTIAL|FULL'];

Troubleshooting

A few common issues can occur during this process:

  • Contained database issues: in RDS you cannot enable contained database authentication through sp_configure stored procedure. Instead it needs to be set on the parameter group, see my other blog post for more details.
  • Issues with xDB containers not getting healthy: if the databases were renamed during the restore there is a good chance some of the xDB roles will not get healthy. The Xdb.Collection.ShardMapManager database has some tables which contain rows with connection info including names of other databases which contain the shards. If the databases have been renamed while they have been restored then this connection info will be incorrect. This can be fixed manually or through the SQL Sharding Deployment Tool.

Prevent page error when Datasource is missing after Upgrade

If you have upgraded Sitecore and along with it GlassMapper to version 5 there might be an issue when the Datasource item is not available in the target database, which will throw an error and the page will not load. Prior to the upgrade this would not have been an issue and the invalid Datasource was simply ignored. When this happens you will see an error in the log like this:

Exception: System.InvalidOperationException 
Message: The model item passed into the dictionary is of type 'Sitecore.Mvc.Presentation.RenderingModel', but this dictionary requires a model item of type 'DatasourceNamespace.DatasourceType'.
Source: System.Web.Mvc
   at System.Web.Mvc.ViewDataDictionary`1.SetModel(Object value)
   at System.Web.Mvc.ViewDataDictionary..ctor(ViewDataDictionary dictionary)
   at System.Web.Mvc.WebViewPage`1.SetViewData(ViewDataDictionary viewData)
   at System.Web.Mvc.RazorView.RenderView(ViewContext viewContext, TextWriter writer, Object instance)
   at System.Web.Mvc.Html.PartialExtensions.Partial(HtmlHelper htmlHelper, String partialViewName, Object model, ViewDataDictionary viewData)
   at Sitecore.Mvc.Presentation.ViewRenderer.Render(TextWriter writer) 

Solution

This can be fixed by checking if the Datasource exists and clearing it in case it doesn’t. The RenderRendering processor is a good place to insert this logic:

using Sitecore.Mvc.Pipelines.Response.RenderRendering; 

namespace Custom.Pipelines.RenderRendering
{
    public class ClearInvalidDatasource : RenderRenderingProcessor
    {
        public override void Process(RenderRenderingArgs args)
        {
            var rendering = args.Rendering;

            if (!string.IsNullOrWhiteSpace(rendering.DataSource) && Sitecore.Context.Database.Items.GetItem(rendering.DataSource) == null)
            {
                rendering.DataSource = string.Empty;
            }
        }
    }
}

This customization can be enabled with the following patch file:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/"> 
  <sitecore>
    <pipelines>
      <mvc.renderRendering>
        <processor patch:before="*[@type='Sitecore.Mvc.Pipelines.Response.RenderRendering.EnterRenderingContext, Sitecore.Mvc']" type="Custom.Pipelines.RenderRendering.ClearInvalidDatasource, Project.Common" />
      </mvc.renderRendering>
    </pipelines>
  </sitecore>
</configuration>

Now that this change is in place the behavior will be identical to the behavior prior to the upgrade.

Execute Ajax calls concurrently without session locking

Large Sitecore applications can be bogged down by pressure on the session state store. This can also be an issue in regular ASP .NET sites and this SO article helps to explain the issue and provides several solutions. Sitecore provides options in this great KB article. In a previous article I wrote about a solution when session locking is slowing down MVC pages.

Another perhaps even more common scenario is when session locking is slowing down multiple Ajax calls. Sitecore’s previously mentioned KB article recommends to not send multiple Ajax calls back to Sitecore. This is good advice and in general it is a good practice to combine Ajax calls as much as possible. However often there are scenario’s where this can be challenging, for example when there are multiple independent components on a page and they each need to load data through Ajax. Before diving into the solution lets look at the issue with multiple Ajax calls.

Ajax calls and session locking

The issue with multiple Ajax calls is that IIS will execute them one at a time. In this article lets look at below simple controller method, which will just wait for 10 seconds and then return:

public class SessionLockTestController : SitecoreController
{
    public async Task<JsonResult> ServiceCall()
    {
        await Task.Delay(10000);

        return Json(new { Ok = true }, JsonRequestBehavior.AllowGet);
    }
}

This Sitecore article describes how to register a custom route. Below image from the network panel shows what happens when a page makes 3 concurrent Ajax requests to this method:

  • Each call starts at the same time
  • The first call completes after 10 seconds
  • The second call completes after 20 seconds, because it had to wait for the first call to finish before it could start
  • The third call completes only after 30 seconds, because it had to wait for the second call to complete before it could start

As can be seen the final Ajax call only completes after 30 seconds. This locking mechanism will have a significant impact on the page loading performance, and also puts a lot of stress on the session store as each locked requests polls the session store frequently to see if it can start executing.

Reducing the locking of Ajax calls

Before unblocking the call it is important to understand what the relationship is between the Ajax call and the session in your solution. There are 3 possibilities which will be discussed separately.

  1. Ajax calls do not use session
  2. Ajax calls read from the session, but do not update it
  3. Ajax calls update/write to the session

Below sections will discuss how to address each scenarios

1. Ajax calls do not use session

This is the easiest scenario. If an action does not use session state, then the controller should disable the session state as can be seen below. In this case the Session object will be null. This will make the action not be subject to any session locking.

[SessionState(SessionStateBehavior.Disabled)]
public class SessionLockTestController : SitecoreController
{
    public async Task<JsonResult> ServiceCall()
    {
        await Task.Delay(10000);

        return Json(new { Ok = true }, JsonRequestBehavior.AllowGet);
    }
}

Below is the same code calling this method. No request is blocked now and they all execute at the same time.

2. Ajax calls read from the session, but do not update it

Sometimes an action method needs access to the session, but does not need to update it. In this case the sessionstate behavior should be set to readonly as can be seen in below code:

[SessionState(SessionStateBehavior.ReadOnly)]
public class SessionLockTestController : SitecoreController
{
    public async Task<JsonResult> ServiceCall()
    {
        await Task.Delay(10000);

        return Json(new { Ok = true }, JsonRequestBehavior.AllowGet);
    }
}

There are a couple of important things to note about readonly session state:

  • The code will have access to the session, an can update it. However any change to the session will be lost after the request.
  • Multiple requests in readonly mode will execute at the same time
  • A request in readonly mode still needs a read lock on the session, and cannot execute at the same time as a request which has read and write access to the session i.e. SessionStateBehavior.Required. For example if one Ajax call is setup in a controller with SessionStateBehavior.Required, then any other Ajax calls with SessionStateBehavior.ReadOnly will still have to wait

3. Ajax calls update/write to the session

This is the most complicated scenario. As far as I know there is no way for Ajax calls to run concurrently, and each update the session, even if you would address the concurrency issues in the application.

A different approach needs to be used since there is no option to execute the requests at the same time while also updating the session. Here are some approaches which might work:

  • Only update session state on initial page load, and use Ajax calls to retrieve additional information which should not be stored in session
  • Have a single Ajax call do all the updates to the session
  • Store any updates somewhere on the client, for example local storage, and update the session at a later point
  • Store data server side in some other store outside of session, and have the application manage concurrency

My Ajax calls are always fast anyway, should I not worry about this?

NO! Each Ajax call will still need to obtain a lock, and this check happens only every 500ms. Consider below method which returns immediately:

public JsonResult ServiceCall()
{
    return Json(new { Ok = true }, JsonRequestBehavior.AllowGet);
}

Below is the same network window with the default session locking, it still takes more than a second before the 3rd API is complete.

Compare this to the same network diagram of the same code without locking below. In this case all Ajax calls return in under 40 ms!

Conclusion

This article showed the detrimental effect the session locking mechanism can have on multiple Ajax calls. Multiple techniques were described to significantly mitigate or eliminate this issue. It is important to remember to try to avoid using session as much as possible, as it comes with scalability and availability concerns. Consider using more advanced and extensible options instead of session if a significant amount of user data needs to be stored server side. The session locking mechanism is on by default. Being able to disable the session can result in large performance improvements.