Jan 15 2010

Filtering items in the MSF 2.0

Category: .NetRory Primrose @ 11:56

I’ve been playing with the Microsoft Sync Framework over the last couple of years. For too long I have been dabbling in a services based project that synchronises data between a set of clients. The system has been through several designs starting with hand-crafted change tracking which was really tricky. The next version used Microsoft Sync Framework on the client manage this process. This didn’t have the best result as there was no central replica for the data held by the service. The third design used a proxy provider so that central metadata information was held by the service. The fourth and latest design completely pushes Microsoft Sync Framework into the service.

The latest design allows clients to simply work with services and not have to have any understanding of Microsoft Sync Framework. There are a few hurdles with this design however. The provider implementation on the server needs to implement a preview sync so it can tell the client what changes needs to happen without doing them at that time. When a change does happens, the client will only action a single change at a time each of which must operate within a sync session in the service. This means that the sync provider also needs to work with a filtered sync session.

I created a POC project to prove that I could actually achieve these features with Microsoft Sync Framework before I invested any more time in the latest design. The POC aims to sync a data item that looks like the following.

using System;

namespace CachedSyncPOC
{
    public class ItemData
    {
        public ItemData()
        {
            Id = Guid.NewGuid().ToString();
            Data = Guid.NewGuid().ToString();
        }

        public String Id
        {
            get;
            set;
        }

        public String Data
        {
            get;
            set;
        }
    }
}

The code aims to store both the Id and the Data of each item in the metadata store of each replica.

Preview mode

This was actually easy to implement and is done in two parts.

The first part is that the provider notifies any interested parties of changes found using a custom event raised in GetChangeBatch.

public override ChangeBatch GetChangeBatch(
    UInt32 batchSize, SyncKnowledge destinationKnowledge, out Object changeDataRetriever)
{
    ChangeBatch batch = Metadata.GetChangeBatch(batchSize, destinationKnowledge); 

    IList changes = new List(batch.Count());
    ItemDataRetriever retriever = new ItemDataRetriever(Metadata);

    foreach (ItemChange change in batch)
    {
        changes.Add(retriever.LoadFromSyncId(change.ItemId));
    }

    OnChangesFound(
        new ChangesFoundEventArgs
        {
            Changes = changes,
            ReplicaId = ReplicaId.GetGuidId()
        });

    changeDataRetriever = retriever;

    return batch;
}

The second part is that the ProcessChangeBatch simply ignores any changes when in preview mode.

public override void ProcessChangeBatch(
    ConflictResolutionPolicy resolutionPolicy, 
    ChangeBatch sourceChanges, 
    Object changeDataRetriever, 
    SyncCallbacks syncCallbacks, 
    SyncSessionStatistics sessionStatistics)
{
    if (IsPreview)
    {
        return;
    }

    // Use a NotifyingChangeApplier object to process the changes. 
    // This object is passed as the INotifyingChangeApplierTarget
    // object that will be called to apply changes to the item store.
    NotifyingChangeApplier changeApplier = new NotifyingChangeApplier(IdFormats);
    INotifyingChangeApplierTarget2 applier = new ItemDataChangeApplier(this, Metadata, Filter);

    changeApplier.ApplyChanges(
        resolutionPolicy, 
        Configuration.CollisionConflictResolutionPolicy, 
        sourceChanges, 
        (IChangeDataRetriever)changeDataRetriever, 
        Metadata.GetKnowledge(), 
        Metadata.GetForgottenKnowledge(), 
        applier, 
        null, 
        SessionContext, 
        syncCallbacks);
}

This example is a little simplistic in that the event does not indicate what action is going to be taken for each item but that should be easily implemented down the track.

Filtering

The Microsoft Sync Framework team has provided many examples on how to use the framework. My initial reaction to the custom filtering sample was one of complete dread. My filtering requirements are simple and the sample code provided is very complex. I posted a question on the Microsoft Sync Framework forum to seek some advice. While the advice was good, it unfortunately pushed me back into the provided filter sample code.

As I started to look through the code, I realised that there were a couple of different types of filtering being demonstrated. I came across a MSDN document about Microsoft Sync Framework filtering (here) and found that what I needed wasn’t actually that complex. I need to filter an item in a session rather than filter a change unit or implement full custom filtering. It is the latter two that are demonstrated in the sync filtering sample code.

You need to update the provider implementation and provide a filter type in order to filter an item in a sync session. The filter type in my example works with the Id property of the filter item and provides logic for comparing filters between providers.

using System;
using Microsoft.Synchronization;

namespace CachedSyncPOC
{
    public class ItemDataFilter : ISyncFilter
    {
        public ItemDataFilter(String id)
        {
            if (String.IsNullOrEmpty(id))
            {
                const String IdParameterName = "id";

                throw new ArgumentNullException(IdParameterName);
            }

            Id = id;
        }

        public Boolean IsIdentical(ISyncFilter otherFilter)
        {
            ItemDataFilter itemFilter = otherFilter as ItemDataFilter;

            return itemFilter != null && Id.Equals(itemFilter.Id);
        }

        public Byte[] Serialize()
        {
            throw new NotImplementedException();
        }

        public String Id
        {
            get;
            set;
        }
    }
}

The provider needs to support a couple of filter interfaces. I need to implement both interfaces as I intend on using the same provider as both source and destination provider.

internal class CustomProvider : KnowledgeSyncProvider, ISupportFilteredSync, IRequestFilteredSync, IDisposable
{
    public void SpecifyFilter(FilterRequestCallback filterRequest)
    {
        if (Filter != null)
        {
            if (!filterRequest(Filter, FilteringType.CurrentItemsOnly))
            {
                throw new Exception("Filter not accepted at source");
            }
        }
    }

    public Boolean TryAddFilter(Object filter, FilteringType filteringType)
    {
        ISyncFilter syncFilter = filter as ISyncFilter;

        if (syncFilter == null)
        {
            return false;
        }

        return true;
    }

    // Rest of class removed for brevity

}

The next change is the GetChangeBatch method needs to deal with the filter. The ChangeBatch returned to the other provider should only contain changes related to the filter. This was the bit I was dreading in the filter process, but the Microsoft Sync Framework makes this really easy for item filtering. The GetFilteredChangeBatch method takes a delegate that determines whether items should be in the filtered change batch or not.

public override ChangeBatch GetChangeBatch(
    UInt32 batchSize, SyncKnowledge destinationKnowledge, out Object changeDataRetriever)
{
    ChangeBatch batch;

    if (Filter != null)
    {
        FilterInfo filterInfo = new ItemListFilterInfo(IdFormats);

        batch = Metadata.GetFilteredChangeBatch(batchSize, destinationKnowledge, filterInfo, ItemFilterCallback);
    }
    else
    {
        batch = Metadata.GetChangeBatch(batchSize, destinationKnowledge);
    }

    IList changes = new List(batch.Count());
    ItemDataRetriever retriever = new ItemDataRetriever(Metadata);

    foreach (ItemChange change in batch)
    {
        changes.Add(retriever.LoadFromSyncId(change.ItemId));
    }

    OnChangesFound(
        new ChangesFoundEventArgs
            {
                Changes = changes, 
                ReplicaId = ReplicaId.GetGuidId()
            });

    changeDataRetriever = retriever;

    return batch;
}

private Boolean ItemFilterCallback(ItemMetadata itemmetadata)
{
    // TODO: Cache this lookup in GetChangeBatch as we don't want to unnecessarily call this for each item checked
    ItemMetadata metadata = Metadata.FindItemMetadataByUniqueIndexedField("Id", Filter.Id);

    if (metadata == null)
    {
        return false;
    }

    return itemmetadata.GlobalId == metadata.GlobalId;
}

That's all there is to it. Not too hard after all.

My POC project is attached to this post for reference.

CachedSyncPOC.zip (21.08 kb)

Tags:

Comments

1.
Steve Steve United States says:

Rory,

Thanks for this post.

I have a project where we have numerous client sites and a single central database. All tables contain a client key unique to the client. This means that really each table at the client is a subset/section of the full table at the central store.

Can the strategy that you describe here be used to sync only those records for a single client from the central store during a sync session that goes from central to client? In simple English, "sync only those records from the central store where table.client-id = current-client-id".

We are/were using SqlSyncProvider, but since that derives from the KnowledgeSyncProvider that you use, does that matter? If it does, then can we still use SqlSyncProvider at the client since data going from the client to the central store is never filtered?

BTW, I do not want to do (nor realistically can do) static filtering like some of the Microsoft Sync Framework examples (get data where state = "CA", for example). We plan on having hundreds of client sites at some point.

Steve

2.
Rory Rory Australia says:

Hi Steve,

So to clarify, you have a hub and spoke design with the hub being a central database store and the spokes being multiple clients. Each client syncs only their data to the central store. Is that correct?

The solutions proposed here are not ideal for database synchronisation. The SqlSyncProvider already has support in its design for filtering. They are call scopes and data can move in and out of scopes or even exist in multiple scopes.

The local provider defines the scope name for the sync session and the remote provider will only sync data to the local provider for that scope name. The only work you will really need to do here is manage the scope table when clients are added, removed or updated.

MSDN has some information about DB sync scopes at http://msdn.microsoft.com/en-us/library/dd918848%28SQL.105%29.aspx and http://msdn.microsoft.com/en-us/library/dd918682%28SQL.105%29.aspx.

Hope this helps.

3.
Steve Steve United States says:

Rory,

Yes, hub and spoke describes the setup. Data is synchronized in both directions. There are about 40 tables in the database, these are arranged into 6 scopes, basically by function. Some of the scopes are client to central, some are both directions and one is central to client.

I have operational synchronization with the 6 scope setup for a single client to/from the central database. I now what to get the dynamic filtering up and running so that I can have more than 1 client's data in the central database. And then 3 and then 4....

I am familiar with the static filtering (e.g. "[side].[CustomerType] = 'Retail'") that is described at the first link you provided as well as the provisioning code.

I was going to write...
"Although the use of individual scopes may be practical for a few clients, it becomes (or at least it seems) impractical if there are, say 1000 clients meaning 6,000 scopes. This number of clients is conceivable for this application."

But upon reflection, if we append the client id to the scope name and we externally manage the 6 scope names into 6,000 internally, then the fact that there are 6,000 doesn't matter one iota. And, it does simplify the coding significantly.

Now there are a couple of tables that are not segmented by client id; these are shared tables (eg a product list) and can be updated at client and central and synced both ways (let's ignore collision schemes for now). So for these tables, we can put them in one or more common scopes, that is scopes that do not have a clientid prefix. That's good. I already have done a POC for that successfully.

Ok, I think I got that.

One more question, which is not clear from the filtering example. Do I need to provision both client and hub databases for the filtering? It seems to me that I only need to do it on the hub since that database contains records for all clients whereas a client only contains data for a single client. What are your thoughts?

Finally, back to your original post which is item filtering on a KnowledgeSyncProvider: Is this applicable in an shape and form to my scenario?

TIA

Steve

4.
Rory Rory Australia says:

Steve, I think you need to clarify what you want to filter. It seems that you are using scopes to define subsets of data, against which you have multiple clients that have their own data within those subsets. Is the subset the filter in question or are you wanting a filter on each subset itself?

With regard to this implementation, my understanding (meaning I haven't tested this) is that the item filtering in this post depends on the direction of the sync as to which provider must implement the filter logic. It makes sense that if you are doing a two way sync then you would want the filter to apply to both directions. So I am assuming you would need to implement the same filter logic on both providers.

You may need to get some feedback from the MSF team themselves for more specific advice. I have been posting several questions to the MSF forum (social.microsoft.com/Forums/en-US/category/sync) over the last couple of months. The Microsoft Sync Framework team have been great at responding in the forum and have been very helpful.

Add comment


(Will show your Gravatar icon)

  Country flag

biuquote
  • Comment
  • Preview
Loading