Ruminations of idle rants and ramblings of a code monkey

RealTime Web Analytics Presentation And Demo From HDNUG

Code Sample | StreamInsight
Here’s all of the materials from the presentation that I did at the Houston .NET User’s Group on March 13. Some of the things that I updated from the version for Baton Rouge Sql Saturday include: Lock around active requests HashSet: Added a lock block around adding and removing items from the _activeRequest HashSet. Since the HashSet isn’t thread-safe and our source is very highly multi-threaded, we need to make sure that operations that modify the internal array are thread-safe. This eliminated some random “IndexOutOfBoundsException” in the source that would halt StreamInsight process. Checks in StandardDeviation UDA: Added checks for the number of values as well as the result. If the number of values in the set for the standard deviation is less than one, the standard deviation is always 0. Also, after the calculation, there’s an additional check on the result to make sure it’s not NaN. This eliminated some random exceptions in the queries that were calculating the standard deviation that would halt the StreamInsight process. Both cases highlight the need to make sure that your custom code running in StreamInsight is tight and solid. They were pretty difficult to track down as well … both would happen randomly. Intellitrace was absolutely essential to identifying and resolving the issues. After fixing them, I was able to run for hours without a problem. Notes to reproducing the demo: I’m not including the site that I used when running the demo. You can get this from the NopCommerce site on CodePlex. I used the out-of-the-box site with sample data. Keep in mind that the module used for the demo forces some of the requests to take 3 seconds – these are our “Bad Actors” – so it’s in no way representative of the performance of nopCommerce. You’ll need to set it up on a local site on port 81 if you want to use the Visual Studio load tests. From there, you need to copy the WebSiteMonitor.Contracts and WebSiteMonitor.Module assemblies into the \bin folder of the site and add the following into the web.config: Under system.WebServer, add the module Code Snippet <modulesrunAllManagedModulesForAllRequests="true">   <addname="MonitorModule"type="WebSiteMonitor.Module.WebSiteMonitorHttpModule"/> </modules> Under system.ServiceModel, add the WCF configuration Code Snippet <system.serviceModel>   <serviceHostingEnvironmentaspNetCompatibilityEnabled="true"multipleSiteBindingsEnabled="true" />     <bindings>     <netTcpBinding>       <bindingname="streamedNetTcpBinding"transferMode="Streamed" />     </netTcpBinding>   </bindings>   <client>     <endpointaddress="net.tcp://localhost/eventService"binding="netTcpBinding"       bindingConfiguration="streamedNetTcpBinding"contract="WebSiteMonitor.Contracts.IRequestService"       name="Client" />   </client> </system.serviceModel> You may (probably will) need to specify the StreamInsight 2.1 instance name that you are using. This is in the app.config for the WebSiteMonitor.WinForms project under “applicationSettings”. The setting name is “StreamInsightInstance” (very creative, I know). You’ll want to run the client app “As Administrator” or reserve the URLs for non-administrator users and accounts. If you are running from Visual Studio, run Visual Studio as Administrator. I tend to run as Administrator when testing and running the demo. In the real-world, you’d reserve the URLs. The TestWebSite project in the solution is a “New Web Site” template from Visual Studio and helps make sure that everything is set up properly. It also has the configuration settings.

Decoupling Queries from Input and Output

Code Sample | StreamInsight
There’s something that really disturbs me about how all of the StreamInsight samples are put together – everything is very, very tightly coupled. Streams are tightly bound to the source and the target of the data and there’s no good guidance on how to break all that stuff apart. I understand why – it certainly simplifies things – but that’s Bad Mojo™ and not how we want to build enterprise applications. Let’s take a simple scenario. Let’s say that you are developing queries to detect key events for, say, an offshore well. You have a good idea what these events look like – what follows what – and you even have some recordings that you can play back at high speed to test your algorithms. All of your query code is built much like the demos are, so you’re pretty tightly bound to the source (the database) and your output (the console, just for giggles). Now … how do you hook this up to your production sources? And you certainly won’t be using a console output for the events that you detect in production. So do you go in, real quick-like, and change all of your code before compiling it for production? If so … let me know how that goes for you. And don’t call me when things start falling apart while you are doing last-minute compiles and emergency deployments and the like. This, however, is how all of the samples are written. Granted … they are samples and aren’t meant to be “ready for the real-world”. But, sadly, there’s precious little guidance out there on a better way to do this in StreamInsight. With that said, however, this isn’t a new problem in software development. Oh, sure, the technology may be new and/or different, but the pattern of the problem certainly isn’t. What we need to do is abstract the specification of the consumer or producer from its usage when creating the streams. Fortunately, we’ve already defined a pretty well abstracted factory pattern for constructing our producers and consumers, even in the context of Reactive-based streams so this helps (and was part of the method to my madness!). In addition to abstracting the producers and consumers, we also need to have a pattern for reuse of query logic. Take this scenario as an example: we’re monitoring offshore platforms in real-time. We have a set of queries that we use to process the sensor readings from the platforms and this logic should be applied to each of the platforms – it detects the key events that we’re concerned with. The only difference between each set of queries is the source of the data (different connections), the metadata associated with the queries and, possibly, the outputs for detected events (though, using subjects, we can aggregate the output events for a single output stream). Enter the Stream Builder This becomes our unit of query re-use. Using a defined interface and implementing classes based on this interface, we can encapsulate the query logic into a reusable “chunk” that gets built together. We can use this to start and stop the queries/process (depending on which model we are using) as well as to provide configuration for different parameters and the producers and consumers. Let’s start with the base class for configuration. StreamBuilderConfiguration publicclassStreamBuilderConfiguration:IConfiguration {     privatereadonlyDictionary<string, EventComponentDefinition> _eventProducers =         newDictionary<string, EventComponentDefinition>();       publicIDictionary<string, EventComponentDefinition> EventProducers     {         get { return _eventProducers; }     }       privatereadonlyDictionary<string, EventComponentDefinition> _eventConsumers =         newDictionary<string, EventComponentDefinition>();       publicIDictionary<string, EventComponentDefinition> EventConsumers     {         get { return _eventConsumers; }     }       publicstring Name     {         get; set;     }       publicvirtualEventComponentDefinition DefaultConsumer     {         get         {             returnnewEventComponentDefinition(){                 ComponentType=typeof(NullDataConsumerConfig),                 Configuration=newNullDataConsumerConfig()};         }     } } First, you’ll notice that we define a dictionaries of EventComponentDefinitions. What is this? Well, keep in mind that we need a factory and a configuration to create our producers and consumers. So … this is what the EventComponentDefinition class encapsulates. EventComponentDefinition publicclassEventComponentDefinition {     publicobject Configuration { get; set; }     publicType ComponentType { get; set; } } In this case, the type for “ComponentType” is the producer/consumer factory class. So … now we have a way to abstract the definition of the consumers and producers (in the configuration) as well as a way to find them. In case you haven’t guessed yet, this provides an inversion of control that uses the dictionary lookup to locate the appropriate service. Now, the producer and/or consumer must still handle the type of the payload for the stream and we don’t have anything (yet) that checks and/or ensures that this is actually correct and compatible but we now have a contract for specifying these items and a contract for creation. Finally, so that we have an interface that we can bind to without having to worry about generics, we’ll extract the Start and the Stop methods into an interface. It’s all nicely abstracted now and ready for  us to create our stream builder. StreamBuilder publicinterfaceIStreamBuilder {     void Start(Microsoft.ComplexEventProcessing.Application cepApplication);       void Stop(); }   publicabstractclassStreamBuilder<TConfiguration> : IStreamBuilderwhere TConfiguration:StreamBuilderConfiguration {       protected StreamBuilder(TConfiguration configuration)     {         this.Configuration = configuration;     }       public TConfiguration Configuration     {         get;         privateset;     }       protectedApplication CurrentApplication     {         get;         set;     }       publicabstractvoid Start(Microsoft.ComplexEventProcessing.Application cepApplication);       publicabstractvoid Stop();       protectedEventComponentDefinition GetConsumer(string name)     {         if (Configuration.EventConsumers.ContainsKey(name))         {             return Configuration.EventConsumers[name];         }         if (Configuration.DefaultConsumer != null)         {             return Configuration.DefaultConsumer;         }         thrownewInvalidOperationException(string.Format(ExceptionMessages.CONSUMER_NOT_FOUND, name));     }       protectedEventComponentDefinition GetProducer(string name)     {         if (Configuration.EventProducers.ContainsKey(name))         {             return Configuration.EventProducers[name];         }           thrownewInvalidOperationException(string.Format(ExceptionMessages.PRODUCER_NOT_FOUND, name));     }   } We also have nice “helper” methods to get the producers and the consumers. Since a consumer isn’t really required, we also have a default consumer – the null consumer – in case it’s not specified. However, the producer is absolutely required so if it’s not found (not provided), we throw an exception. And since the interface has both a Start and a Stop, we can use this nice layer of abstraction to manage our running queries in a way that is abstracted from the underlying StreamInsight API in a consistent method regardless of the API that we are using for the queries. From here, we’ll create more specific implementations for the Reactive and the Query models. ProcessBuilder Implementation With the Reactive model introduced in StreamInsight 2.1, the process is the unit of execution. Related queries are started, executed and stopped as a single unit. You can have multiple bindings in each process for both input and output and they all start and stop together. So … it makes sense that our ProcessBuilder will build a single process with all of the related streams. bound together in that process. We’ll also abstract the code that we need to write for every source stream and for binding every output stream. The trick for handling the bindings is simple … have them added to a list that gets built up and then, when it’s time, bind them all together using With and run them in a single process. Of course, abstracting the bindings also allows us to do this pretty easily. Overriding the StreamBuilder’s Start method allows us to wrap up all of the necessary housekeeping to get started as well as to bind all of the streams together and run them in a single process. We’ll also define a “CreateStreams” method (as abstract) … our concrete implementations will override this method to do the work of creating the streams. ProcessBuilder publicabstractclassProcessBuilder<TConfiguration>:StreamBuilder<TConfiguration>     where TConfiguration:StreamBuilderConfiguration {     protected ProcessBuilder(TConfiguration configuration) : base(configuration)     {     }       privateList<IRemoteBinding> _bindings;     privateIDisposable _runningBindings;       publicoverridevoid Start(Application cepApplication)     {         _bindings = newList<IRemoteBinding>();         CurrentApplication = cepApplication;         CreateStreams();           var bindingList = _bindings.Skip(1).Aggregate(_bindings.First(),             (current, source) => current.With(source));         _runningBindings = bindingList.Run(Configuration.Name);       }       publicoverridevoid Stop()     {         if (_runningBindings != null)         {             _runningBindings.Dispose();         }     }       protectedIQStreamable<TPayload> CreateStream<TPayload>(string producerName, EventShape eventShape)     {         var producer = Configuration.EventProducers[producerName];         returnRxStream<TPayload>.Create(CurrentApplication, producer.ComponentType, producer.Configuration, eventShape);     }       protectedIQStreamable<TPayload> CreateStream<TPayload>(string producerName, EventShape eventShape,         AdvanceTimeSettings advanceTimeSettings)     {         var producer = Configuration.EventProducers[producerName];         returnRxStream<TPayload>.Create(CurrentApplication, producer.ComponentType, producer.Configuration, eventShape, advanceTimeSettings);     }       protectedvoid Bind<TPayload>(IQStreamable<TPayload> stream, string consumerName, EventShape eventShape)     {         var consumer = GetConsumer(consumerName);         _bindings.Add(stream.ToBinding(CurrentApplication, consumer, eventShape));       }       protectedabstractvoid CreateStreams(); } Creating a new process builder is super-simple a really cuts down on the amount of code that you need to create and bind your streams. Both your events producers and consumers are now named resources – the process builder has no idea what the source or the targets are and it doesn’t need to know. SampleProcessBuilder publicclassSampleProcessBuilder : ProcessBuilder<SampleStreamBuilderConfig> {     public SampleProcessBuilder(SampleStreamBuilderConfig configuration)         : base(configuration)     {     }       protectedoverridevoid CreateStreams()     {           var data = CreateStream<TestDataEvent>("SourceData",                             EventShape.Point,                             Configuration.GetAdvanceTimeSettings());           Bind(data, "Output", EventShape.Point);         var aggregate = from d in data             group d by d.ItemId             into itemGroups             from i in itemGroups.HoppingWindow(TimeSpan.FromSeconds(10), TimeSpan.FromSeconds(2))             selectnew             {                 ItemId = itemGroups.Key,                 Average = i.Avg(e => e.Value)             };         Bind(aggregate, "Aggregate", EventShape.Point);     } } CepStreamBuilder Implementation In keeping with our principle to keep the APIs as close as possible, we’ll also have a CepStreamBuilder that looks and acts the same way as our ProcessBuilder. However, we have a little bit of extra work involved since there is no concept of a Process with multiple inputs and outputs that run as a single unit of execution with the pre-2.1 model, we have a little extra work to do. Prior to the reactive model, the query was the unit of execution – it could have multiple input streams but only one output. Each query was independent of all other queries as well and there was no way to conveniently run related queries together. Reuse of query outputs was done through a technique called Dynamic Query Composition (DQC) that re-enqueued the output of one query back into a new stream, which then got output to another query. If you looked at the details of how this was done, it was a special input/output adapter pair. In essence, it was a subject without the flexibility or extensibility that we have with subjects in the Reactive model. Finally, we’ll also need to take into consideration the potential for name clashes when we name our queries – not something that we needed to worry about when working with the process container – and a way to “group” our related queries together – something the process container does inherently but not something that the query model did for you. QueryBuilder publicabstractclassQueryBuilder<TConfiguration> : StreamBuilder<TConfiguration>     where TConfiguration : StreamBuilderConfiguration {       publicList<Query> _queries;       protected QueryBuilder(TConfiguration configuration) : base(configuration)     {     }       publicoverridevoid Start(Application cepApplication)     {         _queries = newList<Query>();         CurrentApplication = cepApplication;         CreateStreams();           //Start all of the queries.         foreach (var query in _queries)         {             query.Start();         }       }       publicoverridevoid Stop()     {         foreach (var query in _queries)         {             query.Stop();         }         foreach (var query in _queries)         {             query.Delete();         }         _queries = null;     }       protectedabstractvoid CreateStreams();       protectedCepStream<TPayload> CreateStream<TPayload>(string producerName, EventShape eventShape)     {         var producer = Configuration.EventProducers[producerName];         returnCepStream<TPayload>.Create(CurrentApplication,             GetSourceStreamName(producerName), producer.ComponentType,             producer.Configuration, eventShape);     }       protectedCepStream<TPayload> CreateStream<TPayload>(string producerName, EventShape eventShape,         AdvanceTimeSettings advanceTimeSettings)     {         var producer = Configuration.EventProducers[producerName];         returnCepStream<TPayload>.Create(CurrentApplication,             GetSourceStreamName(producerName), producer.ComponentType,             producer.Configuration, eventShape, advanceTimeSettings);     }       protectedvoid Bind<TPayload>(CepStream<TPayload> stream, string consumerName, EventShape eventShape)     {         var consumer = GetConsumer(consumerName);         _queries.Add(stream.ToQuery(CurrentApplication, GetQueryName(consumerName),             GetQueryDescription(consumerName), consumer.ComponentType,             consumer.Configuration, eventShape, StreamEventOrder.FullyOrdered));       }       privatestring GetQueryDescription(string consumerName)     {         return"Query from QueryBuilder [" + Configuration.Name + "] for consumer [" + consumerName + "]";     }       protectedstring GetQueryName(string consumerName)     {         return Configuration.Name + "." + consumerName;     }     protectedstring GetSourceStreamName(string producerName)     {         return Configuration.Name + "." + producerName + ".Stream";     }   } And now, creating a query builder is, like the process builder, super-simple. In fact, we can copy and paste the code from the process building into the query builder. The only difference is the name of the base class. SampleQueryBuilder publicclassSampleQueryBuilder:QueryBuilder<SampleStreamBuilderConfig> {     public SampleQueryBuilder(SampleStreamBuilderConfig configuration) : base(configuration)     {     }       protectedoverridevoid CreateStreams()     {         var data = CreateStream<TestDataEvent>("SourceData",                                         EventShape.Point,                                         Configuration.GetAdvanceTimeSettings());           Bind(data, "Output", EventShape.Point);         var aggregate = from d in data                         group d by d.ItemId                             into itemGroups                             from i in itemGroups.HoppingWindow(TimeSpan.FromSeconds(10), TimeSpan.FromSeconds(2))                             selectnew                             {                                 ItemId = itemGroups.Key,                                 Average = i.Avg(e => e.Value)                             };         Bind(aggregate, "Aggregate", EventShape.Point);     } } Using the Stream Builders Again, this is pretty simple. You create the configuration. You create the builder. You start the builder. You stop the builder. Regardless of whether it’s a QueryBuilder or a ProcessBuilder, the code to do this is the same and it bound to the StreamBuilder base class, rather than the specific implementations. Most of the code is now, actually, in creating the configuration to pass to the builder. Right now this is hard-coded but, in time, it won’t be; we’re pretty close to having all of the core, foundational pieces in place to further abstract even the configuration. RunBuilder privatestaticvoid RunBuilder(Application cepApplication, Type builderType) {     var builderConfig = GetSampleStreamBuilderConfig();     var builder = (IStreamBuilder) Activator.CreateInstance(builderType, builderConfig);     builder.Start(cepApplication);       Console.WriteLine("Builder is running. Press ENTER to stop.");     Console.ReadLine();     builder.Stop(); }   privatestaticSampleStreamBuilderConfig GetSampleStreamBuilderConfig() {     //Create the configuration.     var streamProviderConfig = newSampleStreamBuilderConfig()     {         Name = "Provider1",         CtiDelay = TimeSpan.FromMilliseconds(750),         CtiInterval = TimeSpan.FromMilliseconds(250)     };       //Add the producer.     streamProviderConfig.EventProducers.Add("SourceData", newEventComponentDefinition()     {         Configuration = newTestDataInputConfig()         {             NumberOfItems = 20,             RefreshInterval = TimeSpan.FromMilliseconds(500),             TimestampIncrement = TimeSpan.FromMilliseconds(500),             AlwaysUseNow = true,             EnqueueCtis = false         },         ComponentType = typeof (TestDataInputFactory)     });       //Add the consumer.     streamProviderConfig.EventConsumers.Add("Output", newEventComponentDefinition()     {         Configuration = newConsoleOutputConfig()         {             ShowCti = true,             CtiEventColor = ConsoleColor.Blue,             InsertEventColor = ConsoleColor.Green         },         ComponentType = typeof (ConsoleOutputFactory)     });     streamProviderConfig.EventConsumers.Add("Aggregate", newEventComponentDefinition()     {         Configuration = newConsoleOutputConfig()         {             ShowCti = true,             CtiEventColor = ConsoleColor.White,             InsertEventColor = ConsoleColor.Yellow         },         ComponentType = typeof (ConsoleOutputFactory)     });     return streamProviderConfig; } Wrapping Up We’ve made a good deal of progress with this revision. We focused first on creating abstractions for the producers and consumers – they’re the beginning and the end, after all, of your StreamInsight application – and now added a framework for the chewy middle. I had a couple of questions from folks as to why I made some of the architectural decisions, particularly with having factories for the Reactive-model sinks and producers. Hopefully some of that is a little clearer now but, as we move forward, there will be additional things that we may do with the factories. There are also some changes and tweaks included in this download that aren’t described in the blog, primarily around the serialization/deserialization for the event consumers and producers, that were done while I was building the demo for Baton Rouge Sql Saturday. There’s still some work to do around these to optimize the performance, particularly for the serialization side of the equation, that I’ll get to sometime later. I was, however, quite proud that I was able to shave about 3 ticks off the deserialization performance. Now, this doesn’t sound like much (and it’s not) but when you realize that you may be doing this process tens of thousands of times per second, it adds up really quickly. So, what’s next? There are two key pieces that we’ll need to put into place … abstracting the application hosting mode and then a configuration system. And then lots to build out … common query patterns in Linq macros, some aggregates and operators and building out more of the stream builder classes/model, to name a couple. Oh, and let’s not forget that we’ll need to have some of the basic, common event producers and consumers. Last, but not least, you can download the code from here:

How long did that edge event take?–The Sequel

Code Sample | StreamInsight
This is beginning to become a recurring theme around here. It turns out that there is yet another way to get a composite event that will tell you how long a particular event “took”. This goes back to the initial post, where I was using a subject to create a new event that included the original payload but with a Duration property, allowing you to get the total milliseconds for a specific event. You see, subjects have become one of my mostest favoritest features with the whole Reactive model in StreamInsight 2.1; there’s just so much that they let you do that you couldn’t do before. But this particular scenario was possible without subjects and it really should have smacked me in the face. Rather than using a subject, we can use a UDSO (specifically, an edge stream operator) to achieve the exact same thing but with fewer moving parts. Oh … and using a UDSO doesn’t create an independent timeline, whereas using the Subject does create a new timeline in the process. Overall, the UDSO is far simpler, easier to understand and you don’t have to worry about any CTI violations or funkiness thereabouts. With that, here’s the operator: EventDurationOperator public class DurationEvent<TPayload>{     public TPayload Payload;     public double TotalMilliseconds; }   [DataContract] public sealed class EventDurationOperator<TPayload> : CepEdgeStreamOperator<TPayload, DurationEvent<TPayload>> {    public override bool IsEmpty    {        get { return true; }    }      public override DateTimeOffset? NextCti    {        get { return null; }    }      public override IEnumerable<DurationEvent<TPayload>> ProcessEvent(EdgeEvent<TPayload> inputEvent)    {        if(inputEvent.EdgeType == EdgeType.End){             //Create the new duration event.             yield return new DurationEvent<TPayload>{                 Payload = inputEvent.Payload,                 TotalMilliseconds = (inputEvent.EndTime - inputEvent.StartTime).TotalMilliseconds             };         }    } } As you can see, the output here is identical to the output from the previous solution but the code is far, far simpler. Using the operator is simpler than the subject as well. Using the Operator var source = items.ToEdgeStreamable(e => e.GetEvent(startTime), AdvanceTimeSettings.IncreasingStartTime); var results = source.Scan(() => new EventDurationOperator<DataItem>()); That makes the operator, rather than the subject, the winner here … it keeps true to that Golden Rule of Programming – the KISS principle. (That’s Keep It Simple Stupid, in case you were wondering.)

Dual-Mode Data Sinks - Part I

Code Sample | StreamInsight
Now that we have the input completed, we need to start working on the output adapters. As with the input adapters/sources, we’ll create an architecture that allows you to use the same core code whether you are using the pre-2.1 adapter model or the 2.1 and later sink model. As with our StreamInputEvent, we’ll create an abstraction that allows us to handle any event shape with the same code. This StreamOutputEvent should have properties that express all of the possible variations in event shapes as well as a generic property to hold the payload. Now, if you look at several of the StreamInsight samples, you’ll notice that they only send the payload to the sink. Certainly, that makes a couple of things easier but I don’t think that it’s really the best way to do things. A key aspect of everything StreamInsight is the temporal properties and you lose that if you don’t send the full event shape to your sink. And … the shape is really only a way of expressing the event. Internally, all events, regardless of shape, have start and end times that control their lifetime in the query engine. The shape really becomes a way of how you want to “see” the events and handle them in your output. There’s nothing that stops you from expressing the same query as an edge, a point or an interval. It will impact when the event gets “released” to the output adapters/sink by the engine but really, it’s just a matter of how you want to see the event and when you want to get it to your output. But, before I go any further, here’s our StreamOutputEvent: StreamOutputEvent publicclassStreamOutputEvent<TPayload> {     ///<summary>     /// Creates an output event from a source event.     ///</summary>     ///<param name="sourceEvent">The source event.</param>     ///<returns></returns>     publicstaticStreamOutputEvent<TPayload> Create(PointEvent<TPayload> sourceEvent)     {         var outputEvent = newStreamOutputEvent<TPayload>()             {                 StartTime = sourceEvent.StartTime,                 EventKind = sourceEvent.EventKind,                 EventShape = EventShape.Point             };         if (sourceEvent.EventKind == EventKind.Insert)         {             outputEvent.Payload = sourceEvent.Payload;         }         return outputEvent;     }     ///<summary>     /// Creates an output event from a source event.     ///</summary>     ///<param name="sourceEvent">The source event.</param>     ///<returns></returns>     publicstaticStreamOutputEvent<TPayload> Create(IntervalEvent<TPayload> sourceEvent)     {         var outputEvent = newStreamOutputEvent<TPayload>()         {             StartTime = sourceEvent.StartTime,             EventKind = sourceEvent.EventKind,             EventShape = EventShape.Interval         };         if (sourceEvent.EventKind == EventKind.Insert)         {             outputEvent.EndTime = sourceEvent.EndTime;             outputEvent.Payload = sourceEvent.Payload;         }         return outputEvent;     }     ///<summary>     /// Creates an output event from a source event.     ///</summary>     ///<param name="sourceEvent">The source event.</param>     ///<returns></returns>     publicstaticStreamOutputEvent<TPayload> Create(EdgeEvent<TPayload> sourceEvent)     {         var outputEvent = newStreamOutputEvent<TPayload>()         {             StartTime = sourceEvent.StartTime,             EventKind = sourceEvent.EventKind,             EventShape = EventShape.Edge         };         if (sourceEvent.EventKind == EventKind.Insert)         {             outputEvent.Payload = sourceEvent.Payload;             outputEvent.EdgeType = sourceEvent.EdgeType;             if (sourceEvent.EdgeType == Microsoft.ComplexEventProcessing.EdgeType.End)             {                 outputEvent.EndTime = sourceEvent.EndTime;             }         }         return outputEvent;     }     publicDateTimeOffset StartTime { get; privateset;  }     publicEventKind EventKind { get; privateset; }     publicDateTimeOffset? EndTime { get; privateset;  }     publicEventShape EventShape { get; privateset;  }     publicEdgeType? EdgeType { get; privateset; }     public TPayload Payload { get; privateset; } } With constructors for each different event shape, this is something that we can easily create from an event stream and then send to a single set of code that handles the outbound event. When creating the sink and hooking it to the stream, it’s a matter of how you create your observer and the type specified for the TElement. Very simply, the key thing that dictates what StreamInsight “sends” to your sink is the type for the observer’s generic class parameter. If you specify IObserver<TPayload>, you’ll only get the payload. However, if you specify IObserver<PointEvent<TPayload>>, you’ll get a point event (and so on for intervals and edges). Since our event consumer should be able to consume events of any shape, we will actually need to implement the observer interface for each of the shapes. While it may be tempting to try to implement one interface based on TypedEvent<T>, it won’t work. Yes, I tried. But StreamInsight requires that your sinks specify the payload as the type for the observer or one of the TypedEvent<T> child classes. If you don’t specify an event shape, StreamInsight will handle the events and release them to the output adapter as though they were point events. For some very basic scenarios, this works. But when you start getting into some of the more interesting scenarios for StreamInsight, you’ll want to do a lot more with your output than to view it as a point. But … this is for the next post. Let’s get back to our implementation. As we did with our input producer, we’ll create a common, abstract base class for all of our event consumers. Our concrete consumers will inherit from this class and handle whatever is necessary to write the data to our target, whatever it may be. Again, as with the producer, we’ll specify a configuration class; while the reactive StreamInsight API no longer requires it, the reality is that you will want to have a configuration class for your sinks; you don’t want to hard-code things like database connection strings, web service target URIs or anything like that in your event consumers. But the key thing that our base class will do is to implement the interface for each of the event shapes, translate them to our StreamOutputEvent and then send to our actual event consumer’s code. StreamEventConsumer publicabstractclassStreamEventConsumer<TPayloadType, TConfigType> :     IObserver<PointEvent<TPayloadType>>,     IObserver<EdgeEvent<TPayloadType>>,     IObserver<IntervalEvent<TPayloadType>> {     protected StreamEventConsumer(TConfigType configuration)     {         this.Configuration = configuration;     }     public TConfigType Configuration { get; privateset; }     publicabstractvoid Completed();     publicabstractvoid Error(Exception error);     publicabstractvoid EventReceived(StreamOutputEvent<TPayloadType> outputEvent);     publicvoid OnNext(PointEvent<TPayloadType> value)     {         EventReceived(StreamOutputEvent<TPayloadType>.Create((PointEvent<TPayloadType>)value));     }     publicvoid OnNext(IntervalEvent<TPayloadType> value)     {         EventReceived(StreamOutputEvent<TPayloadType>.Create((IntervalEvent<TPayloadType>)value));     }     publicvoid OnNext(EdgeEvent<TPayloadType> value)     {         EventReceived(StreamOutputEvent<TPayloadType>.Create((EdgeEvent<TPayloadType>)value));     }     publicvoid OnCompleted()     {         Completed();     }     publicvoid OnError(Exception error)     {         Error(error);     } } From here, we’ll continue to build the architecture much the same way that we built the event producers. For each consumer, we’ll have a factory that handles the details of creating and starting the consumer based on a common interface. ISinkFactory interfaceISinkFactory {     IObserver<PointEvent<TPayload>> CreatePointObserverSink< TPayload>(object config);     IObserver<EdgeEvent<TPayload>> CreateEdgeObserverSink<TPayload>(object config);     IObserver<IntervalEvent<TPayload>> CreateIntervalObserverSink<TPayload>(object config); } Between StreamEventConsumer and ISinkFactory, it’s now a small step to create a concrete factory and consumer – as well as the adapters. For simplicity’s sake, we’ll use a console consumer. Console Data Consumer publicclassConsoleDataConsumer<TPayloadType>:StreamEventConsumer<TPayloadType, ConsoleOutputConfig> {     public ConsoleDataConsumer(ConsoleOutputConfig configuration) : base(configuration)     {     }     publicoverridevoid Completed()     {         //Nothing necessary.     }     publicoverridevoid Error(Exception error)     {         Console.WriteLine("Error occurred:" + error.ToString());     }     publicoverridevoid EventReceived(StreamOutputEvent<TPayloadType> outputEvent)     {         if (outputEvent.EventKind == EventKind.Insert)         {             Console.ForegroundColor = Configuration.InsertEventColor;             Console.WriteLine("Insert Event Received at " + outputEvent.StartTime);         }         elseif (Configuration.ShowCti)         {             Console.ForegroundColor = Configuration.CtiEventColor;             Console.WriteLine("CTI event received at " + outputEvent.StartTime);         }         Console.ResetColor();     } } Our factory handles the dirty details of hooking up to our source streams as well as our output adapters. By forcing the factory to implement methods for each event shape we both ensure that we get the interface that we need when creating the sink and tying it to our streams as well as the opportunity to say, in code, that a particular output sink doesn’t support specific shapes, should that be appropriate. ConsoleOutputFactory publicclassConsoleOutputFactory:ISinkFactory , ITypedOutputAdapterFactory<ConsoleOutputConfig>       {         publicIObserver<PointEvent<TPayload>> CreatePointObserverSink<TPayload>(object config)         {             returnnewConsoleDataConsumer<TPayload>((ConsoleOutputConfig)config);         }         publicIObserver<EdgeEvent<TPayload>> CreateEdgeObserverSink<TPayload>(object config)         {             returnnewConsoleDataConsumer<TPayload>((ConsoleOutputConfig)config);         }         publicIObserver<IntervalEvent<TPayload>> CreateIntervalObserverSink<TPayload>(object config)         {             returnnewConsoleDataConsumer<TPayload>((ConsoleOutputConfig)config);         }         publicOutputAdapterBase Create<TPayload>(ConsoleOutputConfig configInfo, EventShape eventShape)         {             switch (eventShape)             {                 caseEventShape.Interval:                     returnnewObserverTypedIntervalOutputAdapter<TPayload>(CreateIntervalObserverSink<TPayload>(configInfo));                     break;                 caseEventShape.Edge:                     returnnewObserverTypedEdgeOutputAdapter<TPayload>(CreateEdgeObserverSink<TPayload>(configInfo));                     break;                 caseEventShape.Point:                     returnnewObserverTypedPointOutputAdapter<TPayload>(CreatePointObserverSink<TPayload>(configInfo));                     break;                 default:                     thrownewArgumentOutOfRangeException("eventShape");             }         }         publicvoid Dispose()         {             //throw new NotImplementedException();         }     } We’ve not touched on the output adapters yet so now’s the time to introduce them; they are, after all, already referenced in our factory. As before, we have a single factory for our producers using the 2.1+ Reactive model as well as our legacy adapter mode. As with our input adapters, the output adapters are relatively thin wrappers around our event consumers that handle the details of lifetime. Unlike the input adapters, with the output adapters, we may well get some data after our “stop” event and we want to make sure that we dequeue all events before shutting down. To control this a little better, we use Monitor.Enter and Monitor.Exit directly rather than the basic lock{} block provided by C#. The lock block, by the way, creates, behind the scenes, a Monitor.Enter/Monitor.Exit pair. However, using this directly allows us to minimize the possibility of deadlocks if we get into a scenario where we are actively dequeuing events when we get a Resume call. By using Monitor.TryEnter(), we can attempt to enter our dequeuing thread from other threads without blocking. If the lock has already been acquired, we don’t need to spin up another thread to dequeue and we certainly don’t need to block waiting for a lock that we won’t actually need once we get it. Our dequeue thread will continue to dequeue 1 event at a time until nothing is left in the queue. And we need to make sure that the dequeue operation is synchronized – only 1 thread can dequeue at a time anyway. Adding multiple threads to the dequeue operation typically won’t help us and we want to make sure that we have all available threads available to process actual query results. Now … once we’ve dequeued, we may want to use techniques to multi-thread sending the results to the final target. But … our actual dequeue from each query/process should be single threaded. Keep in mind, however, that you’ll have multiple, single-threaded output sinks in most real-world applications. You will be multi-threaded, have no worries there. And calls into our event consumers can come from any thread, which is why we need to use locks to make sure that we’re properly synchronized. This is particularly important when our output adapter is stopped. After stop is called, we’ll get one more change to empty our queue. We use the monitor to make sure that we do empty all available events from the queue before calling Stopped(). This ensures that we’ll have a nice, clean shutdown with no hangs and no ObjectDisposedExcetpions. Point Output Adapter publicclassObserverTypedPointOutputAdapter<TPayloadType>         : TypedPointOutputAdapter<TPayloadType>     {         privatereadonlyIObserver<PointEvent<TPayloadType>> _sinkObserver;         public ObserverTypedPointOutputAdapter(IObserver<PointEvent<TPayloadType>> sinkObserver)         {             _sinkObserver = sinkObserver;         }         publicoverridevoid Stop()         {             try             {                 Monitor.Enter(_monitorObject);                 //On last round to dequeue                 EmptyQueue();                 //Completed                 _sinkObserver.OnCompleted();             }             finally             {                 Monitor.Exit(_monitorObject);             }             base.Stop();             Stopped();         }         publicoverridevoid Resume()         {             System.Threading.Thread thd = newThread(DequeueEvents);             thd.Start();         }         publicoverridevoid Start()         {             System.Threading.Thread thd = newThread(DequeueEvents);             thd.Start();         }         privateobject _monitorObject = newobject();         privatevoid DequeueEvents()         {             if (this.AdapterState != AdapterState.Running)             {                 return;             }             //Ensures only 1 thread is dequeuing and no other threads are blocked.             if (Monitor.TryEnter(_monitorObject))             {                 try                 {                     EmptyQueue();                 }                 catch (Exception ex)                 {                     _sinkObserver.OnError(ex);                 }                 finally                 {                     Monitor.Exit(_monitorObject);                     this.Ready();                 }                                  }         }         privatevoid EmptyQueue()         {             PointEvent<TPayloadType> dequeuedEvent;                          while (this.Dequeue(out dequeuedEvent) == DequeueOperationResult.Success  )             {                     _sinkObserver.OnNext(dequeuedEvent);             }         }     } Now that we have all of the core pieces in place, let’s take a look at what we need to do to hook our sink up to the console output. It’s actually very simple. Hooking up to a stream privatestaticvoid RunProcess(Application cepApplication) {     var config = newTestDataInputConfig (){         NumberOfItems=20,         RefreshInterval=TimeSpan.FromMilliseconds(500)     };     var data = RxStream<TestDataEvent>.Create(cepApplication, typeof (TestDataInputFactory), config, EventShape.Point);     var factory = new ConsoleOutputAdapter.ConsoleOutputFactory();               var sink = cepApplication.DefineObserver(() => factory.CreatePointObserverSink<TestDataEvent>                                                        (new ConsoleOutputAdapter.ConsoleOutputConfig()                                                            {                                                                ShowCti = true,                                                                CtiEventColor = ConsoleColor.Blue,                                                                InsertEventColor = ConsoleColor.Green                                                            }));     data.Bind(sink).Run(); } There’s one thing that you may notice … the sink needs to know all of the details about the data class. This is far from ideal … and one of the things that I found so powerful about the untyped adapter model – you weren’t tied to the schema of your data classes. There are various ways that we can handle this but that’s a topic for the next entry. Until then, you can download the code from my SkyDrive.

Bug in StreamInsight 1.2 …

This came up recently on the forums and I’ve been meaning to blog about it and, finally, I’m doing it. Before I get going let me just say the following things: There are a couple of ways to work around the bug. It is fixed in StreamInsight 2.0. It does not apply to StreamInsight 1.1. Only 1.2. The scenario that triggers the bug is, I feel, a pretty narrow and uncommon one. Symptoms Since this happens in the Simple StreamInsight App that I posted on MSDN Samples, that’s what we’ll use. In certain cases, a StreamInsight query will crash on the initial event. You will be able to stop and restart the (now aborted) query from the Query Debugger or any custom tools that use the StreamInsight API to restart queries and it will run fine. This only happens on the initial events. The query debugger shows that the exception for the aborted query is: Microsoft.ComplexEventProcessing.Engine.OperatorExecutionException: An exception happened when operator 'Grouping.1.1.Aggregate.1.1' was processing event, check inner exception for more details. ---> System.ArgumentOutOfRangeException: The added or subtracted value results in an un-representable DateTime. Parameter name: value    at System.DateTime.AddTicks(Int64 value)    at Microsoft.ComplexEventProcessing.Engine.DateTimeExtensions.AddNoOverflow(DateTime dateTime, TimeSpan timeSpan)    at Microsoft.ComplexEventProcessing.Engine.DateTimeExtensions.SubtractThrowIfOverflow(DateTime dateTime, TimeSpan timeSpan)    at Microsoft.ComplexEventProcessing.Engine.SynopsisManager.SingleHoppingWindowManager.PreviousWindowVs(DateTime timestamp)    at Microsoft.ComplexEventProcessing.Engine.SynopsisManager.SingleHoppingWindowManager.OverrideCtiPushBackward(DateTime ctiTimestamp)    at Microsoft.ComplexEventProcessing.Engine.ExecutionOperatorWindowBasedOrderPreservingMinus.ProcessSmartCti(EventReference& eventReference, Int64 stimulusTicks)    at Microsoft.ComplexEventProcessing.Engine.ExecutionOperatorStateful.DoProcessEvent(EventReference& eventReference, Int64 stimulusTicks)    at Microsoft.ComplexEventProcessing.Engine.QueryExecutionOperator.ProcessEvent(Int32 streamNo, EventReference& eventReference, Int64 stimulusTicks, Int64 enqueueSequenceNumber)    --- End of inner exception stack trace ---    at Microsoft.ComplexEventProcessing.Diagnostics.Exceptions.Throw(Exception exception)    at Microsoft.ComplexEventProcessing.Engine.QueryExecutionOperator.ProcessEvent(Int32 streamNo, EventReference& eventReference, Int64 stimulusTicks, Int64 enqueueSequenceNumber)    at Microsoft.ComplexEventProcessing.Engine.ExecutionOperatorStateful.ProcessEvent(Int32 streamNo, EventReference& eventReference, Int64 stimulusTicks, Int64 enqueueSequenceNumber)    at Microsoft.ComplexEventProcessing.Engine.QuerySegmentInputStrategy.DispatchEvents(SchedulingPolicy policy)    at Microsoft.ComplexEventProcessing.Engine.SchedulingPolicy.DispatchEvents()    at Microsoft.ComplexEventProcessing.Engine.DataflowTask.OnRun()    at Microsoft.ComplexEventProcessing.StreamOS.Task.Run()    at Microsoft.ComplexEventProcessing.StreamOS.Scheduler.Main() The most interesting – and, for me – perplexing is the “The added or subtracted value results in an un-representable DateTime”. Huh? How is that happening? Since this happens right at start up, we don’t have the time to connect to the query and record the events as it starts so we need to use trace.cmd (see here, about halfway down) to set up the query trace when the application starts. Once we do that and open the trace for the query, we see that the only event in the query is a CTI with a time of negative infinity … or DateTime.MinValue. Conditions This will only happen when you are using IDeclareAdvanceTimeProperties in your adapter factory or AdvanceTimeSettings when you create the stream, and specify that CTIs are created by event count, not by timespan. The delay and the AdvanceTimePolicy don’t seem to make any difference. Next, the query needs to have a HoppingWindow and no other query operators that alter the event lifetime or duration. Tumbling and count by start time windows don’t have any problems. Fixes/Workarounds There are a few ways that you can work around this. Any one of the following methods will work. Specify that CTIs are created by timespan rather than event count. Add a temporal operator into the stream before the window. For example, adding AlterEventDuration(e => TimeSpan.FromSeconds(0)) will work. var hoppingWindowStream = from s in sourceStream .AlterEventDuration(e=>TimeSpan.FromSeconds(0)) group s by s.DeviceId into aggregateGroup from item in aggregateGroup.HoppingWindow( TimeSpan.FromSeconds(10), TimeSpan.FromSeconds(2), HoppingWindowOutputPolicy.ClipToWindowEnd) select new AggregateItem { DeviceId = aggregateGroup.Key, Average = item.Avg(e => e.Value), Count = item.Count(), Sum = item.Sum(e => e.Value) }; Enqueue a CTI from your adapter. This can have any date/time for the value, as long as it is greater than DateTimeOffset.MinValue + [WindowHopSize]. You can, for example, use new DateTimeOffset(1776, 7,4, 0,0,0,TimeSpan.FromTicks(0)) Upgrade to StreamInsight 2.0 now that it’s released.

Moving from dasBlog to BlogEngine.NET

BlogEngine.NET | Web (and ASP.NET) Stuff
As I mentioned previously, I’ve moved from dasBlog to BlogEngine.NET for this blog. This, of course, involved reformatting and redesigning the look and feel of the site; that’s nothing unique to the migration and I’m not going to go into that at all. What I will do, however, is discuss the process of moving existing content over from dasBlog to BlogEngine, something that isn’t really hard but does have a few gotchas. Moving the Content That’s the first thing that needs to be done. In fact, I did this before I even started formatting the new site – I wanted to be sure that the existing content rendered relatively well in the new design. It was not quite as simple as described on Merill’s blog. All of his steps are valid, but there is actually a couple of other things that need to be done. You will definitely want to use the dasBlog to BlogML Converter that Merill posted on MSDN Code Gallery – dasBlog doesn’t do BlogML and, while BlogEngine will import RSS, RSS usually will not get all of your content. BlogML works much better. There were two things with moving the content … how big a deal those are depend on how picky you are about the move. I was. First, the timestamp on the entry. dasBlog uses UTC (GMT) to store the time and that’s how it is imported into BlogML. BlogEngine uses the server time. Both have an offset to convert the saved time into blog local time, but dasBlog’s offset is from UTC (using standard time zones) and BlogEngine uses an offset from the server time. My server is on US Eastern Time and my local blog time is US Central Time, which means that, on import, I had to convert the time to US Eastern and then set my offset in BlogEngine to –1 (US Central is 1 hour “behind” US Eastern). To do this, I had to modify the code that imported the blog entries, which can be found at BlogEngine.Web\api\BlogImporter.asmx. Since the incoming BlogML only had a post date, not a DateCreated and a DateModified (as BlogEngine does), I also set both the create date and the modify date to the same value. Here’s the code snippet from AddPost: Post post = new Post(); post.Title = import.Title; post.Author = import.Author; post.DateCreated = import.PostDate.AddHours(-5); post.DateModified = import.PostDate.AddHours(-5); post.Content = import.Content; post.Description = import.Description; post.IsPublished = import.Publish; Once I set BlogEngine’s server time offset (in the admin section under “Settings”), all of the times were now correctly displayed as US Central. The second thing relates to the tags … BE uses tags (and categories) while dasBlog only uses categories. In dasBlog, the “tag cloud” is generated from the categories and BE generates this from the actual post tags. I can’t say which method I like better yet or if I prefer some mish-mash of the two (generate the cloud from tags and categories … that may be an idea) but I did know that I didn’t want to lose my tag cloud. So, on import, I added tags for each post category to the imported post. Again, simple and again, in AddPost: if (import.Tags.Count == 0) { post.Tags.AddRange(import.Categories); } else { post.Tags.AddRange(import.Tags); } From a performance standpoint, I couldn’t tell you if AddRange is faster than looping block to add each value individually (and it really doesn’t matter here), but it is simpler, cleaner and much easier to read … so I tend to prefer AddRange(). With these two “issues” resolved – and they aren’t issues with BE, to be sure, just a difference between the two – I was ready to move on. Preserving links Some of my entries have a pretty good page rank on various search engines and there is a non-trivial amount of traffic that is generated from these search engines. While I can go in and change things like the RSS source for my feed from FeedBurner to make the move transparent, that doesn’t help with the search engines. Therefore, I needed a way to ensure that existing links would continue to work without returning 404’s. Yes, I moved the old domain over to the new domain and added it as a host header on the new site, but that does not help prevent link breakage and BE and dasBlog have different formats for their links. I also did not, at this point in time, want to force a redirect as soon as a new person hit my site from a search engine; it’s just rude (IMHO) and doesn’t create a great user experience. Sure, maybe it wouldn’t be a big deal, but I didn’t like it. And besides, it gave me an excuse to write code. :-) To keep the links intact, I decided that I would leave BlogEngine’s UrlRewriting intact; I didn’t want to make too many changes to the base source code as it would make it harder for me to move between versions/revisions. Rather, I wanted to sit on top of it and make sure that the links worked. So I used ASP.NET Url Routing to intercept the requests and send them to the right place (post.aspx). Before I go into the code, let’s first examine the (default) url structures for individual posts. In dasBlog, the post link is in the format yyyy/mm/dd/{CompressedTitle}. In BlogEngine, this would be post/{CompressedTitle} –or- (the permalink format) post.aspx?id={PostGUID}. While BE can have the date as a part of the post link, it still wouldn’t work; they compress their titles differently and, as mentioned before, dasBlog uses UTC internally and it’s used in the link as well. For the routing, I created the route using "{Y}/{M}/{D}/{Title}" as the route url. From there, I needed to implement GetHttpHandler to do the work. Initially, I did the matching to title using a Linq query and it worked just fine. The problem with this is that every title in the posts would need to be converted to the dasBlog format (I copied over the dasBlog CompressTitle method as dasBlogCompressTitle), a process that seemed far from ideal. Once I understood how the dates worked, I was able to do the primary matching on the date and then, if necessary, match the titles for posts on the same date, minimizing the string manipulation that was required. Once I determined what the matching post was, all I needed to do was append the query string “?id={postGuid}” to the URL and then pass back the HttpHandler from post.aspx for the actually processing. If there was no match, then there would be no query string appended and post.aspx would show a 404. The code for this is below: public System.Web.IHttpHandler GetHttpHandler(RequestContext requestContext) { //Get the date from the route. string dateString = string.Format(@"{0}/{1}/{2}", requestContext.RouteData.Values["M"], requestContext.RouteData.Values["D"], requestContext.RouteData.Values["Y"]); string titleString = ((string)requestContext.RouteData.Values["Title"]).Replace(".aspx", ""); DateTime postDate = DateTime.MaxValue; Post selectedPost = null; if (DateTime.TryParse(dateString, out postDate)) { //Date is valid at least. //Find posts with the same date. //Date in URL is in UTC. var postsForTitle = from p in Post.Posts where p.DateCreated.ToUniversalTime().Date == postDate select p; if (postsForTitle.Count() == 1) { //There is only one posts for the date, so this must be it. selectedPost = postsForTitle.First(); } else { //differentiate on title. foreach (var p in postsForTitle) { if (dasBlogCompressTitle(p.Title).Equals(titleString, StringComparison.InvariantCultureIgnoreCase)) { selectedPost = p; break; } } } if (selectedPost != null) { //Use UrlRewriting to put the id of the post in the query string. requestContext.HttpContext.RewritePath(requestContext.HttpContext.Request.Path + "?id=" + selectedPost.Id.ToString(), false); } } return BuildManager.CreateInstanceFromVirtualPath( "~/post.aspx", typeof(System.Web.UI.Page)) as System.Web.IHttpHandler; } Once I set it up on the web.config file and added the routes to the RouteTable, all was good and it worked fine. Preserving the RSS Feed Url The final step - and the thing that occurred to me last – was to make sure that the RSS feed url continued to work. While FeedBurner had no problem with changes the RSS url for my blog, there was the possibility (however remote it may have seemed) that someone was using the dasBlog’s RSS feed rather than FeedBurner. I’m not sure how remote a possibility this is but I didn’t use FeedBurner in the early days of the blog, so I figured that it might be an issue. And I certainly wouldn’t want to alienate the longest-time subscribers to my feed. This was incredibly simple and didn’t require any code at all, just two lines line in the web.config file to have SyndicationService.asmx (dasBlog’s RSS feed) handled by BlogEngine’s RSS feed, which is implemented as an HttpHandler and, by default, at syndication.axd. The first line is for IIS 6.0/IIS 7.0 Classic mode and is under the httpHandlers node of system.web: <add verb="*" path="SyndicationService.asmx" type="BlogEngine.Core.Web.HttpHandlers.SyndicationHandler, BlogEngine.Core" validate="false"/ The second goes in the corresponding location for IIS 7 Pipeline mode, in the handlers node of system.webServer: <add name="dasBlogSyndication" verb="*" path="SyndicationService.asmx" type="BlogEngine.Core.Web.HttpHandlers.SyndicationHandler, BlogEngine.Core" resourceType="Unspecified" requireAccess="Script" preCondition="integratedMode"/> These were copied from the default nodes used by BlogEngine for it’s syndication.axd and then the relevant attributes were changed. Simple enough.

Url Routing in ASP.NET

Web (and ASP.NET) Stuff
One of the new features in ASP.NET 3.5 SP1 is Url Routing … it’s the magic behind the pretty, clean url’s that you see with Dynamic Data and MVC. If you dig around a bit, almost all (if not all) of the material that’s out there focuses on using Routing with either MVC or Dynamic Data … I’ve found nada that actually talks about how it can be added to an existing ASP.NET WebForms application. In talks of .NET 3.5 SP1, Url Routing is even ignored some of the time … and if it’s not ignored, it’s barely mentioned in passing. And then there’s the documentation which is, IMHO, pretty lame and the “How To” articles on MSDN are only mildly better. In spite of (or maybe because of) that, I found myself intrigued by the whole idea of Url Routing. Yes, I had seen it and tweaked it in MVC and Dynamic Data, but I knew that there had to be additional uses for it. So … I set about to build a demo for Houston Tech Fest that showed how to get started with Url Routing, adding it to an existing website that showed some data from Northwind. It’s not a pretty or even really functional app … that’s not the point … and has Url’s that are both plain vanilla and that require query strings. In addition, there was absolutely, positively no consistency between the pages or the query string parameters. I know that doesn’t happen in the real world! ;-) There was one thing that I did do in the application that I don’t see done frequently; I added a static utility class that built and returned Url’s for the different pages in the app. Again, not something that I typically see, but it is definitely something that can make the maintenance of links and url’s easier and more elegant. Well, maybe not elegant but it sure beats “Find in project”. But then, url’s in the application never change, do they? If you’re interested, you can find the PPT and the demos (as well as something resembling a script) on my SkyDrive. It’s the same place as the link that I posted the other day for my Houston TechFest presentations. However, I wanted to spend a little more time and explain what’s going on and how it all works. StaticRoutingHandler This is what does the work. All you have to do to create a routing handler is to implement the IRouteHandler interface. It’s pretty a simple interface – there’s only 1 method. This takes a url route and then transfers this to a static .aspx page that is configured for the route (1 route, 1 page). Since I needed to pass query strings to the pages, I also do some url rewriting in here. Technically, this is not necessary to pass information to the target page, but remember – I didn’t want to have to spend a lot of time making changes all over the app and changing the pages to no longer require query strings would be more work and get away from the point of the demo (i.e. making it easy). While you do create an instance of the page with Url Routing, the pages in this demo didn’t have any nice properties that encapsulated the query string parameters. No way to do that without url rewriting when the app is expecting to get query strings. It takes the data arguments specified in the route and turns them into query strings, using the name as the query string parameter. Here it is: public IHttpHandler GetHttpHandler(RequestContext requestContext) { string finalPath = VirtualPath; if (requestContext.RouteData.Values.Count > 0) { List<string> values = new List<string>(); //Add these to the virtual path as QS arguments foreach (var item in requestContext.RouteData.Values) { values.Add(item.Key + "=" + item.Value); } finalPath += "?" + String.Join("&", values.ToArray()); } //Rewrite the path to pass the query string values. HttpContext.Current.RewritePath(finalPath); var page = BuildManager.CreateInstanceFromVirtualPath( VirtualPath, typeof(Page)) as IHttpHandler; return page; } Configuration This is where it could potentially get hairy. With MVC and Dynamic Data, it’s pretty easy to do it in the global.asax file since their paths and page names follow a clear and simple convention. Not so with the sample app. So each route/page combination needs to be registered separately because the page names have absolutely no consistency, not to mention the query string arguments. Ewwww … that’ll bloat your global.asax real quick. Since I didn’t like how that was going, I decided that I’d make it configuration-driven. This had the added benefit of allowing you to change the routes, arguments, etc. without redeploying code. I wrote a custom configuration section to handle this; this also makes the config read/write with the API which I thought might be a nice-to-have. So, the section looks like the following: <StaticRoutes> <Routes> <add name="AXD" routeUrl="{resource}.axd/{*pathInfo}"></add> <add name="CategoryList" routeUrl ="CategoryList" virtualPath="~/CategoryList.aspx"/> <add name="ProductList" routeUrl="Category/Products/{C}" virtualPath="~/ProductList.aspx"> <Restrictions> <add name="C" value="\d"/> </Restrictions> </add> <add name="ViewProduct" routeUrl="Product/{Id}" virtualPath="~/ViewProduct.aspx"> <Restrictions> <add name="Id" value="\d"></add> </Restrictions> <Defaults> <add name="Id" value="1"/> </Defaults> </add> <add name="CustomerOrders" routeUrl="Customers/Orders/{Cu}" virtualPath="~/ListOrders.aspx"> </add> <add name="CustomerList" routeUrl="Customers" virtualPath="~/Customers.aspx"> </add> <add name="OrderDetails" routeUrl="Customers/Orders/{Id}" virtualPath="~/OrderDetails.aspx"/> </Routes> </StaticRoutes> It’s got all of the information that we need to create our routes. Restrictions and Defaults are optional – not every route needs them. You’ll also notice that the “AXD” route doesn’t have any virtual path listed … when there is no virtualPath specified, the StaticRoutingHandler.Configure method will add a StopRoutingHandler rather than the StaticRoutingHandler. The StopRoutingHandler is the only handler (that I could find) that is in the API itself (MVC and Dynamic Data each have their own RoutingHandlers). It tells Routing to simply ignore the request and send it along it’s merry way as if there was no routing configured. The order of the routes in the config file does matter, but that has nothing to do with my code; when the ASP.NET routing framework looks for the handler that a particular url matches, it grabs the first that it finds on the list. So … that’s how you prioritize your routes. The query string parameters are surrounded with curly braces … so “{Cu}” in the CustomerOrders route above would get put in as a query string named “Cu” with the value that appears in that place in the url. With the configuration, RegisterRoutes, rather than being a mess that looks more like my office desk than code, is clean, simple and sweet. We just need to call a static method on the StaticRoutingHandler class to read the configuration and add the routes. public static void RegisterRoutes(RouteCollection routes) {     routes.Clear();     StaticRouting.StaticRoutingHandler.Configure(routes); } The names (which are required) also allow us to build the Url using the Routing API, rather than having Url’s hardcoded in the application. It’s pretty simple and straightforward; below is one of the more “complex” examples as it builds the RouteValueDictionary with the value for the query string. public static string GetProductsForCategory(int categoryId) { var values = new RouteValueDictionary(); values.Add("C", categoryId); var path = RouteTable.Routes.GetVirtualPath( null, "ProductList", values); return path.VirtualPath; } I got a question when I did this presentation about wildcard parameters. I knew about them and how they worked, but somehow didn’t think to play with them in this sample app. First, you can do this (and I mentioned it in the presentation) by adding a wildcard parameter to the route. In our configuration, it would look like the following: <add name="Wildcard" routeUrl ="Wildcard/{id}/{* params}" virtualPath="~/Customers.aspx"></add> It doesn’t have to have a set parameter (id in this case) in the routeUrl; I just put that there as a comparison. Everything else after that goes into a single item called (in this case) “params”. The “id” is exactly as we expect it to be. However, the wildcard doesn’t translate well into query string parameters. Yes, it is named “params”, but the value is everything in the path in the wildcard’s place, including the slashes in the path. So, with the url http://localhost/Wildcard/3/sasdf/asdf/as, the value of “params” is sasdf/asdf/as. Yes, the routing handler will very happily pass this along when it rewrites the url, but doesn’t really seem to make sense in this context. In this case, I’d say that you put each possible query string/route combination in as a different route to make sure that the page gets the parameters it expects the way that it expects them. I might, some time in the future, put some more thought into this and come up with something else, but for right now, I’m happy punting on it for now and just adding a route for each combination.

Linq Performance - Part I

.NET Stuff | Linq | Performance
Well, it’s been a while since I did my initial review of some simple Linq performance tests. Since then, I’ve done a bit more testing of Linq performance and I’d like to share that. The results are enlightening, to say the least. I did this because I’ve gotten a lot of questions regarding the performance of Linq and, in particular, Linq to Sql – something that is common whenever there is a new data-oriented API. Now, let me also say that performance isn’t the only consideration … there are also considerations of functionality and ease of use, as well as the overall functionality of the API and its applicability to a wide variety of scenarios. I used the same methodology that I detailed in this previous post. Now, all of the tests were against the AdventureWorks sample database’s Person.Contact table with some 20,000 rows. Not the largest table in the world, but it’s also a good deal larger that the much-beloved Northwind database. I also decided to re-run all of the tests a second time on my home PC (rather than my laptop) as the client and one of my test servers as the database server. The specs are as follows: Client DB Server AMD Athlon 64 X2 4400+ AMD Athlon 64 X2 4200+ 4 GB RAM 2 GB RAM Vista SP1 x64 Windows Server 2008 Standard x64 Visual Studio 2008 SP1 Sql Server 2008 x64 So, with that out of the way, let’s discuss the first test. Simple Query This is a simple “SELECT * FROM Person.Contact” query … nothing special or funky. From there, as with all of the tests, I loop through the results and assign them to temporary, local variables. An overview of the tests is below: DataReaderIndex Uses a data reader and access the values using the strongly-typed GetXXX methods (i.e. GetString(int ordinal)). With this set, the ordinal is looked up using GetOrdinal before entering the loop to go over the resultset. This is my preferred method of using a DataReader. int firstName = rdr.GetOrdinal("FirstName"); int lastName = rdr.GetOrdinal("LastName"); while (rdr.Read()) { string fullName = rdr.GetString(firstName) + rdr.GetString(lastName); } rdr.Close(); DataReaderHardCodedIndex This is the same as TestDataReaderIndex with the exception that the ordinal is not looked up before entering the loop to go over the resultset but is hard-coded into the application. while (rdr.Read()) { string fullName = rdr.GetString(0) + rdr.GetString(1); } rdr.Close(); DataReaderNoIndex Again, using a reader, but not using the strongly-typed GetXXX methods. Instead, this is using the indexer property, getting the data using the column name as an object. This is how I see a lot of folks using Data Readers. while (rdr.Read()) { string fullName = (string)rdr["FirstName"] + (string)rdr["LastName"]; } rdr.Close(); LinqAnonType Uses Linq with an anonymous type var contactNames = from c in dc.Contacts select new { c.FirstName, c.LastName }; foreach (var contactName in contactNames) { string fullName = contactName.FirstName + contactName.LastName; } LinqClass_Field Again, uses Linq but this time it’s using a custom type. In this class the values are stored in public fields, rather than variables. IQueryable<AdvWorksName> contactNames = from c in dc.Contacts select new AdvWorksName() {FirstName= c.FirstName, LastName= c.LastName }; foreach (var contactName in contactNames) { string fullName = contactName.FirstName + contactName.LastName; } DataSet This final test uses an untyped dataset. We won’t be doing a variation with a strongly-typed dataset for the select because they are significantly slower than untyped datasets. Also, the remoting format for the dataset is set to binary, which will help improve the performance for the dataset, especially as we get more records. DataSet ds = new DataSet(); ds.RemotingFormat = SerializationFormat.Binary; SqlDataAdapter adp = new SqlDataAdapter(cmd); adp.Fill(ds); foreach (DataRow dr in ds.Tables[0].Rows) { string fullName = dr.Field<String>("FirstName") + dr.Field<String>("LastName"); } cnct.Close(); LinqClass_Prop This uses a custom Linq class with properties for the values. IQueryable<AdvWorksNameProps> contactNames = from c in dc.Persons select new AdvWorksNameProps() { FirstName = c.FirstName, LastName = c.LastName }; foreach (var contactName in contactNames) { string fullName = contactName.FirstName + contactName.LastName; } LinqClass_Ctor This uses the same Linq class as above but initializes the class by calling the constructor rather than binding to the properties. IQueryable<AdvWorksNameProps> contactNames = from c in dc.Persons select new AdvWorksNameProps(c.FirstName, c.LastName); foreach (var contactName in contactNames) { string fullName = contactName.FirstName + contactName.LastName; }                                           If you are wondering why the different “flavors” of Linq … it’s because, when I first started re-running these tests for the blog, I got some strange differences that I hadn’t seen before between (what is now) LinqAnonType and LinqClassField. On examination, I found that these things made a difference and wanted to get a more rounded picture of what we were looking at here … so I added a couple of tests. And the results …       Average LinqClass_Field 277.61 DataReaderIndex 283.43 DataReaderHardCodedIndex 291.17 LinqClass_Prop 310.76 DataSet 323.71 LinqAnonType 329.26 LinqClass_Ctor 370.20 DataReaderNoIndex 401.63 These results are actually quite different from what I saw when I ran the tests on a single machine … which is quite interesting and somewhat surprising to me. Linq still does very well when compared to DataReaders … depending on exactly how you implement the class. I didn’t expect that the version using the constructor would turn out to be the one that had the worst performance … and I’m not really sure what to make of that. I was surprised to see the DataSet do so well … it didn’t on previous tests, but in those cases, I also didn’t change the remoting format to binary; this does have a huge impact on the load performance, especially as the datasets get larger (XML gets pretty expensive when it starts getting big).                                                       I’ve got more tests, but due to the sheer length of this post, I’m going to post them separately.

ASP.NET Async Page Model

.NET Stuff | Performance | Web (and ASP.NET) Stuff
I just did a Code Clinic for the Second Life .NET User’s Group on using the ASP.NET async page model and it occurred to me that it’d be a good idea to do a little blog post about it as well. I’ve noticed that a lot of developers don’t know about this little feature and therefore don’t use it. It doesn’t help that the situations where this technique helps aren’t readily apparent with functional testing on the developer’s workstation or even on a separate test server. It only rears its head if you do load testing … something that few actually do (I won’t go there right now). So, let me get one thing straight from the get-go here: I’m not going to be talking about ASP.NET AJAX. No way, no how. I’m going to be talking about a technique that was in the original release of ASP.NET 2.0 and, of course, it’s still there. There are some big-time differences between the async model and AJAX. First, the async model has nothing at all to do with improving the client experience (at least not directly, though it will tend to). Second, the async model doesn’t have any client-side goo; it’s all server-side code. And finally, there is no magic control that you just drop on your page to make it work … it’s all code that you write in the code-behind page. I do want to make sure that this clear ‘cuz these days when folks see “async” in relation to web pages, they automatically think AJAX. AJAX is really a client-side technique, not server side. It does little to nothing to help your server actually scale … it can, in some cases, actually have a negative impact. This would happen when you make additional round trips with AJAX that you might not normally do without AJAX, placing additional load on the server. Now, I’m not saying that you shouldn’t use AJAX … it’s all goodness … but I just want to clarify that this isn’t AJAX. Now, you can potentially this this for AJAX requests that are being processed asynchronously from the client. Now that we have that out of the way, let me, for a moment, talk about what it is. First, it’s a really excellent way to help your site scale, especially when you have long-running, blocking requests somewhere in the site (and many sites do have at least a couple of these). Pages that take a few seconds or more to load may be good candidates. Processes like making web services calls (for example, to do credit card processing and order placement on an eCommerce site) are excellent candidates as well. Why is this such goodness? It has to do with the way ASP.NET and IIS do page processing. ASP.NET creates a pool of threads to actually do the processing of the pages and there is a finite number of threads that will be added to the pool. These processing threads are created as they are needed … so creating additional threads will incur some overhead and there is, of course, overhead involved with the threads themselves even after creation. Now, when a page is requested, a thread is assigned to the page from the pool and that thread is then tied to processing that page and that page alone … until the page is done executing. Requests that cannot be serviced at the time of the request are then queued for processing as a thread becomes available. So … it then (logically) follows that pages that take a long time and consume a processing thread for extended periods will affect the scalability of the site. More pages will wind up in the queue and will therefore take longer since they are waiting for a free thread to execute the page. Of course, once the execution starts, it’ll have no difference on the performance … it’s all in the waiting for a thread to actually process the page. The end result is that you cannot services as many simultaneous requests and users. The async page model fixes this. What happens is that the long running task is executed in the background. Once the task is kicked off, the thread processing the thread is then free to process additional requests. This results in a smaller queue and less time that a request waits to be serviced. This means more pages can actually be handled at the same time more efficiently … better scalability. You can see some test results of this on Fritz Onion’s blog. It’s pretty impressive. I’ve not done my own scalability testing on one of my test servers here, but I think, shortly, I will. Once I do, I’ll post the results here.                                                                                                                                                                               How do you do this? To get started is actually quite easy, simple in fact. You need to add a page directive to your page. This is required regardless of which method you use (there are two). ASP.NET will then implement IAsyncHttpHandler for you behind the scenes. It looks like this: <%@ Page Language="C#" AutoEventWireup="true" CodeFile="Default.aspx.cs" Inherits="_Default" Async="True" %> Simple enough, right? Let me just add a couple of things that you need to make sure you have in place. You will need to follow the .NET asynchronous pattern for this to work … a Begin method that returns IAsyncResult and an end method that takes this result. It’s typically easiest to do this with API’s that already have this implemented for you (you just return their IAsyncResult object). There’s a ton of them and they cover most of the situations where this technique helps. Now, to actually do this. Like I said, there’s two different ways to use this. The first is pretty easy to wireup and you can add multiple requests (I misstated this during the Code Clinic), but all of the async requests run one at a time, not in parallel. You simply call Page.AddOnPreRenderCompleteAsync and away you go. There are two overloads for this method, as follows: void AddOnPreRenderCompleteAsync(BeginEventHandler b, EndEventHandler e)                         void AddOnPreRenderCompleteAsync(BeginEventHandler b, EndEventHandler e, object state) The handlers look like the following: IAsyncResult BeginAsyncRequest(object sender, EventArgs e, AsyncCallback cb, object state) void EndAsyncRequest(IAsyncResult ar)             The state parameter can be used to pass any additional information/object/etc. that you would like to the begin and the end methods (it’s a member if the IAsyncResult interface), so that can be pretty handy. The code behind for such a page would look like the following: protected void Page_Load(object sender, EventArgs e) { LoadThread.Text = Thread.CurrentThread.ManagedThreadId.ToString(); AddOnPreRenderCompleteAsync(new BeginEventHandler(BeginGetMSDN), new EndEventHandler(EndAsyncOperation)); } public IAsyncResult BeginGetMSDN(object sender, EventArgs e, AsyncCallback cb, object state) { BeginThread.Text = Thread.CurrentThread.ManagedThreadId.ToString(); HttpWebRequest _request = (HttpWebRequest)WebRequest.Create(@""); return _request.BeginGetResponse(cb, _request); } void EndAsyncOperation(IAsyncResult ar) { EndThread.Text = Thread.CurrentThread.ManagedThreadId.ToString(); string text; HttpWebRequest _request = (HttpWebRequest)ar.AsyncState; using (WebResponse response = _request.EndGetResponse(ar)) { using (StreamReader reader = new StreamReader(response.GetResponseStream())) { text = reader.ReadToEnd(); } } Regex regex = new Regex("href\\s*=\\s*\"([^\"]*)\"", RegexOptions.IgnoreCase); MatchCollection matches = regex.Matches(text); StringBuilder builder = new StringBuilder(1024); foreach (Match match in matches) { builder.Append(match.Groups[1]); builder.Append("<br/>"); } Output.Text = builder.ToString(); } } If you run this (on a page with the proper controls, of course), you will notice that Page_Load and BeginGetMSDN both run on the same thread while EndAsyncOperation runs on a different thread. The other method uses a class called PageAsyncTask to register an async task with the page. Now, with this one, you can actually execute multiple tasks in parallel so, in some cases, this may actually improve the performance of an individual page. You have two constructors for this class:     public PageAsyncTask( BeginEventHandler beginHandler, EndEventHandler endHandler, EndEventHandler timeoutHandler, Object state) and public PageAsyncTask( BeginEventHandler beginHandler, EndEventHandler endHandler, EndEventHandler timeoutHandler, Object state, bool executeInParallel){}   The only difference between the two is that one little argument … ExecuteInParallel. The default for this is false, so if you want your tasks to execute in parallel, you need to use the second constructor. The delegates have identical signatures to the delegates for AddOnPreRenderComplete. The new handler timeoutHandler, is called when the operations times out and has the same signature to the end handler. So … it’s actually trivial to switch between the two (I did it to the sample listing above in about a minute.) I, personally, like this method better for two reasons. One, the cleaner handling of the timeout. That’s all goodness to me. Second, the option to have them execute in parallel. The same page as above, now using PageAsyncTask looks like to following: public partial class _Default : System.Web.UI.Page { protected void Page_Load(object sender, EventArgs e) { LoadThread.Text = Thread.CurrentThread.ManagedThreadId.ToString(); PageAsyncTask t = new PageAsyncTask( BeginGetMSDN, EndAsyncOperation, AsyncOperationTimeout, false); } public IAsyncResult BeginGetMSDN(object sender, EventArgs e, AsyncCallback cb, object state) { BeginThread.Text = Thread.CurrentThread.ManagedThreadId.ToString(); HttpWebRequest _request = (HttpWebRequest)WebRequest.Create(@""); return _request.BeginGetResponse(cb, _request); } void EndAsyncOperation(IAsyncResult ar) { EndThread.Text = Thread.CurrentThread.ManagedThreadId.ToString(); string text; HttpWebRequest _request = (HttpWebRequest)ar.AsyncState; using (WebResponse response = _request.EndGetResponse(ar)) { using (StreamReader reader = new StreamReader(response.GetResponseStream())) { text = reader.ReadToEnd(); } } Regex regex = new Regex("href\\s*=\\s*\"([^\"]*)\"", RegexOptions.IgnoreCase); MatchCollection matches = regex.Matches(text); StringBuilder builder = new StringBuilder(1024); foreach (Match match in matches) { builder.Append(match.Groups[1]); builder.Append("<br/>"); } Output.Text = builder.ToString(); } void AsyncOperationTimeout(IAsyncResult ar) { EndThread.Text = Thread.CurrentThread.ManagedThreadId.ToString(); Output.Text = "The data is not currently available. Please try again later." } } Not much difference there. We have 1 additional method for the timeout and the registration is a little different. By the way, you can pass null in for the timeout handler if you don’t care about it. I don’t recommend doing that, personally, but that’s up to you. There you have it … a quick tour through the ASP.NET asynchronous page model. It’s clean, it’s easy, it’s MUCH better than spinning up your own threads and messing with synchronization primitives (this is mucho-bad-mojo, just say NO) and it’s got some pretty significant benefits for scalability.                                                             With that, I’m outta here. Happy coding!

Cool way to do ASP.NET Caching with Linq

.NET Stuff | Linq | Web (and ASP.NET) Stuff
OK, well, I think it's cool (and since the mind is its own place ...). I've been a big fan of's cache API since I found out it way back in the 1.0 beta. It certainly solves something that was problematic in ASP "Classic" in a clean, elegant and darn easy to use way. Unfortunately, not a lot of folks seem to know about it. So I'll start with a little overview of caching. As the name implies, it's a cache that sits server side. All of the relevant, .Net-supplied classes are in the System.Web.Caching namespace and the class representing the cache itself is System.Web.Caching.Cache. You can access it from the current HttpContext (which you'll see). The management of the cache is handled completely by ... you just have to add objects to it and then read from it. When you add to the cache, you can set options like dependencies, expiration, priority and a delegate to call when the item is removed from the cache. Dependencies are interesting ... they will automatically invalidate (and remove) the cache item based on notification from the dependency. 1.x had only 1 cache dependency class (System.Web.Caching.CacheDependency) that allowed you to have a dependency on a file, another cache item, and array of them or another CacheDependency. Framework 2.0 introduced System.Web.Caching.SqlCacheDependency for database dependencies and System.Web.Caching.AggregateCacheDependency for multiple, related dependencies. With the AggregateCacheDependency, if one of the dependencies changes, it item is invalidated and tossed from the cache. Framework 2.0 also (finally) "unsealed" the CacheDependency class, so you could create your own cache dependencies. With expiration, you can have an absolute expiration (specific time) or a sliding expiration (TimeSpan after last access). Priority plays into the clean-up algorithm; the Cache will remove items that haven't expired if the cache taking up too much memory/resources. Items with a lower priority are evicted first. Do yourself a favor and make sure that you keep your cache items reasonable. Your AppDomain will thank you for it. also provides page and partial-page caching mechanisms. That, however, is out of our scope here. For the adventurous among that don't know what that is ... So ... the cache ... mmmmm ... yummy ... gooooood. It's golly-gee-gosh-darn useful for items that you need on the site, but don't change often. Those pesky drop-down lookup lists that come from the database are begging to be cached. It takes a load off the database and is a good way to help scalability - at the cost of server memory, of course. (There ain't no free lunch.) Still, I'm a big fan of appropriate caching. So ... what's the technique I mentioned that this post is title after? Well, it's actually quite simple. It allows you to have 1 single common method to add and retrieve items from the cache ... any Linq item, in fact. You don't need to know anything about the cache ... just the type that you want and the DataContext that it comes from. And yes, it's one method to rule them all, suing generics (generics are kewl!) and the Black Voodoo Majick goo. From there, you can either call it directly from a page or (my preferred method) write a one-line method that acts as a wrapper. The returned objects are detached from the DataContext before they are handed back (so the DataContext doesn't need to be kept open all) and returned as a generic list object. The cache items are keyed by the type name of the DataContext and the object/table so that it's actually possible to have the same LinqToSql object come from two different DataContexts and cache both of them. While you can load up the cache on application start up, I don't like doing that ... it really is a killer for the app start time. I like to lazy load on demand. (And I don't wanna hear any comments about the lazy.) Here's the C# code: /// <summary> /// Handles retrieving and populating Linq objects in the ASP.NET cache /// </summary> /// <typeparam name="LinqContext">The DataContext that the object will be retrieved from.</typeparam> /// <typeparam name="LinqObject">The object that will be returned to be cached as a collection.</typeparam> /// <returns>Generic list with the objects</returns> public static List<LinqObject> GetCacheItem<LinqContext, LinqObject>() where LinqObject : class where LinqContext : System.Data.Linq.DataContext, new() { //Build the cache item name. Tied to context and the object. string cacheItemName = typeof(LinqObject).ToString() + "_" + typeof(LinqContext).ToString(); //Check to see if they are in the cache. List<LinqObject> cacheItems = HttpContext.Current.Cache[cacheItemName] as List<LinqObject>; if (cacheItems == null) { //It's not in the cache -or- is the wrong type. //Create a new list. cacheItems = new List<LinqObject>(); //Create the contect in a using{} block to ensure cleanup. using (LinqContext dc = new LinqContext()) { try { //Get the table with the object from the data context. System.Data.Linq.Table<LinqObject> table = dc.GetTable<LinqObject>(); //Add to the generic list. Detaches from the data context. cacheItems.AddRange(table); //Add to the cache. No absolute expirate and a 60 minute sliding expiration HttpContext.Current.Cache.Add(cacheItemName, cacheItems, null, System.Web.Caching.Cache.NoAbsoluteExpiration, TimeSpan.FromMinutes(60), System.Web.Caching.CacheItemPriority.Normal, null); } catch (Exception ex) { //Something bad happened. throw new ApplicationException("Could not retrieve the request cache object", ex); } } } //return ... return cacheItems; } And in VB (see, I am multi-lingual!) ... ''' <summary> ''' Handles retrieving and populating Linq objects in the ASP.NET cache ''' </summary> ''' <typeparam name="LinqContext">The DataContext that the object will be retrieved from.</typeparam> ''' <typeparam name="LinqObject">The object that will be returned to be cached as a collection.</typeparam> ''' <returns>Generic list with the objects</returns> Public Shared Function GetCacheItem(Of LinqContext As {DataContext, New}, LinqObject As Class)() As List(Of LinqObject) Dim cacheItems As List(Of LinqObject) 'Build the cache item name. Tied to context and the object. Dim cacheItemName As String = GetType(LinqObject).ToString() + "_" + GetType(LinqContext).ToString() 'Check to see if they are in the cache. Dim cacheObject As Object = HttpContext.Current.Cache(cacheItemName) 'Check to make sure it's the correct type. If cacheObject.GetType() Is GetType(List(Of LinqObject)) Then cacheItems = CType(HttpContext.Current.Cache(cacheItemName), List(Of LinqObject)) End If If cacheItems Is Nothing Then 'It's not in the cache -or- is the wrong type. 'Create a new list. cacheItems = New List(Of LinqObject)() 'Create the contect in a using block to ensure cleanup. Using dc As LinqContext = New LinqContext() Try 'Get the table with the object from the data context. Dim table As Linq.Table(Of LinqObject) = dc.GetTable(Of LinqObject)() 'Add to the generic list. Detaches from the data context. cacheItems.AddRange(table) 'Add to the cache. No absolute expirate and a 60 minute sliding expiration HttpContext.Current.Cache.Add(cacheItemName, cacheItems, Nothing, _ Cache.NoAbsoluteExpiration, TimeSpan.FromMinutes(60), _ CacheItemPriority.Normal, Nothing) Catch ex As Exception 'Something bad happened. Throw New ApplicationException("Could not retrieve the request cache object", ex) End Try End Using End If 'return ... Return cacheItems End Function   The comments, I think, pretty much say it all. It is a static method (and the class is a static class) because it's not using any private fields (variables). This does help performance a little bit and, really, there is no reason to instantiate a class if it's not using any state. Also, note the generic constraints - these are actually necessary and make sure that we aren't handed something funky that won't work. These constraints are checked and enforced by the compiler. Using this to retrieve cache items is now quite trivial. The next example shows a wrapper function for an item from the AdventureWorks database. I made it a property but it could just as easily be a method. We won't get into choosing one over the other; that gets religious. public static List<StateProvince> StateProvinceList { get { return GetCacheItem<AdvWorksDataContext, StateProvince>(); } } And VB ... Public ReadOnly Property StateProvinceList() As List(Of StateProvince) Get Return GetCacheItem(Of AdvWorksDataContext, StateProvince)() End Get End Property Isn't that simple? Now, if you only have one DataContext type, you can safely code that type into the code instead of taking it as a generic. However, looking at this, you have to admit ... you can use this in any project where you are using Linq to handle the cache. I think it's gonna go into my personal shared library of tricks. As I think you can tell, I'm feeling a little snarky. It's Friday afternoon so I have an excuse. BTW ... bonus points to whoever can send me an email naming the lit reference (and finish it!) in this entry. Umm, no it isn't Lord of the Rings.