The Event Library

2009 February 26
by architect

One of the most important documents relating to any Event Driven Architecture is the Event Library. The Event Library holds a clear definition and categorization of all events flowing within the architecture, and must be maintained as the system grows. This library is normally initially created by the output of a detailed (and in my experience, lengthy) analysis of all the business events handled by the architecture.

Typical fields held within the library may include:

  • Event Name
  • Event Category
  • Event Schema and version compatibility information
  • Event Endpoint (i.e. how can I allow a new piece of software to access the event)
  • Event SLA details – such as what failover abilities are used on the event endpoint, what time-to-live is value is set, etc
  • Audit requirements
  • Security  & Access requirements
  • Which part of the business actually owns a particular event

Sometimes, additional more detailed definitions of some events is required – for example, the occurrence of an event may mean that a user from a predefined user group has to perform a workflow of activities to process the event further (for example, a New Order Received event may arrive at an online seller, and as a result a packer has to go to the warehouse, select products, pack them, label the box, and take it to the shipping department).

Without a clearly defined event language, developers of an Event Driven Architecture are likely to create duplicate, ambiguous, or simply invalid events within the architecture. Poisoning an Event Driven Architecture with duplicate, false or irrelevant events can be highly detrimental, as Event Driven Architectures tend to be very heavy users of network resources.

What does an Event actually look like?

2009 February 26
by architect

Events messages are ultimately machine friendly descriptions of an event transmitted over an event bus. They will always contain the actual event details, and should always (within reason) contain the all the data required to maintain the state of the event. This is because Event Driven Architectures do not have a centralized workflow server that maintains state and is able to rehydrate events from some persistent store. Working with experienced Data Architects, it is my experience that this mind shift sometimes takes a little while to kick in!

In most event schemas, there is a common event header filled with the event metadata, and at least one payload which provides the actual event type specific data.

A diagram showing a typical event schema structure:

PostSpecific-EventSchema1As an be seen from a typical event structure, there are a standard set of details normally transmitted with an event. This includes all the contextual information needed for any processing system to understand the state of the event, and the data held within the event.

The separation of the event metadata from the event payload allows any event to be routed (at least at a high level) and understood by the event sinks in the architecture without necessarily needing to parse or understand the contents of the payload.

Event Driven Architecture and Cloud Computing

2009 February 23
by architect

Although the primary benefits of cloud computing at this stage appear to be largely accounting related (i.e. moving capital expenses to operational expenses), there may well be some additional benefits coming down the line as vendors and IT teams get a handle on exactly what it is and how best to leverage it. EDAs which make use of fully self contained messages, and appropriate SOA standards, are well placed to make the move from data center to cloud – if the kind of processing you’re looking like moving to the cloud is inline with what clouds are best suitable for today. 

The complexity with mixing EDA and Cloud Computing today is that the kind of algorithms and processing styles best suited to cloud computing are in effect, largely the opposite of the problem EDA tries to solve. EDA is all about near instant access to business events, where ever its needed and by everyone who needs it. Cloud computing appears more suitable for large scale batch oriented processing, using techniques such as MapReduce to manage the processing of huge data sets in parallel. For example, it is highly unlikely a large scale realtime P&L operation for a trading firm could be run on a cloud today.

This doesn’t mean EDA has no possible use near a cloud though: batch CEP type processing can be run in these environments – for example, the Apache Mahout  project aims to implement a good baseline of machine learning algorithms on top of a MapReduce platform, and with a large number of collected events, this may prove useful. 

Some additional things to consider about cloud computing that not all vendors like to be clear about: 

  • How do you manage the data geographically (for example, the data protection laws in Europe make things complicated)?
  • How do you move the data around externally to the firm with sufficient value to be properly processed, yet still secure enough to appease the auditors? You need to weigh up the rewards & risks of using a cheaper large scale multi-tenant operation such as Amazon over running a private cloud using suppliers such as IBM.
  • How do you deal with failure? For example, a remote cloud center may be taken out from the corporate point of view if some Trawler broke an undersea cable. What happens if your large batch operations are unable to be executed for a few days?
  • How do you deal with a cloud operator disappearing or closing down? There are little to no standards shared between service providers.
  • The complexity of network infrastructure management and software monitoring jumps dramatically – how will this be dealt with?

Incidentally, I have bumped into a few people using cloud computing in a financial services environment (who also have various forms of EDA), and they tend to be only within two contexts: backup and compliance.

A look into Content Based Routing and Filtering Techniques

2009 February 22
by architect

At the heart of an Event Driven Architecture is the core design of the event dissemination mechanisms. How do we pass events around? How can an Event Sink listen to a large amount of events and extract a subset which is of interest?

A non exhaustive list of techniques for filtered event dissemination over a message oriented middleware:

Channel

A channel is a point to point interface which often involves Event Emitters and Event Sinks which are explicitly aware of each other. This use of this mechanism should always be strongly challenged as it significantly reduces the flexibility of the architecture with the Event Emitter making decisions on what events to send down a particular channel.

Subject Based Filtering

This is a filtering mechanism often found with TIBCO implementations. It typically involves creating a tree like hierarchy of subjects, for example: /Exchange/NASDAQ/MSFT or /Exchange/NASDAQ/IBM. This then allows some flexibility and decoupling, in that if an Event Sink has interest in Exchange data, it is free to listen at an appropriate hierarchy level. If for example, a particular sink is only interested in data for the MSFT ticker, it listens to /Exchange/NASDAQ/MSFT; another may be interested in all NASDAQ data – it can then listen to /Exchange/NASDAQ/*.

There are several simplistic mechanisms available (all MOM vendor dependent) to perform some more advanced filtering, for example, if ticker ABC is available on /Exchange/NASDAQ/ABC and /Exchange/LSE/ABC, then a Event Sink may listen on /Exchange/*/ABC, allowing it to still limit the data received to the ABC ticker, but allowing it to listen to all available exchanges.

Typically however, subject based filtering normally ends one way: subject explosion, in which a huge number of subjects are created, and data is emitted multiple times (this can be MOM vendor dependent), in order to provide a high level of flexibility on the conditions of event consumption. At some stage a condition typically arises where the available subject tree simply cannot support the particular filtering condition.

In some situations where the subject tree is very static and the filtering conditions are likely to remain simple, this can be an effective solution. The moment stream based processing or complex event processing is considered, this often becomes too restrictive quickly, and a content based filtering mechanism is added on top. 

Content Based Filtering

Content Based Filtering typically involves a high level simple subject tree, and the deep inspection of events as they are received by an Event Sink. Using a selected algorithm, the Event Sink evaluates attributes of an event and either ignores it or accepts it for further processing. There are a number of algorithms used here, some of which are explained below. First, I will define a sample filter which will then be worked through in all cases:

Lets imagine that we need to be able to route events to a risk management related Event Sink that has an interest of only listening to high volume or high value trades made on the New York Stock Exchange, for the tickers ABC or MSFT. This can be displayed using a mathematical notation as:

PostSpecific-RuleSampleMath

There are also two sample events used to illustrate the algorithms:

 

  1. Event A: TICKER=”XYZ”, TRADEVOLUME=200000, TRADEVALUE=10000
  2. Event B: TICKER=”MSFT’, TRADEVOLUME=30000, TRADEVALUE=1200000

 

Brute force evaluation

This is a simplest mechanism to evaluate message content based on filter conditions. To solve the example above, brute force evaluation needs to break the “OR” rules into “AND” permutations and evaluate all of them sequentially, for example:

  1. TICKER=”MSFT” AND TRADEVALUE > 1000000 
  2. TICKER=”MSFT” AND TRADEVOLUME > 100000
  3. TICKER=”ABC” AND TRADEVALUE > 1000000 
  4. TICKER=”ABC” AND TRADEVOLUME > 100000

This is clearly inefficient and when there are a high number of complex evaluations to be made, is unsuitable.

Counting Algorithm

At the simplest level, the counting algorithm takes the brute force approach of dividing complex filters into permutations and improves it by then breaking apart the filters into unique individual filter attributes (e.g. TICKER=”MSFT”). The algorithm normally involves two counting “tables” – first, a table with all unique filter attributes and a count – with 0 meaning no match, and 1 meaning match, and then a second table which simply associates the unique filter attributes and their counts into a filter condition. As soon as the total count for the unique filter attributes that make up a rule matches the total number of filter attributes, the filter is passed. If not, it is ignored. Each time an event is evaluated, the counters are reinitialized.

A worked example, using events A and B defined above:

PostSpecific-CountingAlg3

The counting algorithm can be improved in many ways – commonly, a index structure is added whereby all 3 core elements of a filter (the attribute, condition and value) are divided into a tree structure with lookup tables for each level of the tree. The further advantage here is that all matching filters can be found without testing all the filters. Example (it’s a hard one to visualize with a simple filter and limited samples. At the final level of the tree, a dashed line means “no match”, while a solid line means “match”):

 

PostSpecific-CountingAlgIndex2

The multileveled indexed counting algorithm is a good option, but ultimately suffers from the need to break complex “OR” filters into “AND” permutations. 

Binary Decision Diagrams and related

Unlike the earlier algorithms, Binary Decision Diagrams are able to handle any complex, multi level, boolean condition as a single filter. Therefor, it allows processing of our original filter defined earlier exactly as is. Additionally, a variant of the Binary Decision Diagram known as the Ordered Binary Decision Diagram (OBDD), is highly efficient at evaluating a large number of filter conditions. A further alternate to OBDD, Reduce Ordered Binary Decision Diagrams (ROBDD) is able to automatically reduce redundant nodes and any isomorphic subgraphs. OBDD and ROBDD are not discussed further in this post as their complexity distracts from the overview of techniques.

Diagrammatically, the evaluation of a Binary Decision Diagram for our earlier condition may be as follows (the right side of any box in a solid line is the ‘true’ condition, the dashed left side is the ‘false’ condition):

PostSpecific-BDD1Note the inherent ability to optimize the evaluation of the filter – the algorithm is able to stop processing as soon as it has confirmed that the TICKER is not MSFT or ABC. This optimization ability added to the OBDD and ROBDD algorithms provide a very powerful suite of filtering algorithms. I have had my development teams implement BDD algorithms where ever we have needed to control the filtering technique directly, and have had tremendous success with them.

If you have interest in understanding this algorithm further, I’d suggest looking at a paper by Campailla et al, published by the IEEE Computer Society in the Proceedings of the 19th Conference on Software Engineering titled Efficient filtering in publish-subscribe systems using binary decision diagrams.

XML Matching

The previous algorithms all implied that there was an easily accessible interface to the event – but what if the events are in fact complex XML structures such as FpML?

Two alternate options are available:

  1. Transport events as XML, but deal with them using structured object interfaces. This immediately causes schema version handling to be a potential point of pain – ideally, the filter conditions must be unaware of the event schemas.
  2. Filter on the XML directly – reduces the complexity of dealing with structured object interfaces, but does not necessarily insulate the filters from version of the event schemas.

So what XML tools are available? There are two key options: XFilter and YFilter, both of which are finite state machines running on top of XPath. YFilter is the newer of the two, and has limited implementations available, but does appear to be the better option. XFilter suffers from problems seen in earlier algorithms in that commonalities between filter conditions are not leveraged, while YFilter allows common sections of filter conditions to be shared.

My Recommendations

As with anything in architecture, the answer is “it depends”. For high performance, general use scenarios, I have found a great deal of success with BDD running atop simplified object interfaces that extract the filtered attributes of the underlying XML structures using XPath. This gives the performance of BDD, while allowing some level of independence from underlying XML Schema changes.

Event Driven Architecture Patterns: Event Normalizer

2009 February 22
by architect

As a (hopefully) regular series on this blog, I will seek to provide a pattern library of patterns found and used extensively within an Event Driven Architecture. I’d love to get feedback from other practitioners on these patterns.

The first pattern I will document is the Event Normalizer.

Title

Event Normalizer

Description

Events external to the event driven architecture – be they sourced from external Message Emitters (such as a trading exchange or market data feed), or enterprise internal Message Emitters (such as a legacy system) – need to be normalized before they can be consumed within the architecture.

This normalization process may make structural changes to the event’s data format (e.g. format conversion) and add particular enrichments to a event (e.g. add geographic attributes). Where additional attributes are added, they would often be sourced from a master data engine.

Diagram

Patterns-EventNormalizer-12

Benefits

  • Reduces coupling between Event Sinks or Event Processors and the Event Emitter. This has a whole host of benefits, particularly allowing an implicit styled architecture (in other words, emitters have no explicit knowledge of sinks or processors) which fosters dynamic determinism and extreme loose coupling
  • Disparate event attributes become mastered into the organizational golden definition. For example, using the diagram elements above, lets imagine that within Format A, a counterparty – Goldman Sachs – is referred to as “GS”, whilst in Format C, it is referred to as “231238″. Internal Format B may already make use of the internal master definition of the counterparty. The Event Normalizer then passes the original event source’s counterparty information on to a Master Data engine, and the mastered counterparty returned is the correct internal representation. Any additional processing taking place down stream just needs to be able to interpret the internal representation – it has no need to understand the original representation.

Limitations

  • The Event Normalizer is a potential performance bottleneck
  • The Normalized Event needs typing and a defined machine definition (such as a well defined XML Schema)
  • The Event Normalizer is also a bottleneck to the dynamic addition of new event sources (although this depends on the design of the master/reference data engine)
  • The organization should typically have sophisticated processes already in place for managing master/reference data

Recommended Usage

The Event Normalizer is suitable for use in the following scenarios:

  • The variety of event classifications (e.g. a trade event, a market data event) is less than the the number of Event Emitter formats. For example, a trading organization that has interest in processing trade events from a large number of different exchanges and brokers
  • A high speed, well maintained, master/reference data engine is available within the organization

Related Patterns

  • The Normalizer pattern within the Enterprise Application Integration library, although this is more focused on format detection and standardization

Pattern Documentation History

  • Version 1.0 – Created
  • Version 1.1 – Clarifications around the difference between the EAI Normalizer and this pattern. Also, an improved diagram.

Event Driven Architecture and Complex Event Processing

2009 February 22
by architect

There are several historical posts from other blogs around discussing the difference between CEP and EDA. They are largely correct in that they can optionally be tightly related, but they don’t completely clarify the typical processing styles found within an EDA when relating to events.

Before talking about the processing styles, and to bring further clarity, a few of the definitions as to the actors found in a typical EDA (if you have alternates or improvements, I’d be interested to hear about them):

  • Event Emitter/Producer – a piece of software that fires events into the event stream.
  • Event Sink – a piece of software that consumes events without injecting further messages into the event stream
  • Event Processor – a piece of software that acts as a combined sink/producer.
  • Event Reactor – a synchronous interface to the EDA, often a web service

Now that the definitions are clear, a look into event processing styles:

Simple Event Processing

This refers to processing of events by a simple event sink that does not take special interest in the content or context of a message.

For a sample, think of a log file event sink – it may accept all events and blindly store them within a database or some other store where another process (potentially using a mechanism such as MapReduce) may perform additional data processing.

Stream Event Processing

A stream event processor takes a look at a large volume of messages and makes a decision to process or ignore the event on each independent event. Normally it does this by listening to a general, or high level, event stream and then uses one of the content filtering techniques to decide on which events to process.

A classic example of this is a trading system that must decide whether or not a trade event requires further middle office action on it (such as invoking a risk management process), or whether it can be STP’d through the system.

Complex Event Processing

A complex event processor acts as a context aware stream processor. As events flow through the sink, decisions are made both on the content and the context of the event. The context may involve one to many attributes – the geographic location of the event, the running total exposure on a trading book, the last 20 minutes of related events.

A common example found for this is credit card fraud management – e.g. phone the credit card owner if two or more transactions appear within a set time window that are geographically very distant.

New Blog!

2009 February 14
by architect

Time for me to get blogging. I am a enterprise architect working on a large scale event driven service oriented architecture, and thought it could be of assistance to others.

Topics I would like to discuss will likely include:

  • Architectural Features & Lessons
  • Event Processing styles (stream processing, complex event processing, etc)
  • Dealing with numerous event producers and sinks
  • Role of non relational databases
  • much more…

I’d love to also hear the experiences of others in the same architectural environment.