Monday, January 21, 2008

Enterprise Information Integration - A First Look

Nowadays, A trend I noticed in IT Professionals, if any technology start with E** will be on demand, and it’s not absolutely wrong because we have seen in last one decade, “E” runs Enterprises. We have live examples of EAI (Enterprise Application Integration), EDI (Electronic Data Interchange), ETL (Extract Transform Load), ERP (Enterprise Resource Planning), ESB (Enterprise Service Bus), and I can bet about EII (Enterprise Information Integration) that would be not an exception among the group of “E”-Technologies. In this competitive edge, every enterprise wants to best data infrastructure with cost efficiency, and that’s bet “innovations never going to end for this requirement”. In this globalization most of companies are operating form different countries with many cultures, people, languages, and these all brings different applications, different vendors, different standards, different platforms, different technology and systems’ cultures. Now, these things don’t making life easy for CIOs, and they makes our life tuff [:-)] like universal food chain (a big fish eat a small one). Generally to handle such scenarios we are implementing EAI, ETL solutions for a low latency/live data integration and historical data integration respectively but still one unnoticed scenarios need some dedicated effort from the technology community these days which is Information Integration, generally we are fulfilling the purpose of Information Integration by data integration and EAI mostly but such implementations will be not cost competitive compare to dedicated EII implementation because EII is gaining traction for enabling data integration without the need for the physical instantiation of the integration. In other words, EII adds integrated reporting capabilities while minimizing impact on existing systems. But before deciding the integration pattern we must have clear idea what we should integrate with EII patterns and what shouldn’t.

Let’s have look to some situations where EII can be ideal and cost/effort effective solution for any enterprise.

  1. Connecting structured data with unstructured data, this can take advantage of EII’s capability of leaving data in place that could not dramatically increase overall storage requirements. (this you can also count as disadvantage of EAI/ETL solutions in such Information Integration scenarios)
  2. When immediate data change required in response to the data view. (This requirement will be not easy and cost competitive for EAI/ETL).
  3. Some operational and regulatory reporting where the data needed is not completely integrated in one place.
  4. When data transformation is relatively light or nonexistent and just getting the data together for integrated query is the biggest challenge.

At bottom line I can state EII is for implementing such scenarios like integrated queries, connecting data without increasing storage requirements and low latency responses.


So, what could be definition of “EII”, I have referred one definition constructed by “Integration Consortium” and I would like to share with you all, EII is the integration of data from multiple systems into a unified, consistent and accurate representation geared toward the viewing and manipulation of the data. Data is aggregated, restructured and relabeled (if necessary) and presented to the user.


Integration Type

Data

Purpose

Audience

Data Integration (ETL)

Historical

Trend Analysis

Decision-Makers

Application Integration

Live Data

Synchronization

IT Organization

Information Integration

Live

Productivity

End Users


Now after understanding of the EII, the oblivious question would come in mind about implementation and the instrument (tool). Many of technology vendors have already initiated and introduce some of tools focusing or additional functionalities in existing tools which support such EII implementations. The instruments like SOA, Web Services, EAI / ETL tools could be the best picks for implementing such requirements. I believe the following requirements are the key of any EII implementation.

  • Service Oriented Architecture
  • Mata Data Management
  • Semantic Information Model
  • Dynamic Aggregation

The above key topics I believe will be known to any Integration Architect, so any such program or tool can handle or facilitate the above keys to implement in aggregated manner could be used to impermanent EII solutions.


For BizTalk Server, I see very bright scope for such kind of implementation with other coupled Microsoft Technologies like Windows Communication Foundation, Windows Workflow Foundation, MS SSO, Host Integration Server, SQL Server Analysis Services, and SQL Server Integration Services. Additionally the innovative steps like ESB by Microsoft Patterns and Practices are giving very high level of confidence to such implementation through Microsoft Platform.


If I talk about my experience with EII system then yes, with the grace of God I got such precious opportunity to put the first EII implementation in GO-LIVE state at my organization.


I will try to come up with more articles on such immerging Enterprise Information Integration with BizTalk and Microsoft Technology specific articles. Believe me there is lots of things to share with you about my exciting experience for Enterprise Information Integration.

Thanks for BEAR me during the article. Your valuable feedback and suggestion are welcome to nilayparikh@gmail.com

Wednesday, January 16, 2008

BizTalk 2006 / R2 Publishing Throttling State #2 (imbalanced message publishing rate, input rate exceeds output rate)

Our next destination in BizTalk Throttling exploration drive is inbound Throttling states, publishing throttling state #2 Throttling due to imbalanced message publishing rate (input rate exceeds output rate).

Throttling state # 2, due to imbalanced message publishing rate, input rate exceeds output rate:

This is the most simple scenario to understand, here imbalanced message publishing rate means the ratio of the incoming messages by outgoing messages or vice versa.

The following is the condition of the Throttling state:
- (Message publishing incoming rate / Message publishing outgoing rate) * Rate overdrive factor in %

This condition belong to rate base throttling, For inbound (published) messages, BizTalk Server throttles publishing of messages if The Message publishing incoming rate for the host instance exceeds the Message publishing outgoing rate * the specified Rate overdrive factor (percent) value. The Rate overdrive factor (percent) parameter is configurable on the Message Publishing Throttling Settings dialog box. Rate based throttling for inbound messages is accomplished primarily by inducing a delay before publishing the batch of messages into the MessageBox database. No other action is taken to accomplish rate based throttling for inbound messages.

In following general scenario, I found the the throttling state can be possible in environment.

- very high demand to process messages and less available resources to process those messages.
- high use of co-related messages and self-correlated ports, and multiple subscribers for single received message.
- Slow outbound adapters.
- imbalance between host configured for inbound adapter and host configuration for XLANG / host configuration for outbound adapter.
- high processing complexity.



Understanding about Publishing throttling threshold parameter,

Minimum number of samples: Minimum number of messages BizTalk Server will sample for the Sampling window duration before considering rate-based throttling. If the actual number of samples in a sampling window fall below this value then the samples are discarded and throttling is not applied. This value should be consistent with a rate at which messages can be published under a medium load. For example, if your system is expected to handle 1,000 documents per second under a medium load, then this parameter should be set to 1,000 * Sample window duration in seconds (or more precisely, 1 * Sample window duration (milliseconds)). If the value is set too low, then the system may experience a throttling condition under low load. If the value is set too high, then there may not be enough samples for this technique to be effective.
Enter a value of zero to disable rate based inbound throttling.
The default value is 100.

Sampling window duration (milliseconds): The time-window measured in milliseconds, which is used to calculate the publishing rate based on the samples collected. The duration should be increased if the latency required for publishing a single message is high.
Enter a value of zero to disable rate based inbound throttling.
The default value is 15,000.

Rate overdrive factor (percent): This controls how much higher you allow the request rate to be than the completion rate before a throttling condition occurs. For example, if messages are being published at a rate of 200 per second and this parameter is set to 125, then the system will allow the publication of up to 250 messages per second (125% * 200 = 250) before applying throttling. Specifying too small a value for this parameter will cause the system to throttle more aggressively and could lead to over-throttling. Specifying too large a value for this parameter will cause under throttling and prevent the throttling mechanism from recognizing a legitimate throttling condition.
The default value is 125.

Maximum throttling delay (milliseconds): This is the maximum delay BizTalk Server will impose on a message instance due to throttling. The actual delay depends on the severity of the throttling condition.
Enter a value of zero to disable inbound throttling.
The default value is 300,000.

Triggering Mechanism of State #2 publishing throttling: In the sampling window duration, if the incoming/outgoing ratio excceds the configured parameter "Rate overdrive factor" value then BizTalk will be in throttling state.

Throttling Action: Block the publishing thread for a dynamically computed time period until the Message Publishing Incoming Rate is at par with the Message Publishing Outgoing Rate * the specified Rate overdrive factor (percent) value. By blocking the publishing thread, BizTalk can reduce the queued up messages, I have noticed many time the queued up inbound messages has very bad impact on the memory as well as the processing on the BizTalk artifacts. And the impact of the queued up messages is showing exponential memory & utilization which can also impact indirectly overall to every processing in BizTalk. So by blocking publishing thread, BizTalk is reducing incoming message flow and keeping flow healthy.

Sample collection of BizTalk Perfmon data and Graph: You can view in below graph that when the publishing throttling rate condition is setisfying the system in throttling mode #2, also you can analyze as per the sample window state through it.



If you would like to have the perfmon logs data, biztalk throttling configuration snap and graph in compress file for studying or analyzing it further for study which I have simulate for study, please write me to nilayparikh@gmail.com

Hope this article helped you to understand state#2 of throttling for you.

Related articles posted by me on the same blog:
BizTalk 2006 R2 - Throttling - Perfmon Parameters - My Experience (Tuesday, January 15, 2008)
BizTalk Server 2006 / R2 Throttling Mechanisms (Friday, January 4, 2008)

MSDN Resource Link:
http://msdn2.microsoft.com/en-us/library/aa559591.aspx
http://msdn2.microsoft.com/en-us/library/aa559893.aspx

http://msdn2.microsoft.com/en-us/library/aa559628.aspx

http://msdn2.microsoft.com/en-us/library/aa578302.aspx

http://msdn2.microsoft.com/en-us/library/aa547859.aspx


Thanks for BEAR me during the article. Your valuable feedback and suggestion are welcome to nilayparikh@gmail.com

Tuesday, January 15, 2008

BizTalk 2006 R2 - Throttling - Perfmon Parameters - My Experience

BREAKING NEWS:These days.. Dr. Time Pass is spending most of time to understand and implement best throttling scenarios :-).

Let me come back to official language. Few day back, I saw lots of throttling in our BizTalk environment. My enthusiasm drag me to the further exploration in throttling mechanism. Let me grab opportunity to document some cooked things.

Since last weeks, I saw my BizTalk environment is facing suddenly unusual throttling behaviors so I tried to find the root cause behind the throttling state in my BizTalk environment.

The environment is facing lots of message delivery throttling during pick processing hours, and due to the delivery throttling state, and delays our whole environment get stunned. After this incidence, I start taking this functionality very seriously and now digging lots of stuff related to it, here in this article I am going to share my experience as well as the understanding I got after going through many books, articles and MS-KBs.

As we all BizTalk professional aware about the inbound message processing and the internal BizTalk instruments takes care about the inbound message management. In short if I like to explain the word "inbound message" then I can stat "messages towards messagebox" as inbound messages, in BizTalk messagebox is the heart of any environment or implementation. We were facing problem due to unsymmetrical messages' incoming publishing rate and messages' outgoing publishing rate, I noticed in our environment incoming message rate is very high in compare to outgoing message (here outgoing means subscription to messagebox, we are talking about deliveries), in resultant the number was increasing for the messages in memory. BizTalk recognized and our environment moves towards throttling state but still throttling was also not able to make the system stable and put it again towards stable node. Due to the very heavy message rush and less outgoing rate very high number of the messages were waiting in queue for processing. We saw almost 300-700 seconds as delay time during the processing. And finally the nightmare came to true :-0, and counter called "Days without Servility 1" need to reset. The thing I would like to convey is the requirement of the optimized configuration parameter for the throttling while setting up any host. I will stat 'must' recommend for setting-up a set of well configured and planned host for the pick time processing for the critical interfaces. I don't want to drive my article into implementation designing, but I would like to come back to the throttling scenarios.



In this article I would like to focus around perfmon counters, few important ratios, throttling states.

Let's dig first to some state counters, state counters shows the state of the current situation in your environment under monitor. BizTalk 2006/R2 comes with very useful perfmon state counters.

some high (*) counters like High database session, High database size, High in-process message count, High message delivery rate, High process memory, High system memory, High thread count, these all counters represents only a state by showing value like 0 or 1, where 1 the respective focused area cross the configured or warning level. These shows very useful information which should be monitor in your Biztalk environment.

In-process message count, Active instance count are two message queue counters which show the Number of in-memory messages delivered to the XLANG engine or the outbound messaging engine that are not yet processed and instances are in active state in memory referring to the EPM or XLANG engine, respectively.

Specialized message delivery counters like Message delivery delay (ms), Message delivery incoming rate, Message delivery outgoing rate, Message delivery throttling state, Message delivery throttling state duration, Message delivery throttling user override. These all perfmon counters should be monitored regularly during the pick processing to understand the behavior of your environment and the coming out patterns to understand and prevent the failures.

Message delivery delay (ms): The current delay in ms imposed on each message publishing batch (applicable if the message publishing is being throttled and if the batch is not exempted from throttling).

Message publishing incoming rate: Number of messages per second that are being sent to the database for publishing in the given sample interval.

Message publishing outgoing rate: Number of messages per second that are actually published in the database in the given sample interval.

Message publishing throttling state:
  • 0: Not throttling
  • 2: Throttling due to imbalanced message publishing rate (input rate exceeds output rate)
  • 4: Throttling due to process memory pressure
  • 5: Throttling due to system memory pressure
  • 6: Throttling due to database growth
  • 8: Throttling due to high session count
  • 9: Throttling due to high thread count
  • 11: Throttling due to user override on publishing
Message publishing throttling state duration: Seconds since the system entered this state. If the host is throttling, how long it has been throttling; if it is not throttling, how long since throttling was applied.

I would like to place basic example of some perfmon data and counters in graph representation.



I will try to explore on Message publishing throttling states on my coming articles.

Thanks for BEAR me during the article. Your valuable feedback and suggestion are welcome to nilayparikh@gmail.com

Related articles posted by on the same blogs.
1. BizTalk Server 2006 / R2 Throttling Mechanisms

Friday, January 4, 2008

BizTalk Server 2006 / R2 Throttling Mechanisms

BizTalk 2006 / R2 comes with good relief in terms of performance and throttling mechanism, if you work close to administrating and performance tuning job for BizTalk environment or any messaging system then you can visualize the complexity of pick processing time scenarios in any messaging system. Here with the article I am focusing to the particular BizTalk throttling scenarios and their solution. I have derived below contact by referring few books as well as my personal experience while working on performance criteria for BizTalk environment.

The below cases might not cover all the possibilities throttling scenarios of live BizTalk environment, but I have tried to collect as much from my memory.


Case: High message ratio for delivery rate / completion rate

The higher side of the ratio value indicates the service is unable to handle the rate of inbound/incoming messages to the environment. As solution we can throttle message delivery and make available as much resource as possible to BizTalk Server artifacts like XLANG and outbound transports. As a result of improved message completion rate, message latency rate and IO operation rate for message box will be reduced exponentially, also we can get more benefit by cleaning up the queue at out-bound end point.

Ideally environment should maintain ratio value around 1.05 – 1.07.

We can get monitor message delivery rate and message completion rate through various instruments like WMI / OM object / Perfmon counters / etc.


Case: High publishing ratio for request rate / completion rate

High ratio is indicates of the message box being unable to cope up with the load. To improve such scenario in our environment probably we can block publishing threads to slow down the rate and also we can indicate the service class to slow down the message-publishing rate. In the case there is no such solution available apart of scaling out the SQL Server environment where message box has been deployed, if the environment is facing such problems concurrently then it will be appropriate decision to scale-out the SQL Server environment.

I don’t have accurate observation for the value of the ratio, but I am sure that every organization must need 1 to called reliable messaging services :-).

To get the ratio value, we need to monitor entry and exit of commit batch call.


Case: Process memory exceeds a threshold

If process memory exceeds a threshold then probably your environment and batch process need steep memory, if you are getting such alerts or messages regularly then you should think about the steep memory for the environment. In this scenario, it has affect components like XLANG and all kind of transportation (adapters/pipelines). It also indicates services to dehydrate and shrink catch.

By monitoring private bytes of the process we can get clear idea about the memory threshold for the individual processes. In BizTalk hosts can be treat as processes.


Case: System memory exceeds a threshold

Same possibilities we can assume as we discussed for case “Process memory exceeds a threshold” above.

Here you can monitor the physical memory and trend through perfmon.


Case: Database session being used by the process exceed a threshold count

You can throttle publishing inside the BizTalk environment. To improve such condition, we need to tune the SQL Server Database properly and need to scale up/out as per our threshold requirement. In this scenario XLANG and all inbound transports will get affected. There is not such quick resolution available for the scenario but we can reduce/block the external/idle sessions available on SQL Server. If BAM Notification / BAS are configured to your environment and their database are also running on same server then you can disable / suspend those less priorities services to resume the message processing to normal.

Such scenario can affect XLANG objects as well as all type of inbound transports.

You can monitor session on message box by Perfmon counter / SQL Server Counters / Management Studio / etc. For cluster environment you can monitor session per message box.


Case: Process thread count exceeds a particular threshold

Throttle publishing, delivery and it indicates to reduce the thread pool size.

Such scenario affect XLANG objects as well as all kind of transports (adapters / pipeline). There is no direct solution for the scenario, long term solution could be scale-out the server box and suggestion will be to migrate 64x or itanium (or equivalent) processors server boxes. To handle some pick processing time during the day, we can provide more thread and priority to the host processes handling the threshold.

You can monitor count like threads per CPU to undertake such scenarios / cases.


BizTalk Server 2006 / R2 is handling such throttling automatically but we can get much improved performance by giving some favor to the mechanism. To perform auto throttling BizTalk Server uses the configuration parameters, I will try to come up with detail review of each parameter in next coming article.

Thanks for BEAR me during the article. Your valuable feedback and suggestion are welcome to nilayparikh@gmail.com.