| Does Lightstreamer offer guaranteed delivery? |
The Lightstreamer network protocol is based on TCP. Being Lightstreamer Server a stand-alone application (i.e. not intermediated by a web/application server) it owns a direct control over TCP. The TCP protocol offers guaranteed delivery, since it uses backward ACK packets to notify the Server that each piece of information has reached the Client. Lightstreamer’s guaranteed delivery works as follows. First of all, it depends on the subscription mode. Lightstreamer offers four different modes (RAW, MERGE, DISTINCT, COMMAND). Basically, in MERGE mode subsequent events can be filtered and mixed together, in order to reduce bandwidth and frequency. In DISTINCT mode subsequent events are never mixed and they are delivered one by one to the Client. MERGE mode is typically used for stock prices; DISTINCT mode is typically used for news headlines or order execution notifications in a portfolio. Let’s suppose that a web client subscribed to a portfolio in DISTINCT mode. The maxFrequency parameter of that subscription has been set to “unfiltered”, which implies an “unlimited” ItemEvenBuffer size. An global maximum bandwidth has also been allocated for that streaming session.
The Data Adapter begins to pump updates pertaining to the items of that portfolio into the Lightstreamer Kernel, that in turn delivers them in real-time to the right Client(s). The Kernel guarantees the delivery of the updates in the original order. All the possible scenarios for the delivery of each packet are the following:
- The packet reaches the Client. A TCP acknowledge is sent back to the Server, which can keep on delivering the updates. Of course, the custom client-side web application has to display the updates upon receipt. But if it’s not defective, this will predictably happen.
- The packet does not reach the Client because the allocated maximum bandwidth (if it was allocated) is currently saturated OR there is a network congestion preventing the packets from being delivered. In these cases the Kernel buffers the events until the network allows it to release them. Should the internal buffer (whose maximum physical size depends on the memory of the server box) get full (quite an impossible event for a portfolio), then one or more updates are LOST: in this case a notification is sent to the Client with high priority (on the same stream connection but in front of the buffered updates) so that the Client application can choose to take some recovery action. If even the delivery of the notifications is impossible, it means there is a serious network problem and we fall into case 3.
- The packet does not reach the Client because the stream connection was closed due to a network timeout or the Server crashed. In this case, after a configurable timeout for missing heartbeat packets, the Web Client automatically and transparently aborts the current session and opens a fresh connection to the Lightstreamer Server cluster. Then it resubscribes to all the current items (without the user having to do anything), while requesting for a “snapshot”. This means that the Server will send back the current image of the items, to guarantee that no lost events occurs during the transition between the old session and the new one.
DISTINCT mode was used in this example for the sake of simplicity. In reality a portfolio is easier to handle through the COMMAND mode. In COMMAND mode all the considerations above apply when delivering new rows. But it keeps the ability to merge events pertaining to the same row. This means that if you deliver an order submission to the Client and after half a second you deliver to the Client the execution of that order, if the Client establishes a new session (in case 3), as a snapshot it will receive the updated item (order execution), not the redundant history (order submission covered by a order execution). This improves efficiency and bandwidth usage a lot. Obviously, if the Client needs the full history of the Portfolio instead of “current-moment coherency”, DISTINCT mode can be used. |
| Could you elaborate on what is specifically handled within the Metadata Adapter? Would an example convey this? |
The Metadata Adapter is mainly used for authentication/authorization. It provides some interfaces that give you full fine-grained control over what each user can see and do.
We provide a default Metadata Adapter (under ?Lightstreamer\sdk_for_adapters\examples\ReusableMetadataAdapters?) that is very simple and is good for a POC. From there you can add your custom code to get a production version, in which you authenticate the user, grant a maximum bandwidth, etc.
The interface of the Metadata Adapter is quite straightforward. See MetadataProvider interface: many optional methods are available to customize the behavior of the push sessions. For example, notifyUser() is called to check the user?s credentials; getAllowedMaxBandwidth() allows you to configure the maximum streaming bandwidth for each specific user. Most of these methods are optional and you have to implement them only if you need.
Basically, to identify the users that open Lightstreamer sessions, you will need to implement the notifyUser() callback of the Metadata Adapter. You will receive the credentials passed by the JavaScript code in the page (you can choose to provide a username+password, a sessionid, or whatever you want). Then it?s up to your implementation of the method to validate the credentials and associate the session to a certain specific user of your system. |
| I'm looking for a good explanation of the purpose of snapshot [...] |
I'm looking for a good explanation of the purpose of snapshot. For example, if we are displaying stock prices should I be setting isSnapShot to true each time last price is updated or should I only set isSnapShot to true when the data is originally fetched and ignore resetting isSnapShot each time an update occurs? How is the isSnapshotAvailable method to be implemented?
The second meaning is the correct one:
"only set isSnapShot to true when the data is orginally
fetched and ignore resetting isSnapShot each time an
update occurs".
The initial snapshot is the state of anitem at the time such item is subscribed to, so it contains data that has accumulated before subscription time.
The snapshot takes different forms on different subscription modes. However, it is always implemented through a list of updates. This means that the Data Adapter, upon subscription, should initially send some updates carrying initial snapshot and then should start sending the normal updates. In MERGE mode (the simplest case), the snapshot just consists of one update, which does not carry changed field values but current field values.
As some Data Providers may not be able to supply the current state of an item, the Data Adapter can omit sending the snapshot upon subscription; however, in this case, the Adapter must declare this limitation by returning false from isSnapshotAvailable.
Once it has received the snapshot, the Server keeps it and updates it by integrating all the subsequent updates into it, in order to be able to send the current snapshot upon each client subscription request.
Let's recap the workflow: the Data adapter will provide the initial snapshot for an item as the first client subscribes to that item (the subscribe method is called on the DataAdapter). After that, until there is at least one client subscribed to that item, the snapshot is completely handled by Lightstreamer Kernel. When no more clients are subscribed to that item, the unsubscribe method is called on the DataAdapter. On the next subscription to that item the DataAdapter should send the snapshot events again (because the subscribe method is called again).
Snapshot is fundamental for fail-over. When a Client connects to a different Server, for fail-over, it will need a fresh snapshot for all subscribed items. If some of these items are subscribed to for the first time on the new Server, the snapshot must be obtained from the Data Adapter. |
| Should the Adapters be aware of the subscription mode for an item? |
The subscription mode should be specified with each request.
The Data Adapter just sends data and it is not aware of the mode
used by the clients to subscribe to the items.
However, data for items to be subscribed to in MERGE, DISTINCT
or COMMAND mode is quite different and requires different
processing by the Server. Hence, for each item, only one
of these modes should be allowed to clients. It is the Metadata
Adapter that should ensure this, through the "modeMayBeAllowed"
method.
RAW mode, on the other hand, poses no restrictions on the
data and requires no specific processing by the Server;
therefore it can be requested for all items.
If any items are to be subscribed to in RAW mode
and are not suitable for any of the MERGE, DISTINCT
and COMMAND modes,
the "modeMayBeAllowed" method can again be used
to enforce this and to prevent other types of subscriptions.
The "adapters.xml" file is not involved in the above,
unless you stick to the LiteralBasedProvider.
In that case, in the section devoted to the Metadata Adapter,
you can configure the behaviour of the "modeMayBeAllowed"
method of the LiteralBasedProvider, by leveraging the
<item_family_...> and <modes_for_item_family_...> elements,
as shown in the sample configuration file.
On the other hand, if your Metadata Adapter is based
on the ARI protocol, then you should provide mode information
by answering to the "Get Item Data" (GIT) command properly.
Through the "allowed modes" array, you can specifiy
that only RAW mode is allowed for your items,
or that both RAW and one out of MERGE, DISTINCT
and COMMAND modes are allowed. |
| The Data Adapter API specifications say I can't reuse HashMap objects. Is it just a matter of correctness? |
When an update event is supplied to Lightstreamer by the Data Adapter through "update" or "smartUpdate", the event object is forwarded by Lightstreamer to worker threads and kept for some unspecified time.
As a consequence, the Data Adapter cannot reuse that object instance anymore, otherwise race conditions between reads by Lightstreamer and further modifications of the object by the Data Adapter could happen and Lightstreamer might find the update event object in an inconsistent state.
The above holds for all types of event objects supported (namely, ItemEvent, OldItemEvent and all Map subtypes).
However, if the object class is not thread safe, this also means that, in case the Data Adapter tried to reuse an event object, the object might be accessed in a improper way.
In fact, in case of reuse, even if the Data Adapter explicitly synchronized its own accesses to the object, it would be possible that an access by Lightstreamer and an access by the Data Adapter happen concurrently.
The consequences of concurrent accesses to a non-thread-safe object can be unpredictable and the final effect can be of a totally different nature and difficult to investigate.
This is the case of HashMap, which is the most common event object type.
The curruption introduced into a HashMap object by lack of synchronization can be as bad as causing calls to get() and contains() to get stuck in endless loops.
The race condition could be exploited at any moment and once a loop is entered, it might also cause important locks to be kept, leading to internal buffers growth, thread pools exhaustion and the whole system to become unresponsive.
Although using ConcurrentHashMap in place of HashMap prevents all synchronization problems, you might prefer to stick to HashMap because of performance reasons.
Hence, it is important to stress that the Data Adapter must take care that concurrent accesses to the same HashMap instance are not possible and, in particular, that HashMap instances are no longer accessed after they are supplied to "update" or "smartUpdate". |
| How should I implement the subscribe method of my Data Adapter for high performance? |
The invocation of the subscribe() method on the Data Adapter can be the cause of delays on the processing of aggregated subscription requests (i.e. when multiple tables are subscribed at once). Aggregated subscription requests are performed sequentially; so, if the subscribe() method needs to perform slow operations, it is preferred that they are dispatched asynchronously. Note that the subscribe() method:
- doesn't need to return an error code;
- doesn't need to send a snapshot before returning: the snapshot can always be sent asynchonously;
- can throw a SubscriptionException, but its only effect is to prevent "unsubscribe" to be called for the item;
- can throw a FailureException, but this is the same as invoking "failure" on the listener.
So, performing subscription operations asynchronously should always be possible, though this may require some complication of the Data Adapter code.
Pauses in the subscribe() call cause the client to re-issue some subscription requests, with some overload (though the redundant requests are just discarded). Moreover, they cause some threads from a limited pool to remain blocked; this requires that you extend the pool by acting on the "server_pool_max_size" configuration flag.
Starting with Server version 3.5.2 (still to be released, at the moment) the subscribe() and unsubscribe() calls will be issued by the Server asynchronously and in parallel on the "server" thread pool; the only constraint will be that subsequent calls related with the same item will be performed in sequence.
Hence, a slow implementation will no longer delay the answer to the client subscription request (note that a failure in a subscription is not notified to the client); nor it will delay the update flow on the session, but for the involved item.
Moreover, it will be possible to configure thread pools to be used to run subscription requests specifically for each Data Adapter; in this case, slow implementations will in no way propagate the delay to operations on different Data Adapters. |
| How should I implement the notifyUser method of my Metadata Adapter for high performance? |
The implementation of the notifyUser() should be nonblocking. However, using a blocking implementation has no severe consequences. There are two kinds of negative impacts, both of which can be minimized.
- While the "notifyUser" method is executing, a thread from a limited pool is consumed. When multiple stream connection requests come in short time, this may cause the thread pool to be filled and some requests to be queued. You can size this thread pool according to your estimated request rate through the "server_pool_max_size" configuration flag. This allows you to process more than the default 10 requests concurrently; setting 0 allows an unlimited number of requests to be processed concurrently. Note that this also includes control requests; but blocking requests by notifyUser() only involves connection requests (occurring once for each session).
- A delay in notifyUser() causes a delay in the response from a connection request issued by the client. If the delay is as high as several seconds, then the recovery mechanisms implemented in some client libraries may cause a second connection attempt in POLLING mode (with its own call to notifyUser()) to be issued before the first one returns. In this case, a double-start effect may be observed on the client. To prevent this, the first call to notifyUser() should return clearly before the second one.
Starting with Server version 3.5.2 (still to be released, at the moment) it will be possible to configure thread pools to be used to run "notifyUser" specifically for each Adapter Set.
In this case, a slow implementation of "notifyUser" will in no way propagate the delay to other kinds of requests or to requests related with different Adapter Sets.
However, the client session requests could still be delayed and interfere with the recovery mechanisms implemented in some client libraries.
Slow implementations of other Metadata Adapter methods are still not recommended; however, for the most critical of them, it will also be possible to configure proper thread pools to reduce delay propagation.
See each method documentation for details. |
| What is Pre-filtering? |
| A Pre-Processor is created and used internally by Lightstreamer Server for each item and is independent on the user sessions. It manages the updates received from the Data Adapter before dispatching them to all the involved client-specific ItemEventBuffers. A Pre-Processor implements snapshot management and update-frequency Pre-Filtering for the related item. The behavior depends on the item subscription mode. Note that MERGE, DISTINCT and COMMAND subscription modes are exclusive; this means that all the clients have to subscribe the same item in the same way (or in RAW mode, which, on the other hand, can always be combined with the other modes). Other Pre-Processor parameters can be set by Lightstreamer Server configuration. In particular, a Pre-Filter frequency can be set for items subscribed in MERGE or DISTINCT mode, in order to allow the Server to protect against very high update rates. For example, if the data feed produces 50 updates per second for a given item, but no user session will be allowed to receive more than 5 updates per second, then the data flow can be pre-filtered to 5 updates/second (this is done one time only, instead of doing it one time for each user session). Then each user session will be free to apply a second filtering frequency (e.g. 3 updates/second). Pre-filtering is accomplished by implementing the "getMinSourceFrequency" method in the Metadata Adapter, that is consulted by the Server for each item. |
| What's the meaning of the max_buffer_size configuration parameter? |
This limit refers to the number of events waiting for dispatching,
not to the event size. The limit is applied for each single item,
or more precisely, to each single subscription.
In normal cases, the Server should not need to use these buffers,
as it should be able to send all events in real time.
If you subscribe to an item in a mode that allows filtering,
then your subscription settings should also request a small buffer size
(note that, for MERGE mode, a buffer size of 1 is requested by default).
On the other hand, if you subscribe to an item in a mode that
doesn't allow filtering, then you should ensure that the event rate
for the item is not higher than the expected dispatch rate.
The buffer is there in order to allow bursts of events to be dequeued,
but, if the event rate for an item is steadily higher than the dispatch rate,
then the buffer usage grows and this impacts both on dispatching delays
and on Server heap memory usage.
Note that some factors may reduce the dispatch rate even for items
subscribed to in an unfiltered mode:
these are bandwidth limits, network or client slowness and
Server-edition-specific frequency restrictions.
See also "itemEventBuffer" in "General Concepts.pdf". |
Powered by vBulletin Version 3.5.4 Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
|  |