The Secret Diary of Han, Aged 0x29

Archive for March 2003

Storing peak values in RRD

RRD keeps a fixed amount of storage for a datastream, by storing the data in multiple round-robin arrays, each withn an increasing granularity. The first RRA might store each value that comes in, say every 10 minutes. The second may average 6 values into one value stored per hour. A third one might consolidate into one value per day. I mentioned average, but in fact the function to consolidate these data values is configurable, and average is just one of the options. The other choices are MIN, MAX, or LAST.

When consolidating a data into averages, extreme values get lost. However, peak values are often as important or more important to companies than averages. In any case, both numbers are potentially valuable for resource allocation purposes. Often, a combination of peak and average makes most sense, showing, for instance, the average number of page views for the last couple of hours, but also the peak.

It therefore makes sense to update the reporting system to optionally store two (or more) streams. The current implementation only stores incoming data into a single stream. While it is possible to store avg and peak data into two different rrd datastores, it would require the data to be sent twice over the network. It works as follows: The stream is determined from component and measurement type, and resolved into rrd file and rrd datasource.

Since it is possible to store multiple datasources in file, this could be used to store avg and peak values. Thus, a single rrd database file would host multiple datastreams. They would be from the same component and measurement type and differ in their consolidation function only. The data collector would update both datasources from a single incoming samplecollection.

In order to keep this simple, it is probably best to consider a single rrd database with multiple data sources, as a single logical datastream. One datasource inside, the one labeled the primary datasource, would be usually be used as a source for values for graphing or otherwise. However, the secondary could be used if so specified. Finally there would be one graph type that combines the values from both, in say an area graph with a superimposed line for peaks.


Written by Han

March 31, 2003 at 22:03

Posted in Uncategorized

Implementation of graph consolidation completed

Just completed the implementation of the enhanced graph consolidation features. Access to raw data is not implemented as yet.

Component discovery, graph discovery and graph consolidation are now three independent services. Some graph consolidation functionality is built in. However, consolidation is completely controllable from the graph URL. This means that any desired future functionality can be implemented by the portal programmer or configured by the user.

Component discovery is pretty basic for now. This will be covered in a separate component selection service. Implemented right now is discovering the children of a given component.

Built in consolidations are:

for elements to appear on the same graph

  • measurement type must be same
  • y quantity and y unit must be same
  • y quantity, y unit and component ip address must be same
  • none (all data streams go to their own graph)

for elements to appear in the same plot, they must be in the same graph. From there, the following consolidation criteria are provided:

  • measurement type must be same
  • measurement type and component class and instance must be same
  • measurement type and component class and ip address must be same
  • none (all datastreams go to their own plot)

All graphs have URL’s as written up before.

Written by Han

March 14, 2003 at 01:02

Posted in Uncategorized