The Secret Diary of Han, Aged 0x29

twitter.com/h4n

Archive for August 2005

Windows 2003 service madness.

Try running the service as “NT AUTHORITY\LocalService”.

If it does not run as “LOCAL SERVICE” on win2k3 you may need to alter the ‘User Rights Assignment’ in ‘Local Security Policy’ under Admin Tools.

On XP “SYSTEM” is part of the “Log on as a service” policy under ‘User Rights Assignment’. That’s probably why you were able to run the service as “localsystem” on your XP box.

I believe if you add “LOCAL SERVICE” to the the “Log on as a service” policy under ‘User Rights Assignment’… the alert filter service should be able to run as “NT AUTHORITY\LocalService” on win2k3.

I’m not sure if win2k3 has a “SYSTEM” account. If it does, adding it to the “Log on as a service” policy may allow you to run the service as “localsystem”.

I know it seems ridiculous to have to add “Log on as a service” user right for a user called “LOCAL SERVICE” (or SYSTEM for that matter)… but I have had success doing this in the past on nt4 and win2k.

Hope this helps.

Written by Han

August 25, 2005 at 11:40

Posted in Uncategorized

Cisco QoS (bandwidth mgmt) instance mapping

Until now all instance mappings we had were fairly straightforward. None involved more than a single table lookup, although some details varied.That has changed. Today I implemented the instance map for bandwidth monitoring on the Cisco 4500. I had to read in no less than 4 different tables, one of which really describes a tree structure and requires some extra navigation. Before I forget how this is done, here are some details: The tables to be read are:

  • the ifTable, for ifIndex and ifName
  • a policyIndex tabe (cbQosIfIndex). It gives the policies attached to each interface
  • the class map table. Results in a configuration index for each class map name
  • The objects table. Describes QoS instances and their hierarchical relationships

The algorithm goes like this:

- iterate through ifTable

- iterate through policyIndex table

- iterate through classMap table

- find an objectId using interface, policy and classmap

- create map from interface and classmap to policy and object

Finding the object works as follows:


- find classmap object using config index and policy index. This is the parent object of the one we are looking for

- find classmap object that is child of the one found in the previous step, and has a type value of 7 (police)

The mapping is set up for all configured interface + class name combinations on the router at once, and refreshed at configurable intervals, like all instance mappings.The instance map for bandwidth monitoring on the Cisco 4500 is a different story. First, it seems to start with not one, but two instance values. One is an interface name, the other is a bandwidth policy class map name. The reason, ofcourse, is that two different component instances are involved

Written by Han

August 25, 2005 at 07:12

Posted in Uncategorized

Alert filter service deployed to production. Windows 2003 service madness.

Last monday I deployed the cycle 12 backport of the alert filter service to production. Since it is cycle 12, it cannot use the new alert view attributes. Instead it uses ViewFlag, and it is therefore limited to excludes only. In addition, it does not look at any alerts that have ViewFlag set to 3 (Forced Primary), so that in effect ViewFlag = 3 overrides all filters. In addition a read-only mode was added that logs any changes the service wants to make, but doesn’t actually make them, which is great for testing the impact of running the service.

The service was installed to run under the “localsystem” account, which is the default for services. Although this worked great on my XP machine, on the production Win2003 server (correlator1) it did not:

System.Net.WebException: The underlying connection was closed: Unable to connect to the remote server.Running the service under the mgt\administrator account fixed this. So evidently localsystem does not have enough rights on Windows 2003. Since running under and admin account is clearly not the safest thing to do, I tried various other accounts, including the new “network service”. None of these have the necessary right though, so for now it remains under mgt\administrator. One problem is that if a user logs into the server under this account, the account loses its right to run a s a service until the service is manually restarted. Automatic restarts of the service don’t work. Therefore it is best to create a new account with admin privileges and run the service under this account.

Written by Han

August 23, 2005 at 15:32

Posted in Uncategorized

Netcool DataServer now 10x faster

Last week W. and I did a load test for the AlertFilter service. Although the purpose of the test was to see if a single update could handle thousands of alerts at once (it can), we noticed that reading the full alerts.status table with 4000 alerts using the Netcool Dataserver took about 25 seconds on the fast staging hardware.

I suspected this was due to serialization of the alerts into XML, since the netcool desktop event viewer only takes a few seconds to load so many alerts. I tested this by removing all XML serialization and just dumping the raw database output. This took 2 seconds for the full database. So clearly the delay was caused by the serialization step.

The library I used for XML serialization was XML::Simple. I replaced this by the more highly regarded XML::Writer. This increased the serialization time to about 70 seconds. I then decided to do my own serialization, just concatenating the strings, and then replacing all <, >, and & by <, >, and &. This brought the the time to 2.5 seconds, very close to the raw data dump.

I added the new serialization code to the cycle 13 code base and deployed it in dev and staging.

Written by Han

August 23, 2005 at 15:15

Posted in Uncategorized

Category mapping attributes

For completeness, here are the attributes that can be used for category mapping:

  • event.iprange. This is an IP range filter, that matches if the alert’s IP address is within the given range list
  • event.message. This is a regular expression that tries to match the alert text (also known as alert summary in netcool.
  • event.monitor. This is the attribute that, together with the manager attribute determines what agent or probe created the alert. In Netcool known as AlertGroup
  • event.manager. The netcool Manager attribute
  • event.componentclass. The component class of the component that the event was for
  • component.customer. The customer that owns this component
  • device.os. The OS indication for the device. This is a number that broadly indicates the kind of OS. 1 = windows, 2 = solaris, 4 = linux, 8 = network device.

All are regular expressions, except for device.os, which is a number that must match exactly. If a component cannot be found for the event, device.os is not available. Any rules that contain device.os will never match.

Finally, the value for customer is the customer that owns the component. If a component is not found or the customer value is absent (NULL), but a customer is found using the customer mapping rules, this value is taken instead.

Here is an example


	<categoryMappingRules>

		<default value="network" />

		<rule event.message="Ping fail.*" event.manager="Precision" value="network" />

		<rule device.os="1" value="windows" />

		<rule device.os="2" value="unix" />

		<rule device.os="4" value="unix" />

		<rule device.os="8" value="network" />

	</categoryMappingRules>

Written by Han

August 19, 2005 at 22:20

Posted in Uncategorized

Customer mapping added

In cycle 13, a new “category mapping” feature was added to the ServiceCorrelator (addCustomer) service. Due to popular demand, customer mapping was added to this feature as well. As was the case for category mapping, the mapping itself is completely configurable in the web.config file. The customer can be mapped based on the following attributes:

  • event.iprange. This is an IP range filter, that matches if the alert’s IP address is within the given range list
  • event.message. This is a regular expression that tries to match the alert text (also known as alert summary in netcool.
  • event.hostname. Another regular expression that tries to match the alert hostname (better known as NodeAlias in netcool)

The mapped value of each rule is a customer name. This name is checked in Siebel. If the customer is unknown there, the rule is considered invalid and skipped.

Here is an example customer mapping section:


	<customerMappingRules>

		<default value="Internal" />

		<rule event.iprange="1.2.3.0/24, 2.3.4.0/24" value="Customer Name" />

		<rule event.message="I'm feeling lucky" value="Google" />

	</customerMappingRules>

If the component already knows its customer from Opsware, then that value gets precedence over anything that comes from the mapping rule. Note that a blank value in the database also counts. If it’s value is absent (NULL), the mapping rule takes over.

Written by Han

August 19, 2005 at 22:13

Posted in Uncategorized

Alert filter service (2)

The service now can filter on all alerts. Node uses the IP address range filter. The rest uses regular expressions. Regular expressions don’t support negative matches usually (match if the string is not equal to…). However, using a negative lookahead or lookback assertion, the same effect can be achieved. For example. Node=”(?!^compaq)” will match anything that does not start with “compaq”.

In addition it was backported to cycle 12. Here it modifies the ViewFlag attribute. Due to limitations of the cycle 12 model, only a single level of exclusions, and no inclusions are supported. Also, all alerts with ViewFlag == 3 (”forced primary”) are not touched at all.

Written by Han

August 19, 2005 at 22:01

Posted in Uncategorized