Designing Rules


Top     Installation     Introduction     Samples     Tutorial     Reference     Release Notes


So you've started Excel and looked at some of the ARulesXL samples or read through the basic tutorial, but you are wondering how to add rules to one of your own spreadsheets. This introduction will describe the process a step at a time.

There are three basic building blocks for a rule set:

The process of building a rule set is an iterative one. In each iteration you determine and refine the outputs, inputs and rules, then you write more rules and test them. So the steps are:

  1. Identify the Outputs
  2. Determine the Inputs
  3. Write the Rules Naturally
  4. Determine How You will Use the Outputs
  5. Determine How You will Get the Inputs
  6. Write the First Rules
  7. Test the First Rules
  8. Write More Rules
  9. Test More Rules

1. Identify the Outputs

The first step is to determine what you want the result of your rules to be. Here are some possibilities:

As you can see the range of results can be a simple, single value; a set of interdependent values; or a full-text explanation. ARulesXL can produce all of these.

As an example, suppose we want to advise on what type of shaft to install in a particular golf club (see the 'Product Advice.xls' sample in the beta test release). The output for this system is:

For a loan application the output is 'approved' or 'denied' and an explanation if the loan was denied. For a vaccination analysis, the output is an analysis of whether or not each vaccination given thus far was correct, and a plan for future vaccinations. For an expense report review, the output highlights which expenses did not meet the guidelines along with an explanation for each.

2. Determine the Inputs

The next step is to make a list of the things that you need to know in order to generate the output identified in step 1. Don't worry about identifying every last one. Start with the primary inputs that come to mind. This list will expand (and contract) as you write your rules (later). Also, while you write your rules, you might find inputs that you can calculate (infer) and don't need to obtain from the user or other spreadsheet cells.

Like outputs, inputs can be simple values like answers to questions; or lists; or n-dimensional arrays of values.

For our golf shaft advisor, the inputs are:

The inputs for a vaccination advisor are the age of the child and a list of the vaccinations received each with a date administered and vaccine name. The inputs for loan application are the price of the property, the applicant's credit rating and monthly income, the down payment and the like. The inputs for product pricing are a list of items and the quantity for each one. The inputs for a workflow system are a list of services to be provided or tasks to be accomplished.

3. Write the Rules Naturally

The easiest way to start writing rules is to think of them as a set of goals. The top-level goal is to produce the output(s) identified in step 1. For example, in our shaft advisor, the goal is to provide 'advice' containing:

shaft flex and tip stiffness

If this is your first rule set, you might find it easier to write your rules in plain English (French, Spanish, Chinese, etc.). Try to write them with two phrases:

The general form is:

a=1 and c=2 when x=m and/or y=n

This means that 'a' and 'c' are goals and they are assigned the values 1 and 2 when the 'x' and 'y' goals already have the values m and/or n. Here are some examples:

unit price = 5 when quantity < 10
price = unit price * quantity
status = denied when credit rating < 100
next vaccine = birthdate + 6 months when vaccine count = 0

Goals are the heart of a rule set. When a rule set is queried, it tries to find the value of a goal. To do that, it has to find the values of all the sub-goals for that goal. So for the shaft advisor, our primary goal is to provide advice, and the advice consists of shaft flex and tip stiffness. In order to get a value for advice, the rule engine must get a value for shaft flex and tip stiffness.

               advice
              /            \
      shaft flex    tip stiffness

For the shaft advisor, our first rule in plain English is:

the advice is the recommended shaft flex and recommended tip stiffness

Note this rule is comprised of two sub-goals, shaft flex and tip stiffness. Now we need some rules for them. These rules all use inputs from the user after the when.

shaft flex is L when swing speed < 75
shaft flex is A when swing speed > 75 and < 85
shaft flex is R when swing speed > 85 and < 95
shaft flex is S when swing speed > 95 and < 105

shaft tip is soft when using high degree driver and the player wants a normal to high ball flight
shaft tip is stiff when using a driver < 11 degrees and the players wants a low to normal ball flight
shaft tip is stiff when the player wants accuracy over distance
shaft tip is soft when the player wants distance over accuracy

Write as many rules as come to mind easily. Don't worry about getting the words exactly right, just try to follow the general form.

4. Determine the How You will Use the Outputs

Before you can actually create a rule set, you need to figure out how you are going to use the output from the rule queries. You could query for one goal per spreadsheet cell. For the golf shaft advisor, we could return the shaft flex in cell B2 and shaft stiffness in cell B3. Those values could then be used by other spreadsheet formulas and functions. This form of output is useful for any rule set that returns numeric values, like prices, for totalling.

Another option is to return a sentence or paragraph for the spreadsheet user to read. That's what the shaft advisor does. The top-level goal is called advice, which is the name of a fact in ARulesXL. A fact name is a word and has a value associated with it. The value can be a number, word, sentence or entire document. The latter forms are not useful in a spreadsheet formulas or functions, but they are very good for regulatory or legal applications.

You can also return an array of values (this is called an array query in Excel). In this case your goals would need to be an array. In ARulesXL an array fact uses square brackets around the indices. For the shaft advisor sub-goals, we use an array fact consisting of two elements:

shaft['flex']
shaft['tip']

ARulesXL lets you use words or number as indices, which makes the rules easier to read. We could have used a separate property for each goal, for example, shaft_flex and shaft_tip.

In sophisticated applications you can return a multi-dimensional array of values (in essence a table). A vaccination advisor returns a table that analyzes each vaccination given to date, and another table that is a schedule of vaccinations to give in future.

Since rule sets can have multiple goals, you might return different types of values for each goal. For example, a loan application returns a simple yes/no value for whether or not the application is approved, and then a sentence or two explaining if it was denied.

5. Determine How You will Get the Inputs

Inputs come from the spreadsheet user entering values into cells, or from the results of spreadsheet formulas. If the user is responding to a list of questions, the answers can be easily brought into the rule set using the RArray() function. For the shaft advisor, this is done as follows:

=RArray("in", A2:B5, TRUE, FALSE, TRUE)

where cells A2 to B5 contain the following values:

 
A
B
2 Swing Speed 77
3 Club Type Fairway Wood
4 Favor Accuracy
5 Ball Flight Normal

This creates an array fact named in. The first TRUE says the table has row headers (e.g. Swing Speed, Favor). The FALSE says the table does not have column headers in the first row (in other words, the first row contains data values). The last TRUE says we want this as a 1-dimensional vector. So the result of this RArray() is:

in['Swing Speed'] =      77
in['Club Type'] = "Fairway Wood"
in['Favor'] = "Accuracy"
in['Ball Flight'] = "Normal"

In addition to RArray(), you can use the RCell() function to get a value from a spreadsheet cell. For example

=RCell("discount", D8)

sets the discount property to the value in spreadsheet cell D8.

RArray() and RCell() functions must be inside a rule sets. The name of the rule set in the shaft advisor is ShaftRules. The ARulesXL menu contains commands to create, rename and manage rule sets.

If you want to check if the inputs in the shaft advisor were set correctly, you can type this rule set query function in any cell:

=RQuery(ShaftRules, "FIND in['Club Type']")

this should return the value "Fairway Wood".

You can also use RArray() to incorporate tables of values in your rules. The Product Advice.xls spreadsheet that this introduction is based on does exactly that to determine the shaft flex from the swing speed.

6. Write the First Rules

The first rule returns the value for one of your goals. If you query your rule set for multiple values, then you start by writing a rule for each of those goals. For the shaft advisor, the top-level goal is:

the advice is the recommended shaft flex and recommended tip stiffness

In ARulesXL, we write that as follows:

advice = "Flex is " & shaft['flex'] & " Tip is " & shaft['tip']

This rule reads as:

"the advice is a text string that includes the recommended shaft flex and the recommended shaft tip stiffness "

Notice the advice goal requires finding values for two sub-goals: the flex and the tip. The next step is to write a couple of rules for each of the sub-goals. Referring back to our plain English rules:

shaft flex is L when swing speed < 75
shaft flex is A when swing speed > 75 and < 85

shaft tip is stiff when the player wants accuracy over distance
shaft tip is soft when the player wants distance over accuracy

The rules are written as follows:

shaft['flex'] = "L" WHEN in['Swing Speed'] <= 75
shaft['flex'] = "A" WHEN in['Swing Speed'] > 75 and in['Swing Speed'] <= 85

shaft['tip'] = "Stiff" WHEN in['Favor'] = "Accuracy"
shaft['tip'] = "Soft" WHEN in['Favor'] = "Distance"

Note on the left side we are setting the value of one of our sub-goals (flex or tip stiffness), and the conditions (after the WHEN) refer to the user inputs loaded by RArray().

Some rule sets have multiple top-level goals. The value for a top-level goal is retrieved by using the RQuery() function.

7. Test the First Rules

In the shaft advisor, we test the rules with this query:

=RQuery(ShaftRules, "FIND advice")

This says query the ShaftRules rule set and set the value of the current cell to the value of the advice property. You should get:

Flex is A Tip is Stiff

if your inputs are set as shown above. When testing rule sets, change the inputs to get different results to test all the rules. Notice that the rules are reapplied when the input data changes due to Excel's spreadsheet recalculation.

Let's examine how the query actually works with a very simple example that contrasts traditional procedural execution (an ordered set of steps like Excel spreadsheet recalculation) with how a rule engine operates. Take the following steps:

a = 2
b = 3
c = a + b

In a procedural system, the steps run in the order given. First a is set to 2. Then b is set to 3. Then c is set to 5. However, if these are rules you could put them in this order (or any order for that matter):

c = a + b
a = 2
b = 3

In a rule set you query for the value of c. The rule engine finds a rule that can provide a value for c (c = a + b) and sees that it needs to get a value for a. So it looks for a rule that provides that value (a = 2). Next, it needs a rule that provides a value for b (b = 3). Finally, since now it know the values of a and b, it can calculate the value of c. This is the essence of the process of querying rules (inferencing).

Let's look at how the first rules in the shaft advisor work. The query is to find a value for advice. The rule engine does the following:

  1. Find a rule that returns a value for advice. There is only one:
    advice = "Flex is " & shaft['flex'] & " Tip is " & shaft['tip']
  2. Find the value for shaft['flex']. To do this:
    1. Check if the value is known for shaft[flex]. The known list is:
      in['Swing Speed'] =      77
      in['Club Type'] = "Fairway Wood"
      in['Favo'r] = "Accuracy"
      in['Ball Flight'] = "Normal"
      All these values came from the RArray() function. The value for shaft['flex'] is not known.
    2. Find a rule that returns a value for shaft['flex']. The first one is:
      shaft['flex'] = "L" WHEN in['Swing Speed'] <= 75
    3. Check if the value is known for in['Swing Speed']. It is 77 (from RArray()).
    4. Test the conditions to see if 77 <= 75. It is not.
    5. Find another rule that returns a value for shaft['flex']. The next one is:
      shaft['flex'] = "A" WHEN in['Swing Speed'] > 75 and in['Swing Speed'] <= 85
    6. Check if the value is known for in['Swing Speed']. It is 77 (from RArray()).
    7. Test the conditions to see if 77 > 75 and <= 85. It is.
    8. Set the value of shaft['flex'] to "A".
  3. Find the value for shaft['tip']. To do this:
    1. Check if the value is known for shaft[tip]. The known list is:
      shaft['flex'] = "A"
      in['Swing Speed'] =      77
      in['Club Type'] = "Fairway Wood"
      in['Favor'] = "Accuracy"
      in['Ball Flight'] = "Normal"
      The value for shaft['tip'] is not known.
    2. Find a rule that returns a value for shaft['tip']. The first one is:
      shaft['tip'] = Stiff WHEN in['Favor'] = "Accuracy"
    3. Check if the value is known for in['Favor']. It is "Accuracy" (from RArray()).
    4. Test the conditions to see if "Accuracy" = "Accuracy". It does.
    5. Set the value of shaft['tip'] to "Stiff".
  4. Set the value for advice to the values of shaft['flex'] and shaft['tip'] along with the text strings.

If you are unfamiliar with how rules work, take a close look at the Trace Query command on the ARulesXL menu. It will help you understand how the rule engine finds and matches rules. Once you understand how this work (the process is called inferencing), it will become easier to write more rules.

8. Write More Rules

Now we can write additional rules for the sub-goals shaft['flex'] and shaft['tip']. These additional rules will use more of the inputs identified in steps 2 and 5. When writing new rules, you will probably find new inputs you need to add to your rule set.

Rules for a fact are tried in the order they are entered. This means if there are multiple rules that might fit the input facts, the first one found will be used. This also means you can write a default rule as the last one which sets a value if none of the other rules apply.

shaft['tip'] = "Soft" WHEN in['Club Type'] = "Driver < 11 Degrees" AND in['Ball Flight'] = "High" OR in['Ball Flight'] = "Normal"
shaft['tip'] = "Stiff" WHEN in['Club Type'] = "Driver < 11 Degrees" AND in['Ball Flight'] <> "High"
shaft['tip'] = "Soft" WHEN in['Club Type'] = "Driver < 11 Degrees" AND in['Favor'] = "Distance"
shaft['tip'] = "Stiff" WHEN in['Club Type'] = "Driver >= 11 Degrees"
shaft['tip'] = "Stiff" WHEN in['Club Type'] = "Hybrid/Utility" OR in['Favor'] = "Accuracy"
shaft['tip'] = "Soft" WHEN in['Club Type'] <> "Hybrid/Utility" OR in['Favor'] = "Distance"
shaft['tip'] = "Stiff" WHEN in['Favor'] = "Accuracy"
shaft['tip'] = "Soft" WHEN in['Favor'] = "Distance
"

Hence the latter rules are more general and provide default values when more specific cases do not apply.

9. Test More Rules

Everytime you add a new set of rules, test them by changing your inputs. If you do not understand the results, use the ARulesXL debugger to learn how your rules are selected and matched.

Also your system may grow in complexity with additional sub-goals. For example, our shaft advisor could recommend shaft bend point, weight and material. This would add three additional sub-goals onto the main advice goal.

For more complex systems you might have many sub-goals. And each sub-goal may itself have additional sub-goals. These complex systems are where rule technology really shines because the ARulesXL engine determines which sub-goals to process and the order to process them based on the inputs. Each time the inputs change, a different set of rules are used, often in a different order. But, you don't have to worry about that. This makes it easier to write and maintain rules as compared to technologies that do their processing step-by-step.

You can see a more complete version of this example in the 'Product Advice.xls' sample.

Open in New Window to Print


Copyright ©2005-7 Amzi! inc. All Rights Reserved.
ARulesXL and Amzi! are trademarks or registered trademarks of Amzi!
Microsoft and Excel are trademarks or registered trademarks of Microsoft Corporation.