Avoid row-by-row scanning


Link to this posting

Don't process DataWindows and DataStores row by row if the task can be accomplished in another, more efficient way.

Find and process rows which satisfy a criteria

Use Find() in a loop rather than compare values in each row:

*** BAD code: ***

Code: Select all
ll_row_count = dw_emp.RowCount()
for ll_row = 1 to ll_row_count
       ll_curr_dept_id = dw_emp.object.dept_id[ll_row]
       if ll_curr_dept_id <> al_dept_id then continue
       // process the found row
next

*** GOOD code: ***

Code: Select all
ll_row = 0
ll_row_count = dw_emp.RowCount()
ls_search_expr = "dept_id=" + String(al_dept_id)
do while true
       ll_row = dw_emp.Find(ls_search_expr, ll_row + 1, ll_row_count)
       if ll_row = 0 then exit
       // process the found row
       if ll_row = ll_row_count then exit // prevent eternal loop when the last row satisfies the search condition
loop

In many situations, the function Describe("Evaluate('<EXPRESSION>, 1, 0) for all', 1)") is very useful if you want to avoid a scanning a DW in a loop with comparison in each row.

Count rows which satisfy a criteria

Code: Select all
ll_active_emp_count = Long(dw_emp.Describe("Evaluate('Sum(if(status = ~"A~", 1, 0) for all)', 1)"))
MessageBox("HR", "There are " + String(ll_active_emp_count) + " active employees.")

Maximum value of the field in all rows

Code: Select all
ld_latest_order_date = Date(ids_order.Describe("Evaluate('Max(order_date)', 0)"))
MessageBox("Orders", "The last order was made in " + String(ld_latest_order_date, '"MMM DD, YYYY") + ".")

Count highlighted rows

Code: Select all
ll_selected_count = Long(dw_emp.Describe("Evaluate('Sum(if(IsSelected(), 1, 0) for all)', 1)"))
if ll_selected_count < 2 then MessageBox("New Team", "Please select at least 2 employees.")

Define if a field in DS / DW has duplicated value (i.e. is not unique)

Code: Select all
dw_order_status.SetSort('order_status A')
dw_order_status.Sort()
dw_order_status.GroupCalc()
lb_dup_found = ("1" = dw_order_status.Describe("Evaluate('Max(if(order_status[-1] = order_status, 1, 0))', 0)"))
if lb_dup_found then MessageBox("Error", "Order Statuses must be unique.")

Put a same value in a field in all the rows

If you need to assign a same value to a field in ALL the rows (like a coefficient another field should be multiplied or divided by), use a computed field instead of assigning the value to the field in each row in a loop. For that, make that field computed, with a very simple expression "1", and change that expression programmatically by simply making the value the field's expression.

*** BAD code: ***

Suppose, a variable ll_coef_to_divide contains the result of a calculation in PB code. Here is the inefficient solution (assuming that the field is not computed but exists in the DW's data source):

Code: Select all
for ll_row = 1 to ll_row_count
       dw_XXX.object.coef_to_divide[ll_row] = ll_coef_to_divide
next


*** GOOD code: ***

To make the assignment at one stroke, the field should be computed. The value, which the field will return, is assigned this way:

Code: Select all
dw_XXX.object.c_coef_to_divide.Expression = String(ll_coef_to_divide)
dw_XXX.GroupCalc() // recalc other computed fields which mention c_coef_to_divide in their expressions