KDB+ Learning Notes (3):Calculation between rows

How to do calculation between roles within groups in KDB+/Q elegantly

It is easy to do calculation between columns for KDB+ tables. Similar to SQL or like ‘mutate’ in the dplyr package of R, we can define the new columns easily. However, calculations between rows can be tough for me as a beginner. Below, I listed some of the methods I tried.

Loop over each group

q) update NewColumn: func each OldColumn from Table
or
{[x] func} each TimeCommodityGroup

to do the calculation would be very slow and can be comlipated.

Adding new columns to faciliate calculation

The example we use to illustrate this problem is as follows:

The supply and demand of commodies vary on different days. The table includes information on total supply and specified demand of investors in the market. The market runs in the first-come-first-serve way. We would want to know which of the buyers’ orders will be fulfilled and which are not.

In brief, this is the fill model calculation within each commodity name and date. And we need to

  1. fulfill the first several in full amount, when the sum of the first few rows are less than the supply
  2. partially fill the demand, when the cumulative sum is less than the demand but not large enough
  3. set the fill number as 0 when there is no availability in the market

The corresponding calculation will be

  1. calculate cumulative sum of demand in the market
  2. compare cumulative sums and supply in the market
  3. deal with the partial fill
/ Remember to sort the df before the follow calculation.
q) df: update cum_sum: sums Demand by Commodity, Date from df;
q) df: update FillQty: Demand from df where cum_sum<Supply;
/ Calculate the time that cumulative demand exceeds supply
q) df: update IntersectTime: first Time by Commodity, Date from df where cum_sum>=Supply;
q) df: update FillQty: (cum_sum-Supply) from df where Time = IntersectTime;
q) df: update FillQty: 0^FillQty from df

Converge within the group

The example we use to illustrate this problem is as follows:

If the increment between the rows is small, keep the previous value and adopt the new value only when the increment is significant enough.

q) df: update new: ({$[y>x+epsilon;y;x]\}) value by group from df;
/ epsilon is the threshold for increment
q) ({$[y>x+2;y;x]}\) 1 2 3 4
1 1 1 4

The use of the converge wrapper is in this link https://code.kx.com/q/ref/accumulators/

Cello dropout

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store