Is your feature request related to a problem? Please describe.
At the moment in Pregel there are only two approaches:
- check the adaptive convergence on each pregel superstep
- does not check it at all and rely fully on the maxIter
Describe the solution you would like
Two new options.
- minIter -- if minIter > 0 than skip the first minIter iterations of Pregel (do not do the convergence / earlyStopping checks)
- checkConvergnceEachK -- if specified, instead of checking convergence / early stopping on each iteration, do it on each K iteration
Component
Additional context
In most scenarios users well known an approximate minIter: if one wants shortest paths, they are typically knows well an approximate amount of hops. Specifying minIter bigger than required not a big problem IRL: with a price of a couple of additional iterations that changes nothing in result, users will get a significantly much faster iterations overall. For example, for a graph with sp 5 hops specifying minIter to 6 will add one additional iteration while eliminate an action in 5 iterations!
The same is true for checkConvergnceEachK -- for example, PageRank (that we did not implement yet in DF Pregel) may be significantly faster with checkConvergnceEachK=2: by the price of at most one additional superstep we are removing the additional isEmpty action in half of iterations. In the case of convergence in 10 rounds it will 1 additional iteration while removing action in 5!
While we cannot just change defaults until the 1.0 release (it will be a breaking change), we can add loud language to documentation and encourage users to change the values as well we can tune this two options for our own benchmarks.
Are you planning on creating a PR?
Is your feature request related to a problem? Please describe.
At the moment in Pregel there are only two approaches:
Describe the solution you would like
Two new options.
Component
Additional context
In most scenarios users well known an approximate
minIter: if one wants shortest paths, they are typically knows well an approximate amount of hops. SpecifyingminIterbigger than required not a big problem IRL: with a price of a couple of additional iterations that changes nothing in result, users will get a significantly much faster iterations overall. For example, for a graph with sp 5 hops specifyingminIterto 6 will add one additional iteration while eliminate an action in 5 iterations!The same is true for checkConvergnceEachK -- for example, PageRank (that we did not implement yet in DF Pregel) may be significantly faster with checkConvergnceEachK=2: by the price of at most one additional superstep we are removing the additional
isEmptyaction in half of iterations. In the case of convergence in 10 rounds it will 1 additional iteration while removing action in 5!While we cannot just change defaults until the
1.0release (it will be a breaking change), we can add loud language to documentation and encourage users to change the values as well we can tune this two options for our own benchmarks.Are you planning on creating a PR?