WRITE-UP:
Interpolation Grids in R-Packages
Peter Ruckdeschel, Jan 27 2013 revised Mar 12 2013 revised Jul 18 2018

1. Starting Point:

Computation of optimally robust influence curves [=opt-rob ICs] in our R pkgs 
works well, but may be slow, in particular in case of lack of invariance---like 
in scale-shape models, when we cannot move the IC from one parameter value theta 
to the next by invariance. 
Then we have to recompute the IC for each theta anew.
As the opt-rob ICs are given through Lagrange multipliers [=LMs]) which are 
continuous in theta as shown by Matthias Kohl (see his PhD thesis), we may 
compute opt-rob ICs for a grid of theta values offline and then, for a new 
theta value use interpolation.
The same strategy applies to speed up evaluation of scale functional Sn 
(Croux, Rousseeuw) for Shape-Scale-models.

2. Problem when passing from R-2.15 to R-3.0 

From R-2.15 to R-3.0 Brian Ripley (R-Core) has changed some non-exported 
interfaces to C-code. This affects approxfun and splinefun which we are 
using for interpolation because saving the results of approxfun and 
splinefun comes up much faster (experiments by Matthias) than using approx 
resp. spline. Now approxfun, splinefun from R-3.0 on return functions
whose body contains R code which is not yet interpretable with R-2.15
and the code from R-2.15 no longer runs from R-3.0 on.
We solved this (after long discussions with R-core ...) 
saving two interpolating functions, one with suffix ".O" for < R-2.16, 
one with suffix ".N" for >2.16, and at run time determine the current
R version and take the suitable one.

[added Jul 2018]: In principle, this could be solved now, requiring
R >= 3.3 (which is no longer "too new"), but for compatibility, we 
leave it as is (the respective sysdata.rda file containing functions
for both R >= 3.0 and R <= 2.15 only grows by less than 50 KB) and
indicated how to use this for R <= 2.15 in a file HowTo in the
package main folder.

3. Size of packages

With CRAN getting larger (>4000 pkgs) people from CRAN have set up some
limitations on size of pkgs. Taking up our interpolation idea naively,
however, quite easily ends up with huge pkgs. So care has to be taken.

4. Implementation 

R allows to provide pre-computed R objects in a pkg in a particular
file sysdata.rda (to be generated by save()) to be put into the R 
folder of the pkg. Works easily for simple objects like numerics, 
characters; gets more complicated with functions. In addition to
the code (in the body of the function) and the argument list,
this also comprises an environment, which is used to bind objects
which are not generated in the body and are not passed through
the argument list (lexical scoping). This environment easily gets
large, in particular if whole S4 class hierarchies have to be
provided ... As a consequence, when loading such a "small" sysdata.rda
file R may have to load several packages under the hood to reconstruct
the environment. More to this issue later. 
At any rate, we will store our interpolation grids and functions in
such a sysdata.rda file.

5. Datastructure

As we deal with several robust optimality criteria (MSE, RMX, MBR) 
and several models---currently: GPD, GEVD (with and without knowing 
mu), Gamma, Weibull---, our grids are stored in nested lists. 
Debatable, but done so for the moment. Similarly, we store---for
the same models---fast interpolators for the Sn estimator for scale.

After a discussion with Gerald Kroisandt, we now have a sparser
data structure.

Let's use the following notation for describing the list structure:
Each layer in the hierarchy gives one ">" and an item is inserted
below the item next left to it with number of ">" by 1 smaller than
its own. I-fct denotes the interpolating function to the grid left
to it (named "fun"). {} denote optional entries and capture that one 
may want to smooth out the original interpolation grids in 
entries 'grid', giving smoothed grids written into entries 'gridS'. 
OptCrit for the time being is either in ".OMSE", ".MBRE", ".RMXE" or ".Sn".
Models for the time being are GPD, GEVD, Gamma, Weibull.
Then our structure goes as follows:
[model1], >[OptCrit1], >>[grid], {>>[gridS],} >>[I-fct.O], >>[I-Fct.N],
[model1], >[OptCrit2], >>[grid], {>>[gridS],} >>[I-fct.O], >>[I-Fct.N],
...
[model1], >Sn, >>[grid], {>>[gridS],} >>[I-fct.O], >>[I-Fct.N],
[model2], >[OptCrit1], >>[grid], {>>[gridS],} >>[I-fct.O], >>[I-Fct.N], ...

For instance, to get the clipping height "b" in OMSE for "GEV" with 
known parameter mu at theta = (xi=0.3) for >R-2.16, we may write 
      .GEV[["OMSE"]][["fun.N"]][[1]](0.3)

6. Namespace issue

It is absolutely necessary that functions I-fct (or I-fct.O, I-fct.N)
be generated _in_ the namespace of the pkg; otherwise conflicts arise,
as namespaces have to be loaded twice (and hence pkg installation already
fails)

Finding this out took me quite some time!

My initial idea w.r.t to point 3. was to save the grids to some rda file
and then, to get rid of all RobASt S4 infrastructure, to load them in 
a virgin R session, maybe separately for <2.16 and >2.16, to call splinefun 
therein and then to save this as a (small) sysdata.rda file.

THIS DOES NOT WORK.

After a discussion with Gerald Kroisandt, we now use a somewhat similar
approach: _Within_ packages like pkg 'RobExtremes', we only produce the grids
and store them in intermediate .csv files which for archivation we keep in 
r-forge folder "RobExtremesBuffer". 

Corresponding general infrastructure is maintained in pkg 'ROptEst', i.e. 
non-exported functions .RMXE.th, .MBRE.th, .OMSE.th and .getLMGrid to compute 
Lagrange multipliers for  "OMSE"-, "MBR"-, "RMX"-ICs, .generateInterpGrid to 
produce the grid and .saveGridToCSV, and .readGridFromCSV to read grids from 
files and write grids to files (all in file interpolLM.R).

Infrastructure particular for scale-shape models is maintained in pkg 'RobExtremes'. 
I.e. non-exported functions .RMXE.xi, .OMSE.xi, .MBEE.xi, .modify.xi.PFam.call,
.getLMGrid to compute respective Lagrante multipliers, and .svInt and 
.generateInterpGridSn to generate the grids for LM's and Sn (all in file interpolLM.R).
The respective infrastructure for the grids for Sn is provided through functions
getShapeGrid, getSnGrid, .generateInterpGridSn (all in file interpolSn.R)
Finally, .getPsi (in file internal-getpsi.R) to given set of interpolators generates 
an optimally robust IC, and .Sn.intp accesses the interpolator for Sn (in file SnQn.R)

Actual code to produce the interpolators and to manipulate the grids
(including smoothing grids out) is maintained in pkg 'RobAStRDA' which
by this technique does not need to import any RobASt-package infrastructure
and only uses (as originally intended) merely architecture from packages
'base' and 'stats'. Functionality in pkg 'RobAStRDA' comprises non-exported
functions .versionSuff to distinguish R<2.16 and R>2.16 behavior, 
.readGridFromCSV to read out CSV files with grids, .MakeSmoothGridList,  
.generateInterpolators to smooth out grids and to generate interpolators,
.saveGridToRda, .computeInterpolators, to write the grids and the interpolators
to sysdata.rda files, and .mergeGrid, .mergeF, .copy_smoothGrid, .renameGridName
for manipulations.

For details, see the respective Rd-Files.

7. Size of sysdata.rda 

To reduce the size of sysdate.rda, it was a very important trick to explicitely 
assign a very small environment generated by new.env() to the results of 
splinefun in [which still has the namespace of RobAStRDA in its parents].

8. Namespace-conformal Inspection / Manipulation 

As seen, to manipulate an existing sysdata.rda files, it is a bad idea to
load them into a new R-session and then use the save()-d result as new 
sysdata.rda in the pkg. Still, for checking, inspection is possible 
with load(). To avoid getting confused with objects in the workspace
generated elsewhere, a good idea is to load sysdata.rda with 
load(file, env) into an environment env generated by new.env() particularly
for this purpose. Access is provided by get(<symbol>, env) and modification
by assign(<symbol>, value, env).

To do manipulations within the namespace of RobAStRDA, we provide helper
functions  .mergeGrid, .mergeF, .copy_smoothGrid, .renameGridName.

9. Exports

As this is functionality which should not bother the standard user of
'RobExtremes', basically all infrastructure mentioned in sections 6 
and 8 of this write-up is not exported. 
OTOH, these functions could be of interest to the user wanting to generate
new interpolators for new scale shape families or, say simply for the
95% VaR, each function remains in the R folder of the pkgs and is
documented in Rd.
As a consequence, to use these functions one has to access them by
RobExtremes:::fct (or, as in interpolationscripts.R, define shortcuts like 
fct <- RobExtremes:::fct

10. Reproducibility of grid generation

To be able to reproduce all operations which we used to generate the
contents of sysdata.rda, we provide scripts interpolationscripts.R
(this has been split up now as of Mar 2013, into grid generation
in  pkg 'RobExtremes' and into interpolator generation in pkg 
'RobAStRDA' (not quite executable scripts, rather to be used line-wise 
for copy&paste), each in pkg subfolder \inst\AddMaterial\interpolation ; 
hence they are available after pkg installation in the library as 
RobAStRDA\AddMaterial\interpolation resp. Robextremes\AddMaterial\interpolation 
 
Comments & Suggestions are welcome.
