The Big R-Book. Philippe J. S. De Brouwer. Читать онлайн. Newlib. NEWLIB.NET

Автор: Philippe J. S. De Brouwer
Издательство: John Wiley & Sons Limited
Серия:
Жанр произведения: Математика
Год издания: 0
isbn: 9781119632771
Скачать книгу
means that they don't follow R's general logic “copy-on-modify” semantics, but are modified in place. This allows for difficult to read code but is invaluable to solve problems that cannot be solved in S3 or S4.

       C++

       C#

       RC – reference class

       struct

      # Define a string: acc <- “Philippe” # Force an attribute, balance, on it: acc$balance <- 100 ## Warning in acc$balance <- 100: Coercing LHS to a list # Inspect the result: acc ## [[1]] ## [1] “Philippe” ## ## $balance ## [1] 100

      This means that the base type holds information on how the object is stored in memory (and hence how much bytes it occupies), what variables it has, etc. The base types are part of R's code and compiled, so it is only possible to create new ones by modifying R's source code and recompiling. When thinking about the base types, one readily recalls all the types that we studied in the previous sections such as integers, vectors, matrices are base types. However, there are more exotic ones such as environments, functions, calls.

      Some conventions are not straightforward but deeply embedded in R and many people's code, some things might be somewhat surprising. Consider the following code:

      # a function build in core R typeof(mean) ## [1] “closure” is.primitive(mean) ## [1] FALSE # user defined function are “closures: add1 <- function(x) {x+1} typeof(add1) ## [1] “closure” is.function(add1) ## [1] TRUE is.object(add1) ## [1] FALSE

       is.primitive()

       typeof()

       is.function()

       is.object()

      The importance of these struct-based base type is that all other object types are built upon these: S3 objects are directly build on top of the base types, S4 objects use a special-purpose base type, and RC objects are a combination of S4 and environments (which is also a base type).

      The function is.object() returns true both for S3 and S4 objects. There is no base function that allows directly to test if an object is S3, but there is a to test to check if an object is S4. So we can test if something is S3 as follows.

      # is.S3 # Determines if an object is S3 # Arguments: # x -- an object # Returns: # boolean -- TRUE if x is S3, FALSE otherwise is.S3 <- function(x){is.object(x) & !isS4(x)} # Create two test objects: M <- matrix(1:16, nrow=4) df <- data.frame(M) # Test our new function: is.S3(M) ## [1] FALSE is.S3(df) ## [1] TRUE

      However, it is not really necessary to create such function by ourselves. We can leverage the library pryr, which provides a function otype() that returns the type of object.

       pryr

       otype()

      library(pryr) otype(M) ## [1] “base” otype(df) ## [1] “S3” otype(df$X1) # a vector is not S3 ## [1] “base” df$fac <-factor(df$X4) otype(df$fac) # a factor is S3 ## [1] “S3”

      If you would like to determine if a function is S3 generic, then you can check the source code for the use of the function useMethod(). This function will take care of the dispatching and hence decide which method to call for the given object.

       useMethod()

      However, this method is not foolproof because some primitive functions have this switch statement embedded in their C-code. For example, [, sum(), rbind(), and cbind() are generic functions, but this is not visible in their code in R.

      mean ## function (x, …) ## UseMethod(“mean”) ## <bytecode: 0x563423e48908> ## <environment: namespace:base> ftype(mean) ## [1] “s3” “generic” sum ## function (…, na.rm = FALSE) .Primitive(“sum”) ftype(sum) ## [1] “primitive” “generic”

      R calls the functions that have this switch in their C-code “internal” “generic”.

      The S3 generic function basically decides to what other function to dispatch its task. For example, the function print can be called with any base or S3 object and print will decide what to do based on its class. Try the function apropos() to find out what different methods exist (or type print. in RStudio.

      apropos(“print.”) ## [1] “print.AsIs” ## [2] “print.by” ## [3] “print.condition” ## [4] “print.connection” ## [5] “print.data.frame” ## [6] “print.Date” ## [7] “print.default” ## [8] “print.difftime” ## [9] “print.Dlist” ## [10] “print.DLLInfo” ## [11] “print.DLLInfoList” ## [12] “print.DLLRegisteredRoutines” ## [13] “print.eigen” ## [14] “print.factor” ## [15] “print.function” ## [16] “print.hexmode” ## [17] “print.libraryIQR” ## [18] “print.listof” ## [19] “print.NativeRoutineList” ## [20] “print.noquote” ## [21] “print.numeric_version” ## [22] “print.octmode” ## [23] “print.packageInfo” ## [24] “print.POSIXct” ## [25] “print.POSIXlt” ## [26] “print.proc_time” ## [27] “print.restart” ## [28] “print.rle” ## [29] “print.simple.list” ## [30] “print.srcfile” ## [31] “print.srcref” ## [32] “print.summary.table” ## [33] “print.summaryDefault” ## [34] “print.table” ## [35] “print.warnings” ## [36] “printCoefmat” ## [37] “sprintf” apropos(“mean.”) ## [1] “.colMeans” “.rowMeans” “colMeans” ## [4] “kmeans” “mean.Date”