Releases: SebKrantz/collapse
collapse version 1.8.6
collapse 1.8.6
- Fixed further minor issues:
- some inline functions in TRA.c needed to be declared 'static' to be local in scope (#275)
- timeid.Rd now uses zoo package conditionally and limits size of printout
collapse 1.8.5
- Fixed some issues flagged by CRAN:
- Installation on some linux distributions failed because omp.h was included after Rinternals.h
- Some signed integer overflows while running tests caused UBSAN warnings. (This happened inside a hash function where overflows are not a problem. I changed to unsigned int to avoid the UBSAN warning.)
- Ensured that package passes R CMD Check without suggested packages
collapse version 1.8.4
A few improvements and fixes to make collapse 1.8 acceptable to CRAN. The changes may be summarised as follows:
collapse 1.8.4
- Makevars text substitution hack to have CRAN accept a package that combines C, C++ and OpenMP. Thanks also to @MichaelChirico for pointing me in the right direction.
collapse 1.8.3
-
Significant speed improvement in
qF/qG
(factor-generation) for character vectors with more than 100,000 obs and many levels ifsort = TRUE
(the default). For details see themethod
argument of?qF
. -
Optimizations in
fmode
andfndistinct
for singleton groups.
collapse 1.8.2
-
Fixed some rchk issues found by Thomas Kalibera from CRAN.
-
faster
funique.default
method. -
group
now also internally optimizes on 'qG' objects.
collapse 1.8.1
-
Added function
fnunique
(yet another alternative todata.table::uniqueN
,kit::uniqLen
ordplyr::n_distinct
, and principally a simple wrapper forattr(group(x), "N.groups")
). At presentfnunique
generally outperforms the others on data frames. -
finteraction
has an additional argumentfactor = TRUE
. Settingfactor = FALSE
returns a 'qG' object, which is more efficient if just an integer id but no factor object itself is required. -
Operators (see
.OPERATOR_FUN
) have been improved a bit such that id-variables selected in the.data.frame
(by
,w
ort
arguments) or.pdata.frame
methods (variables in the index) are not computed upon even if they are numeric (since the default iscols = is.numeric
). In general, ifcols
is a function used to select columns of a certain data type, id variables are excluded from computation even if they are of that data type. It is still possible to compute on id variables by explicitly selecting them using names or indices passed tocols
, or including them in the lhs of a formula passed toby
. -
Further efforts to facilitate adding the group-count in
fsummarise
andfmutate
:- if
options(collapse_mask = "all")
before loading the package, an additional functionn()
is exported that works just likedplyr:::n()
. (Note that internal optimization flags forn
are always on, so if you really want the function to be calledn()
without settingoptions(collapse_mask = "all")
, you could also don <- GRPN
orn <- collapse:::n
) - otherwise the same can now always be done using
GRPN()
. The previous uses ofGRPN
are unaltered i.e.GRPN
can also:- fetch group sizes directly grouping object or grouped data frame i.e.
data |> gby(id) |> GRPN()
ordata %>% gby(id) %>% ftransform(N = GRPN(.))
(note the dot). - compute group sizes on the fly, for example
fsubset(data, GRPN(id) > 10L)
orfsubset(data, GRPN(list(id1, id2)) > 10L)
orGRPN(data, by = ~ id1 + id2)
.
- fetch group sizes directly grouping object or grouped data frame i.e.
- if
collapse version 1.8.0
collapse 1.8.0, released mid of May 2022, brings enhanced support for indexed computations on time series and panel data by introducing flexible 'indexed_frame' and 'indexed_series' classes and surrounding infrastructure, sets a modest start to OpenMP multithreading as well as data transformation by reference in statistical functions, and enhances the packages descriptive statistics toolset.
Changes to functionality
-
Functions
Recode
,replace_non_finite
, depreciated since collapse v1.1.0 andis.regular
, depreciated since collapse v1.5.1 and clashing with a more important function in the zoo package, are now removed. -
Fast Statistical Functions operating on numeric data (such as
fmean
,fmedian
,fsum
,fmin
,fmax
, ...) now preserve attributes in more cases. Previously these functions did not preserve attributes for simple computations using the default method, and only preserved attributes in grouped computations if!is.object(x)
(see NEWS section for collapse 1.4.0). This meant thatfmin
andfmax
did not preserve the attributes of Date or POSIXct objects, and none of these functions preserved 'units' objects (used a lot by the sf package). Now, attributes are preserved if!inherits(x, "ts")
, that is the new default of these functions is to generally keep attributes, except for 'ts' objects where doing so obviously causes an unwanted error (note that 'xts' and others are handled by the matrix or data.frame method where other principles apply, see NEWS for 1.4.0). An exception are the functionsfnobs
andfndistinct
where the previous default is kept. -
Time Series Functions
flag
,fdiff
,fgrowth
andpsacf/pspacf/psccf
(and the operatorsL/F/D/Dlog/G
) now internally process time objects passed to thet
argument (whereis.object(t) && is.numeric(unclass(t))
) via a new function calledtimeid
which turns them into integer vectors based on the greatest common divisor (GCD) (see below). Previously such objects were converted to factor. This can change behavior of code e.g. a 'Date' variable representing monthly data may be regular when converted to factor, but is now irregular and regarded as daily data (with a GCD of 1) because of the different day counts of the months. Users should fix such code by either by callingqG
on the time variable (for grouping / factor-conversion) or using appropriate classes e.g.zoo::yearmon
. Note that plain numeric vectors where!is.object(t)
are still used directly for indexation without passing them throughtimeid
(which can still be applied manually if desired). -
BY
now has an argumentreorder = TRUE
, which casts elements in the original order ifNROW(result) == NROW(x)
(likefmutate
). Previously the result was just in order of the groups, regardless of the length of the output. To obtain the former outcome users need to setreorder = FALSE
. -
options("collapse_DT_alloccol")
was removed, the default is now fixed at 100. The reason is that data.table automatically expands the range of overallocated columns if required (so the option is not really necessary), and calling R options from C slows down C code and can cause problems in parallel code.
Bug Fixes
-
Fixed a bug in
fcumsum
that caused a segfault during grouped operations on larger data, due to flawed internal memory allocation. Thanks @Gulde91 for reporting #237. -
Fixed a bug in
across
caused by twofunction(x)
statements being passed in a list e.g.mtcars |> fsummarise(acr(mpg, list(ssdd = function(x) sd(x), mu = function(x) mean(x))))
. Thanks @trang1618 for reporting #233. -
Fixed an issue in
across()
when logical vectors were used to select column on grouped data e.g.mtcars %>% gby(vs, am) %>% smr(acr(startsWith(names(.), "c"), fmean))
now works without error. -
qsu
gives proper output for length 1 vectors e.g.qsu(1)
. -
collapse depends on R > 3.3.0, due to the use of newer C-level macros introduced then. The earlier indication of R > 2.1.0 was only based on R-level code and misleading. Thanks @ben-schwen for reporting #236. I will try to maintain this dependency for as long as possible, without being too restrained by development in R's C API and the ALTREP system in particular, which collapse might utilize in the future.
Additions
-
Introduction of 'indexed_frame','indexed_series' and 'index_df' classes: fast and flexible indexed time series and panel data classes that inherit from plm's 'pdata.frame', 'pseries' and 'pindex' classes. These classes take full advantage of collapse's computational infrastructure, are class-agnostic i.e. they can be superimposed upon any data frame or vector/matrix like object while maintaining most of the functionality of that object, support both time series and panel data, natively handle irregularity, and supports ad-hoc computations inside arbitrary data masking functions and model formulas. This infrastructure comprises of additional functions and methods, and modification of some existing functions and 'pdata.frame' / 'pseries' methods.
-
New functions:
findex_by/iby
,findex/ix
,unindex
,reindex
,is_irregular
,to_plm
. -
New methods:
[.indexed_series
,[.indexed_frame
,[<-.indexed_frame
,$.indexed_frame
,
$<-.indexed_frame
,[[.indexed_frame
,[[<-.indexed_frame
,[.index_df
,fsubset.pseries
,fsubset.pdata.frame
,funique.pseries
,funique.pdata.frame
,roworder(v)
(internal)na_omit
(internal),print.indexed_series
,print.indexed_frame
,print.index_df
,Math.indexed_series
,Ops.indexed_series
. -
Modification of 'pseries' and 'pdata.frame' methods for functions
flag/L/F
,fdiff/D/Dlog
,fgrowth/G
,fcumsum
,psmat
,psacf/pspacf/psccf
,fscale/STD
,fbetween/B
,fwithin/W
,fhdbetween/HDB
,fhdwithin/HDW
,qsu
andvarying
to take advantage of 'indexed_frame' and 'indexed_series' while continuing to work as before with 'pdata.frame' and 'pseries'.
For more information and details see
help("indexing")
. -
-
Added function
timeid
: Generation of an integer-id/time-factor from time or date sequences represented by integer of double vectors (such as 'Date', 'POSIXct', 'ts', 'yearmon', 'yearquarter' or plain integers / doubles) by a numerically quite robust greatest common divisor method (see below). This function is used internally infindex_by
,reindex
and also in evaluation of thet
argument to functions likeflag
/fdiff
/fgrowth
wheneveris.object(t) && is.numeric(unclass(t))
(see also note above). -
Programming helper function
vgcd
to efficiently compute the greatest common divisor from a vector or positive integer or double values (which should ideally be unique and sorted as well,timeid
usesvgcd(sort(unique(diff(sort(unique(na_rm(x)))))))
). Precision for doubles is up to 6 digits. -
Programming helper function
frange
: A significantly faster alternative tobase::range
, which calls bothmin
andmax
. Note thatfrange
inherits collapse's globalna.rm = TRUE
default. -
Added function
qtab/qtable
: A versatile and computationally more efficient alternative tobase::table
. Notably, it also supports tabulations with frequency weights, and computation of a statistic over combinations of variables. Objects are of class 'qtab' that inherits from 'table'. Thus all 'table' methods apply to it. -
TRA
was rewritten in C, and now has an additional argumentset = TRUE
which toggles data transformation by reference. The functionsetTRA
was added as a shortcut which additionally returns the result invisibly. SinceTRA
is usually accessed internally through the like-named argument to Fast Statistical Functions, passingset = TRUE
to those functions yields an internal call tosetTRA
. For examplefmedian(num_vars(iris), g = iris$Species, TRA = "-", set = TRUE)
subtracts the species-wise median from the numeric variables in the iris dataset, modifying the data in place and returning the result invisibly. Similarly the argument can be added in other workflows such asiris |> fgroup_by(Species) |> fmutate(across(1:2, fmedian, set = TRUE))
ormtcars |> ftransform(mpg = mpg %+=% hp, wt = fsd(wt, cyl, TRA = "replace_fill", set = TRUE))
. Note that such chains must be ended byinvisible()
if no printout is wanted. -
Exported helper function
greorder
, the companion togsplit
to reorder output infmutate
(and now also inBY
): letg
be a 'GRP' object (or something coercible such as a vector) andx
a vector, thengreorder
orders data iny = unlist(gsplit(x, g))
such thatidentical(greorder(y, g), x)
.
Improvements
-
fmean
,fprod
,fmode
andfndistinct
were rewritten in C, providing performance improvements, particularly infmode
andfndistinct
, and improvements for integers infmean
andfprod
. -
OpenMP multithreading in
fsum
,fmean
,fmedian
,fnth
,fmode
andfndistinct
, implemented via an additionalnthreads
argument. The default is to use 1 thread, which internally calls a serial version of the code infsum
andfmean
(thus no change in the default behavior). The plan is to slowly roll this out over all statistical functions and then introduce options to set alternative global defaults. Multi-threading internally works different for different functions, see thenthreads
argument documentation of a particular function. Unfortunately I currently cannot guarantee thread safety, as parallelization of complex loops entails some tricky bugs and I have limited time to sort these out. So please report bugs, and if you happen to have experience with OpenMP please consider examining the code and making some suggestions. -
TRA
has an additional option"replace_NA"
, e.g.wlddev |> fgroup_by(iso3c) |> fmutate(across(PCGDP:POP, fmedian, TRA = "replace_NA"))
performs median value imputation of missing values. Similarly fo...
collapse version 1.7.6
-
Corrected a C-level bug in
gsplit
that could lead R to crash in some instances (gsplit
is used internally infsummarise
,fmutate
,BY
andcollap
to perform computations with base R (non-optimized) functions). -
Ensured that
BY.grouped_df
always (by default) returns grouping columns in aggregations i.e.iris |> gby(Species) |> nv() |> BY(sum)
now gives the same asiris |> gby(Species) |> nv() |> fsum()
. -
A
.
was added to the first argument of functionsfselect
,fsubset
,colorder
andfgroup_by
, i.e.fselect(x, ...) -> fselect(.x, ...)
. The reason for this is that over time I added the option to select-rename columns e.g.fselect(mtcars, cylinders = cyl)
, which was not offered when these functions were created. This presents problems if columns should be renamed intox
, e.g.fselect(mtcars, x = cyl)
failed, see #221. Renaming the first argument to.x
somewhat guards against such situations. I think this change is worthwhile to implement, because it makes the package more robust going forward, and usually the first argument of these functions is never invoked explicitly. I really hope this breaks nobody's code. -
Added a function
GRPN
to make it easy to add a column of group sizes e.g.mtcars %>% fgroup_by(cyl,vs,am) %>% ftransform(Sizes = GRPN(.))
ormtcars %>% ftransform(Sizes = GRPN(list(cyl, vs, am)))
orGRPN(mtcars, by = ~cyl+vs+am)
. -
Added
[.pwcor
and[.pwcov
, to be able to subset correlation/covariance matrices without loosing the print formatting.
collapse version 1.7.5
collapse 1.7.5
-
In the development version on GitHub, a
.
was added to the first argument of functionsfselect
,fsubset
,colorder
andfgroup_by
, i.e.fselect(x, ...) -> fselect(.x, ...)
. The reason for this is that over time I added the option to select-rename columns e.g.fselect(mtcars, cylinders = cyl)
, which was not offered when these functions were created. This presents problems if columns should be renamed intox
, e.g.fselect(mtcars, x = cyl)
fails, see e.g. #221 . Renaming the first argument to.x
somewhat guards against such situations. I think this API change is worthwhile to implement, because it makes the package more robust going forward, and usually the first argument of these functions is never invoked explicitly. For now it remains in the development version which you can install usingremotes::install_github("SebKrantz/collapse")
. If you have strong objections to this change (because it will break your code or you know of people that have a programming style where they explicitly set the first argument of data manipulation functions), please let me know! -
Also ensuring tidyverse examples are in
\donttest{}
and building without the dplyr testing file to avoid issues with static code analysis on CRAN. -
20-50% Speed improvement in
gsplit
(and therefore infsummarise
,fmutate
,collap
andBY
when invoked with base R functions) when grouping withGRP(..., sort = TRUE, return.order = TRUE)
. To enable this by default, the default for argumentreturn.order
inGRP
was set tosort
, which retains the ordering vector (needed for the optimization). Retaining the ordering vector uses up some memory which can possibly adversely affect computations with big data, but with big datasort = FALSE
usually gives faster results anyway, and you can also always setreturn.order = FALSE
(also infgroup_by
,collap
), so this default gives the best of both worlds.
- An ancient depreciated argument
sort.row
(replaced bysort
in 2020) is now removed fromcollap
. Also argumentsreturn.order
andmethod
were added tocollap
providing full control of the grouping that happens internally.
collapse 1.7.4
-
Tests needed to be adjusted for the upcoming release of dplyr 1.0.8 which involves an API change in
mutate
.fmutate
will not take over these changes i.e.fmutate(..., .keep = "none")
will continue to work likedplyr::transmute
. Furthermore, no more tests involving dplyr are run on CRAN, and I will also not follow along with any future dplyr API changes. -
The C-API macro
installTrChar
(used in the newmassign
function) was replaced withinstallChar
to maintain backwards compatibility with R versions prior to 3.6.0. Thanks @tedmoorman #213. -
Minor improvements to
group()
, providing increased performance for doubles and also increased performance when the second grouping variable is integer, which turned out to be very slow in some instances.
collapse version 1.7.3
-
Removed tests involving the weights package (which is not available on R-devel CRAN checks).
-
fgroup_by
is more flexible, supporting computing columns e.g.fgroup_by(GGDC10S, Variable, Decade = floor(Year / 10) * 10)
and various programming options e.g.fgroup_by(GGDC10S, 1:3)
,fgroup_by(GGDC10S, c("Variable", "Country"))
, orfgroup_by(GGDC10S, is.character)
. You can also use column sequences e.g.fgroup_by(GGDC10S, Country:Variable, Year)
, but this should not be mixed with computing columns. Compute expressions may also not include the:
function. -
More memory efficient attribute handling in C/C++ (using C-API macro
SHALLOW_DUPLICATE_ATTRIB
instead ofDUPLICATE_ATTRIB
) in most places.
collapse version 1.7.2
- Ensured that the base pipe
|>
is not used in tests or examples, to avoid errors on CRAN checks with older versions of R.
collapse version 1.7.1
collapse 1.7.1
-
Fixed minor C/C++ issues flagged in CRAN checks.
-
Added option
ties = "last"
tofmode
. -
Added argument
stable.algo
toqsu
. Settingstable.algo = FALSE
toggles a faster calculation of the standard deviation, yielding 2x speedup on large datasets. -
Fast Statistical Functions now internally use
group
for grouping data if bothg
andTRA
arguments are used, yielding efficiency gains on unsorted data. -
Ensured that
fmutate
andfsummarise
can be called if collapse is not attached.
collapse version 1.7.0
collapse 1.7.0
collapse 1.7.0, released mid January 2022, brings major improvements in the computational backend of the package, it's data manipulation capabilities, and a whole set of new functions that enable more flexible and memory efficiency R programming - significantly enhancing the language itself. For the vast majority of codes, updating to 1.7 should not cause any problems.
Changes to functionality
-
num_vars
is now implemented in C, yielding a massive performance increase over checking columns usingvapply(x, is.numeric, logical(1))
. It selects columns where(is.double(x) || is.integer(x)) && !is.object(x)
. This provides the same results for most common classes found in data frames (e.g. factors and date columns are not numeric), however it is possible for users to define methods foris.numeric
for other objects, which will not be respected bynum_vars
anymore. A prominent example are base R's 'ts' objects i.e.is.numeric(AirPassengers)
returnsTRUE
, butis.object(AirPassengers)
is alsoTRUE
so the above yieldsFALSE
, implying - if you happened to work with data frames of 'ts' columns - thatnum_vars
will now not select those anymore. Please make me aware if there are other important classes that are found in data frames and whereis.numeric
returnsTRUE
.num_vars
is also used internally incollap
so this might affect your aggregations. -
In
flag
,fdiff
andfgrowth
, if a plain numeric vector is passed to thet
argument such thatis.double(t) && !is.object(t)
, it is coerced to integer usingas.integer(t)
and directly used as time variable, rather than applying ordered grouping first. This is to avoid the inefficiency of grouping, and owes to the fact that in most data imported into R with various packages, the time (year) variables are coded as double although they should be integer (I also don't know of any cases where time needs to be indexed by a non-date variable with decimal places). Note that the algorithm internally handles irregularity in the time variable so this is not a problem. Should this break any code, kindly raise an issue on GitHub. -
The function
setrename
now truly renames objects by reference (without creating a shallow copy). The same is true forvlabels<-
(which was rewritten in C) and a new functionsetrelabel
. Thus additional care needs to be taken (with use inside functions etc.) as the renaming will take global effects unless a shallow copy of the data was created by some prior operation inside the function. If in doubt, better usefrename
orrelabel
which do create a shallow copy. -
Some improvements to the
BY
function, both in terms of performance and security. Performance is enhanced through a new C functiongsplit
, providing split-apply-combine computing speeds competitive with dplyr on a much broader range of R objects. Regarding Security: if the result of the computation has the same length as the original data, names / rownames and grouping columns (for grouped data) are only added to the result object if known to be valid, i.e. if the data was originally sorted by the grouping columns (information recorded byGRP.default(..., sort = TRUE)
, which is called internally on non-factor/GRP/qG objects). This is becauseBY
does not reorder data after the split-apply-combine step (unlikedplyr::mutate
); data are simply recombined in the order of the groups. Because of this, in general,BY
should be used to compute summary statistics (unless data are sorted before grouping). The added security makes this explicit. -
Added a method
length.GRP
giving the length of a grouping object. This could break code callinglength
on a grouping object before (which just returned the length of the list). -
Functions renamed in collapse 1.6.0 will now print a message telling you to use the updated names. The functions under the old names will stay around for 1-3 more years.
-
The passing of argument
order
instead ofsort
in functionGRP
(from a very early version of collapse), is now disabled.
Bug Fixes
- Fixed a bug in some functions using Welfords Online Algorithm (
fvar
,fsd
,fscale
andqsu
) to calculate variances, occurring when initial or final zero weights caused the running sum of weights in the algorithm to be zero, yielding a division by zero andNA
as output although a value was expected. These functions now skip zero weights alongside missing weights, which also implies that you can pass a logical vector to the weights argument to very efficiently calculate statistics on a subset of data (e.g. usingqsu
).
Additions
Basic Computational Infrastructure
-
Function
group
was added, providing a low-level interface to a new unordered grouping algorithm based on hashing in C and optimized for R's data structures. The algorithm was heavily inspired by the greatkit
package of Morgan Jacob, and now feeds into the package through multiple central functions (includingGRP
/fgroup_by
,funique
andqF
) when invoked with argumentsort = FALSE
. It is also used in internal groupings performed in data transformation functions such asfwithin
(when no factor or 'GRP' object is provided to theg
argument). The speed of the algorithm is very promising (often superior toradixorder
), and it could be used in more places still. I welcome any feedback on it's performance on different datasets. -
Function
gsplit
provides an efficient alternative tosplit
based on grouping objects. It is used as a new backend torsplit
(which also supports data frame) as well asBY
,collap
,fsummarise
andfmutate
- for more efficient grouped operations with functions external to the package. -
Added multiple functions to facilitate memory efficient programming (written in C). These include elementary mathematical operations by reference (
setop
,%+=%
,%-=%
,%*=%
,%/=%
), supporting computations involving integers and doubles on vectors, matrices and data frames (including row-wise operations viasetop
) with no copies at all. Furthermore a set of functions which check a single value against a vector without generating logical vectors:whichv
,whichNA
(operators%==%
and%!=%
which return indices and are significantly faster than==
, especially inside functions likefsubset
),anyv
andallv
(allNA
was already added before). Finally, functionssetv
andcopyv
speed up operations involving the replacement of a value (x[x == 5] <- 6
) or of a sequence of values from a equally sized object (x[x == 5] <- y[x == 5]
, orx[ind] <- y[ind]
whereind
could be pre-computed vectors or indices) in vectors and data frames without generating any logical vectors or materializing vector subsets. -
Function
vlengths
was added as a more efficient alternative tolengths
(without method dispatch, simply coded in C). -
Function
massign
provides a multivariate version ofassign
(written in C, and supporting all basic vector types). In addition the operator%=%
was added as an efficient multiple assignment operator. (It is called%=%
and not%<-%
to facilitate the translation of Matlab or Python codes into R, and because the zeallot package already provides multiple-assignment operators (%<-%
and%->%
), which are significantly more versatile, but orders of magnitude slower than%=%
)
High-Level Features
-
Fully fledged
fmutate
function that provides functionality analogous todplyr::mutate
(sequential evaluation of arguments, including arbitrary tagged expressions andacross
statements).fmutate
is optimized to work together with the packages Fast Statistical and Data Transformation Functions, yielding fast, vectorized execution, but also benefits fromgsplit
for other operations. -
across()
function implemented for use insidefsummarise
andfmutate
. It is also optimized for Fast Statistical and Data Transformation Functions, but performs well with other functions too. It has an additional arguments.apply = FALSE
which will apply functions to the entire subset of the data instead of individual columns, and thus allows for nesting tibbles and estimating models or correlation matrices by groups etc..across()
also supports an arbitrary number of additional arguments which are split and evaluated by groups if necessary. Multipleacross()
statements can be combined with tagged vector expressions in a single call tofsummarise
orfmutate
. Thus the computational framework is pretty general and similar to data.table, although less efficient with big datasets. -
Added functions
relabel
andsetrelabel
to make interactive dealing with variable labels a bit easier. Note that both functions operate by reference. (Throughvlabels<-
which is implemented in C. Taking a shallow copy of the data frame is useless in this case because variable labels are attributes of the columns, not of the frame). The only difference between the two is thatsetrelabel
returns the result invisibly. -
function shortcuts
rnm
andmtt
added forfrename
andfmutate
.across
can also be abbreviated usingacr
. -
Added two options that can be invoked before loading of the package to change the namespace:
options(collapse_mask = c(...))
can be set to export copies of selected (or all) functions in the package that start withf
removing the leadingf
e.g.fsubset
->subset
(bothfsubset
andsubset
will be exported). This allows masking base R and dplyr functions (even basic functions such assum
,mean
,unique
etc. if desired) with collapse's fast functions, facilitating the optimization of existing codes and allowing you to work with collapse using a more natural namespace. The package has been internally insulated against such changes, but of course they might have major effects on...
collapse version 1.6.5
collapse 1.6.5
-
Use of
VECTOR_PTR
in C API now gives an error on R-devel even ifUSE_RINTERNALS
is defined. Thus this patch gets rid of all remaining usage of this macro to avoid errors on CRAN checks using the development version of R. -
The print method for
qsu
now uses an apostrophe (') to designate million digits, instead of a comma (,). This is to avoid confusion with the decimal point, and the typical use of (,) for thousands (which I don't like).