-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making R internals more ALTREP friendly #23
Comments
I'd be happy to contribute to efforts in this vein if there is endorsement
from R-core.
…On Thu, Jul 4, 2024 at 12:22 PM Davis Vaughan ***@***.***> wrote:
I've been working on an ALTREP "view" class, and have discovered a number
of small places where I think base R itself could be a little more ALTREP
friendly. I'd be interested in working on a few of these next week if there
is support for it, I think it could be a really tractable win for r-dev-day
with small and well scoped issues to tackle!
------------------------------
In particular, there are a few places I've found so far where R uses, say,
LOGICAL() where it could instead use LOGICAL_RO(). These are cases where
it seems like R only needs a read only pointer into the data. This would be
more friendly to ALTREP classes that can provide a readonly pointer easily,
but would have to materialize to provide a writable pointer (like an ALTREP
"view").
- duplicate1(), used by Rf_duplicate()
- Link:
https://github.com/wch/r-source/blob/9dd0f20d40d6696ae55580f8525a4d6a85d17f7e/src/main/duplicate.c#L343-L347
- copyVector()
- Link:
https://github.com/wch/r-source/blob/b9c83de5f7c9bb0b6dbe96b6d61a04d195b0cf2e/src/main/duplicate.c#L385
- R_compute_identical(), used by the R level identical()
- Link:
https://github.com/wch/r-source/blob/b9c83de5f7c9bb0b6dbe96b6d61a04d195b0cf2e/src/main/identical.c#L233
All 3 of these currently force a materialization of my ALTREP class due to
requesting a writable pointer, when I think they only need a readonly one.
------------------------------
In addition to these, I think that ExtractSubset() (used by the R level [,
and many other places) could be modified to have really nice efficiency
gains for ALTREP types. It currently calls the corresponding *_ELT()
method on *each index*. I was thinking that alternatively it could try to
see if a call to Dataptr_or_null() returns a cheap read only dataptr that
it could use to extract from instead. I think that would make subset
extraction way more efficient for ALTREP classes that can provide a
readonly pointer easily (again, like a view).
I do realize there is an ALTREP Extract_subset method that I could
implement for this last case, but subsetting is tricky to get *exactly*
right for all cases, and I'd much rather just hand R a read only dataptr
and let it handle the finer details of the subsetting!
Link:
https://github.com/wch/r-source/blob/b9c83de5f7c9bb0b6dbe96b6d61a04d195b0cf2e/src/main/subset.c#L137
—
Reply to this email directly, view it on GitHub
<#23>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAG53MJYNPCX6VEIGIABATLZKWOHBAVCNFSM6AAAAABKMAU536VHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4TCMZXGMYTANA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Seems worth a shot. |
It does seem like we will need to defer to the extractsubset method if it's
there and I dont recall if that's one of thr methods that can return null
if it wants to do nothing and defer to core code (I'm on my phone at a
train station atm)
…On Sat, Jul 6, 2024, 4:59 PM ltierney ***@***.***> wrote:
Seems worth a shot.
—
Reply to this email directly, view it on GitHub
<#23 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAG53MNPDFUAML5QINNPYU3ZLAA3NAVCNFSM6AAAAABKMAU536VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJRG44TAMJVGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
rlang side of this, which will be used as a test to make sure it is working r-lib/rlang#1725 |
This work has been reported back in 3 reports on Bugzilla, addressing the 3 cases identified in #23 (comment):
The work on |
4th PR sent in at: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I've been working on an ALTREP "view" class, and have discovered a number of small places where I think base R itself could be a little more ALTREP friendly. I'd be interested in working on a few of these next week if there is support for it, I think it could be a really tractable win for r-dev-day with small and well scoped issues to tackle!
In particular, there are a few places I've found so far where R uses, say,
LOGICAL()
where it could instead useLOGICAL_RO()
. These are cases where it seems like R only needs a read only pointer into the data. This would be more friendly to ALTREP classes that can provide a readonly pointer easily, but would have to materialize to provide a writable pointer (like an ALTREP "view").duplicate1()
, used byRf_duplicate()
copyVector()
R_compute_identical()
, used by the R levelidentical()
All 3 of these currently force a materialization of my ALTREP class due to requesting a writable pointer, when I think they only need a readonly one.
In addition to these, I think that
ExtractSubset()
(used by the R level[
, and many other places) could be modified to have really nice efficiency gains for ALTREP types. It currently calls the corresponding*_ELT()
method on each index. I was thinking that alternatively it could try to see if a call toDataptr_or_null()
returns a cheap read only dataptr that it could use to extract from instead. I think that would make subset extraction way more efficient for ALTREP classes that can provide a readonly pointer easily (again, like a view).I do realize there is an ALTREP
Extract_subset
method that I could implement for this last case, but subsetting is tricky to get exactly right for all cases, and I'd much rather just hand R a read only dataptr and let it handle the finer details of the subsetting!Link: https://github.com/wch/r-source/blob/b9c83de5f7c9bb0b6dbe96b6d61a04d195b0cf2e/src/main/subset.c#L137
The text was updated successfully, but these errors were encountered: