Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDS writer #16

Merged
merged 34 commits into from
Jul 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
bdd280d
RDSWriter
programLyrique Feb 22, 2024
50f12a6
Add information in the README:
programLyrique Feb 22, 2024
271b2a2
RDSOutputStream
programLyrique Feb 22, 2024
5d4135c
Minimal compiling RDSWriter
programLyrique Mar 5, 2024
0abd2f0
Names of symbol should be non null.
programLyrique Mar 7, 2024
a99082a
Support most common sexp types for RDS writing
programLyrique Mar 11, 2024
a5e6af9
Write read test for integer vector
programLyrique Mar 12, 2024
eb4174c
Test for Logical + associated builder
programLyrique Mar 12, 2024
c634080
Tests for reals and null
programLyrique Mar 12, 2024
a133a1c
Tests writing with GNU R as oracle
programLyrique Mar 13, 2024
f6e6171
ListSXP works.
programLyrique Mar 15, 2024
6a946a6
Supports user environments
programLyrique Mar 19, 2024
f4c144e
Bytecode serialization: WIP
programLyrique Mar 21, 2024
da47ea4
Some stubs I was starting to fill for the bytecode writer
programLyrique May 27, 2024
d95424b
Merge branch 'main' of github.com:PRL-PRG/r-compile-server into rds-w…
breitnw Jun 17, 2024
71bce82
Support for complex number serialization
breitnw Jun 18, 2024
c03929c
Writer and RDS flag refactor, basic GP flag support
breitnw Jun 24, 2024
11f2226
Bytecode serialization should probably work (untested)
breitnw Jun 24, 2024
f80b079
Bytecode serialization actually works (tested)
breitnw Jun 28, 2024
b0aed80
Bytecode serialization actually works (tested)
breitnw Jun 28, 2024
13f1f57
Merge branch 'rds-writer' of github.com:PRL-PRG/r-compile-server into…
breitnw Jun 28, 2024
6a12ca8
Merge branch 'main' of github.com:PRL-PRG/r-compile-server into rds-w…
breitnw Jun 28, 2024
2ec7ca2
Compiler tests passing
breitnw Jul 1, 2024
a54dace
Reference serialization
breitnw Jul 3, 2024
ab9f412
Removed obsolete TODOs
breitnw Jul 3, 2024
dff1661
fixed verification (hopefully)
breitnw Jul 3, 2024
698de95
RDS logging
breitnw Jul 11, 2024
4a7a285
RDS roundtrip test with R closures
breitnw Jul 15, 2024
8ca5da7
Cleaned up file organization
breitnw Jul 16, 2024
9408226
UTF-8 encoding support
breitnw Jul 16, 2024
ce32e0c
Fixed RDSRoundtripTest to compare toString output
breitnw Jul 17, 2024
a441ada
Removed log messages & fixed NA_STRING serialization
breitnw Jul 17, 2024
37e4737
removed logging
breitnw Jul 17, 2024
f2c2e08
Small fixes (formatting, etc.)
breitnw Jul 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .idea/highlightedFiles.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions .settings/org.eclipse.jdt.core.prefs
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ org.eclipse.jdt.core.compiler.annotation.nonnull=org.eclipse.jdt.annotation.NonN
org.eclipse.jdt.core.compiler.annotation.nonnullbydefault=org.eclipse.jdt.annotation.NonNullByDefault
org.eclipse.jdt.core.compiler.annotation.nullable=org.eclipse.jdt.annotation.Nullable
org.eclipse.jdt.core.compiler.annotation.nullanalysis=disabled
org.eclipse.jdt.core.compiler.codegen.targetPlatform=21
org.eclipse.jdt.core.compiler.compliance=21
org.eclipse.jdt.core.compiler.codegen.targetPlatform=22
org.eclipse.jdt.core.compiler.compliance=22
org.eclipse.jdt.core.compiler.problem.enablePreviewFeatures=disabled
org.eclipse.jdt.core.compiler.problem.forbiddenReference=warning
org.eclipse.jdt.core.compiler.problem.nullAnnotationInferenceConflict=error
Expand All @@ -14,6 +14,6 @@ org.eclipse.jdt.core.compiler.problem.nullSpecViolation=error
org.eclipse.jdt.core.compiler.problem.potentialNullReference=ignore
org.eclipse.jdt.core.compiler.problem.reportPreviewFeatures=ignore
org.eclipse.jdt.core.compiler.problem.syntacticNullAnalysisForFields=disabled
org.eclipse.jdt.core.compiler.processAnnotations=disabled
org.eclipse.jdt.core.compiler.processAnnotations=enabled
org.eclipse.jdt.core.compiler.release=disabled
org.eclipse.jdt.core.compiler.source=21
org.eclipse.jdt.core.compiler.source=22
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
SOFTWARE.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,12 @@ import javax.annotation.ParametersAreNonnullByDefault;

In IntelliJ you can simply copy `package-info.json` from any package into the new one and it will create this.

The project requires at least Java 22.

## Commands

- Run `make setup` to install Git Hooks. The commit hook formats, the pre-push hook runs tests and static analyses.
- Build with `make` or `mvn package`
- Test (no static analyses) with `make test` or `mvn test`
- Test and static anaylses with `make check` or `mvn verify`
- Format with `make format` or `mvn spotless:apply`
- Format with `make format` or `mvn spotless:apply`. This requires to have `npm` installed.
9 changes: 9 additions & 0 deletions src/main/java/org/prlprg/RVersion.java
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,15 @@ public static RVersion parse(String textual) {
this(major, minor, patch, null);
}

/**
* Encode the version as an integer. It is used for the RDS serialization for instance.
*
* @return
*/
public int encode() {
return patch + 256 * minor + 65536 * major;
}

@Override
public String toString() {
return major + "." + minor + "." + patch + (suffix == null ? "" : "-" + suffix);
Expand Down
31 changes: 18 additions & 13 deletions src/main/java/org/prlprg/bc/ConstPool.java
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,7 @@
import com.google.common.collect.ForwardingList;
import com.google.common.collect.ImmutableList;
import edu.umd.cs.findbugs.annotations.Nullable;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.*;
import java.util.function.Function;
import javax.annotation.concurrent.Immutable;
import org.prlprg.parseprint.ParseMethod;
Expand Down Expand Up @@ -125,13 +120,13 @@ public static class Builder {
private final List<SEXP> values;

public Builder() {
this(Collections.emptyList());
this.index = new HashMap<>();
this.values = new ArrayList<>();
}

public Builder(List<SEXP> consts) {
index = new HashMap<>(consts.size());
values = new ArrayList<>(consts.size());

public Builder(Collection<? extends SEXP> consts) {
this.index = new HashMap<>(consts.size());
this.values = new ArrayList<>(consts.size());
for (var e : consts) {
add(e);
}
Expand All @@ -142,7 +137,7 @@ public <S extends SEXP> Idx<S> add(S c) {
index.computeIfAbsent(
c,
(ignored) -> {
var x = index.size();
var x = values.size();
values.add(c);
return x;
});
Expand Down Expand Up @@ -180,11 +175,21 @@ public Idx<RegSymSXP> indexSym(int i) {
return index(i, RegSymSXP.class);
}

// FIXME: do we need this?
// FIXME: do we need these? ---
public @Nullable Idx<LangSXP> indexLangOrNilIfNegative(int i) {
return i >= 0 ? orNil(i, LangSXP.class) : null;
}

public @Nullable Idx<IntSXP> indexIntOrNilIfNegative(int i) {
return i >= 0 ? orNil(i, IntSXP.class) : null;
}

public @Nullable Idx<StrSXP> indexStrOrNilIfNegative(int i) {
return i >= 0 ? orNil(i, StrSXP.class) : null;
}

// -- FIXME

public @Nullable Idx<StrOrRegSymSXP> indexStrOrSymOrNil(int i) {
return orNil(i, StrOrRegSymSXP.class);
}
Expand Down
5 changes: 5 additions & 0 deletions src/main/java/org/prlprg/primitive/Logical.java
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@ public static Logical valueOf(int i) {
};
}

/** Convert to GNU-R representation. */
public int toInt() {
return i;
}

Logical(int i) {
this.i = i;
}
Expand Down
155 changes: 137 additions & 18 deletions src/main/java/org/prlprg/rds/Flags.java
Original file line number Diff line number Diff line change
@@ -1,7 +1,26 @@
package org.prlprg.rds;

import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.util.Objects;
import javax.annotation.Nullable;

/**
* The bitflags describing a {@code SEXP} in RDS. They are described as follows:
*
* <ul>
* <li><b>0-7:</b> describe the SEXP's RDSItemType
* <li><b>8:</b> enabled if the SEXP is an object
* <li><b>9:</b> enabled if the SEXP has attributes
* <li><b>10:</b> enabled if the SEXP has a tag (for the pairlist types)
* <li><b>11:</b> unused bit
* <li><b>12-27:</b> general purpose (gp) bits, as defined in the {@code SXPINFO} struct
* </ul>
*
* There are 16 GP bits, which are traditionally present on each SEXP.
*/
final class Flags {
private static final int UTF8_MASK = 1 << 3;
private static final int OBJECT_MASK = 1 << 8;
private static final int ATTR_MASK = 1 << 9;
private static final int TAG_MASK = 1 << 10;

Expand All @@ -16,32 +35,39 @@ public Flags(int flags) {
}
}

// Pack the flags of a regular item
public Flags(
RDSItemType type,
int levels,
boolean isUTF8,
boolean hasAttributes,
boolean hasTag,
int refIndex) {
RDSItemType type, GPFlags levels, boolean isObject, boolean hasAttributes, boolean hasTag) {
if (type.i() == RDSItemType.Special.REFSXP.i())
throw new IllegalArgumentException(
"Cannot write REFSXP with this constructor: ref index " + "needed");
this.flags =
type.i()
| (levels << 12)
| (isUTF8 ? UTF8_MASK : 0)
| (levels.encode() << 12)
| (isObject ? OBJECT_MASK : 0)
| (hasAttributes ? ATTR_MASK : 0)
| (hasTag ? TAG_MASK : 0)
| (refIndex << 8);
| (hasTag ? TAG_MASK : 0);
}

// Pack the flags of a reference
public Flags(RDSItemType type, int refIndex) {
if (type.i() != RDSItemType.Special.REFSXP.i())
throw new IllegalArgumentException(
"Cannot write REFSXP with this constructor: ref index " + "needed");
this.flags = type.i() | (refIndex << 8);
}

public RDSItemType getType() {
return RDSItemType.valueOf(flags & 255);
}

public int decodeLevels() {
return flags >> 12;
// The levels contained in Flags are general-purpose bits.
public GPFlags getLevels() {
return new GPFlags(flags >> 12);
}

public boolean isUTF8() {
return (decodeLevels() & UTF8_MASK) != 0;
public boolean isObject() {
return (flags & OBJECT_MASK) != 0;
}

public boolean hasAttributes() {
Expand All @@ -56,15 +82,28 @@ public int unpackRefIndex() {
return flags >> 8;
}

/**
* Returns a new Flags identical to this one, but with the hasAttr bit set according to
* hasAttributes.
*/
public Flags withAttributes(boolean hasAttributes) {
return new Flags(this.flags & ~ATTR_MASK | (hasAttributes ? ATTR_MASK : 0));
}

/** Returns a new Flags identical to this one, but with the hasTag bit set according to hasTag. */
public Flags withTag(boolean hasTag) {
return new Flags(this.flags & ~TAG_MASK | (hasTag ? TAG_MASK : 0));
}

@Override
public String toString() {
return "Flags{"
+ "type="
+ getType()
+ ", levels="
+ decodeLevels()
+ ", isUTF8="
+ isUTF8()
+ getLevels().encode()
+ ", isObject="
+ isObject()
+ ", hasAttributes="
+ hasAttributes()
+ ", hasTag="
Expand All @@ -73,4 +112,84 @@ public String toString() {
+ unpackRefIndex()
+ '}';
}

public int encode() {
return flags;
}
}

/**
* Flags corresponding with the general-purpose (gp) bits found on a SEXP. See <a
* href="https://cran.r-project.org/doc/manuals/r-release/R-ints.html">R internals</a>
*
* <ul>
* <li><b>Bits 14 and 15</b> are used for 'fancy bindings'. Bit 14 is used to lock a binding or
* environment, and bit 15 is used to indicate an active binding. Bit 15 is used for an
* environment to indicate if it participates in the global cache.
* <li><b>Bits 1, 2, 3, 5, and 6</b> are used for a {@code CHARSXP} (we use strings) to indicate
* its encoding. Relevant to us are bits 2, 3, and 6, which indicate Latin-1, UTF-8, and ASCII
* respectively.
* <li><b>Bit 4</b> is turned on to mark S4 objects
* </ul>
*
* Currently, only the character encoding flag is used.
*/
final class GPFlags {
// HASHASH_MASK: 1;
// private static final int BYTES_MASK = 1 << 1;
private static final int LATIN1_MASK = 1 << 2;
private static final int UTF8_MASK = 1 << 3;
// S4_OBJECT_MASK: 1 << 4
// CACHED_MASK = 1 << 5;
private static final int ASCII_MASK = 1 << 6;
private static final int LOCKED_MASK = 1 << 14;

private final int flags;

GPFlags(@Nullable Charset charset, boolean locked) {
// NOTE: CACHED_MASK and HASHASH_MASK should be off when packing RDS flags for SEXPType.CHAR,
// but since we don't have external input for levels I think they should be off anyways
this.flags =
(locked ? LOCKED_MASK : 0)
| (charset == StandardCharsets.UTF_8 ? UTF8_MASK : 0)
| (charset == StandardCharsets.US_ASCII ? ASCII_MASK : 0)
| (charset == StandardCharsets.ISO_8859_1 ? LATIN1_MASK : 0);
}

GPFlags(int levels) {
this.flags = levels;
}

GPFlags() {
this.flags = 0;
}

public int encode() {
return flags;
}

public @Nullable Charset encoding() {
if ((flags & LATIN1_MASK) != 0) {
return StandardCharsets.ISO_8859_1; // ISO_8859_1 is LATIN1
} else if ((flags & UTF8_MASK) != 0) {
return StandardCharsets.UTF_8;
} else if ((flags & ASCII_MASK) != 0) {
return StandardCharsets.US_ASCII;
} else {
return null;
}
}

public boolean isLocked() {
return (this.flags & LOCKED_MASK) != 0;
}

public String toString() {
return "GPFlags{"
+ "encoding="
+ (this.encoding() == null ? "null" : Objects.requireNonNull(this.encoding()).name())
+ ", locked="
+ this.isLocked()
+ '}';
}
}
Loading
Loading