Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store tag markup for serialization #136

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Store tag markup for serialization #136

wants to merge 12 commits into from

Conversation

peterzhu2118
Copy link
Member

Based on #135 and Shopify/liquid#1380.

This PR stores tags encountered during parsing so we can use it during serialization. A block body has many (zero or more) tags and each tags also recursively has zero or one block body.

For example, consider this liquid code:

{% assign test = true %}
{% if test %}
Hello
{% else %}
Goodbye
{% endif %}

It would generate a structure that looks like the following:

Block body (body: "")
  |-> Tag markup (tag_name: "assign", markup: "test = true")
  |-> Tag markup (tag_name: "if", markup: "test")
        |-> Block body (body: "Hello")
              |-> Tag markup (tag_name: "else", markup: "")
                    |-> Block body (body: "Goodbye")

This PR also changes how tag objects (not to be confused with the tag markup) are stored. They are no longer pushed into the array of constants (since we don't want to serialize it) and instead written to a separate buffer that doesn't get serialized. Then during serialization, we serialize the tag markup, and upon deserialization we re-parse the tag markup into a tag object.

Base automatically changed from pz-raw-tag-tokenizer to master January 4, 2021 20:42
Copy link
Contributor

@macournoyer macournoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something the missing tags_ptr = in vm.c are necessary.

ext/liquid_c/tag_markup.h Show resolved Hide resolved
ext/liquid_c/vm.c Show resolved Hide resolved
ext/liquid_c/vm.c Show resolved Hide resolved
ext/liquid_c/tag_markup.c Show resolved Hide resolved
Copy link
Contributor

@dylanahsmith dylanahsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed some suggested changes to a branch (pz-deserialize...deserialize-suggestions) based on top of #138 and linked to relevant commits in review comments

ext/liquid_c/document_body.c Outdated Show resolved Hide resolved
ext/liquid_c/document_body.c Outdated Show resolved Hide resolved
@@ -361,7 +363,7 @@ static VALUE vm_render_until_error(VALUE uncast_args)
}

case OP_WRITE_NODE:
rb_funcall(cLiquidBlockBody, id_render_node, 3, vm->context.self, output, (VALUE)*const_ptr++);
rb_funcall(cLiquidBlockBody, id_render_node, 3, vm->context.self, output, *tags_ptr++);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding another moving constant pointer seems like a step in the wrong direction, since we are trying to get rid of these in order to not have to add extra complexity for control flow.

An alternative would be to combine these during deserialization into a single constant table. We could do that by marshal dumping the constant array with nil placeholders for the tag objects and serializing an index with the tag markup to use for assigning the tag object into the constant table on deserialization. However, this would be a lot of wasted effort if we wanted to have a separate tag table from a constant table when addressing #84.

Tags are used with a separate instruction, so I think a separate constant table might be appropriate, like how the liquid VM prototype uses a separate table for filters. E.g. this way we could have 255 tag write instructions with a 1-byte operation in a block. If so, this approach is fine for this PR, but I wanted to make sure we are aligned on the expected longer term approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't think the current way is the best way of implementing. I didn't store tags inside of the constant table because it added complexity (the need to iterate over the constants array and remove all the tags, and then during deserialization update the correct indexes in the constants array). I think in the future, we be storing indexes for constants/tags which would eliminate the need for both the const_ptr and the tags_ptr.

Comment on lines -251 to 258
void vm_assembler_add_write_node(vm_assembler_t *code, VALUE node)
void vm_assembler_add_write_node(vm_assembler_t *code)
{
vm_assembler_write_opcode(code, OP_WRITE_NODE);
vm_assembler_write_ruby_constant(code, node);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is no longer adding a complete instruction, unlike the other functions with a similar name. Should the tag objects be a part of the vm_assembler_t so that we can write it along with the instruction? Should it be given both the node and the node buffer so it can write to it here? Otherwise, we should just use vm_assembler_write_opcode directly in the caller to make it clear that it is writing parts of the instruction rather making it seems like it is using a function that is writing a whole instruction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vm_assembler_t is recycled after freezing the block and the tag objects live longer than that, so I don't think it can be stored there. The only place where this function is called is in block_body_add_node which calls this function (vm_assembler_add_write_node) and then writes the tag onto the buffer.

ext/liquid_c/vm_assembler.h Outdated Show resolved Hide resolved
ext/liquid_c/document_body.c Outdated Show resolved Hide resolved
ext/liquid_c/document_body.c Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants