Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve write-behind flushSize vs batchSize counting #80

Open
ceefour opened this issue Jul 6, 2014 · 0 comments
Open

Improve write-behind flushSize vs batchSize counting #80

ceefour opened this issue Jul 6, 2014 · 0 comments

Comments

@ceefour
Copy link

ceefour commented Jul 6, 2014

With the following write-behind config :

<bean parent="cache-template">
    <property name="name" value="yagoLabel" />
    <property name="cacheMode" value="PARTITIONED" />
    <property name="atomicityMode" value="ATOMIC" />
    <property name="distributionMode" value="PARTITIONED_ONLY" />
    <property name="backups" value="1" />
    <property name="store">
        <bean class="id.ac.itb.ee.lskk.lumen.yago.YagoLabelCacheStore" />
    </property>
    <property name="writeBehindEnabled" value="true" />
    <property name="writeBehindFlushSize" value="10240" />
    <property name="writeBehindFlushFrequency" value="30000" />
    <property name="writeBehindBatchSize" value="10240" />
    <property name="swapEnabled" value="false" />
    <property name="evictionPolicy">
        <bean class="org.gridgain.grid.cache.eviction.lru.GridCacheLruEvictionPolicy">
            <property name="maxSize" value="100000" />
        </bean>
    </property>
</bean>

I get following behavior:

09:10:01.395 [main] INFO  i.a.i.e.l.l.yago.YagoLabelsToMongo - Inserted 680000 labels
09:10:03.626 [flusher-0-#41%null%] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - Upserted 10240 documents, inserted=0, modified=1652, upserted=8588
09:10:03.631 [flusher-0-#41%null%] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - Upserted 1 documents, inserted=0, modified=0, upserted=1
09:10:04.362 [main] INFO  i.a.i.e.l.l.yago.YagoLabelsToMongo - Inserted 690000 labels
09:10:06.565 [flusher-0-#41%null%] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - Upserted 10240 documents, inserted=0, modified=1666, upserted=8573
09:10:06.573 [flusher-0-#41%null%] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - Upserted 1 documents, inserted=0, modified=0, upserted=1
09:10:07.062 [main] INFO  i.a.i.e.l.l.yago.YagoLabelsToMongo - Inserted 700000 labels
09:10:13.044 [flusher-0-#41%null%] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - Upserted 10240 documents, inserted=0, modified=1748, upserted=8491
09:10:13.050 [flusher-0-#41%null%] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - Upserted 1 documents, inserted=0, modified=0, upserted=1
09:10:13.450 [main] INFO  i.a.i.e.l.l.yago.YagoLabelsToMongo - Inserted 710000 labels

So the pattern is, putAll 10240 entries, then putAll 1 entry, putAll 10240 entries again, then putAll 1 entry and so on.

Which isn't optimal, it should simply put 10240 entries consistently based on the config above. Note that the default write-behind values (flush 10240 and batch 512) exhibit similar behavior, i.e. during a flush, put a couple of 512 batches, then put 1.

My workaround is to set:

<property name="writeBehindFlushSize" value="10239" />

i.e. 1 less than the batch size, which gives expected behavior:

09:13:50.480 [flusher-0-#41%null%] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - Upserted 10240 documents, inserted=0, modified=552, upserted=9688
09:13:51.644 [main] INFO  i.a.i.e.l.l.yago.YagoLabelsToMongo - Inserted 210000 labels
09:13:53.471 [flusher-0-#41%null%] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - Upserted 10240 documents, inserted=0, modified=606, upserted=9633
09:13:54.501 [main] INFO  i.a.i.e.l.l.yago.YagoLabelsToMongo - Inserted 220000 labels
09:13:56.271 [flusher-0-#41%null%] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - Upserted 10240 documents, inserted=0, modified=612, upserted=9627
09:13:57.121 [main] INFO  i.a.i.e.l.l.yago.YagoLabelsToMongo - Inserted 230000 labels

but this isn't intuitive. When flush size is a multiple of batch size, then behavior should align.

@ceefour ceefour changed the title Improve write-behind counting Improve write-behind flushSize vs batchSize counting Jul 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant