Add read/write of tags to fileset to restore tags, add fs index bootstrapping #590

robskillington · 2018-05-06T22:44:47Z

No description provided.

…trapping

codecov · 2018-05-07T14:17:07Z

Codecov Report

Merging #590 into master will increase coverage by 0.39%.
The diff coverage is 83.43%.

@@            Coverage Diff            @@
##           master    #590      +/-   ##
=========================================
+ Coverage    81.7%   82.1%   +0.39%     
=========================================
  Files         230     231       +1     
  Lines       22089   22331     +242     
=========================================
+ Hits        18048   18334     +286     
+ Misses       3012    2954      -58     
- Partials     1029    1043      +14

Flag	Coverage Δ
#integration	`64.16% <69.78%> (+0.94%)`	⬆️
#unittests	`78.53% <70.56%> (+0.2%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fd4c714...da4bcb6. Read the comment docs.

prateek · 2018-05-07T17:28:37Z

storage/index/allocator.go

+// NewDefaultMutableSegmentAllocator returns a default mutable segment
+// allocator.
+func NewDefaultMutableSegmentAllocator(
+	iopts instrument.Options,


nit: change this to Options (instead of instrument.Options)

(we override some other defaults in the segment options in the default ctor too)

Sure thing.

prateek · 2018-05-07T17:30:56Z

persist/fs/clone/cloner.go

 		if err != nil {
 			if err == io.EOF {
 				break
 			}
 			return fmt.Errorf("unexpected error while reading data: %v", err)
 		}

+		var tags ident.Tags


this snippet seems to be repeated multiple times in the diff. mind pulling into a common place? m3x/ident/testutil maybe?

Sure thing.

richardartoul · 2018-05-07T17:34:17Z

persist/fs/msgpack/roundtrip_test.go

+	// Set the default values on the fields that did not exist in V1
+	// and then restore them at the end of the test - This is required
+	// because the old decoder won't read the new fields
+	oldEncodedTags := testIndexEntry.EncodedTags


I think I did the same thing in my test, but I was dealing with integer types that would get encoded anyways. Do you want to encode data with tag data included, and then strip them from the test object for the sake of comparison? That way we know the decoder can actually skip over tags data if it is present

Hm, do you mean in the test above or this one? Because in this one the encoded tags should actually be there if we specify encodeLegacyV1IndexEntry: false and encode with tags with values.

richardartoul · 2018-05-07T17:38:20Z

persist/fs/options.go

@@ -112,6 +128,12 @@ func (o *options) Validate() error {
 			"invalid index bloom filter false positive percent, must be >= 0 and <= 1: instead %f",
 			o.indexBloomFilterFalsePositivePercent)
 	}
+	if o.tagEncoderPool == nil {


Is this guarding against someone doing SetTagEncoderPool(nil)? I don't think we do this check everywhere but seems reasonable

prateek · 2018-05-07T17:40:38Z

persist/fs/clone/cloner_test.go

 		b1.IncRef()
 		b2.IncRef()
 		require.Equal(t, t1.String(), t2.String())
+		require.Equal(t, a1.Remaining(), a2.Remaining())
+		numTags, numTagsMatched := a1.Remaining(), 0


there's a matcher for tag iters, so can make this a little shorter if you'd like:

ident.NewTagIterMatcher(ident.NewTagSliceIterator(a1)).Matches(ident.NewTagSliceIterator(a2)))

Ah nice, will update - gracias.

richardartoul · 2018-05-07T17:41:36Z

persist/fs/read.go

@@ -380,21 +388,31 @@ func (r *reader) ReadBloomFilter() (*ManagedConcurrentBloomFilter, error) {
 	)
 }



This method name is a little confusing, it feels like it implies it returns the bytes for an entry when really what it does is just convert arbitrary []byte to (unowned) checked.Bytes. maybe just call it toCheckedBytes or cloneBytes or something

I'll call it entryClonedBytes(...) ta.

prateek · 2018-05-07T17:42:01Z

persist/fs/clone/options.go

@@ -1,3 +1,23 @@
+// Copyright (c) 2018 Uber Technologies, Inc.


note to future self: m3db/ci-scripts#12 ++

Ha, aye, all good.

prateek · 2018-05-07T17:42:35Z

persist/fs/index_lookup_prop_test.go

@@ -65,7 +65,9 @@ func TestIndexLookupWriteRead(t *testing.T) {
 		filePathPrefix := filepath.Join(dir, "")
 		defer os.RemoveAll(dir)

-		options := NewOptions().
+		// NB(r): Use testDefaultOpts to avoid allocing pools each


+1 this test takes forever usually, hopefully this will help

richardartoul · 2018-05-07T17:48:02Z

persist/fs/write.go

+			}
+			data, ok := tagsEncoder.Data()
+			if !ok {
+				return errWriterEncodeTagsDataNotAccessible


When does this happen

Only if you haven't actually encoded anything yet, it should never occur in practice.

richardartoul · 2018-05-07T17:48:56Z

services/m3dbnode/config/bootstrap.go

+			SetInstrumentOptions(opts.InstrumentOptions()).
+			SetDatabaseBlockOptions(opts.DatabaseBlockOptions()).
+			SetSeriesCachePolicy(opts.SeriesCachePolicy()).
+			SetIndexMutableSegmentAllocator(index.NewDefaultMutableSegmentAllocator(iopts))


This wasn't here before right? Why is it being added now

need the ability to create segments for index bootstrap results (they include segments)

As prateek mentions.

richardartoul · 2018-05-07T17:50:47Z

storage/bootstrap/bootstrapper/fs/source.go

@@ -45,10 +47,30 @@ type newDataFileSetReaderFn func(
 	opts fs.Options,
 ) (fs.DataFileSetReader, error)

+type runType int
+
+const (


Any reason to move these out into a more shareable location? I'm thinking no because its really only the FS bootstrapper that uses almost the exact same code-path for bootstrappign data vs index right?

Although true, this is internal to this bootstrapper - for instance for now the peer bootstrapper will have a fundamentally different path when bootstrapping the index as it needs to call FetchMetadata on the client session, hence it won't need any concept of run type.

I don't see the need at this time to cause an abstraction that's not necessarily required and may raise the question "what is this used for", etc when others read the code.

richardartoul · 2018-05-07T17:51:43Z

storage/bootstrap/bootstrapper/fs/source.go

+func (s *fileSystemSource) tagsFromTagsIter(
+	iter ident.TagIterator,
+) (ident.Tags, error) {
+	tags := make(ident.Tags, 0, iter.Remaining())


this code repeats all the time, do you think its worth adding a new method to the iter to just do it?

Also we're not pooling these arrays yeah? I think I need ot do something similar for the commitlog bootstrappign stuff and wasn't sure. I thought @prateek mentioned he created a pool for it

Yeah, so this instance we actually use the ident.Pool to clone the tags however we aren't pooling the arrays. If we're at the point where we need to actually create tags from the tag iterator however we are putting this into a segment which we won't rotate out until it expires, so it's probably fine not to pool for the meantime (similar to when creating the series ID and tags, at that point we just allocate because it'll be around for a while and callers need to take refs to these things without fear of it being deallocated while they use it).

Yeah I don't think pooling is necessary cause these ids/tags would be retained in memory by the series (once we transfer ownership correctly). Longer term I think it'd be worthwhile to see if bulk allocating the slices helps perf but we can come back to that.

richardartoul · 2018-05-07T17:52:32Z

storage/bootstrap/bootstrapper/fs/source.go

+				s.log.Errorf("unable to create index segment: %v", err)
+				hasError = true
+			}
+		default:


richardartoul · 2018-05-07T17:57:00Z

storage/bootstrap/bootstrapper/fs/source.go

-			xlog.NewField("namespace", nsID.String()),
-		).Infof("filesystem bootstrapper resolving block retriever")
+	if run == bootstrapDataRunType {
+		// NB(r): We can only need to cache shard indices and possibly shortcut


This looks like one of the few big code blocks that would be easy to factor out into a method since it doesn't modify any other function state except the blockRetriever (which it could return).

Also, its ok that the blockRetriever is always nil in the index case yeah?

This comment is a little confusing as well.

Maybe something like: "We only need to cache shard indices and marks blocks as fulfilled when bootstrapping data, because the data can be retrieved lazily from disk during reads. On the other hand, if we're bootstrapping the index then we need to rebuild it from scratch by reading all the IDs/tags"

Fair, I can refactor into a method and update the comment.

richardartoul · 2018-05-07T18:12:27Z

storage/series/series.go

 	s.RLock()
+	defer s.RUnlock()


Delete cache all metadata :D

It must die, indeed.

prateek · 2018-05-07T19:02:06Z

persist/fs/seek.go

+	Size        uint32
+	Checksum    uint32
+	Offset      int64
+	EncodedTags []byte


mind updating the docs/diagrams with this change too

Sure thing.

prateek · 2018-05-07T19:04:26Z

persist/fs/retriever.go

@@ -523,6 +537,10 @@ func (req *retrieveRequest) onCallerOrRetrieverDone() {
 	}
 	req.id.Finalize()
 	req.id = nil
+	if req.tags != nil {
+		req.tags.Close()
+		req.tags = nil


should this be req.tags = ident.EmptyTagIterator

Sure thing (we initialize this with ident.EmptyTagIterator when we reset for reuse but we can do that here too).

prateek · 2018-05-07T19:17:46Z

storage/series/series.go

 	s.RLock()
+	defer s.RUnlock()


is the disk fetch (earlier comment) no longer a concern?

I think he removed it because it only applied to cache all metadata and we're planning on deleting that

Yeah we're going to delete the cache all metadata strategy, also it's probably not all that bad to just wait if that's the case (it never should though though as this is a flush, the data should never have come from disk if we're flushing this).

fair enough

richardartoul · 2018-05-07T19:35:33Z

Also do you think its worth adding a simple integration test to make sure this all works end to end? Seems like you could just copy-paste one of our existing FS integration tests and add an additional assertion on the tags

robskillington · 2018-05-07T20:28:39Z

Sure thing will add an integration test.

richardartoul · 2018-05-08T15:37:00Z

docs/architecture/storage.md

-│- Major Version      │  │- Index Entry Offset ├──┘  │- Checksum           │   
+                                                     ┌─────────────────────┐
+┌─────────────────────┐  ┌─────────────────────┐     │     Index File      │
+│      Info File      │  │   Summaries File    │     │   (sorted by ID)    │


Since you're updating this, can you add the new fields I added recently:

type IndexInfo struct { MajorVersion int64 BlockStart int64 BlockSize int64 Entries int64 Summaries IndexSummariesInfo BloomFilter IndexBloomFilterInfo SnapshotTime int64 FileType persist.FileSetType }

Sure thing.

richardartoul · 2018-05-08T15:38:54Z

integration/fs_bootstrap_index_test.go

@@ -0,0 +1,249 @@
+// +build integration
+
+// Copyright (c) 2016 Uber Technologies, Inc.


richardartoul · 2018-05-08T15:41:01Z

integration/fs_bootstrap_tags_test.go

@@ -0,0 +1,126 @@
+// +build integration
+
+// Copyright (c) 2016 Uber Technologies, Inc.


richardartoul · 2018-05-08T15:49:58Z

persist/fs/msgpack/encoder.go

+// backwards-compatbility
+func (enc *Encoder) encodeIndexEntryV1(entry schema.IndexEntry) {
+	// Manually encode num fields for testing purposes
+	enc.encodeArrayLenFn(minNumIndexEntryFields)


I wonder if you should just hard-code the number of use a v1-specific constant here instead of relying on the minimum (which I think is designed for decoding purposes). Seems like a safe bet since this function is designed to simulate the behavior of old binaries

Sure thing.

richardartoul · 2018-05-08T15:50:16Z

persist/fs/msgpack/encoder.go

+}
+
+func (enc *Encoder) encodeIndexEntryV2(entry schema.IndexEntry) {
+	// Manually encode num fields for testing purposes


This comment seems less relevant for the "current" version

richardartoul · 2018-05-08T15:56:04Z

persist/fs/read.go


-	return ident.BinaryID(idClone)
+func (r *reader) entryClonedEncodedTagsTagIter(encodedTags []byte) ident.TagIterator {


super nit: you can probably make this entryClonedEncodedTagsIter so it doesn't stutter as much

Sure thing.

prateek · 2018-05-08T16:33:02Z

integration/fs_bootstrap_index_test.go

+	"testing"
+	"time"
+
+	"github.com/m3db/m3ninx/idx"


nit: import order

prateek · 2018-05-08T16:35:09Z

integration/fs_bootstrap_index_test.go

+	require.NoError(t, err)
+	defer iter.Finalize()
+
+	verifyQueryMetadataResults(t, iter, exhausitive, verifyQueryMetadataResultsOptions{


+1 nice and clean

prateek · 2018-05-08T16:38:17Z

storage/bootstrap/bootstrapper/fs/source.go

@@ -265,28 +317,51 @@ func (s *fileSystemSource) handleErrorsAndUnfulfilled(
 				}
 			}
 		}
-		resultLock.Unlock()
+		// NB(r): We explicitly do not remove entries from the index results


richardartoul · 2018-05-08T16:50:02Z

storage/bootstrap/bootstrapper/fs/source.go

+		// as they are additive and get merged together with results from other
+		// bootstrappers by just appending the result (unlike data bootstrap
+		// results that when merged replace the block with the current block).
+		// It would also be difficult to remove only series that was added to the


prateek · 2018-05-08T16:53:24Z

storage/bootstrap/result/result_index.go

+	idxopts namespace.IndexOptions,
+	opts Options,
+) (segment.MutableSegment, error) {
+	blockStart := t.Truncate(idxopts.BlockSize())


maybe add a quick note about why this truncation is required and the % ==0 guarantee the code relies upon.

Sure thing.

prateek

LGTM

richardartoul · 2018-05-08T16:58:58Z

storage/bootstrap/bootstrapper/fs/source.go

+	runResult.Lock()
+	exists, err = segment.ContainsID(idBytes)
+	// ID and tags no longer required below
+	release()


it looks like FromMetricIter clones the ID so you could probably do:

d, err := convert.FromMetricIter(id, tagsIter) release() if err != nil { return err } runResult.Lock() exists, err = segment.ContainsID(d.ID)

Sure thing.

richardartoul

LGTM with nits

Rob Skillington added 3 commits May 6, 2018 18:43

Add read/write of tags to fileset to restore tags, add fs index boots…

886793d

…trapping

Fix integration test build failures

cf54469

Fix unit test build errors

9014e99

Rob Skillington added 4 commits May 7, 2018 10:46

Fix fs identifier pool

48afa23

Fix lint issues

9a70a80

Fix remaining metalint issues

2aa6ded

Fix encode legacy index entry flag

fe8c764

prateek reviewed May 7, 2018

View reviewed changes

richardartoul reviewed May 7, 2018

View reviewed changes

prateek reviewed May 7, 2018

View reviewed changes

richardartoul reviewed May 7, 2018

View reviewed changes

prateek reviewed May 7, 2018

View reviewed changes

richardartoul reviewed May 7, 2018

View reviewed changes

prateek reviewed May 7, 2018

View reviewed changes

Address feedback

8095ab5

Rob Skillington added 4 commits May 8, 2018 02:17

Address further feedback and split inner fs bootstrapper methods

c580dbe

Add sorted order when returning test tags

78658f5

Add fs tags integration test

8aadf85

Add integration test for testing index queries after bootstrap

64bf392

richardartoul reviewed May 8, 2018

View reviewed changes

Test test build errors

29e5247

richardartoul reviewed May 8, 2018

View reviewed changes

Rob Skillington added 3 commits May 8, 2018 11:57

Address feedback

b219f44

Address nit naming comment

37d4f77

Fix TestPeersBootstrapNodeDown integration test

6170c63

prateek reviewed May 8, 2018

View reviewed changes

integration/fs_bootstrap_index_test.go Outdated

"testing"

"time"

"github.com/m3db/m3ninx/idx"

Copy link

Collaborator

prateek May 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: import order

prateek reviewed May 8, 2018

View reviewed changes

Fix remaining integration tests

be3a1fd

prateek reviewed May 8, 2018

View reviewed changes

richardartoul reviewed May 8, 2018

View reviewed changes

prateek reviewed May 8, 2018

View reviewed changes

prateek approved these changes May 8, 2018

View reviewed changes

richardartoul reviewed May 8, 2018

View reviewed changes

richardartoul approved these changes May 8, 2018

View reviewed changes

Rob Skillington added 3 commits May 8, 2018 13:19

Address feedback

1a8f1f9

Fix forward compatible encoder/decoder tests

22b5680

Fix lint

da4bcb6

robskillington merged commit 5a17284 into master May 8, 2018

robskillington deleted the r/add-bootstrap-index-fs-support branch May 8, 2018 19:00

		@@ -380,21 +388,31 @@ func (r reader) ReadBloomFilter() (ManagedConcurrentBloomFilter, error) {
		)
		}

		@@ -1,3 +1,23 @@
		// Copyright (c) 2018 Uber Technologies, Inc.

		@@ -0,0 +1,249 @@
		// +build integration

		// Copyright (c) 2016 Uber Technologies, Inc.

		@@ -0,0 +1,126 @@
		// +build integration

		// Copyright (c) 2016 Uber Technologies, Inc.


		return ident.BinaryID(idClone)
		func (r *reader) entryClonedEncodedTagsTagIter(encodedTags []byte) ident.TagIterator {

Add read/write of tags to fileset to restore tags, add fs index bootstrapping #590

Add read/write of tags to fileset to restore tags, add fs index bootstrapping #590

Conversation

robskillington commented May 6, 2018

codecov bot commented May 7, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington May 7, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington May 7, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardartoul commented May 7, 2018

robskillington commented May 7, 2018

richardartoul May 8, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prateek May 8, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prateek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardartoul left a comment

Choose a reason for hiding this comment

codecov bot commented May 7, 2018 •

edited

Loading

robskillington May 7, 2018 •

edited

Loading

robskillington May 7, 2018 •

edited

Loading

richardartoul May 8, 2018 •

edited

Loading

prateek May 8, 2018 •

edited

Loading