Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve how to handle issues with supporting Windows #35

Open
cmnbroad opened this issue Oct 30, 2018 · 2 comments
Open

Resolve how to handle issues with supporting Windows #35

cmnbroad opened this issue Oct 30, 2018 · 2 comments

Comments

@cmnbroad
Copy link
Collaborator

cmnbroad commented Oct 30, 2018

If we're going to support windows, there are some issues, some subtle and some not, that we should probably address sooner rather than later. (#34 is is failing when the tests run on Windows).

  • Line endings. For source code, test files, and more subtly, text files that have indices that contain offsets, i.e. fasta. We could configure the repo to handle them, but I don't think that would prevent a Windows user from submitting a compressed file with Windows line endings (though I'm not clear on whether all binary types need to be exempted as described in the article). We recently had a GATK developer who submitted a branch developed on Windows - the tests succeeded locally but were failing somewhat mysteriously on travis due to index offset issues caused by line ending differences.

  • Path separator interconversion. We're going to need a way to reference test files from within tests that work on all platforms (once we figure out where test files live, that is). Many of the tests in URI and Path specifier classes for input/output resources. #34 are intended to validate that raw input paths from a user are normalized and handled correctly. Preprocessing the inputs using Paths or URI defeats the purpose, as these perform other transformations.

@magicDGS
Copy link
Member

  • For line-endings, I whould suggest to configure the repo and write detailed guidelines for the Windows user/developer and maintainers should check binary files. Also, we can add a pre-check in the Travis configuration to check for this in compressed files that are changed. But the main point of maintaining a multi-platform development means that, in cases where line-endings are supported in both formats (e.g., FASTA/FASTQ), the tests should not fail independently on the system. That's something that we should take care in development, because we should not use the system-preferences: for input, we should be aware of the kind of line-ending that we are reading and for the output we should set either a default one that can be roundtrip in both systems or the one described in the specs for that format.
  • For path-separator interconversion, that's the main reason why I am against URI and Path specifier classes for input/output resources. #34 - there is already a nice way of getting paths in a consistent way for any system (Paths.get). I guess that, as Disq is doing, we can support both java.nio.Path as the preferred way and IO streams in case a file-system implementation is not available. Correct me if I am wrong, but I guess that the URIs use-case is more to provide a way to flag a resource as being, for example, a GenomicsDB source of records: in that case, I repeat that this should be done in a factory method that handles differently that resource by providing the Path/*Stream necessary. I don't think that we need to have the logic in htsjdk3 for converting the String with a non-path URI to the resource itself, but maybe provide an interface that could be implemented by the user for defining a resource and how to access it. For example, in the case of GenomicDB, the resource can still be difined as a Path once the schema (gendb://) is removed. The factory method supporting GenommicDB could extend the default one, parse the URI and if the gendb: schema is present handle it as the Path that is defined; otherwise, delegate to the super method and handle as a Path-*Stream

I think that we are mixing here two different questions: 1) how line-endings in tests will be handled to be both Windows/Linux compatible and avoid check-in (this topic); 2) if we really need #34, and how we can provide a proper way to pass non-Path URIs in a proper way. For the later, I recommend other issue for opening discussion about the better design or just discusss on the PR.

@cmnbroad
Copy link
Collaborator Author

cmnbroad commented Oct 31, 2018

Yes, please, lets keep the discussion around #34 in that PR, rather than here. The issues here are independent of it; I only mentioned it here because its what brought these issues to my attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants