Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H1-hESC cooler based on combination of 2 replicates R1-R2 is actually just R2, only hg38 #4

Open
sergpolly opened this issue Nov 30, 2018 · 2 comments

Comments

@sergpolly
Copy link
Member

sergpolly commented Nov 30, 2018

Just want to bring it up here, to those who are using these coolers and/or dot-calls based on these before it would cause more confusion ...
@nkrietenstein @betulakgol @itsameercat @demerson368 let anyone relevant know this as well.

I'll reference @burakalver as well - and yes we need to redo dot-calls and such using 4DN-mapped coolers, and probably stick to 4DN processed data going forward - at least for 4DN-related purposes.

The title says it all: cooler-files that we generated for H1-hESC using hg38 assembly were not combined properly, such that the R1-R2 cooler is actually identical to R2 ...
just check the screenshot with cooler info output for different libraries/combinations mapped to hg19/hg38:
screenshot from 2018-11-30 10-13-56

here is the screenshot from 4dn portal - everyhting is just fine there - the proper 2.5 billion reads for R1-R2 combination:
screenshot from 2018-11-30 10-29-58

I'll of course address that issue, but only after the 4DN meeting ...

@sergpolly
Copy link
Member Author

confirmation about the # of dot-calls:

sv49w@ghpcc06 ➜  2018-09-09-Snake-dotcalling-test for f in U54-ESC4DN-FA-DpnII-2017524-R*/combineddots/*.bedpe; do dirname $(dirname $f); echo "# of dots called:  " $(wc -l $f|cut -f1 -d' '); done
U54-ESC4DN-FA-DpnII-2017524-R1_hg19
# of dots called:   3599
U54-ESC4DN-FA-DpnII-2017524-R1_hg38
# of dots called:   3638
U54-ESC4DN-FA-DpnII-2017524-R1-R2_hg19
# of dots called:   5804
U54-ESC4DN-FA-DpnII-2017524-R1-R2__hg38
# of dots called:   3019
U54-ESC4DN-FA-DpnII-2017524-R2_hg19
# of dots called:   2940
U54-ESC4DN-FA-DpnII-2017524-R2__hg38
# of dots called:   3019

@sergpolly
Copy link
Member Author

at the same time bedpe-s provided in this repo seems fine for H1-hESC - i.e. R1-R2 is indeed based on a combination of 2 replicates. And they are hg38 at the same time as well ...

so, stuff that we sent to dcic should be fine ...

That means that H1-hESC has been remapped at some point on our side and things went wrong when, combining R1-R2.

@hakanozadam , were you mapping these ? could you please, at least, confirm that H1-hESC-s were remapped a couple of times, to clear confusion a little bit ?

  • Right now in the U54-Deep folder , R1-R2__hg38 is the wrong one (just R2 really).
  • So I'm confused about how come we have combined H1-hESC data on hg38 in this 4dn_jawg repo ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant