Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UHS4 dbGaP submission #11

Open
4 tasks
jaamarks opened this issue Jan 16, 2020 · 3 comments
Open
4 tasks

UHS4 dbGaP submission #11

jaamarks opened this issue Jan 16, 2020 · 3 comments

Comments

@jaamarks
Copy link
Owner

jaamarks commented Jan 16, 2020

Dear Eric,
Thank you for your resubmission of subject consent and subject phenotype files. I checked your phenotype files and our QC test has identified:
dbGaP_phenotypeDS_20191211:
ERROR: 5 subjects have SEX conflicts submitted in version 1 and version 2 of the phs454.
        Subject 'HHG1973' (dbGaP ID 575872): Sex = 1 in version 1, Sex = 2 in version 2;
        Subject 'HHG4751' (dbGaP ID 577134): Sex = 1 in version 1, Sex = 2 in version 2;
        Subject 'HHG6687' (dbGaP ID 578027): Sex = 1 in version 1, Sex = 2 in version 2;
        Subject 'HHG7140' (dbGaP ID 578252): Sex = 2 in version 1, Sex = 1 in verslion2;
        Subject 'HHG7939' (dbGaP ID 578591): Sex = 1 in version 1, Sex = 2 in version 2.
Please check sex of these subjects in the subject phenotype data set [dbGaP_phenotypeDS_20191211] and let me know if you intensely changed sex of these subjects between version 1 and version 2 of the phs452.
Please answer to me as soon as possible.
Please keep in mind that because according to your pedigree file your subjects do not have relationship we removed your pedigree file from your study processing.
Please find you study status at https://www.ncbi.nlm.nih.gov/gap/study/status/6601. 
Regards,
Natasha

Version 002 of phenotype files

dbGaP_phenotypeDD_20191211.xlsx
dbGaP_phenotypeDS_20191211.txt





Issues

  • some entries are blank. They need to be filled in with the missing data entry -99999.
  • need to make missing-entry-numbers consistent. Some have numbers other than -99999, like -8,-2,-99, etc.
  • 5 sex discrepant samples (see quoted at the top). Note that 1=male and 2=female.
  • some data are not filled in, but we actually do have data for them—either in the master phenotype file: s3://rti-heroin/hiv_all_merged_with_uhs_all_phenotype_data_08282017.csv.gz, in the GWAS phenotype, or in the observed genotype data.
@jaamarks jaamarks reopened this Jan 16, 2020
@jaamarks
Copy link
Owner Author

jaamarks commented Jan 16, 2020

5 Sex Discrepant Samples

dbGaP_phenotypeDS_03292013.txt

This sheet was found at:
//RTPNFIL02/eojohnson/HIV/hiv_data/UHS/UHS dbGaP submission/FinalPhenotypeFiles_dbGaP/Updated_030413/Updated_032913/dbGaP_phenotypeDS_03292013.txt

Previous Submission:

subjid sampid consent affection_status hivstat ViralLoad_cperml ViralLoad_Log10 sex gwassex race age age_cat year year_cat lca_groupá mpartners stis anal needleshare sexwork site cluster final_analysis
HHG4751 328952 1 1 1 35000 4.54 1 1 1 44 1 1998 2 3 0 1 0 0 0 2 271 1
HHG6687 407014 1 2 0 -99999 -99999 1 -2 2 37 0 1994 1 1 0 -99999 0 -99999 1 2 -99999 1
HHG7140 513323 1 1 1 1060 3.03 2 2 2 42 1 1992 1 1 0 0 0 -99999 1 2 220 1
HHG7939 448818 1 2 0 -99999 -99999 1 -2 2 38 1 1997 2 2 0 0 0 0 0 2 -99999 1
  • HHG1973 not in this original submission.
  • 1=male, 2=female
  • -2 for gwassex—this is not defined in the dictionary file //RTPNFIL02/eojohnson/HIV/hiv_data/UHS/UHS dbGaP submission/FinalPhenotypeFiles_dbGaP/Updated_030413/dbGaP_phenotypeDD_02072013.txt





2020 Submission

subjid hivstat age_hiv viralload_cperml viralload_log10 sex_selfreport gwassex ancestry_selfreport age surveyyear lca_group mpartners stis anal needleshare sexwork site heroin_ever heroin_ever_inj opioid_ever opioid_ever_inj totopioid_ever heroin_age_ons opioid_age_ons totopioid_age_ons heroin_inj_30d heroin_non_30d opioid_inj_30d opioid_non_30d totopioid_inj_30d totopioid_non_30d totopioid_tot_30d heroin_case opioid_case totopioid_case inj_ever inj_freq inj_case inj_age_ons coc_ever coc_ever_inj amphet_ever amphet_ever_inj sed_ever sed_ever_inj mj_ever coc_age_ons amphet_age_ons sed_age_ons mj_age_ons cocaine_inj_30d cocaine_non_30d totcoc_30d amphet_inj_30d amphet_non_30d totamphet_30d mj_30d totcoc_case totamphet_case mj_case
HHG4751 1 43 35000 5 2 1 1 43 1998 3 0 1 0 0 0 1 1 1 1 1 1 19 20 19 49 0 0   49 0 49 1 0 1 1 57 1 19 1 1 1 1 1 -8 1 19 -99999 -99999 -99999 0 0 0 8 3 11   0 1
HHG6687 0 -99999 -99999 -99999 2 -99999 2 36 1994 1 0 -99999 0 -99999 1 1 1 1 -8 -8 -8 -8 -8 -8 30 15 0 0 30 15 45 1 0 1 1 30 1   1 -8 -8 -8 -8 -8 1 -99999 -99999 -99999 14 0 45 45 0 0 0 1 1 0 0
HHG7140 1 45 1060 3 1 2 2 45 1992 1 0 0 0 -99999 1 1 1 1 1 1 1 21   21 75 0 0   75 0 75 1 0 1 1 125 1 21 1 1 1 1 -8 -8 -99999 30 -99999 -99999 -99999 0 30 30 50 0 50   1 1
HHG7939 0 -99999 -99999 -99999 2 -99999 2 36 1997 2 0 0 0 0 0 1 1 1 -8 -8 -8 -8 -8 -8 120 30 0 0 120 30 150 1 0 1 1 120 1   1 -8 -8 -8 -8 -8 1 -99999 -99999 -99999 12 0 20 20 0 0 0 99 1 0 1
HHG1973 0 -99999 -99999 -99999 2 -99999 2 41 1993 1 1 1 0 0 1 2 1 1 0 0 1 31   31 24 0 0 0 24 0 24 1 0 1 1 24 1 32 1 0 0 0 0 0 1 33     28 0 27 27 0 0 0   1 0





In Latest GWAS

HHG4751 328952@1054755694
HHG6687 407014@1054753316
HHG7140 513323@1054752755
HHG7939 448818@1054697882
HHG1973 942966@1054697675

iid hiv age gender PC1 PC9 PC10
328952@1054755694_328952@1054755694 1 43 1 -0.002 0.0132 5.00E-04
407014@1054753316_407014@1054753316 0 36 1 0.0111 0.0031 0.001
513323@1054752755_513323@1054752755 1 45 0 -0.011 0.0189 0.006
448818@1054697882_448818@1054697882 0 36 1 0.0079 -0.0096 -0.0077
942966@1054697675_942966@1054697675 0 41 1 0.0048 0.0019 -5.00E-04
  • 0 female, 1 male
  • all in the AA GWAS

@jaamarks
Copy link
Owner Author

jaamarks commented Jan 16, 2020

2020 Submission

subjid hivstat age_hiv viralload_cperml viralload_log10 sex_selfreport gwassex ancestry_selfreport age surveyyear lca_group mpartners stis anal needleshare sexwork site heroin_ever heroin_ever_inj opioid_ever opioid_ever_inj totopioid_ever heroin_age_ons opioid_age_ons totopioid_age_ons heroin_inj_30d heroin_non_30d opioid_inj_30d opioid_non_30d totopioid_inj_30d totopioid_non_30d totopioid_tot_30d heroin_case opioid_case totopioid_case inj_ever inj_freq inj_case inj_age_ons coc_ever coc_ever_inj amphet_ever amphet_ever_inj sed_ever sed_ever_inj mj_ever coc_age_ons amphet_age_ons sed_age_ons mj_age_ons cocaine_inj_30d cocaine_non_30d totcoc_30d amphet_inj_30d amphet_non_30d totamphet_30d mj_30d totcoc_case totamphet_case mj_case
HHG4751 1 43 35000 5 2 1 1 43 1998 3 0 1 0 0 0 1 1 1 1 1 1 19 20 19 49 0 0   49 0 49 1 0 1 1 57 1 19 1 1 1 1 1 -8 1 19 -99999 -99999 -99999 0 0 0 8 3 11   0 1
HHG6687 0 -99999 -99999 -99999 2 -99999 2 36 1994 1 0 -99999 0 -99999 1 1 1 1 -8 -8 -8 -8 -8 -8 30 15 0 0 30 15 45 1 0 1 1 30 1   1 -8 -8 -8 -8 -8 1 -99999 -99999 -99999 14 0 45 45 0 0 0 1 1 0 0
HHG7140 1 45 1060 3 1 2 2 45 1992 1 0 0 0 -99999 1 1 1 1 1 1 1 21   21 75 0 0   75 0 75 1 0 1 1 125 1 21 1 1 1 1 -8 -8 -99999 30 -99999 -99999 -99999 0 30 30 50 0 50   1 1
HHG7939 0 -99999 -99999 -99999 2 -99999 2 36 1997 2 0 0 0 0 0 1 1 1 -8 -8 -8 -8 -8 -8 120 30 0 0 120 30 150 1 0 1 1 120 1   1 -8 -8 -8 -8 -8 1 -99999 -99999 -99999 12 0 20 20 0 0 0 99 1 0 1
HHG1973 0 -99999 -99999 -99999 2 -99999 2 41 1993 1 1 1 0 0 1 2 1 1 0 0 1 31   31 24 0 0 0 24 0 24 1 0 1 1 24 1 32 1 0 0 0 0 0 1 33     28 0 27 27 0 0 0   1 0




In GWAS

HHG4751 328952@1054755694
HHG6687 407014@1054753316
HHG7140 513323@1054752755
HHG7939 448818@1054697882
HHG1973 942966@1054697675

iid hiv age gender PC1 PC9 PC10
328952@1054755694_328952@1054755694 1 43 1 -0.002 0.0132 5.00E-04
407014@1054753316_407014@1054753316 0 36 1 0.0111 0.0031 0.001
513323@1054752755_513323@1054752755 1 45 0 -0.011 0.0189 0.006
448818@1054697882_448818@1054697882 0 36 1 0.0079 -0.0096 -0.0077
942966@1054697675_942966@1054697675 0 41 1 0.0048 0.0019 -5.00E-04
  • 0 female, 1 male
  • all in the AA GWAS

@jaamarks
Copy link
Owner Author

Summary of UHS sex issues

20200122_uhs_problematic_sex_n34.xlsx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant