forked from MikheyevLab/varroa-denovo-genomes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathvarroa-denovo-genomes.Rmd
1367 lines (1086 loc) · 67.1 KB
/
varroa-denovo-genomes.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Divergent selection following speciation in two ectoparasitic
honey bee mites"
author: "Online supplementary data and scripts"
date: "2019-06-07"
output:
html_document:
toc: true
toc_float: true # make
depth: 3
number_sections: false
theme: flatly
code_folding: hide #
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
This document was written in R Markdown, and translated into .html using the R package `knitr`. Press the buttons labeled **Code** to show the R code.
```{r message=FALSE, warning=FALSE, results="hide"}
library("tidyverse")
library("knitr") # for the markdown
library("kableExtra") # for creating a scrolling table
library("ggplot2") # for plotting
library("ape") # for reading the phylogenetic tree
library("Biostrings")
library("ggtree") # for plotting the tree
library("ggrepel") # for spreading text labels on the plot
library("scales") # for axis labels notation
library("tidyverse")
library("GO.db")
library("WGCNA")
library("reshape2")
library("RSQLite")
library("AnnotationDbi")
library("GSEABase")
library("GOstats")
library("maps") # for the map background
library("leaflet") #for the interactive maps
library("htmltools")
library("rgdal")
library("grid")
library("gridExtra")
library("GeneOverlap")
```
**Background:** Varroa mites are specialist ectoparasites of _Apis_ honey bees, distributed originally throughout Asia. Within the cryptic genus, two sister species <span style="color:red"> _Varroa destructor_ </span> and <span style="color:blue"> _Varroa jacobsoni_ </span> have drawn attention as they both independently and repeatedly switched host from the Eastern honey bee _Apis cerana_ to the Western honey bee _A. mellifera_. While _A. cerana_ tolerates mite's presence, the arrival of _V. destructor_ and its associated viruses has been driving _A. mellifera_ colony collapse in many parts of the world.
These two mites are morphologically similar, occupy the same ecological niche as brood parasite and present identical parasite life cycle, which is synchronized to their common hosts. Yet, it is unclear whether _V. destructor_ and _V. jacobsoni_ evolved toward similar or divergent adaptations to their hosts.
We expected that the coevolutive arm-race between the mite-parasite and its bee-host would leave genomic footprints through positively selected genes (dN/dS) in Varroa genomes.
# 1| Reported distribution of Varroa mites on the Eastern honey bee
Aside from genetic barcoding, differentiating <span style="color:red"> _V. destructor_ </span> and <span style="color:blue"> _V. jacobsoni_ </span> on the field requires a taxonomical expert eye or meticulous morphometrical measurements. To better understand their distribution on the same original host, we reviewed the reported presence of each of this mite on _A. cerana_ by collecting data from both literature and NCBI confirming its presence by mtDNA barcoding.
The pattern appearing from the points distribution confirm that these sister species are parapatric and that it becomes possible to assume one species presence also regarding their geographical distribution.
``` {r load worldlayer, warning=FALSE, results="hide", cache = TRUE}
# Download the country borders layer
#download.file("http://thematicmapping.org/downloads/TM_WORLD_BORDERS_SIMPL-0.3.zip" , destfile="world_shape_file.zip")
#system("unzip world_shape_file.zip")
world_spdf=readOGR(dsn= getwd() , layer="TM_WORLD_BORDERS_SIMPL-0.3")
```
The following interactive map focuses on Asia where Varroa mites occur naturally on their original host _A. cerana_ : it is possible to zoom in and out, and passing the cursor on one point give basic information on the year and reporter.
On the top right corner of the map, a control button allow to add or remove layers containing mtDNA haplogroup information for each Varroa species.
Color dots represent species distribution as:
<span style="color:red"> _Varroa destructor_ </span>
<span style="color:blue"> _Varroa jacobsoni_ </span>
<span style="color:green"> _Varroa underwoodi_ </span>
<span style="color:black"> _Varroa sp._ </span>
``` {r leaflet map, warning=FALSE, cache = TRUE}
#Data with exact or approximate GPS coordinates obtained from references
coordvarroa <- read.csv("data/Map/mappingpoints_Varroacerana2019.csv", header = TRUE)
coord.vdac <- coordvarroa %>% filter(Species == "Vdestructor") %>% filter(Host == "Acerana")
coord.vjac <- coordvarroa %>% filter(Species == "Vjacobsoni") %>% filter(Host == "Acerana")
### Subset the data for only Varroa sp. on the Eastern honey bee
coord.vspac <- coordvarroa %>% filter(Species == "Vsp.") %>% filter(Host == "Acerana")
coord.vunac <- coordvarroa %>% filter(Species == "Vunderwoodi") %>% filter(Host == "Acerana")
## Distribution of Varroa destructor mtDNA lineages on Apis cerana
vdac.K <- coord.vdac[coord.vdac$Haplogroup =="Korea",]
vdac.J <- coord.vdac[coord.vdac$Haplogroup =="Japan",]
vdac.KJ <- coord.vdac[coord.vdac$Haplogroup =="Korea & Japan",]
vdac.V <- coord.vdac[coord.vdac$Haplogroup =="Vietnam",]
vdac.KV <- coord.vdac[coord.vdac$Haplogroup =="Korea & Vietnam",]
vdac.C1 <- coord.vdac[coord.vdac$Haplogroup =="China C1",]
vdac.C2 <- coord.vdac[coord.vdac$Haplogroup =="China C2",]
vdac.C2C3 <- coord.vdac[coord.vdac$Haplogroup =="China C2 & C3",]
vdac.N <- coord.vdac[coord.vdac$Haplogroup =="Nepal",]
vjac.Mal <- coord.vjac[coord.vjac$Haplogroup =="Malaysia",]
vjac.Jav <- coord.vjac[coord.vjac$Haplogroup =="Java",]
vjac.Amb <- coord.vjac[coord.vjac$Haplogroup =="Ambon",]
vjac.Lom <- coord.vjac[coord.vjac$Haplogroup =="Lombok",]
vjac.Bal <- coord.vjac[coord.vjac$Haplogroup =="Bali",]
vjac.Sbw <- coord.vjac[coord.vjac$Haplogroup =="Sumbawa",]
vjac.Sum <- coord.vjac[coord.vjac$Haplogroup =="Sumatra",]
vjac.Flo <- coord.vjac[coord.vjac$Haplogroup =="Flores",]
vjac.Sam <- coord.vjac[coord.vjac$Haplogroup =="Samui",]
vjac.Bor <- coord.vjac[coord.vjac$Haplogroup =="Borneo",]
vjac.NT <- coord.vjac[coord.vjac$Haplogroup =="NorthThai",]
vjac.NTM <- coord.vjac[coord.vjac$Haplogroup =="Malaysia & NorthThai",]
# Prepare the map title
tag.map.title2 <- tags$style(HTML("
.leaflet-control.map-title2 {
transform: translate(-50%,20%);
text-align: center;
padding-left: 10px;
padding-right: 10px;
background: rgba(255,255,255,0.75);
font-weight: bold;
font-size: 20px;
}
"))
title <- tags$div(
tag.map.title2, HTML("Varroa mites on Apis cerana")
)
leaflet(coord.vdac) %>%
addTiles(group = "OSM (default)") %>%
## add the layer other than default we would like to use for background
addProviderTiles(providers$CartoDB.PositronNoLabels, group = "Positron NoLabels") %>%
## VARROA DESTRUCTOR LAYERS
addCircleMarkers(data = coord.vdac, coord.vdac$coord.Y, coord.vdac$coord.X,
weight = 0.5,
col = "red",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. destructor') %>%
## adding each lineage group as a layer for V. destructor on cerana
addCircleMarkers(data = vdac.K, vdac.K$coord.Y, vdac.K$coord.X,
weight = 0.5,
col = "#FB0000",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. destructor K1 lineage') %>%
addCircleMarkers(data = vdac.J, vdac.J$coord.Y, vdac.J$coord.X,
weight = 0.5,
col = "#9707E7",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. destructor J1 lineage') %>%
addCircleMarkers(data = vdac.KJ, vdac.KJ$coord.Y, vdac.KJ$coord.X,
weight = 0.5,
col = "#9005DC",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'K1 & J1 co-infection') %>%
addCircleMarkers(data = vdac.V, vdac.V$coord.Y, vdac.V$coord.X,
weight = 0.5,
col = "#FFAA00",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. destructor V1 lineage') %>%
addCircleMarkers(data = vdac.C1, vdac.C1$coord.Y, vdac.C1$coord.X,
weight = 0.5,
col = "#920053",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. destructor C1 lineage') %>%
addCircleMarkers(data = vdac.C2, vdac.C2$coord.Y, vdac.C2$coord.X,
weight = 0.5,
col = "#B53A80",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. destructor C2 lineage') %>%
addCircleMarkers(data = vdac.C2C3, vdac.C2C3$coord.Y, vdac.C2C3$coord.X,
weight = 0.5,
col = "#CE97B6",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'C2 & C3 co-infection') %>%
addCircleMarkers(data = vdac.N, vdac.N$coord.Y, vdac.N$coord.X,
weight = 0.5,
col = "#4A4A4A",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. destructor? Nepal lineage') %>%
addCircleMarkers(data = vdac.KV, vdac.KV$coord.Y, vdac.KV$coord.X,
weight = 0.5,
col = "#FFDD00",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'K1 & V1 co-infection') %>%
## VARROA JACOBSONI
addCircleMarkers(data = coord.vjac, coord.vjac$coord.Y, coord.vjac$coord.X,
weight = 0.5,
col = "blue",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni') %>%
## adding each lineage group as a layer for V. jacobsoni on cerana
addCircleMarkers(data = vjac.NT, vjac.NT$coord.Y, vjac.NT$coord.X,
weight = 0.5,
col = "#5A0DC1",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni NorthThai') %>%
addCircleMarkers(data = vjac.Mal, vjac.Mal$coord.Y, vjac.Mal$coord.X,
weight = 0.5,
col = "#0F4ABF",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Malaysia') %>%
addCircleMarkers(data = vjac.Sam, vjac.Sam$coord.Y, vjac.Sam$coord.X,
weight = 0.5,
col = "#5A7BBC",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Samui') %>%
addCircleMarkers(data = vjac.NTM, vjac.NTM$coord.Y, vjac.NTM$coord.X,
weight = 0.5,
col = "#137575",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Malaysia & NorthThai') %>%
addCircleMarkers(data = vjac.Bor, vjac.Bor$coord.Y, vjac.Bor$coord.X,
weight = 0.5,
col = "#6AB28D",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Borneo') %>%
addCircleMarkers(data = vjac.Jav, vjac.Jav$coord.Y, vjac.Jav$coord.X,
weight = 0.5,
col = "#0F933C",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Java') %>%
addCircleMarkers(data = vjac.Sum, vjac.Sum$coord.Y, vjac.Sum$coord.X,
weight = 0.5,
col = "#A4CE01",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Sumatra') %>%
addCircleMarkers(data = vjac.Bal, vjac.Bal$coord.Y, vjac.Bal$coord.X,
weight = 0.5,
col = "#D6D601",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Bali') %>%
addCircleMarkers(data = vjac.Lom, vjac.Lom$coord.Y, vjac.Lom$coord.X,
weight = 0.5,
col = "#FFFC56",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Lombok') %>%
addCircleMarkers(data = vjac.Sbw, vjac.Sbw$coord.Y, vjac.Sbw$coord.X,
weight = 0.5,
col = "#FFDD00",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Sumbawa') %>%
addCircleMarkers(data = vjac.Flo, vjac.Flo$coord.Y, vjac.Flo$coord.X,
weight = 0.5,
col = "#FFE9A7",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Flores') %>%
addCircleMarkers(data = vjac.Amb, vjac.Amb$coord.Y, vjac.Amb$coord.X,
weight = 0.5,
col = "#3A3831",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'V. jacobsoni Ambon') %>%
## adding Varroa sp. layer
addCircleMarkers(data = coord.vspac, coord.vspac$coord.Y, coord.vspac$coord.X,
weight = 0.5,
col = "black",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'Varroa sp. Luzon lineage') %>%
addCircleMarkers(data = coord.vunac, coord.vunac$coord.Y, coord.vunac$coord.X,
weight = 0.5,
col = "green",
radius = 4,
fillOpacity = 0.9,
stroke = T,
label = ~as.character(Description),
group = 'Varroa underwoodi') %>%
addLayersControl(position = "topright",
baseGroups = c("OSM (default)", "Positron NoLabels"),
overlayGroups = c("V. destructor",
"V. jacobsoni",
"Varroa sp. Luzon lineage",
"Varroa underwoodi",
"V. destructor K1 lineage",
"V. destructor J1 lineage",
"K1 & J1 co-infection",
"V. destructor V1 lineage",
"K1 & V1 co-infection",
"V. destructor C1 lineage",
"V. destructor C2 lineage",
"C2 & C3 co-infection",
"V. destructor? Nepal lineage",
"V. jacobsoni NorthThai",
"V. jacobsoni Malaysia",
"V. jacobsoni Malaysia & NorthThai",
"V. jacobsoni Samui", "V. jacobsoni Sumatra",
"V. jacobsoni Borneo",
"V. jacobsoni Java",
"V. jacobsoni Bali",
"V. jacobsoni Lombok",
"V. jacobsoni Sumbawa",
"V. jacobsoni Flores",
"V. jacobsoni Ambon"),
options = layersControlOptions(collapsed = TRUE)) %>%
## adding a title for the map
addControl(title, position = "bottomright", className="map-title2") %>%
## show the positron background prerably to the OSM layer
showGroup("Positron NoLabels") %>%
hideGroup("V. destructor K1 lineage") %>%
hideGroup("V. destructor J1 lineage") %>%
hideGroup("K1 & J1 co-infection") %>%
hideGroup("V. destructor V1 lineage") %>%
hideGroup("K1 & V1 co-infection") %>%
hideGroup("V. destructor C1 lineage") %>%
hideGroup("V. destructor C2 lineage") %>%
hideGroup("C2 & C3 co-infection") %>%
hideGroup("V. destructor? Nepal lineage") %>%
hideGroup("V. jacobsoni NorthThai") %>%
hideGroup("V. jacobsoni Malaysia") %>%
hideGroup("V. jacobsoni Malaysia & NorthThai") %>%
hideGroup("V. jacobsoni Samui") %>%
hideGroup("V. jacobsoni Sumatra") %>%
hideGroup("V. jacobsoni Borneo") %>%
hideGroup("V. jacobsoni Java") %>%
hideGroup("V. jacobsoni Bali") %>%
hideGroup("V. jacobsoni Lombok") %>%
hideGroup("V. jacobsoni Sumbawa") %>%
hideGroup("V. jacobsoni Flores") %>%
hideGroup("V. jacobsoni Ambon")
```
# 2| Annotated orthogroups for Varroa mites and Acari
## Details of the orthogroups defined by ORTHONOME
<span style="color:red"> _Varroa destructor_ </span> [(GCA_002443255.1)](https://www.ncbi.nlm.nih.gov/assembly/GCF_002443255.1/) and <span style="color:blue"> _Varroa jacobsoni_ </span> [(GCA_002532875.1)](https://www.ncbi.nlm.nih.gov/assembly/GCF_002532875.1/) genomes were compared with those of four other Acari species:
- the non-Varroid honey bee ectoparasite _Tropilaelaps mercedesae_ [(GCA_002081605.1)](https://www.ncbi.nlm.nih.gov/assembly/GCA_002081605.1/);
**Dong et al,. GigaScience (2017), 6(3), 1-17**
- the ectoparasitic tick _Ixodes scapularis_ [(GCA_000208615.1)](https://www.ncbi.nlm.nih.gov/assembly/GCF_000208615.1/);
- the free-living mite _Metaseiulus occidentalis_ [(GCA_000255335.1)](https://www.ncbi.nlm.nih.gov/assembly/GCF_000255335.1/);
**Hoy et al., Genome Biology and Evolution (2016), 8(6), 1762-1775**
- the free-living mite _Tetranychus urticae_ [(GCA_000239435.1)](https://www.ncbi.nlm.nih.gov/assembly/GCF_000239435.1/)
**Grbicet al,. Nature (2011), 479(7374), 487-492**
``` {r orthogroup plot, cache = TRUE}
tableS1 <- read.csv("data/Orthogroups/tableS01-orthogroups.csv")
g <- ggplot(tableS1, aes(x=shared))
g <- g + geom_bar(color="black", fill="gray")
g <- g + labs(title="Shared orthogroups by the six species",x="Number of species", y = "Orthrogroups")
g <- g + geom_text(stat='count', aes(label=..count..), vjust=-1)
g <- g + theme_classic()
g <- g + ylim(0,5500)
g
```
Below is the summary output of orthologs genes (with NCBI gene ID) among the six Acari species as obtained using the [ORTHONOME](www.orthonome.com) pipeline.
SOG = super orthologue group and OG = orthologue group.
``` {r orthogroup table, cache = TRUE}
kable(cbind(tableS1)) %>%
kable_styling(bootstrap_options = "striped", font_size = 10) %>%
scroll_box(width = "100%", height = "400px")
```
## Gene ontology terms associated with orthologs clusters
Regarding the Acari phylogenetic tree (Figure 3), ortholog genes specific to each lineage were enriched using the [goatools pipeline](https://github.com/tanghaibao/goatools).
``` {r phylotree, cache = TRUE}
acariphylo <- read.tree("data/Orthogroups/tree_rooted.nwk")
ggtree(acariphylo) +
geom_tiplab() +
geom_nodepoint()+
geom_hilight(node=11, fill="purple", alpha = 0.1) +
geom_cladelabel(node=11, label="Varroidae", color="purple2", offset=.20, align=TRUE)+
geom_hilight(node=10, fill="purple", alpha = 0.1) +
geom_hilight(node=9, fill="purple", alpha = 0.1) +
geom_hilight(node=8, fill="purple", alpha = 0.1) +
theme_tree2() +
xlim(0,1.1)+
theme_tree()
```
Below, are the details for these orthology clusters shared by Varroa mites and _T. mercedesae_ (ASIAN-MITES); + _M. occidentalis_ (MESOSTIGMATA); + _I. scapularis_ (PARASITIFORMES); and + _T. urticae_ (ROOT).
Ontology BP = Biological Processes, CC = Cellular Components and MF = Molecular Function.
``` {r orthogroup GOterms, cache = TRUE}
tableS2 <- read.csv("data/Orthogroups/tableS02-goshared.csv")
kable(cbind(tableS2)) %>%
kable_styling(bootstrap_options = "striped", font_size = 10) %>%
scroll_box(width = "100%", height = "400px")
```
# 3| Genes detected under positive selection
`HyPhy` and `aBSREL` method were used to test for each branch of the Acari phylogeny, whether a proportion of sites have evolved under positive selection. Raw results for genes detected under positive selection after FDR correction (p_Holm) are presented on the following table with orthogroups ID.
Vjac = _V. jacobsoni_, Vdes = _V. destructor_, Tmer = _T. mercedesae_, Mocc = _M. occidentalis_, Isca = _I. scapularis_, Turt = _T. urticae_
``` {r hyPhy GOterms, cache = TRUE}
tableHyPhy <- read.csv("data/Positive selection/VarroaAbsrelResults.csv")
kable(cbind(tableHyPhy)) %>%
kable_styling(bootstrap_options = "striped", font_size = 10) %>%
scroll_box(width = "100%", height = "400px")
```
## Chromosomal localisation and Varroa karyogram
Here we created a karyogram of the seven major chromosomes assembled in the <span style="color:red"> _Varroa destructor_ </span> reference genome and plot the location of the genes detected under positive selection. The localization and description of each gene were summarized in separate species files after searching for their ID in the NCBI database and the .gff annotation file.
The following code generates a .pdf file with the full karyogram, but it is necessary to look at it with high resolution to correctly see the bands. For illustration purpose, only a snapshot in a .tiff file is presented as a preview.
The karyogram plot was inspired by the script detailed [here](https://www.biostars.org/p/269857/).
``` {r karyogram positive sel, message=FALSE, warning=FALSE, results="hide", cache = TRUE}
# chr1 is BEIS01000001.1 or NW_019211454.1
# chr2 is BEIS01000002.1 or NW_019211455.1
# chr3 is BEIS01000003.1 or NW_019211456.1
# chr4 is BEIS01000004.1 or NW_019211457.1
# chr5 is BEIS01000005.1 or NW_019211458.1
# chr6 is BEIS01000006.1 or NW_019211459.1
# chr7 is BEIS01000007.1 or NW_019211460.1
# Varroa destructor chromosome sizes
chrom_sizes <- structure(list(V1 = c("chr1", "chr2", "chr3", "chr4",
"chr5", "chr6", "chr7"), V2 = c(76898487L, 60562263L, 58536683L, 52889743L, 41990949L, 32556156L, 39399627L)), .Names = c("V1",
"V2"), class = "data.frame", row.names = c(NA, -7L))
colnames(chrom_sizes) <- c("chromosome", "size")
## Read files with information on chromosome location (for illustration purpose genes span was 5000bp from the starting point to be visible on the karyogram)
regionsvd <- read.csv("data/Positive selection/Vd-location-17072018-short.csv")
regionsvj <- read.csv("data/Positive selection/Vj-location-17072018-short.csv")
common <- read.csv("data/Positive selection/Varroa-ancestor-20072018-short.csv")
chrom_order <- c("chr1", "chr2", "chr3", "chr4", "chr5", "chr6", "chr7")
chrom_key <- setNames(object = as.character(c(1, 2, 3, 4, 5, 6, 7)), nm = chrom_order)
chrom_order <- factor(x = chrom_order, levels = rev(chrom_order))
# convert the chromosome column in each dataset to the ordered factor
chrom_sizes[["chromosome"]] <- factor(x = chrom_sizes[["chromosome"]], levels = chrom_order)
#pdf(file = "data/Positive selection/VDVJ-POSSEL.pdf", width = 20, height = 12)
POSEL <- ggplot(data = chrom_sizes)
# base rectangles for the chroms, with numeric value for each chrom on the x-axis
POSEL <- POSEL + geom_rect(aes(xmin = as.numeric(chromosome) - 0.3,
xmax = as.numeric(chromosome) + 0.3,
ymax = size, ymin = 0),
colour="black", fill = "white")
# rotate the plot 90 degrees
POSEL <- POSEL + coord_flip()
# give the appearance of a discrete axis with chrom labels
POSEL <- POSEL + scale_x_discrete(name = "chromosome", limits = names(chrom_key))
# add bands for regions for VD
POSEL <- POSEL + geom_rect(data = regionsvd, aes(xmin = as.numeric(chromosome) - 0.0,
xmax = as.numeric(chromosome) + 0.3,
ymax = end5000bp, ymin = start, fill = "V. destructor"))
# add bands for regions for both
POSEL <- POSEL + geom_rect(data = common, aes(xmin = as.numeric(chromosome) - 0.3,
xmax = as.numeric(chromosome) + 0.3,
ymax = end5000bp, ymin = start, fill = c("Common to both Varroa")))
# add bands for regions for VJ
POSEL <- POSEL + geom_rect(data = regionsvj, aes(xmin = as.numeric(chromosome) - 0.3,
xmax = as.numeric(chromosome) + 0.0,
ymax = end5000bp, ymin = start, fill = "V. jacobsoni"))
POSEL <- POSEL + scale_fill_manual(values = c( "black","red2", "blue"))
POSEL <- POSEL + ggtitle("Genes detected under positive selection in Varroa mite lineages")
# supress scientific notation on the y-axis
POSEL <- POSEL + scale_y_continuous(breaks = c(0e+00, 1e+07, 2e+07,3e+07,4e+07,5e+07,6e+07,7e+07), labels = comma) +
xlab("Chromosomes")+
ylab("region (bp)")
# black & white color theme
POSEL <- POSEL + theme(panel.grid.major = element_line(colour = 'gray70', size=0.2, linetype = "dashed"),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(colour = "black", size = 16),
axis.title.x = element_text(colour = "black", size = 20, face = "bold"),
axis.text.y = element_text(colour = "black", size = 20),
axis.title.y = element_text(colour = "black", size = 20, face = "bold"),
plot.title = element_text(color = "black", size = 24, face = "bold"),
legend.text=element_text(size=24, face = "bold.italic"),
legend.position=c(0.85, 0.80),
legend.title = element_blank())
#POSEL
#dev.off()
```
The converted .tiff file obtained from the .pdf is presented here as an online preview. The figure 4 has been modified and improved from this base using Adobe Illustrator.
```{r snapshot plot1, out.width = "100%", cache = TRUE}
knitr::include_graphics("data/Positive selection/VDVJ-POSSEL.png")
```
## Genes under positive selection between the split of Varroa species
The following table list all *40 genes under positive selection* shared only by both Varroa mite genomes before their split. Possible genes implicated in tolerance to external stressors or stimuli (Table 2) is indicated in black.
``` {r varroa possel, cache = TRUE}
tableS3 <- read.csv("data/Positive selection/tableS03-posselvarroa.csv")
kable(cbind(tableS3)) %>%
kable_styling(bootstrap_options = "striped", font_size = 9) %>%
row_spec(15, bold = T, color = "white", background = "black")
```
## Genes under positive selection specific to _V. destructor_
The following table list all *234 genes under positive selection* found only in <span style="color:red"> _Varroa destructor_ </span>. Possible genes implicated in tolerance to external stressors or stimuli (Table 2) is indicated in red (see keywords).
``` {r vdestructor possel, cache = TRUE}
tableS4 <- read.csv("data/Positive selection/tableS04-posvd.csv")
kable(cbind(tableS4)) %>%
kable_styling(bootstrap_options = "striped", font_size = 9) %>%
row_spec(5, bold = T, color = "white", background = "#D7261E") %>%
row_spec(33, bold = T, color = "white", background = "#D7261E") %>%
row_spec(46, bold = T, color = "white", background = "#D7261E") %>%
row_spec(53, bold = T, color = "white", background = "#D7261E") %>%
row_spec(81, bold = T, color = "white", background = "#D7261E") %>%
row_spec(93, bold = T, color = "white", background = "#D7261E") %>%
row_spec(110, bold = T, color = "white", background = "#D7261E") %>%
row_spec(143, bold = T, color = "white", background = "#D7261E") %>%
row_spec(146:148, bold = T, color = "white", background = "#D7261E") %>%
row_spec(170, bold = T, color = "white", background = "#D7261E") %>%
row_spec(172, bold = T, color = "white", background = "#D7261E") %>%
row_spec(186, bold = T, color = "white", background = "#D7261E") %>%
row_spec(208, bold = T, color = "white", background = "#D7261E") %>%
row_spec(229, bold = T, color = "white", background = "#D7261E")
```
## Genes under positive selection specific to _V. jacobsoni_
The following table list all *224 genes under positive selection* found only in <span style="color:blue"> _Varroa jacobsoni_ </span>. Possible genes implicated in tolerance to external stressors or stimuli (Table 2) is indicated in blue (see keywords).
``` {r vjacobsoni possel, cache = TRUE}
tableS5 <- read.csv("data/Positive selection/tableS05-posvj.csv")
kable(cbind(tableS5)) %>%
kable_styling(bootstrap_options = "striped", font_size = 9) %>%
row_spec(17, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(29, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(32, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(44, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(63, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(80, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(107, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(127, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(132, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(188, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(194, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(198, bold = T, color = "white", background = "#1A8CFF") %>%
row_spec(212, bold = T, color = "white", background = "#1A8CFF")
```
## Fisher's exact test on overlapping genes
We tested whether the set of genes under positive selection were significantly overlapping between _V. destructor_ and _V. jacobsoni_ using Fisher's exact test from the `GeneOverlap` package.
```{r overlap genepositive, cache = TRUE}
vd.list <- as.character(tableS4$pos.vd.gene.id)
vj.list <-as.character(tableS5$ortholog.vd) ## to check the intersect we need to look at the gene ID from Vdes_3.0 genome
gs.orthologs = 9628 #genes
##Testing overlap between two gene lists
go.obj <- newGeneOverlap(vd.list, vj.list, gs.orthologs)
go.obj <- testGeneOverlap(go.obj)
print(go.obj)
## if pvalue is under 0.05 then means overlap is highly significant
## if Odd ratio is equal or less than 1, there is no association between the two lists
## Jaccard index measures the similarity of the list between 0 and 1, no similarity
```
The test results show that the overlap is highly significant, considering the 9,628 orthologs genes by both mites and that both _V. destructor_ and _V. jacobsoni_ positively selected genes will overlap (n = 13). However, we checked the significativity of this test by additionally performing a Fisher's exact two-tailed test, which also revealed significant.
```{r fishertest, cache = TRUE}
fisher.test(matrix(c(9128, 221, 212, 13), nrow=2))
```
## GO Term enrichment of positively selected genes
We used the package `GOstats` to enrich genes under positive selection for each Varroa mite species and their set of shared genes. We used a hypergeometric test for GO Term association with the `hyperGTest` function to detect whether or not specific terms are significantly overrepresented.
Tests were done with different parameters for the ontology; with BP = Biological Processes, MF = Molecular Function, and CC = Cellular component.
``` {r GOterms possel, warning=FALSE, cache = TRUE}
#Preparing the data
Vd.data <-read.csv("data/Positive selection/vdesgoassoc.csv")
#dim(Vd.data)
#summary(Vd.data)
#Vd.data$Gene.id
Vd.melt <- Vd.data %>%
gather(gene.id, GO.ids, X:X.136) %>%
arrange(Gene.id)
Vd.melt2 <-Vd.melt[!(is.na(Vd.melt$GO.ids) | Vd.melt$GO.ids==""),]
#head(Vd.melt2)
#write.csv(Vd.melt2, file = "data/Positive selection/VdesGOready.csv")
## In the csv I removed the LOC before sequence number to avoid problems due to character and integerFALSE compatibility
##Preparing the GO frame
annot.vd <- read.csv("data/Positive selection/VdesGOready2.csv")
annot.vd2 <- annot.vd %>%
mutate(evidence = "IEA") %>%
dplyr::select(go_id = GO.ids, evidence, gene = Gene.id)
#head(annot.vd2)
goFrame.vd <-GOFrame(annot.vd2, organism = "Vd")
goAllFrame.vd <-GOAllFrame(goFrame.vd)
gsc.vd <-GeneSetCollection(goAllFrame.vd, setType = GOCollection())
universe.vd <- unique(annot.vd2$gene)
#head(universe.vd)
genesposel.vd <- read.csv("data/Positive selection/Vdesselected1511.csv")
genes.vd <- unique(genesposel.vd$geneno)
#head(genes.vd)
#set <- intersect(genes,universe)
#lapply(genes, function(x) x %in% universe)
params.vd <- GSEAGOHyperGParams(name = "Vd_GO_enrichment",
geneSetCollection = gsc.vd,
geneIds = genes.vd,
universeGeneIds = universe.vd,
ontology = "BP", # change with MF, CC to test all
pvalueCutoff = 0.05,
conditional = F,
testDirection = "over")
over.vd <- hyperGTest(params.vd)
#summary(over.vd)
GO_enrich.vd <- as.data.frame(summary(over.vd))
#GO_enrich.vd %>%
# arrange(Pvalue) %>%
# write.csv(file = "data/Positive selection/GO_term_enrichment_VdBP-15112018.csv")
#### V. jacobsoni
Vj.data <-read.csv("data/Positive selection/vjacgoassoc.csv")
#dim(Vj.data)
#summary(Vj.data)
#Vj.data$Gene.id
Vj.melt <- Vj.data %>%
gather(gene.id, GO.ids, X:X.136) %>%
arrange(Gene.id)
Vj.melt2 <-Vj.melt[!(is.na(Vj.melt$GO.ids) | Vj.melt$GO.ids==""),]
#head(Vj.melt2)
#write.csv(Vj.melt2, file = "data/Positive selection/VjacGOready.csv")
## In the csv I removed the LOC before sequence number to avoid problems due to character and integerFALSE compatibility
#Enrichment with GOstats
annot.vj <- read.csv("data/Positive selection/VjacGOready2.csv")
annot.vj2 <- annot.vj %>%
mutate(evidence = "IEA") %>%
dplyr::select(go_id = GO.ids, evidence, gene = Gene.id)
#head(annot.vj2)
goFrame.vj <-GOFrame(annot.vj2, organism = "Vj")
goAllFrame.vj <-GOAllFrame(goFrame.vj)
gsc.vj <-GeneSetCollection(goAllFrame.vj, setType = GOCollection())
universe.vj <- unique(annot.vj2$gene)
#head(universe.vj)
genesposel.vj <- read.csv("data/Positive selection/Vjacselected1511.csv")
genes.vj <- unique(genesposel.vj$geneno)
#head(genes.vj)
#set <- intersect(genes,universe)
#lapply(genes, function(x) x %in% universe)
params.vj <- GSEAGOHyperGParams(name = "Vj_GO_enrichment",
geneSetCollection = gsc.vj,
geneIds = genes.vj,
universeGeneIds = universe.vj,
ontology = "BP",
pvalueCutoff = 0.05,
conditional = F,
testDirection = "over")
over.vj <- hyperGTest(params.vj)
#summary(over.vj)
GO_enrich.vj <-as.data.frame(summary(over.vj))
#GO_enrich.vj %>%
# arrange(Pvalue) %>%
# write.csv(file = "data/Positive selection/GO_term_enrichment_VjBP-15112018.csv")
```
Once all tests have been run for <span style="color:red"> _V. destructor_ </span> and <span style="color:blue"> _V. jacobsoni_ </span> genomes, the significantly overrepresented GO Terms and their description (p-value < 0.05) were concatenated in a summary table presented below:
``` {r GOterms posseltab, cache = TRUE}
tableS6 <- read.csv("data/Positive selection/tableS06-posGOterm.csv")
kable(cbind(tableS6)) %>%
kable_styling(bootstrap_options = "striped", font_size = 8) %>%
scroll_box(width = "100%", height = "800px")
```
## Fisher's exact test on overlapping GO terms
Since our overlap regarding the exact gene ID under positive selection was significant for the small number detected, we tested if their associated GO terms were also significantly overlapping or rather indicating different metabolic pathways involved.
On the original BLAST2GO association gene ID and GO terms for _V. destructor_ genome, a total of 6,608 GO terms were found and considered as the full set.
```{r overlap gotermpositive, cache = TRUE}
vd.golist <-read.csv("data/Positive selection/vdgolist.csv", header = FALSE)
as.character(vd.golist$V1)
vj.golist <-read.csv("data/Positive selection/vjgolist.csv", header = FALSE)
as.character(vj.golist$V1)
## if pvalue is under 0.05 then means overlap is highly significant
## if Odd ratio is equal or less than 1, there is no association between the two lists
## Jaccard index measures the similarity of the list between 0 and 1, no similarity
gs.goterm = 6608 #GO terms from the BLAST2GO association in Vdes_3.0 genome
##Testing overlap between two GO term list
go.obj3 <- newGeneOverlap(vd.golist$V1, vj.golist$V1, gs.goterm)
go.obj3 <- testGeneOverlap(go.obj3)
print(go.obj3)
fisher.test(matrix(c(6220, 198, 187, 3), nrow=2))
```
Here, the small overlap is considered not significant by both one-tailed and two-tailed Fisher's exact test, indicating that with the effective tested here, the 201 and 190 GO terms enriched for the positively selected genes of _V. destructor_ and _V. jacobsoni_ are involved in different biological processes, molecular function, and cellular components.
Now we have all the significant GO terms associated with the genes under positive selection, and we would like to reduce and visualize the most redundant ones. To observe whether or not the same GO terms or related terms are overlapping or being very different for the two Varroa sister species, we use the online tool of [REVIGO](http://revigo.irb.hr/).
**Supek et al., PloS One (2011), 6(7), e21800**
We allowed a medium (70%) to small semantic similarity of the GO terms (based on Resnik's and Lin's measures which use the semantic similarity of two GO terms related to their common ancestor terms.)
**Schlicker et al., BMC Bioinformatics (2006), 7(1):302**.
REVIGO produces an R script which generates the bobble chart or the treemap as in Figure 3 and 4 in the manuscript. Below is an example of scatter-bubble chart script generated by REVIGO for the positively selected genes with "Small Similarity" options using the Resnik's measure and choosing to run the data as follow:
Example of input in REVIGO
(Vdes GO term are coded with 0.1 and Vjac 0.001 so we can have bobble color by species)
GO:0003254 0.1
GO:0051899 0.1
GO:0097306 0.1
GO:0006627 0.001
GO:0034982 0.001
...
``` {r GOterms revigo, message = FALSE}
# --------------------------------------------------------------------------
# Here is your data from REVIGO. Scroll down for plot configuration options.
revigo.names <- c("term_ID","description","frequency_%","plot_X","plot_Y","plot_size","log10_p_value","uniqueness","dispensability");
revigo.data <- rbind(c("GO:0009611","response to wounding", 0.127, 2.161,-7.619, 4.212,-3.0000,0.922,0.000),
c("GO:0017187","peptidyl-glutamic acid carboxylation", 0.006, 5.531, 4.570, 2.865,-3.0000,0.943,0.000),
c("GO:0032502","developmental process", 2.812,-1.295,-1.603, 5.557,-1.0000,0.991,0.000),
c("GO:0045986","negative regulation of smooth muscle contraction", 0.003,-5.913,-2.855, 2.509,-3.0000,0.652,0.000),
c("GO:1990255","subsynaptic reticulum organization", 0.000, 5.317,-0.291, 1.462,-3.0000,0.937,0.014),
c("GO:0051187","cofactor catabolic process", 0.062, 1.750, 2.634, 3.904,-3.0000,0.952,0.035),
c("GO:0006775","fat-soluble vitamin metabolic process", 0.012,-1.785, 6.861, 3.203,-3.0000,0.909,0.042),
c("GO:0048489","synaptic vesicle transport", 0.035,-4.057,-0.214, 3.658,-3.0000,0.815,0.048),
c("GO:0030048","actin filament-based movement", 0.021,-3.368, 2.033, 3.422,-3.0000,0.803,0.050),
c("GO:0010586","miRNA metabolic process", 0.004, 0.399, 6.193, 2.761,-1.0000,0.943,0.071),
c("GO:1901568","fatty acid derivative metabolic process", 0.017,-1.854,-0.528, 3.342,-3.0000,0.921,0.089),
c("GO:0022615","protein to membrane docking", 0.016,-2.326, 2.777, 3.314,-1.0000,0.881,0.109),
c("GO:0006629","lipid metabolic process", 3.522,-0.330, 2.356, 5.655,-1.0000,0.890,0.131),
c("GO:0033003","regulation of mast cell activation", 0.006,-0.001,-3.478, 2.914,-1.0000,0.760,0.131),
c("GO:0030029","actin filament-based process", 0.398,-3.542, 4.620, 4.708,-1.0000,0.858,0.133),
c("GO:0010506","regulation of autophagy", 0.072, 1.560,-3.948, 3.964,-1.0000,0.815,0.160),
c("GO:0031991","regulation of actomyosin contractile ring contraction", 0.007, 1.025,-2.732, 2.955,-1.0000,0.761,0.161),
c("GO:0045292","mRNA cis splicing, via spliceosome", 0.031, 1.926, 5.735, 3.595,-1.0000,0.933,0.170),
c("GO:0032409","regulation of transporter activity", 0.039,-1.768,-3.913, 3.697,-1.0000,0.770,0.181),
c("GO:0050801","ion homeostasis", 0.434, 3.603,-3.761, 4.746,-1.0000,0.800,0.199),
c("GO:0045234","protein palmitoleylation", 0.000, 5.782, 3.708, 1.519,-3.0000,0.945,0.208),
c("GO:0006414","translational elongation", 0.777, 3.475, 5.762, 4.999,-3.0000,0.902,0.228),
c("GO:0033198","response to ATP", 0.009, 2.303,-7.064, 3.067,-3.0000,0.907,0.231),
c("GO:0071875","adrenergic receptor signaling pathway", 0.009, 0.660,-6.372, 3.072,-3.0000,0.794,0.231),
c("GO:0009268","response to pH", 0.011, 2.761,-6.387, 3.133,-3.0000,0.936,0.233),
c("GO:0018214","protein carboxylation", 0.006, 5.038, 5.347, 2.866,-3.0000,0.943,0.242),
c("GO:0051604","protein maturation", 0.293, 4.470, 5.278, 4.575,-3.0000,0.958,0.257),
c("GO:0016260","selenocysteine biosynthetic process", 0.034,-1.541, 7.243, 3.645,-3.0000,0.847,0.263),
c("GO:0016575","histone deacetylation", 0.048, 5.782, 2.508, 3.787,-3.0000,0.899,0.271),
c("GO:0031952","regulation of protein autophosphorylation", 0.008, 5.839, 1.023, 3.006,-1.0000,0.829,0.276),
c("GO:0044090","positive regulation of vacuole organization", 0.005, 5.248,-1.718, 2.793,-1.0000,0.757,0.279),
c("GO:0046112","nucleobase biosynthetic process", 0.365,-2.247, 6.020, 4.671,-3.0000,0.812,0.322),
c("GO:0007386","compartment pattern specification", 0.001,-7.586,-0.089, 2.111,-3.0000,0.757,0.355),
c("GO:0010587","miRNA catabolic process", 0.002, 1.062, 5.991, 2.330,-1.0000,0.939,0.360),
c("GO:0033504","floor plate development", 0.002,-7.400,-0.208, 2.373,-3.0000,0.725,0.365),
c("GO:0006995","cellular response to nitrogen starvation", 0.009, 1.154,-7.327, 3.047,-1.0000,0.896,0.367),
c("GO:2000647","negative regulation of stem cell proliferation", 0.004,-3.552,-6.038, 2.740,-1.0000,0.816,0.372),
c("GO:0007303","cytoplasmic transport, nurse cell to oocyte", 0.001,-6.435, 1.013, 1.833,-3.0000,0.679,0.407),
c("GO:0050873","brown fat cell differentiation", 0.008,-6.845, 2.754, 3.027,-3.0000,0.785,0.411),
c("GO:0036446","myofibroblast differentiation", 0.001,-7.233, 2.443, 1.839,-1.0000,0.810,0.411),
c("GO:0032940","secretion by cell", 0.763,-4.284, 3.762, 4.991,-3.0000,0.773,0.421),
c("GO:0042373","vitamin K metabolic process", 0.001,-0.419, 4.613, 2.193,-3.0000,0.876,0.428),
c("GO:0042060","wound healing", 0.094, 0.757,-7.905, 4.079,-3.0000,0.919,0.431),
c("GO:0050817","coagulation", 0.051,-7.287,-1.492, 3.814,-3.0000,0.736,0.432),
c("GO:0007010","cytoskeleton organization", 0.786, 5.780,-0.432, 5.004,-1.0000,0.906,0.434),
c("GO:0006428","isoleucyl-tRNA aminoacylation", 0.050, 1.623, 5.887, 3.811,-1.0000,0.828,0.434),
c("GO:0040024","dauer larval development", 0.004,-7.239,-0.396, 2.683,-1.0000,0.747,0.442),
c("GO:0031146","SCF-dependent proteasomal ubiquitin-dependent protein catabolic process", 0.022, 3.980, 4.537, 3.447,-1.0000,0.932,0.444),
c("GO:0046666","retinal cell programmed cell death", 0.003,-6.818, 0.239, 2.547,-3.0000,0.660,0.453),
c("GO:0071877","regulation of adrenergic receptor signaling pathway", 0.001, 0.553,-6.891, 2.164,-3.0000,0.799,0.453),
c("GO:2000574","regulation of microtubule motor activity", 0.007, 2.931,-3.084, 2.968,-1.0000,0.872,0.459),
c("GO:0051899","membrane depolarization", 0.015,-0.077,-4.506, 3.299,-1.0000,0.830,0.460),
c("GO:0070593","dendrite self-avoidance", 0.001,-7.205, 0.513, 2.009,-1.0000,0.711,0.461),
c("GO:0055014","atrial cardiac muscle cell development", 0.001,-7.382, 0.457, 1.908,-1.0000,0.706,0.462),
c("GO:0070782","phosphatidylserine exposure on apoptotic cell surface", 0.000, 3.952,-1.742, 1.613,-1.0000,0.758,0.468),
c("GO:0072702","response to methyl methanesulfonate", 0.000, 2.590,-6.917, 1.531,-1.0000,0.921,0.475),
c("GO:0072703","cellular response to methyl methanesulfonate", 0.000, 2.861,-7.166, 1.531,-1.0000,0.901,0.475),
c("GO:0034199","activation of protein kinase A activity", 0.000, 5.593, 0.740, 1.491,-1.0000,0.788,0.484),
c("GO:0019915","lipid storage", 0.032,-2.214,-4.345, 3.611,-1.0000,0.773,0.484),
c("GO:0021535","cell migration in hindbrain", 0.003,-6.391, 0.389, 2.577,-1.0000,0.626,0.485),
c("GO:0006627","protein processing involved in protein targeting to mitochondrion", 0.016, 3.861, 2.326, 3.312,-3.0000,0.801,0.491),
c("GO:0050820","positive regulation of coagulation", 0.005,-5.484,-2.666, 2.834,-3.0000,0.632,0.493));
one.data <- data.frame(revigo.data);
names(one.data) <- revigo.names;
one.data <- one.data [(one.data$plot_X != "null" & one.data$plot_Y != "null"), ];
one.data$plot_X <- as.numeric( as.character(one.data$plot_X) );
one.data$plot_Y <- as.numeric( as.character(one.data$plot_Y) );
one.data$plot_size <- as.numeric( as.character(one.data$plot_size) );
one.data$log10_p_value <- as.numeric( as.character(one.data$log10_p_value) );
one.data$frequency <- as.numeric( as.character(one.data$frequency) );
one.data$uniqueness <- as.numeric( as.character(one.data$uniqueness) );
one.data$dispensability <- as.numeric( as.character(one.data$dispensability) );
#head(one.data);
# --------------------------------------------------------------------------
# Names of the axes, sizes of the numbers and letters, names of the columns,
# etc. can be changed below
p1 <- ggplot( data = one.data );
p1 <- p1 + geom_point( aes( plot_X, plot_Y, colour = log10_p_value, size = plot_size), alpha = I(0.6) ) + scale_size_area();
p1 <- p1 + scale_colour_gradientn( colours = c("blue", "yellow", "red"), limits = c( min(one.data$log10_p_value), 0) );
p1 <- p1 + geom_point( aes(plot_X, plot_Y, size = plot_size), shape = 21, fill = "transparent", colour = I (alpha ("black", 0.6) )) + scale_size_area();
p1 <- p1 + scale_size( range=c(2, 10)) + theme_bw(); # + scale_fill_gradientn(colours = heat_hcl(7), limits = c(-300, 0) );
ex <- one.data [ one.data$dispensability < 0.15, ];
p1 <- p1 + geom_text( data = ex, aes(plot_X, plot_Y, label = description), colour = I(alpha("black", 0.85)), size = 3 );
p1 <- p1 + labs (y = "semantic space x", x = "semantic space y");
p1 <- p1 + theme(legend.key = element_blank()) ;
one.x_range = max(one.data$plot_X) - min(one.data$plot_X);
one.y_range = max(one.data$plot_Y) - min(one.data$plot_Y);
p1 <- p1 + xlim(min(one.data$plot_X)-one.x_range/10,max(one.data$plot_X)+one.x_range/10);
p1 <- p1 + ylim(min(one.data$plot_Y)-one.y_range/10,max(one.data$plot_Y)+one.y_range/10);
# --------------------------------------------------------------------------
# Output the plot to screen
p1;
```
# 4| Genes duplicated in tandem in Varroa genomes
A reduced number of genes were detected to duplicate in tandem and to be species-specific for each of the honey bee parasite. A total of *51 tandem duplicated genes* were detected in the <span style="color:red"> _V. destructor_ </span> genome and 35 in the <span style="color:blue"> _V. jacobsoni_ </span> genome. The Varroa sister species shared 19 genes duplicated in tandem.
## Duplicated genes species specific to each Varroa
``` {r duplicate details, cache = TRUE}