-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathprelink.tex
3643 lines (3367 loc) · 193 KB
/
prelink.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[twoside]{article}
\def\docversion{0.7}
% timezone: +01 == CET
\def\timezone{+01}
% Uncomment for draft print.
\def\isdraft{1}
\newif\ifpdf
\ifx\pdfoutput\undefined
\pdffalse % we are not running PDFLaTeX
\else
\pdfoutput=1 % we are running PDFLaTeX
\pdftrue
\fi
\usepackage{linuxtag}
\usepackage{times}
\usepackage{makeidx}
\usepackage{nomencl}
\usepackage[square]{natbib}
\usepackage{marvosym}
\usepackage{longtable}
\renewcommand\bibsection{\chapter{\bibname}}
\ifpdf
\usepackage[pdftex]{graphics}
\usepackage{type1cm}
\usepackage{thumbpdf}
\pdfcompresslevel9
\pdfinfo{/CreationDate (D:20030924012900\timezone'00')}
% The following code to set /ModDate comes from Heiko Oberdiek's paper
% one PDF & hyperref. I only added the \timezone stuff.
\begingroup
\def\twodigits#1{\ifnum#1<10 0\fi\the#1}%
\count0=\time \divide\count0 by 60
\edef\x{\twodigits{\count0}}%
\multiply\count0 by 60
\count1=\time \advance\count1 by -\count0
\edef\x{\x\twodigits{\count1}}%
\edef\x{/ModDate (D:\the\year \twodigits\month \twodigits\day \x 00\timezone'00')}%
\expandafter\endgroup
\expandafter\pdfinfo\expandafter{\x}%
\input pdfcolor
%% For a "Draft" mark on the pages uncomment the following:
\ifx\isdraft\undefined
\relax
\else
\usepackage{eso-pic}
\usepackage{color}
\makeatletter
\AddToShipoutPicture{\rm%
\setlength{\@tempdimb}{.5\paperwidth}%
\setlength{\@tempdimc}{.5\paperheight}%
\setlength{\unitlength}{1pt}%
\put(\strip@pt\@tempdimb,\strip@pt\@tempdimc){%
\makebox(0,0){\rotatebox{45}{\textcolor[gray]{0.9}{\fontsize{5cm}{5cm}\selectfont{Draft}}}}
}
}
\makeatother
\fi
\else
\usepackage[dvips]{graphics}
\fi
\def\titlecolor{NavyBlue}
\makeatletter
\def\@sect#1#2#3#4#5#6[#7]#8{%
\ifnum #2>\c@secnumdepth
\let\@svsec\@empty
\else
\refstepcounter{#1}%
\protected@edef\@svsec{\@seccntformat{#1}\relax}%
\fi
\@tempskipa #5\relax
\ifdim \@tempskipa>\z@
\begingroup
\hbox{\expandafter\csname\titlecolor\endcsname#6{%
\@hangfrom{\hskip #3\relax\@svsec}%
\interlinepenalty \@M #8\@@par}\Black}%
\endgroup
\csname #1mark\endcsname{#7}%
\addcontentsline{toc}{#1}{%
\ifnum #2>\c@secnumdepth \else
\protect\numberline{\csname the#1\endcsname}%
\fi
#7}%
\else
\def\@svsechd{%
\hbox{\expandafter\csname\titlecolor\endcsname#6{\hskip #3\relax
\@svsec #8}%
\csname #1mark\endcsname{#7}\Black}%
\addcontentsline{toc}{#1}{%
\ifnum #2>\c@secnumdepth \else
\protect\numberline{\csname the#1\endcsname}%
\fi
#7}}%
\fi
\@xsect{#5}}
\def\@ssect#1#2#3#4#5{%
\@tempskipa #3\relax
\ifdim \@tempskipa>\z@
\begingroup
\expandafter\csname\titlecolor\endcsname#4{%
\@hangfrom{\hskip #1}%
\interlinepenalty \@M #5\@@par}\Black%
\endgroup
\else
\def\@svsechd{\expandafter\csname\titlecolor\endcsname#4{\hskip #1\relax #5}\Black}%
\fi
\@xsect{#3}}
\makeatother
\usepackage{fancyheadings}
\pagestyle{fancy}
\rhead{}
\chead{}
\lhead{}
\rfoot[\sl Prelink]{\thepage}
\lfoot[\thepage]{\sl Jakub Jel\'\i nek}
\ifx\isdraft\undefined
\cfoot{Version \docversion}
\else
\cfoot{Draft \docversion}
\fi
\renewcommand{\headrulewidth}{0.4pt}
\renewcommand{\footrulewidth}{0.4pt}
\ifx\isdraft\undefined
\relax
\else
\usepackage[mathlines]{lineno}
\fi
\usepackage{graphicx}
\usepackage{hyperref}
\hypersetup{
bookmarksnumbered,
bookmarksopen=true,
pdfpagemode=UseOutlines,
pdfkeywords={Prelink, ELF, DSO, Shared Library, Dynamic Linking, Linux}
}
\usepackage{prelinklisting}
\def\tts#1{\texttt{\small #1}}
\setcounter{dbltopnumber}{3}
\makeatletter
\newcommand{\annotate}[2][]{%
\marginpar{%
\pdfstringdef\x@title{#1}%
\edef\r{\string\r}%
\pdfstringdef\x@contents{#2}%
\pdfannot
width 50em%\linewidth
height .5\baselineskip
depth 2.5\baselineskip
{
/Subtype /Text
/T (\x@title)
/Contents (\x@contents)
}%
}
}
\makeatother
\makeglossary
\makeindex
\begin{document}
\makeatletter
\newcommand\orgmaketitle{}
\let\orgmaketitle\maketitle
\def\maketitle{%
\hypersetup{
pdftitle={\@title},
pdfsubject={Description of prelink tool},
pdfauthor={\@author}
}%
\orgmaketitle
}
\makeatother
\title{Prelink}
\author{Jakub Jel\'\i nek\\
Red Hat, Inc.\\
{\small\tt\href{mailto:[email protected]}{[email protected]}}}
%\maketitle
%\tableofcontents
%\vfil\break
%\listoftables
%\vfil\break
%\listofprelinklistings
%\vfil\break
%\listoffigures
%\vfil\break
\maketitle
\begin{center}
\begin{abstract}
\vspace*{.5\baselineskip}
\parbox{0.8\textwidth}{%
Prelink is a tool designed to speed up dynamic linking of ELF
programs on various Linux architectures.
It speeds up start up of OpenOffice.org 1.1 by 1.8s from 5.5s on 651MHz Pentium III.}
\end{abstract}
\end{center}
\ifx\isdraft\undefined
\relax
\else
\linenumbers
\linenumbersep4pt
\fi
\section{Preface}
In 1995, Linux changed its binary format from \tts{a.out} to \tts{ELF}.
The \tts{a.out} binary format was very inflexible and shared libraries
were pretty hard to build. Linux's shared libraries in \tts{a.out} are position
dependent and each had to be given a unique virtual address space slot
at link time. Maintaining these assignments was pretty hard even when
there were just a few shared libraries, there used to be a central address
registry maintained by humans in form of a text file, but it is certainly
impossible to do these days when there are thousands of different shared libraries
and their size, version and exported symbols are constantly changing.
On the other side, there was just minimum amount of work the dynamic
linker had to do in order to load these shared libraries, as relocation
handling and symbol lookup was only done at link time. The dynamic linker
used the \tts{uselib} system call which just mapped the named library
into the address space (with no segment or section protection differences,
the whole mapping was writable and executable).
The \href{http://www.caldera.com/developers/devspecs/gabi41.pdf}%
{\tts{ELF}}
\footnote{As described in generic ABI document [1] and various processor
specific ABI supplements [2], [3], [4], [5], [6], [7], [8].}
binary format is one of the most flexible binary formats,
its shared libraries are easy to build and there is no need for a central
assignment of virtual address space slots. Shared libraries are position
independent and relocation handling and symbol lookup are done partly
at the time the executable is created and partly at runtime. Symbols in shared
libraries can be overridden at runtime by preloading a new shared
library defining those symbols or without relinking an executable by adding
symbols to a shared library which is searched up earlier during symbol
lookup or by adding new dependent shared libraries to a library used by the
program. All these improvements have their price, which is a slower
program startup, more non-shareable memory per process and runtime cost
associated with position independent code in shared libraries.
Program startup of \tts{ELF} programs is slower than startup of \tts{a.out}
programs with shared libraries, because the dynamic linker has much more work
to do before calling program's entry point. The cost of loading libraries
is just slightly bigger, as \tts{ELF} shared libraries have typically
separate read-only and writable segments, so the dynamic linker
has to use different memory protection for each segment.
The main difference is in relocation handling and associated symbol lookup.
In the \tts{a.out} format there was no relocation handling or symbol lookup at runtime.
In \tts{ELF}, this cost is much more important today than it used to be
during \tts{a.out} to \tts{ELF} transition in Linux, as especially GUI
programs keep constantly growing and start to use more and more shared
libraries. 5 years ago programs using more than 10 shared libraries
were very rare, these days most of the GUI programs link against around
40 or more shared and in extreme cases programs use even more than 90
shared libraries. Every shared library adds its set of dynamic relocations
to the cost and enlarges symbol search scope,
\nomenclature{Symbol Search Scope}{The sequence of \tts{ELF} objects in
which a symbol is being looked up. When a symbol definition is found,
the searching stops and the found symbol is returned. Each program
has a global search scope, which starts by the executable, is typically
followed by the immediate dependencies of the executable and then their
dependencies in breadth search order (where only first occurrence
of each shared library is kept). If \tts{DT\_FILTER}
or \tts{DT\_AUXILIARY} dynamic tags are used the order is slightly
different. Each shared library loaded with \tts{dlopen} has its
own symbol search scope which contains that shared library and
its dependencies. \tts{Prelink} operates also with natural
symbol search scope of each shared library, which is the global
symbol search scope the shared library would have if it were started
as the main program}
so in addition to doing more symbol lookups, each symbol
lookup the application has to perform is on average more expensive.
Another factor increasing the cost is the length of symbol names
which have to be compared when finding symbol in the symbol hash table of
a shared library. C++ libraries tend to have extremely long symbol
names and unfortunately the new \href{http://www.codesourcery.com/cxx-abi/}%
{C++ ABI} puts namespaces and class names first and method names last
in the mangled names, so often symbol names differ only in last
few bytes of very long names.
Every time a relocation is applied the entire memory page
\nomenclature{Page}{Memory block of fixed size which virtual memory
subsystem deals with as a unit. The size of the page depends on
the addressing hardware of the processor, typically pages are 4K or 8K,
in some cases bigger}
containing the address which is written to must be loaded into memory.
The operating system does a copy-on-write operation which also has the
consequence that the physical memory of the memory page cannot anymore
be shared with other processes.
With \tts{ELF}, typically all of program's Global Offset Table,
\nomenclature{Global Offset Table (\tts{GOT})}{When position independent
code needs to build address which requires dynamic relocation, instead
of building it as constant in registers and applying a dynamic relocation
against the read-only segment (which would mean that any pages of the
read-only segment where relocations are applied cannot be shared between
processes anymore), it loads the address from an offset table
private to each shared library, which is created by the linker.
The table is in writable segment and relocations are applied against it.
Position independent code uses on most architectures a special \tts{PIC}
register which points to the start of the Global Offset Table}
constants and variables containing pointers to objects in shared libraries, etc.
are written into before the dynamic linker passes control over to the program.
On most architectures (with some exceptions like \tts{AMD64} architecture)
position independent code requires that one register needs to be dedicated as
\tts{PIC} register and thus cannot be used in the functions for other purposes.
This especially degrades performance on register-starved
architectures like \tts{IA-32}. Also, there needs to be some code to
set up the \tts{PIC} register, either invoked as part of function prologues,
or when using function descriptors in the calling sequence.
\tts{Prelink} is a tool which (together with corresponding dynamic linker
and linker changes) attempts to bring back some of the \tts{a.out}
advantages (such as the speed and less COW'd pages) to the \tts{ELF}
binary format while retaining all of its flexibility. In a limited way
it also attempts to decrease number of non-shareable pages created by
relocations.
\tts{Prelink} works closely with the dynamic linker in the GNU C library,
but probably it wouldn't be too hard to port it to some other \tts{ELF}
using platforms where the dynamic linker can be modified in similar
ways.
\section{Caching of symbol lookup results}
Program startup can be speeded up by caching of symbol lookup
results\footnote{Initially, this has been implemented in the \tts{prelink}
tool and \tts{glibc} dynamic linker, where \tts{prelink} was sorting
relocation sections of existing executables and shared libraries.
When this has been implemented in the linker as well and most executables
and shared libraries are already built with \tts{-z combreloc},
the code from \tts{prelink} has been removed, as it was no longer
needed for most objects and just increasing the tool's complexity.}.
Many shared libraries need more than one lookup of a particular symbol.
This is especially true for C++ shared libraries, where e.g. the same method
is present in multiple virtual tables or {\sl RTTI} data structures.
\nomenclature{RTTI}{C++ runtime type identification}
Traditionally, each \tts{ELF} section which needs dynamic relocations has an
associated \tts{.rela*} or \tts{.rel*} section (depending on whether
the architecture is defined to use \tts{RELA} or \tts{REL} relocations).
\nomenclature{RELA}{Type of relocation structure which includes offset,
relocation type, symbol against which the relocation is and an integer
addend which is added to the symbol. Memory at offset is not supposed
to be used by the relocation. Some architectures got this implemented
incorrectly and memory at offset is for some relocation types used
by the relocation, either in addition to addend or addend is not used
at all. \tts{RELA} relocations are generally better for \tts{prelink},
since when \tts{prelink} stores a pre-computed value into the memory location
at offset, the addend value is not lost}
\nomenclature{REL}{Type of relocation structure which includes just offset,
relocation type and symbol. Addend is taken from memory location at
offset}
The relocations in those sections are typically sorted by ascending
\tts{r\_offset} values.
Symbol lookups are usually the most expensive operation during program
startup, so caching the symbol lookups has potential to decrease time
spent in the dynamic linker.
One way to decrease the cost of symbol lookups is to create a table with the
size equal to number of entries
in dynamic symbol table (\tts{.dynsym}) in the dynamic linker when resolving
a particular shared library, but that would in some cases need a lot of
memory and some time spent in initializing the table. Another option
would be to use a hash table with chained lists, but that needs both
extra memory and would also take extra time for computation of the hash value
and walking up the chains when doing new lookups.
Fortunately, neither of this is really necessary if we modify the linker
to sort relocations so that relocations against the same symbol
are adjacent. This has been done first in the \tts{Sun} linker and dynamic
linker, so the GNU linker and dynamic linker use the same \tts{ELF} extensions
and linker flags. Particularly, the following new \tts{ELF} dynamic tags have been introduced:
\tts{\#define DT\_RELACOUNT 0x6ffffff9}\\
\tts{\#define DT\_RELCOUNT 0x6ffffffa}
New options \tts{-z combreloc} and \tts{-z nocombreloc} have been
added to the linker. The latter causes the previous linker behavior,
i.e. each section requiring relocations has a corresponding relocation section,
which is sorted by ascending \tts{r\_offset}. \tts{-z combreloc}
\footnote{\tts{-z combreloc} is the default in GNU linker versions
2.13 and later.} instructs the linker to create just one relocation
section for dynamic relocations other than symbol jump table (\tts{PLT})
relocations.
\nomenclature{PLT}{Process Linkage Table. Stubs in \tts{ELF} shared
libraries and executables which allow lazy relocations of function calls.
They initially point to code which will do the symbol lookup. The
result of this symbol lookup is then stored in the Process Linkage Table
and control transfered to the address symbol lookup returned. All
following calls to the \tts{PLT} slot just branch to the already looked
up address directly, no further symbol lookup is needed}
This single relocation section (either \tts{.rela.dyn} or \tts{.rel.dyn})
is sorted, so that relative relocations come first (sorted by ascending
\tts{r\_offset}), followed by other relocations, sorted again by ascending
\tts{r\_offset}. If more relocations are against the same
symbol, they immediately follow the first relocation against that symbol
with lowest \tts{r\_offset}.
\footnote{In fact the sorting needs to take into account also the type of
lookup. Most of the relocations will resolve to a \tts{PLT} slot in the executable
if there is one for the lookup symbol, because the executable might have a
pointer against that symbol without any dynamic relocations. But e.g.
relocations used for the \tts{PLT} slots must avoid these.}.
\nomenclature{relative relocation}{Relocation, which doesn't need a symbol
lookup, just adds a shared library load offset to certain memory location
(or locations)}
The number of relative relocations at the beginning of the section
is stored in the \tts{DT\_RELACOUNT} resp. \tts{DT\_RELCOUNT} dynamic tag.
The dynamic linker can use the new dynamic tag for two purposes.
If the shared library is successfully mapped at the same address
as the first \tts{PT\_LOAD} segment's virtual address, the load offset
is zero and the dynamic linker can avoid all the relative relocations which
would just add zero to various memory locations. Normally shared libraries are
linked with first \tts{PT\_LOAD} segment's virtual address set to zero, so
the load offset is non-zero. This can be changed through a linker script or by
using a special \tts{prelink} option \tts{--reloc-only} to change
the base address of a shared library. All prelinked shared libraries
have non-zero base address as well. If the load offset is non-zero, the
dynamic linker can still make use of this dynamic tag, as relative
relocation handling is typically way simpler than handling other
relocations (since symbol lookup is not necessary) and thus it can
handle all relative relocations in a tight loop in one place and
then handle the remaining relocations with the fully featured
relocation handling routine. The second and more important point is
that if relocations against the same symbol are adjacent, the dynamic
linker can use a cache with single entry.
The dynamic linker in \tts{glibc}, if it sees \tts{statistics}
as part of the \tts{LD\_DEBUG} environment variable, displays statistics
which can show how useful this optimization is.
Let's look at some big C++ application, e.g. konqueror.
If not using the cache, the statistics looks like this:
\noindent{\small\begin{verbatim}
18000: runtime linker statistics:
18000: total startup time in dynamic loader: 270886059 clock cycles
18000: time needed for relocation: 266364927 clock cycles (98.3%)
18000: number of relocations: 79067
18000: number of relocations from cache: 0
18000: number of relative relocations: 31169
18000: time needed to load objects: 4203631 clock cycles (1.5%)
\end{verbatim}}
This program run is with hot caches, on non-prelinked system, with lazy
binding.
\nomenclature{Lazy Binding}{A way to postpone symbol lookups for calls until
a function is called for the first time in particular shared library.
This decreases number of symbol lookups done during startup and symbols
which are never called don't need to be looked up at all. Calls requiring
relocations jump into \tts{PLT}, which is initially set up so that a
function in the dynamic linker is called to do symbol lookup. The looked
up address is then stored either into the \tts{PLT} slot directly
(if \tts{PLT} is writable) or into \tts{GOT} entry corresponding
to the \tts{PLT slot} and any subsequent calls already go directly to that
address. Lazy binding can be turned off by setting \tts{LD\_BIND\_NOW=1}
in the environment. Prelinked programs never use lazy binding for the
executable or any shared libraries not loaded using \tts{dlopen}}
The numbers show that the dynamic linker spent most of its time
in relocation handling and especially symbol lookups. If using symbol
lookup cache, the numbers look different:
\noindent{\small\begin{verbatim}
18013: total startup time in dynamic loader: 132922001 clock cycles
18013: time needed for relocation: 128399659 clock cycles (96.5%)
18013: number of relocations: 25473
18013: number of relocations from cache: 53594
18013: number of relative relocations: 31169
18013: time needed to load objects: 4202394 clock cycles (3.1%)
\end{verbatim}}
On average, for one real symbol lookup there were two cache hits and total
time spent in the dynamic linker decreased by 50\%.
\section{Prelink design}
\tts{Prelink} was designed, so that it requires as few \tts{ELF} extensions
as possible. It should not be tied to a particular architecture, but
should work on all \tts{ELF} architectures. During program startup it
should avoid all symbol lookups which, as has been shown above, are
very expensive. It needs to work in an environment where shared
libraries and executables are changing from time to time, whether it is
because of security updates or feature enhancements. It should avoid big code
duplication between the dynamic linker and the tool. And prelinked
shared libraries need to be usable even in non-prelinked executables,
or when one of the shared libraries is upgraded and the prelinking of the
executable has not been updated.
To minimize the number of performed relocations during startup,
the shared libraries (and executables) need to be relocated
already as much as possible. For relative relocations this means the library
needs to be loaded always at the same base address, for other relocations
this means all shared libraries with definitions those relocations resolve
to (often this includes all shared libraries the library or executable depends on)
must always be loaded at the same addresses. \tts{ELF} executables
(with the exception of {\sl Position Independent Executables})
\nomenclature{Position Independent Executable}{A hybrid between
classical \tts{ELF} executables and \tts{ELF} shared libraries.
It has a form of a \tts{ET\_DYN} object like shared libraries and should
contain position independent code, so that the kernel can load
the executable starting at random address to make certain security attacks
harder. Unlike shared libraries it contains \tts{DT\_DEBUG} dynamic
tag, must have \tts{PT\_INTERP} segment with dynamic linker's path,
must have meaningful code at its \tts{e\_entry} and can use symbol
lookup assumptions normal executables can make, particularly that
no symbol defined in the executable can be overridden by a shared
library symbol} have their load address fixed already during linking.
For shared libraries, \tts{prelink} needs something similar to \tts{a.out}
registry of virtual address space slots. Maintaining such registry
across all installations wouldn't scale well, so \tts{prelink} instead
assigns these virtual address space slots on the fly after looking at
all executables it is supposed to speed up and all their dependent shared
libraries. The next step is to actually relocate shared libraries
to the assigned base address.
When this is done, the actual prelinking of shared libraries can be done.
First, all dependent shared libraries need to be prelinked (\tts{prelink}
doesn't support circular dependencies between shared libraries, will just
warn about them instead of prelinking the libraries in the cycle), then for each
relocation in the shared library \tts{prelink} needs to look up the symbol
in natural symbol search scope of the shared library (the shared library
itself first, then breadth first search of all dependent shared libraries) and
apply the relocation to the symbol's target section. The symbol lookup code
in the dynamic linker is quite complex and big, so to avoid duplicating all
this, \tts{prelink} has chosen to use dynamic linker to do the symbol lookups.
Dynamic linker is told via a special environment variable it should print
all performed symbol lookups and their type and \tts{prelink} reads this
output through a pipe. As one of the requirements was that
prelinked shared libraries must be usable even for non-prelinked executables
(duplicating all shared libraries so that there are pristine and prelinked
copies would be very unfriendly to RAM usage), \tts{prelink} has to ensure
that by applying the relocation no information is lost and thus relocation
processing can be cheaply done at startup time of non-prelinked executables.
For \tts{RELA} architectures this is easier, because the content
of the relocation's target memory is not needed when processing the relocation.
\footnote{Relative relocations on certain \tts{RELA} architectures use
relocation target's memory, either alone or together with \tts{r\_addend}
field.} For \tts{REL} architectures this is not the case.
\tts{prelink} attempts some tricks described
later and if they fail, needs to convert the \tts{REL} relocation section
to \tts{RELA} format where addend is stored in the relocation section
instead of relocation target's memory.
When all shared libraries an executable (directly or indirectly) depends on
are prelinked, relocations in the executable are handled similarly to
relocations in shared libraries. Unfortunately, not all symbols resolve the
same when looked up in a shared library's natural symbol search scope
(i.e. as it is done at the time the shared library is prelinked) and when
looked up in application's global symbol search scope. Such symbols are
herein called {\sl conflicts} and the relocations against those symbols
{\sl conflicting relocations}. Conflicts depend on the executable, all its
shared libraries and their respective order. They are only computable
for the shared libraries linked to the executable (libraries mentioned in
\tts{DT\_NEEDED} dynamic tags and shared libraries they transitively need).
The set of shared libraries loaded via \tts{dlopen(3)} cannot be predicted
by \tts{prelink}, neither can the order in which this happened, nor the time
when they are unloaded. When the dynamic linker prints symbol lookups
done in the executable, it also prints conflicts. \tts{Prelink} then
takes all relocations against those symbols and builds a special
\tts{RELA} section with conflict fixups and stores it into the
prelinked executable. Also a list of all dependent shared libraries
in the order they appear in the symbol search scope, together
with their checksums and times of prelinking is stored in another special
section.
The dynamic linker first checks if it is itself prelinked. If yes,
it can avoid its preliminary relocation processing (this one is done
with just the dynamic linker itself in the search scope, so that
all routines in the dynamic linker can be used easily without too many
limitations). When it is about to start a program, it first looks
at the library list section created by \tts{prelink} (if any) and
checks whether they are present in symbol search scope in the same
order, none have been modified since prelinking and that there aren't any
new shared libraries loaded either. If all these conditions are
satisfied, prelinking can be used. In that case the dynamic linker
processes the fixup section and skips all normal relocation handling.
If one or more of the conditions are not met, the dynamic linker continues
with normal relocation processing in the executable and all shared libraries.
\section{Collecting executables and libraries which should be prelinked}
Before the actual work can start the \tts{prelink} tool needs to collect the
filenames of executables and libraries it is supposed to prelink.
It doesn't make any sense to prelink a shared library if no executable is
linked against it because the prelinking information will not be used anyway.
Furthermore, when \tts{prelink} needs to do a \tts{REL} to \tts{RELA}
conversion of relocation sections in the shared library (see later)
or when it needs to convert \tts{SHT\_NOBITS} \tts{PLT} section to
\tts{SHT\_PROGBITS}, a prelinked shared library might grow in size and so
prelinking is only desirable if it will speed up startup of some
program. The only change which might be useful even for shared libraries
which are never linked against, only loaded using \tts{dlopen}, is
relocating to a unique address. This is useful if there are many relative
relocations and there are pages in the shared library's writable segment
which are never written into with the exception of those relative
relocations. Such shared libraries are rare, so \tts{prelink} doesn't
handle these automatically, instead the administrator or developer can
use \tts{prelink --reloc-only={\sl ADDRESS}} to relocate it manually.
Prelinking an executable requires all shared libraries it is linked against
to be prelinked already.
\tts{Prelink} has two main modes in which it collects filenames.
One is {\sl incremental prelinking}, where \tts{prelink} is invoked without
the \tts{-a} option. In this mode, \tts{prelink} queues for prelinking
all executables and shared libraries given on the command line, all executables
in directory trees specified on the command line, and all shared libraries
those executables and shared libraries are linked against.
For the reasons mentioned earlier a shared library is queued only if a
program is linked with it or the user tells the tool to do it anyway
by explicitly mentioning it on the command line.
The second mode is {\sl full prelinking}, where the \tts{-a} option is
given on the command line. This in addition to incremental prelinking
queues all executables found in directory trees specified in \tts{prelink.conf}
(which typically includes all or most directories where system executables
are found). For each directory subtree in the config file the user
can specify whether symbolic links to places outside of the tree are to be followed
or not and whether searching should continue even across filesystem
boundaries.
There is also an option to blacklist some executables or directory trees
so that the executables or anything in the directory trees will not
be prelinked. This can be specified either on the command line or in
the config file.
\tts{Prelink} will not attempt to change executables which use a non-standard
dynamic linker
\footnote{Standard dynamic linker path is hardcoded in the executable for each
architecture. It can be overridden from the command line, but only with
one dynamic linker name (normally, multiple standard dynamic linkers are
used when prelinking mixed architecture systems).}
for security reasons, because it actually needs to execute the dynamic
linker for symbol lookup and it needs to avoid executing some random
unknown executable with the permissions with which \tts{prelink} is run
(typically \tts{root}, with the permissions at least for changing all
executables and shared libraries in the system). The administrator should
ensure that \tts{prelink.conf} doesn't contain world-writable directories
and such directories are not given to the tool on the command line either,
but the tool should be distrustful of the objects nevertheless.
Also, \tts{prelink} will not change shared libraries which are not specified
directly on the command line or located in the directory trees specified on the
command line or in the config file. This is so that
e.g. \tts{prelink} doesn't try to change shared libraries on shared
networked filesystems, or at least it is possible to configure the tool
so that it doesn't do it.
For each executable and shared library it collects, \tts{prelink} executes
the dynamic linker to list all shared libraries it depends on, checks if
it is already prelinked and whether any of its dependencies changed.
Objects which are already prelinked and have no dependencies which changed
don't have to be prelinked again (with the exception when e.g. virtual
address space layout code finds out it needs to assign new virtual address space slots
for the shared library or one of its dependencies). Running the dynamic
linker to get the symbol lookup information is a quite costly
operation especially on systems with many executables and shared libraries
installed, so \tts{prelink} offers a faster \tts{-q} mode. In all modes,
\tts{prelink} stores modification and change times of each shared library
and executable together with all object dependencies and other information
into \tts{prelink.cache} file. When prelinking in \tts{-q} mode, it
just compares modification and change times of the executables and shared
libraries (and all their dependencies). Change time is needed because
\tts{prelink} preserves modification time when prelinking (as well as
permissions, owner and group). If the times match, it assumes the
file has not changed since last prelinking. Therefore the file can be
skipped if it is already prelinked and none of the dependencies changed.
If any time changed or one of the dependencies changed, it invokes the
dynamic linker the same way as in normal mode to find out real dependencies,
whether it has been prelinked or not etc. The collecting phase in normal
mode can take a few minutes, while in quick mode usually takes just a few
seconds, as the only operation it does is it calls just lots of \tts{stat}
system calls.
\section{Assigning virtual address space slots}
\tts{Prelink} has to ensure at least that for all successfully prelinked
executables all shared libraries they are (transitively) linked against
have non-overlapping virtual address space slots (furthermore they
cannot overlap with the virtual address space range used by the executable
itself, its \tts{brk} area, typical stack location and \tts{ld.so.cache}
and other files mmaped by the dynamic linker in early stages of dynamic
linking (before all dependencies are mmaped). If there were any overlaps,
the dynamic linker (which mmaps the shared libraries at the desired location
without \tts{MAP\_FIXED} mmap flag so that it is only soft requirement) would
not manage to mmap them at the assigned locations and the prelinking
information would be invalidated (the dynamic linker would have to do all
normal relocation handling and symbol lookups). Executables are linked against
very wide variety of shared library combinations and that has to be taken
into account.
The simplest approach is to sort shared libraries by descending
usage count (so that most often used shared libraries like the dynamic
linker, \tts{libc.so} etc. are close to each other) and assign them
consecutive slots starting at some architecture specific base address
(with a page or two in between the shared libraries to allow for a limited
growth of shared libraries without having to reposition them).
\tts{Prelink} has to find out which shared libraries will need
a \tts{REL} to \tts{RELA} conversion of relocation sections
and for those which will need the conversion count with the increased size
of the library's loadable segments. This is \tts{prelink} behavior without
\tts{-m} and \tts{-R} options.
The architecture specific base address is best located a few megabytes above
the location where \tts{mmap} with \tts{NULL} first argument and without
\tts{MAP\_FIXED} starts allocating memory areas (in Linux this is the value
of \tts{TASK\_UNMAPPED\_BASE} macro).
\footnote{\tts{TASK\_UNMAPPED\_BASE} has been chosen
on each platform so that there is enough virtual memory for both the
\tts{brk} area (between executable's end and this memory address) and \tts{mmap}
area (between this address and bottom of stack).} The reason for not
starting to assign addresses in \tts{prelink} immediately at
\tts{TASK\_UNMAPPED\_BASE} is that \tts{ld.so.cache} and other mappings by
the dynamic linker will end up in the same range and could overlap with
the shared libraries. Also, if some application uses \tts{dlopen} to load
a shared library which has been prelinked,
\footnote{Typically this is because some other executable is linked against that
shared library directly.}
those few megabytes above \tts{TASK\_UNMAPPED\_BASE} increase the probability
that the stack slot will be still unused (it can clash with e.g.
non-prelinked shared libraries loaded by \tts{dlopen} earlier
\footnote{If shared libraries have first \tts{PT\_LOAD} segment's virtual
address zero, the kernel typically picks first empty slot above
\tts{TASK\_UNMAPPED\_BASE} big enough for the mapping.} or other kinds
of mmap calls with \tts{NULL} first argument like \tts{malloc} allocating
big chunks of memory, mmaping of locale database, etc.).
This simplest approach is unfortunately problematic on 32-bit (or 31-bit)
architectures where the total virtual address space for a process is
somewhere between 2GB (S/390) and almost 4GB (Linux IA-32 4GB/4GB kernel
split, AMD64 running 32-bit processes, etc.). Typical installations these
days contain thousands of shared libraries and if each of them is given a
unique address space slot, on average executables will have pretty sparse
mapping of its shared libraries and there will be less contiguous virtual
memory for application's own use
\footnote{Especially databases look these days for every byte of virtual
address space on 32-bit architectures.}.
\tts{Prelink} has a special mode, turned on with \tts{-m} option, in which
it computes what shared libraries are ever loaded together in some executable
(not considering \tts{dlopen}). If two shared libraries are ever loaded
together, \tts{prelink} assigns them different virtual address space slots,
but if they never appear together, it can give them overlapping addresses.
For example applications using \tts{KDE} toolkit link typically against many
\tts{KDE} shared libraries, programs written using the \tts{Gtk+} toolkit
link typically against many \tts{Gtk+} shared libraries, but there are just
very few programs which link against both \tts{KDE} and \tts{Gtk+} shared
libraries, and even if they do, they link against very small subset of those
shared libraries. So all \tts{KDE} shared libraries not in that subset can
use overlapping addresses with all \tts{Gtk+} shared libraries but the
few exceptions. This leads to considerably smaller virtual address space
range used by all prelinked shared libraries, but it has its own
disadvantages too. It doesn't work too well with incremental prelinking,
because then not all executables are investigated, just those which are given
on \tts{prelink}'s command line. \tts{Prelink} also considers executables
in \tts{prelink.cache}, but it has no information about executables which have
not been prelinked yet. If a new executable, which links against some shared
libraries which never appeared together before, is prelinked later,
\tts{prelink} has to assign them new, non-overlapping addresses.
This means that any executables, which linked against the library
that has been moved and re-prelinked, need to be prelinked again.
If this happened during incremental prelinking, \tts{prelink} will
fix up only the executables given on the command line, leaving other
executables untouched. The untouched executables would not be able to
benefit from prelinking anymore.
Although with the above two layout schemes shared library addresses can
vary slightly between different hosts running the same distribution
(depending on the exact set of installed executables and libraries), especially
the most often used shared libraries will have identical base addresses
on different computers. This is often not desirable for security reasons,
because it makes it slightly easier for various exploits to jump to routines
they want. Standard Linux kernels assign always the same addresses to
shared libraries loaded by the application at each run, so with these
kernels \tts{prelink} doesn't make things worse. But there are kernel
patches, such as Red Hat's \tts{Exec-Shield}, which randomize memory
mappings on each run. If shared libraries are prelinked, they cannot
be assigned different addresses on each run (prelinking information can
be only used to speed up startup if they are mapped at the base addresses
which was used during prelinking), which
means prelinking might not be desirable on some edge servers.
\tts{Prelink} can assign different addresses on different hosts though,
which is almost the same as assigning random addresses on each run
for long running processes such as daemons. Furthermore, the administrator
can force full prelinking and assignment of new random addresses every few
days (if he is also willing to restart the services, so that the old
shared libraries and executables don't have to be kept in memory).
To assign random addresses \tts{prelink} has the \tts{-R} option.
This causes a random starting address somewhere in the architecture specific
range in which shared libraries are assigned, and minor random reshuffling
in the queue of shared libraries which need address assignment (normally
it is sorted by descending usage count, with randomization shared libraries
which are not very far away from each other in the sorted list can be
swapped). The \tts{-R} option should work orthogonally to the \tts{-m}
option.
Some architectures have special further requirements on shared library
address assignment. On 32-bit PowerPC, if shared libraries are located
close to the executable, so that everything fits into 32MB area, \tts{PLT}
slots resolving to those shared libraries can use the branch relative
instruction instead of more expensive sequences involving memory load
and indirect branch. If shared libraries are located in the
first 32MB of address space, \tts{PLT} slots resolving to those shared
libraries can use the branch absolute instruction (but already \tts{PLT}
slots in those shared libraries resolving to addresses in the executable
cannot be done cheaply). This means for optimization \tts{prelink}
should assign addresses from a 24MB region below the executable first, assuming
most of the executables are smaller than those remaining 8MB.
\tts{prelink} assigns these from higher to lower addresses. When this
region is full, \tts{prelink} starts from address 0x40000
\footnote{To leave some pages unmapped to catch \tts{NULL} pointer
dereferences.} up till the bottom of the first area. Only when
all these areas are full, \tts{prelink} starts picking addresses high above
the executable, so that sufficient space is left in between to leave room
for \tts{brk}.
When \tts{-R} option is specified, \tts{prelink} needs to honor it, but
in a way which doesn't totally kill this optimization. So it picks up
a random start base within each of the 3 regions separately, splitting
them into 6 regions.
Another architecture which needs to be handled specially is IA-32
when using \tts{Exec-Shield}. The IA-32 architecture doesn't have an
bit to disable execution for each page, only for each segment. All readable
pages are normally executable. This means the stack is usually executable,
as is memory allocated by \tts{malloc}. This is undesirable for security reasons,
exploits can then overflow a buffer on the stack to transfer control
to code it creates on the stack.
Only very few programs actually need an executable stack. For example
programs using GCC trampolines for nested functions need it or when
an application itself creates executable code on the stack and calls it.
\tts{Exec-Shield} works around this IA-32 architecture deficiency
by using a separate code segment, which starts at address 0 and spans
address space until its limit, highest page which needs to
be executable. This is dynamically changed when some page with higher
address than the limit needs to be executable (either because of \tts{mmap}
with \tts{PROT\_EXEC} bit set, or \tts{mprotect} with \tts{PROT\_EXEC}
of an existing mapping). This kind of protection is of course only
effective if the limit is as low as possible. The kernel tries to
put all new mappings with \tts{PROT\_EXEC} set and \tts{NULL} address low.
If possible into {\sl ASCII Shield area} (first 16MB of address space)
\nomenclature{ASCII Shield area}{First 16MB of address space on 32-bit
architectures. These addresses have zeros in upper 8 bits,
which on little endian architectures are stored as last byte of the address
and on big endian architectures as first byte of the address.
A zero byte terminates string, so it is hard to control the exact
arguments of a function if they are placed on the stack above the
address. On big endian machines, it is even hard to control the
low 24 bits of the address}, if not, at least below the executable.
If \tts{prelink} detects \tts{Exec-Shield}, it tries to do the same as
kernel when assigning addresses, i.e. prefers to assign addresses in
{\sl ASCII Shield area} and continues with other addresses below
the program. It needs to leave first 1MB plus 4KB of address space
unallocated though, because that range is often used by programs
using \tts{vm86} system call.
\section{Relocation of libraries}
When a shared library has a base address assigned, it needs to be relocated
so that the base address is equal to the first \tts{PT\_LOAD} segment's
\tts{p\_vaddr}. The effect of this operation should be bitwise identical
as if the library were linked with that base address originally.
That is, the following scripts should produce identical output:
\noindent{{\small\begin{verbatim}
$ gcc -g -shared -o libfoo.so.1.0.0 -Wl,-h,libfoo.so.1 \
input1.o input2.o somelib.a
$ prelink --reloc-only=0x54321000 libfoo.so.1.0.0
\end{verbatim}
\prelinklistingcaption{Script to relocate a shared library after linking using \tts{prelink}}}
and:
\noindent{\small\begin{verbatim}
$ gcc -shared -Wl,--verbose 2>&1 > /dev/null \
| sed -e '/^======/,/^======/!d' \
-e '/^======/d;s/0\( + SIZEOF_HEADERS\)/0x54321000\1/' \
> libfoo.so.lds
$ gcc -Wl,-T,libfoo.so.lds -g -shared -o libfoo.so.1.0.0 \
-Wl,-h,libfoo.so.1 input1.o input2.o somelib.a
\end{verbatim}}
\prelinklistingcaption{Script to link a shared library at non-standard base}}
The first script creates a normal shared library with the default
base address 0 and then uses \tts{prelink}'s special mode when it just
relocates a library to a given address. The second script first modifies
a built-in GNU linker script for linking of shared libraries, so that
the base address is the one given instead of zero and stores it into a
temporary file. Then it creates a shared library using that linker script.
The relocation operation involves mostly adding the difference between
old and new base address to all \tts{ELF} fields which contain values
representing virtual addresses of the shared library
(or in the program header table also representing physical addresses).
File offsets need to be unmodified. Most places where the adjustments
need to be done are clear, \tts{prelink} just has to watch \tts{ELF} spec
to see which fields contain virtual addresses.
One problem is with absolute symbols. \tts{Prelink} has no way to find
out if an absolute symbol in a shared library is really meant as
absolute and thus not changing during relocation, or if it is an address
of some place in the shared library outside of any section or on their
edge. For instance symbols created in the GNU linker's script outside
of section directives have all \tts{SHN\_ABS} section, yet they can be
location in the library (e.g. \tts{symbolfoo~=~.}) or they can be absolute
(e.g. \tts{symbolbar~=~0x12345000}). This distinction is lost at link
time. But the dynamic linker when looking up symbols doesn't make any
distinction between them, all addresses during dynamic lookup have the
load offset added to it. \tts{Prelink} chooses to relocate any absolute
symbols with value bigger than zero, that way \tts{prelink --reloc-only}
gets bitwise identical output with linking directly at the different base
in almost all real-world cases. Thread Local Storage symbols (those with
\tts{STT\_TLS} type) are never relocated, as their values are relative
to start of shared library's thread local area.
When relocating the dynamic section there are no bits which tell if
a particular dynamic tag uses \tts{d\_un.d\_ptr} (which needs to
be adjusted) or \tts{d\_un.d\_val} (which needs to be left as is).
So \tts{prelink} has to hardcode a list of well known architecture
independent dynamic tags which need adjusting and have a hook for
architecture specific dynamic tag adjustment. Sun came up with
\tts{DT\_ADDRRNGLO} to \tts{DT\_ADDRRNGHI} and \tts{DT\_VALRNGLO}
to \tts{DT\_VALRNGHI} dynamic tag number ranges, so at least as
long as these ranges are used for new dynamic tags \tts{prelink}
can relocate correctly even without listing them all explicitly.
When relocating \tts{.rela.*} or \tts{.rel.*} sections, which is
done in architecture specific code, relative relocations and on \tts{.got.plt}
using architectures also \tts{PLT} relocations typically need an
adjustment. The adjustment needs to be done in either \tts{r\_addend} field
of the \tts{ElfNN\_Rela} structure, in the memory pointed by \tts{r\_offset},
or in both locations.
On some architectures what needs adjusting is not even the same for all relative relocations.
Relative relocations against some sections need to have \tts{r\_addend}
adjusted while others need to have memory adjusted.
On many architectures, first few words in \tts{GOT} are special and some
of them need adjustment.
The hardest part of the adjustment is handling the debugging sections.
These are non-allocated sections which typically have no corresponding
relocation section associated with them. \tts{Prelink} has to match the various
debuggers in what fields it adjusts and what are skipped.
As of this writing \tts{prelink} should handle
\href{http://www.eagercon.com/dwarf/dwarf-2.0.0.pdf}%
{\tts{DWARF 2} [15]} standard as corrected (and extended) by
\href{http://reality.sgiweb.org/davea/dwarf3-draft8-011125.pdf}%
{\tts{DWARF 3 draft} [16]},
\href{http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/doc/stabs.texinfo?cvsroot=src}%
{\tts{Stabs} [17]} with GCC extensions and Alpha or MIPS \tts{Mdebug}.
\tts{DWARF 2} debugging information involves many separate sections,
each of them with a unique format which needs to be relocated differently.
For relocation of the \tts{.debug\_info} section compilation units \tts{prelink} has to
parse the corresponding part of the \tts{.debug\_abbrev} section, adjust all
values of attributes that are using the \tts{DW\_FORM\_addr} form and adjust embedded
location lists. \tts{.debug\_ranges} and \tts{.debug\_loc} section
portions depend on the exact place in \tts{.debug\_info} section from
which they are referenced, so that \tts{prelink} can keep track of their
base address. \tts{DWARF} debugging format is very extendable, so
\tts{prelink} needs to be very conservative when it sees unknown extensions.
It needs to fail prelinking instead of silently break debugging information
if it sees an unknown \tts{.debug\_*} section, unknown attribute form
or unknown attribute with one of the \tts{DW\_FORM\_block*} forms, as
they can potentially embed addresses which would need adjustment.
For \tts{stabs} \tts{prelink} tried to match GDB behavior. For
\tts{N\_FUN}, it needs to differentiate between function start and
function address which are both encoded with this type, the rest of types
either always need relocating or never. And similarly to \tts{DWARF 2}
handling, it needs to reject unknown types.
The relocation code in \tts{prelink} is a little bit more generic
than what is described above, as it is used also by other parts of
\tts{prelink}, when growing sections in a middle of the shared library
during \tts{REL} to \tts{RELA} conversion. All adjustment functions
get passed both the offset it should add to virtual addresses and
a start address. Adjustment is only done if the old virtual address
was bigger or equal than the start address.
\section{REL to RELA conversion}
On architectures which normally use the \tts{REL} format for relocations instead
of \tts{RELA} (IA-32, ARM and MIPS), if certain relocation types use the
memory \tts{r\_offset} points to during relocation, \tts{prelink} has to
either convert them to a different relocation type which doesn't use
the memory value, or the whole \tts{.rel.dyn} section needs to be converted
to \tts{RELA} format. Let's describe it on an example on IA-32 architecture:
\noindent{{\small\begin{verbatim}
$ cat > test1.c <<EOF