web-vs-c++.fqa

﻿C++ criticism by other people
{}
This page is a collection of the best C++ criticism by FQA readers, copied
from e-mail messages and online discussions. If you know an interesting consequence
of C++ problems not mentioned in the FQA, please [http://yosefk.com send me e-mail]. Similarly to
[http://yosefk.com/c++fqa/web-vs-fqa.html the FQA errors] page, this one lists things
that can be proved \/ tested rather than qualitative statements. The stuff is published with
credits or anonymously, according to the choice of each author.

The issues listed here (or the FQA itself) are not supposed to be "new" in the sense that
they were never discussed in a published work
(if C++ problems were so hard to discover as to take decades, discussing them wouldn't necessarily be worth the trouble).

There's lots of (well-reasoned or entertaining or both)
C++ criticism on the web, including several pieces by celebrity programmers. However, I made
the decision not to cite famous quotes by celebrities on this site. The main reason
is that I don't want to make the FQA more convincing to the people who mostly value the credentials
of an author and ignore things like facts and reasoning. I want to work less both with C++
/and/ with these people. So I'd rather have them use C++ than convince them to switch to something else.
`<!-- h2toc -->`

`<h2>`Implicit type conversions`</h2>`

/Anonymous:/ I'd add something about the broken type system - how the following code is legal
and compiles without warnings with most compilers, for example:

@
void foo(const std::string &) {}
int main() 
{
    foo(false);
}
@

Another example:

@
class A { public: int a; };
class B : public A { public: int b; };
int main() 
{
    A * p = new B[10];
    p[5].a = 1;
}
@

Why should a static type system allow this without an explicit cast?

/Yossi:/ The FQA doesn't talk much about implicit type conversions, since the FAQ doesn't. The problem
is quite important though. It wouldn't be so bad if C++ detected run time errors
(as opposed to compiling the second example to code modifying the wrong place), and\/or
if so many C++ programmers didn't think that "with C++, when it compiles and links, it will run correctly"
(I actually heard this one, and then there are many large C++ monolithic applications without unit tests
 speaking for themselves).

Note that this has nothing to do with safety and C++ being a "power tool allowing you to do dangerous things"
because it's so "high-performance". This argument only makes sense for /explicit/ casts. What the code
demonstrates is unexpected interactions between pairs of different implicit conversions
(in the first example, |bool -> char* -> std::string|, in the second - |B[] -> B* -> A*|).                                                         

By the way, the second example explains
[21.5 the FAQ's remark] about arrays being [6.15 evil] in the context of inheritance and substitutability.
The thing is that with arrays of objects, there's an implicit type conversion that allows you to violate
 the substitutability principle without a compile time error. With |std::vector<B>|, there's no implicit
 cast. I didn't understand the FAQ was talking about that, because most of the time, when inheritance and
 polymorphism are involved, you allocate arrays of /pointers/ to objects. And in that case, there's no
 difference between a |vector<B*>| and a |B**| - you'd need an explicit cast in both cases. I automatically thought
 the question was the continuation of the discussion in [21.2 preceding questions]
 about /why/ the compiler wouldn't do the cast implicitly, and in that context
 the "arrays are evil" remark didn't quite fit. I completely forgot about the arrays-of-objects case,
 which is something a newbie coming from another language
 (and with C++, some people stay "newbies" for years) could very well try to use.

`<h2>`C++ grammar: the type name vs object name issue`</h2>`
 
/drorz:/ In C\/C++ you can not separate parsing into separate syntax and semantic passes.
 No existing compiler does it in two separate passes.

In the example:

@
AA BB(CC);
@
 
The parse tree is different in the following cases:

`<ul>
<li>When AA and CC are types, BB is a function prototype.</li>
<li>When AA is a type and CC is a variable, then BB is a variable/object initialization.</li>
<li>When AA is a variable, `|AA BB(CC)|` is illegal and its parse tree is entirely different from the first two cases.</li>
 </ul>`

You can not (more precisely, no one did it in a real C\/C++ compiler) fix a wrong parse tree in semantic analysis pass.

Consider this example:

@
x * y(z);
@
 
in two different contexts:

@
int main() {
    int x, y(int), z;
    x * y(z);
}
@

and

@
int main() {
    struct x { x(int) {} } *z;
    x * y(z);
}
@
 
In the first case |x * y(z)| is expression, and in the second case it is a declaration of pointer y.
 Parse trees for those cases are completely different.

/Yossi:/ This is the first part of the problem making the C++ grammar undecidable. The second
 part of the problem is that |AA| may really be |Template<Params>::InnerDef|, and figuring
 out whether |InnerDef| is a type name or an object name is equivalent to solving the halting
 problem, since templates may instantiate themselves recursively and in fact represent arbitrary
 recursive functions. Maybe I'll expand on this one later. In particular, it has to do with
 template specializations, which are discussed in the next item.

Purists who don't like the "nearly context-free" expression in Defective C++: when you write
parsers, it /does/ make sense to discuss the "extent" of your dependence on the context.
 For example, C++ inherits the type name\/object name riddle from C. But in C, you can
 solve it using a single dictionary of |typedef| names. Of course, theoretically the
 important part is that the C grammar is /decidable/ (though not context-free). In practice,
 what matters is that it's /easy to parse/. In particular, you can use a parser generator for
 context-free grammars (|yacc|\/|bison| is one mature program in this family) with the
 simple "symbol table hack" described above, and get a working parser. This is what
 "nearly context-free" means.

I think the example is excellent since it took me quite some time to figure out what the code
 means myself
 (I think it's the asterisk). The |AA BB(CC);| example used in Defective C++ is simpler, but I think it doesn't convey
  the point as clearly, since apparently it makes it intuitively easier to counter with something like "you can
  solve the ambiguity at the semantical analysis stage". Note that you can /always/ counter with that -
  for example, you can say that the "parse tree" of your language is simply the list of characters
  in the file, and the rest is semantical analysis.

`<h2>`C++ grammar: type vs object and template specializations`</h2>`

/drorz:/ Consider the following example:

@  
#include <cstdio>
template<int n> struct confusing
{
    static int q;
};
template<> struct confusing<1>
{
    template<int n>
    struct q
    {
        q(int x)
        {
            printf("Separated syntax and semantics.\\n");
        }
        operator int () { return 0; }
    };
};
char x;
int main()
{
    int x = confusing<sizeof(x)>::q < 3 > (2);
    return 0;
}
@

If you "didn't care" about semantics during parsing, then |confusing<1>::q| is a typename,
  so |confusing<1>::q<3>(2)| creates an object of type |confusing<1>::q<3>| with the argument 2.

If you "do" semantics during the syntax pass, then |confusing<4>| will be looked up,
  |confusing<4>::q| is a variable. The declaration would "expand" to |int x = (confusing<4>::q < 3) > 2|.

You can see that parse trees in those cases are completely different, based on the output of the |sizeof| operator!

/Yossi:/ ...and |sizeof| depends on the platform and the implementation details of inheritance
  (including multiple and virtual), virtual functions, etc. The "parser" gets closer and closer
  to a full-blown compiler.

The problem is the freedom that template specializations have when defining members. Now if anybody
  showed me a /useful/ application of the ability to define something as an inner type in one specialization
  and a static variable in some other specialization, I'd be surprised.

The [http://programming.reddit.com/info/5z7jr/comments/c02bmog reddit thread]
  which this and the previous example are taken from has a detailed discussion about parsing C++.

`<h2>`printf, iostream and internationalization`</h2>`

/Alexander E. Patrakov (patrakov at ums dot usu dot ru):/  The FQA
lists [15.1 valid information] for and against the use of |<iostream>| instead of |<cstdio>|.
  There is,
however, one more thing for |<cstdio>| and against |<iostream>|: the
possibility to translate program messages to a different natural
language (using, e.g., |gettext|). And here I don't mean that there is
currently no gettext equivalent for C++ iostreams, but that there is no
way to design such thing correctly.

Translation works on phrases, not on their parts. Consider, e.g., such C
statements:

@
printf("Read %d files\\n", total);
printf("New data were found in %d files\\n", found);
@
  
With the standard C++ iostreams, this becomes:

@
cout << "Read " << total << " files\\n";
cout << "New data were found in " << found << " files\\n";
@
  
A well-designed program fetches translations from a message catalog,
Windows resource or anywhere else except its own source code. With C
and gettext message catalogs, the translator sees the whole phrases such as /"Read %d files",
"New data were found
in %d files"/, etc. If the same approach were applied to C++, the
translator would see just /"Read ", "New data were found in "/, and /"files"/ (used twice). Lack of context is the least of all worries. The
real problem is that, e.g., when translating to Russian, the two
instances of " files" have to be translated slightly differently, because
Russian has six grammatical cases and different cases are required
in the two sentences:

@
Read %d files => Прочитано %d файлов
New data were found in %d files => Новые данные были найдены в %d файлах
@

(approximately - I don't want to
overwhelm the example with the singular\/plural treatment)

Even worse, examples exist with two format substitutions where they have
to be reordered when translating. C (or, more precisely, the Single UNIX
Specification) allows such reordering with something like |printf("%2$d x
%1$d inches", width, num);| but in C++ the output order of fields is
hard-coded.

The downside is, of course, that nobody except the translator checks
the translated format string, and wrongly-copied conversion specifiers
can crash a program in the corresponding locale (and this did happen
with sed and vim in the past).

See how Trolltech
[http://doc.trolltech.com/4.3/i18n.html#use-qstring-arg-for-dynamic-text handles] the abovementioned problems in their Qt
toolkit.

/Yossi:/ I really like this example because it can be a real eye-opener for a practical programmer,
 and I wish I heard and thought about it several years ago. Clearly the |printf| interface gets
 something right that |iostream| doesn't, since it seems to save us lots of trouble. What is it that |printf|
 gets right? Could it be that representing the program structure using compile time constructs incomprehensible
 to any tool except for the compiler is not the way to go? Effectively the advantage of the |printf| program
 is that it's easier for /other programs/ to manipulate. The idea that backfires is that program structure may be encoded
 in abitrarily complex ways and the only one who ever has to worry about it is the compiler writer.

But maybe this translation business is a singularity in the computing universe, and we
 shouldn't infer general conclusions from it? Well, here's another example. Suppose you want to do real time logging. You don't have
 enough time and\/or bandwidth to do the formatting at the target machine. And yet you want to log free
 text, not some strict binary format with versioning schemes and fixed size limits and other headaches.
 With |printf|-style interface, you can log packets of (for example) 32 bit words - size, constant format string
 pointer, and the list of arguments. You can then extract the format strings from the executable file
 (reading ELF or COFF files is easy - there are examples on the net of about 200 lines of C code),
 and do the formatting at the host machine. Now, with |iostream|-like interface, the format string is split to many
 little parts, and all kinds of types come in the middle - types of data items have to be encoded in the logged packets, too.
 And you'd have to log calls to I\/O manipulators such as |hex|, |setfill|, etc.
 Clearly the overhead per logged data word is going to increase significantly.

Think about it: how can it be that a simplistic "1 format string plus N arguments of dynamic types" interface beats
 an advanced "statically dispatched polymorphic operators" interface,
 and what makes it surprising to you?

`<h2>`Static binding rules`</h2>`

/Miguel Catalina:/ The following test program does not compile under gcc 4.3.{1,2}:

@
#include <cmath>
struct my_class {
  my_class(int) {}
};
inline my_class operator&&(my_class,int){return my_class(0);}
int main(void)
{
  double x = std::pow(1.0, 1.0);
  (void) x; // to avoid unused variable warning
}
@

The error message is:

@
$ g++ -Wall -pedantic-errors simple_test.cpp
simple_test.cpp: In function ‘int main()’:
simple_test.cpp:11: error: ambiguous overload for `operator&&' in `std::__traitor<std::__is_integer<double>, std::__is_floating<double> >::__value && std::__traitor<std::__is_integer<double>, std::__is_floating<double> >::__value'
simple_test.cpp:11: note: candidates are: operator&&(bool, bool) <built-in>
simple_test.cpp:7: note:                 my_class operator&&(my_class, int)
@

The reason for this error has to do with this declaration in |cmath|:

@
  template<typename _Tp, typename _Up>
    inline
    typename __gnu_cxx::__promote_2<
    typename __gnu_cxx::__enable_if<__is_arithmetic<_Tp>::__value
				    && __is_arithmetic<_Up>::__value,
				    _Tp>::__type, _Up>::__type
    pow(_Tp __x, _Up __y)
    {
      typedef typename __gnu_cxx::__promote_2<_Tp, _Up>::__type __type;
      return pow(__type(__x), __type(__y));
    }
@

Tracing a few header files up brings us to |bits\/cpp_type_traits.h|:
@
  template<typename _Tp>
    struct __is_arithmetic
    : public __traitor<__is_integer<_Tp>, __is_floating<_Tp> >
    { };
@

...and:

@
  template<class _Sp, class _Tp>
    struct __traitor
    {
      enum { __value = bool(_Sp::__value) || bool(_Tp::__value) };
      typedef typename __truth_type<__value>::__type __type;
    };
@

So it turns out that the operation that is giving us trouble is the &&
inside the |__enable_if| in the template declaration of |pow()|.  We are
invoking && with two |enum| operands (|__is_arithmetic<T>::__value| is an
unnamed |enum|).  I guess the compiler is treating unnamed |enum|s as
|int|s.  So the compiler is
trying to call |operator&&(int,int)|.  But there isn't,
there are only |operator&&(my_class,int)| and |operator&&(bool,bool)|.  So
the compiler is trying to do an implicit conversion of the operands so
that they can match the available prototypes.  There are implicit ways
of converting an |int| to a |my_class|, as well as convering an |int| to a
|bool|.  The compiler does not know which one to use, hence the
ambiguity.

The question is: why on Earth when you are trying to invoke a
function that only deals with |double|s, do you have to deal with the
ambiguity between two available implicit conversions for types that
have nothing to do with |double|?

/Yossi:/ Takes time to wrap one's mind around this, um, treason
(don't you just love the |public __traitor| bit? I guess a "traitor"
is something used to generate so-called "type traits", a key idiom
in the world of C++ templates arcana). Now that we (presumably) understand
the error message, how would you work around the problem? If the compiler
would barf trying to dispatch an operator with user-defined types,
we could specifically define the operator with the prototype it would
pick as the best match (as the GNU STL implementors themselves [35.11 do]
in similar situations). But we can't define |operator&&(int,int)|. Now what?