diff --git a/LICENSE.txt b/LICENSE.txt new file mode 100644 index 0000000..7e89def --- /dev/null +++ b/LICENSE.txt @@ -0,0 +1,70 @@ +Creative Commons +Attribution-NonCommercial-NoDerivs 3.0 Unported + + CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE LEGAL SERVICES. DISTRIBUTION OF THIS LICENSE DOES NOT CREATE AN ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES REGARDING THE INFORMATION PROVIDED, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM ITS USE. + +License + +THE WORK (AS DEFINED BELOW) IS PROVIDED UNDER THE TERMS OF THIS CREATIVE COMMONS PUBLIC LICENSE ("CCPL" OR "LICENSE"). THE WORK IS PROTECTED BY COPYRIGHT AND/OR OTHER APPLICABLE LAW. ANY USE OF THE WORK OTHER THAN AS AUTHORIZED UNDER THIS LICENSE OR COPYRIGHT LAW IS PROHIBITED. + +BY EXERCISING ANY RIGHTS TO THE WORK PROVIDED HERE, YOU ACCEPT AND AGREE TO BE BOUND BY THE TERMS OF THIS LICENSE. TO THE EXTENT THIS LICENSE MAY BE CONSIDERED TO BE A CONTRACT, THE LICENSOR GRANTS YOU THE RIGHTS CONTAINED HERE IN CONSIDERATION OF YOUR ACCEPTANCE OF SUCH TERMS AND CONDITIONS. + +1. Definitions + + "Adaptation" means a work based upon the Work, or upon the Work and other pre-existing works, such as a translation, adaptation, derivative work, arrangement of music or other alterations of a literary or artistic work, or phonogram or performance and includes cinematographic adaptations or any other form in which the Work may be recast, transformed, or adapted including in any form recognizably derived from the original, except that a work that constitutes a Collection will not be considered an Adaptation for the purpose of this License. For the avoidance of doubt, where the Work is a musical work, performance or phonogram, the synchronization of the Work in timed-relation with a moving image ("synching") will be considered an Adaptation for the purpose of this License. + "Collection" means a collection of literary or artistic works, such as encyclopedias and anthologies, or performances, phonograms or broadcasts, or other works or subject matter other than works listed in Section 1(f) below, which, by reason of the selection and arrangement of their contents, constitute intellectual creations, in which the Work is included in its entirety in unmodified form along with one or more other contributions, each constituting separate and independent works in themselves, which together are assembled into a collective whole. A work that constitutes a Collection will not be considered an Adaptation (as defined above) for the purposes of this License. + "Distribute" means to make available to the public the original and copies of the Work through sale or other transfer of ownership. + "Licensor" means the individual, individuals, entity or entities that offer(s) the Work under the terms of this License. + "Original Author" means, in the case of a literary or artistic work, the individual, individuals, entity or entities who created the Work or if no individual or entity can be identified, the publisher; and in addition (i) in the case of a performance the actors, singers, musicians, dancers, and other persons who act, sing, deliver, declaim, play in, interpret or otherwise perform literary or artistic works or expressions of folklore; (ii) in the case of a phonogram the producer being the person or legal entity who first fixes the sounds of a performance or other sounds; and, (iii) in the case of broadcasts, the organization that transmits the broadcast. + "Work" means the literary and/or artistic work offered under the terms of this License including without limitation any production in the literary, scientific and artistic domain, whatever may be the mode or form of its expression including digital form, such as a book, pamphlet and other writing; a lecture, address, sermon or other work of the same nature; a dramatic or dramatico-musical work; a choreographic work or entertainment in dumb show; a musical composition with or without words; a cinematographic work to which are assimilated works expressed by a process analogous to cinematography; a work of drawing, painting, architecture, sculpture, engraving or lithography; a photographic work to which are assimilated works expressed by a process analogous to photography; a work of applied art; an illustration, map, plan, sketch or three-dimensional work relative to geography, topography, architecture or science; a performance; a broadcast; a phonogram; a compilation of data to the extent it is protected as a copyrightable work; or a work performed by a variety or circus performer to the extent it is not otherwise considered a literary or artistic work. + "You" means an individual or entity exercising rights under this License who has not previously violated the terms of this License with respect to the Work, or who has received express permission from the Licensor to exercise rights under this License despite a previous violation. + "Publicly Perform" means to perform public recitations of the Work and to communicate to the public those public recitations, by any means or process, including by wire or wireless means or public digital performances; to make available to the public Works in such a way that members of the public may access these Works from a place and at a place individually chosen by them; to perform the Work to the public by any means or process and the communication to the public of the performances of the Work, including by public digital performance; to broadcast and rebroadcast the Work by any means including signs, sounds or images. + "Reproduce" means to make copies of the Work by any means including without limitation by sound or visual recordings and the right of fixation and reproducing fixations of the Work, including storage of a protected performance or phonogram in digital form or other electronic medium. + +2. Fair Dealing Rights. Nothing in this License is intended to reduce, limit, or restrict any uses free from copyright or rights arising from limitations or exceptions that are provided for in connection with the copyright protection under copyright law or other applicable laws. + +3. License Grant. Subject to the terms and conditions of this License, Licensor hereby grants You a worldwide, royalty-free, non-exclusive, perpetual (for the duration of the applicable copyright) license to exercise the rights in the Work as stated below: + + to Reproduce the Work, to incorporate the Work into one or more Collections, and to Reproduce the Work as incorporated in the Collections; and, + to Distribute and Publicly Perform the Work including as incorporated in Collections. + +The above rights may be exercised in all media and formats whether now known or hereafter devised. The above rights include the right to make such modifications as are technically necessary to exercise the rights in other media and formats, but otherwise you have no rights to make Adaptations. Subject to 8(f), all rights not expressly granted by Licensor are hereby reserved, including but not limited to the rights set forth in Section 4(d). + +4. Restrictions. The license granted in Section 3 above is expressly made subject to and limited by the following restrictions: + + You may Distribute or Publicly Perform the Work only under the terms of this License. You must include a copy of, or the Uniform Resource Identifier (URI) for, this License with every copy of the Work You Distribute or Publicly Perform. You may not offer or impose any terms on the Work that restrict the terms of this License or the ability of the recipient of the Work to exercise the rights granted to that recipient under the terms of the License. You may not sublicense the Work. You must keep intact all notices that refer to this License and to the disclaimer of warranties with every copy of the Work You Distribute or Publicly Perform. When You Distribute or Publicly Perform the Work, You may not impose any effective technological measures on the Work that restrict the ability of a recipient of the Work from You to exercise the rights granted to that recipient under the terms of the License. This Section 4(a) applies to the Work as incorporated in a Collection, but this does not require the Collection apart from the Work itself to be made subject to the terms of this License. If You create a Collection, upon notice from any Licensor You must, to the extent practicable, remove from the Collection any credit as required by Section 4(c), as requested. + You may not exercise any of the rights granted to You in Section 3 above in any manner that is primarily intended for or directed toward commercial advantage or private monetary compensation. The exchange of the Work for other copyrighted works by means of digital file-sharing or otherwise shall not be considered to be intended for or directed toward commercial advantage or private monetary compensation, provided there is no payment of any monetary compensation in connection with the exchange of copyrighted works. + If You Distribute, or Publicly Perform the Work or Collections, You must, unless a request has been made pursuant to Section 4(a), keep intact all copyright notices for the Work and provide, reasonable to the medium or means You are utilizing: (i) the name of the Original Author (or pseudonym, if applicable) if supplied, and/or if the Original Author and/or Licensor designate another party or parties (e.g., a sponsor institute, publishing entity, journal) for attribution ("Attribution Parties") in Licensor's copyright notice, terms of service or by other reasonable means, the name of such party or parties; (ii) the title of the Work if supplied; (iii) to the extent reasonably practicable, the URI, if any, that Licensor specifies to be associated with the Work, unless such URI does not refer to the copyright notice or licensing information for the Work. The credit required by this Section 4(c) may be implemented in any reasonable manner; provided, however, that in the case of a Collection, at a minimum such credit will appear, if a credit for all contributing authors of Collection appears, then as part of these credits and in a manner at least as prominent as the credits for the other contributing authors. For the avoidance of doubt, You may only use the credit required by this Section for the purpose of attribution in the manner set out above and, by exercising Your rights under this License, You may not implicitly or explicitly assert or imply any connection with, sponsorship or endorsement by the Original Author, Licensor and/or Attribution Parties, as appropriate, of You or Your use of the Work, without the separate, express prior written permission of the Original Author, Licensor and/or Attribution Parties. + + For the avoidance of doubt: + Non-waivable Compulsory License Schemes. In those jurisdictions in which the right to collect royalties through any statutory or compulsory licensing scheme cannot be waived, the Licensor reserves the exclusive right to collect such royalties for any exercise by You of the rights granted under this License; + Waivable Compulsory License Schemes. In those jurisdictions in which the right to collect royalties through any statutory or compulsory licensing scheme can be waived, the Licensor reserves the exclusive right to collect such royalties for any exercise by You of the rights granted under this License if Your exercise of such rights is for a purpose or use which is otherwise than noncommercial as permitted under Section 4(b) and otherwise waives the right to collect royalties through any statutory or compulsory licensing scheme; and, + Voluntary License Schemes. The Licensor reserves the right to collect royalties, whether individually or, in the event that the Licensor is a member of a collecting society that administers voluntary licensing schemes, via that society, from any exercise by You of the rights granted under this License that is for a purpose or use which is otherwise than noncommercial as permitted under Section 4(b). + Except as otherwise agreed in writing by the Licensor or as may be otherwise permitted by applicable law, if You Reproduce, Distribute or Publicly Perform the Work either by itself or as part of any Collections, You must not distort, mutilate, modify or take other derogatory action in relation to the Work which would be prejudicial to the Original Author's honor or reputation. + +5. Representations, Warranties and Disclaimer + +UNLESS OTHERWISE MUTUALLY AGREED BY THE PARTIES IN WRITING, LICENSOR OFFERS THE WORK AS-IS AND MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND CONCERNING THE WORK, EXPRESS, IMPLIED, STATUTORY OR OTHERWISE, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTIBILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR THE ABSENCE OF LATENT OR OTHER DEFECTS, ACCURACY, OR THE PRESENCE OF ABSENCE OF ERRORS, WHETHER OR NOT DISCOVERABLE. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO SUCH EXCLUSION MAY NOT APPLY TO YOU. + +6. Limitation on Liability. EXCEPT TO THE EXTENT REQUIRED BY APPLICABLE LAW, IN NO EVENT WILL LICENSOR BE LIABLE TO YOU ON ANY LEGAL THEORY FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL, PUNITIVE OR EXEMPLARY DAMAGES ARISING OUT OF THIS LICENSE OR THE USE OF THE WORK, EVEN IF LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + +7. Termination + + This License and the rights granted hereunder will terminate automatically upon any breach by You of the terms of this License. Individuals or entities who have received Collections from You under this License, however, will not have their licenses terminated provided such individuals or entities remain in full compliance with those licenses. Sections 1, 2, 5, 6, 7, and 8 will survive any termination of this License. + Subject to the above terms and conditions, the license granted here is perpetual (for the duration of the applicable copyright in the Work). Notwithstanding the above, Licensor reserves the right to release the Work under different license terms or to stop distributing the Work at any time; provided, however that any such election will not serve to withdraw this License (or any other license that has been, or is required to be, granted under the terms of this License), and this License will continue in full force and effect unless terminated as stated above. + +8. Miscellaneous + + Each time You Distribute or Publicly Perform the Work or a Collection, the Licensor offers to the recipient a license to the Work on the same terms and conditions as the license granted to You under this License. + If any provision of this License is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this License, and without further action by the parties to this agreement, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable. + No term or provision of this License shall be deemed waived and no breach consented to unless such waiver or consent shall be in writing and signed by the party to be charged with such waiver or consent. + This License constitutes the entire agreement between the parties with respect to the Work licensed here. There are no understandings, agreements or representations with respect to the Work not specified here. Licensor shall not be bound by any additional provisions that may appear in any communication from You. This License may not be modified without the mutual written agreement of the Licensor and You. + The rights granted under, and the subject matter referenced, in this License were drafted utilizing the terminology of the Berne Convention for the Protection of Literary and Artistic Works (as amended on September 28, 1979), the Rome Convention of 1961, the WIPO Copyright Treaty of 1996, the WIPO Performances and Phonograms Treaty of 1996 and the Universal Copyright Convention (as revised on July 24, 1971). These rights and subject matter take effect in the relevant jurisdiction in which the License terms are sought to be enforced according to the corresponding provisions of the implementation of those treaty provisions in the applicable national law. If the standard suite of rights granted under applicable copyright law includes additional rights not granted under this License, such additional rights are deemed to be included in the License; this License is not intended to restrict the license of any rights under applicable law. + + Creative Commons Notice + + Creative Commons is not a party to this License, and makes no warranty whatsoever in connection with the Work. Creative Commons will not be liable to You or any party on any legal theory for any damages whatsoever, including without limitation any general, special, incidental or consequential damages arising in connection to this license. Notwithstanding the foregoing two (2) sentences, if Creative Commons has expressly identified itself as the Licensor hereunder, it shall have all rights and obligations of Licensor. + + Except for the limited purpose of indicating to the public that the Work is licensed under the CCPL, Creative Commons does not authorize the use by either party of the trademark "Creative Commons" or any related trademark or logo of Creative Commons without the prior written consent of Creative Commons. Any permitted use will be in compliance with Creative Commons' then-current trademark usage guidelines, as may be published on its website or otherwise made available upon request from time to time. For the avoidance of doubt, this trademark restriction does not form part of this License. + + Creative Commons may be contacted at http://creativecommons.org/. \ No newline at end of file diff --git a/README.md b/README.md new file mode 100644 index 0000000..272d6c3 --- /dev/null +++ b/README.md @@ -0,0 +1,61 @@ +# You Don't Know JS (book series) + +This is a series of books diving deep into the core mechanisms of the JavaScript language. The first edition of the series is now complete. + +  +  +  +  +  + + +Please feel free to contribute to the quality of this content by submitting PR's for improvements to code snippets, explanations, etc. While typo fixes are welcomed, they will likely be caught through normal editing processes, and are thus not necessarily as important for this repository. + +**To read more about the motivations and perspective behind this book series, check out the [Preface](preface.md).** + +## Titles + +* Read online (free!): ["Up & Going"](up & going/README.md#you-dont-know-js-up--going), Published: [Buy Now](http://shop.oreilly.com/product/0636920039303.do) in print, but the ebook format is free! +* Read online (free!): ["Scope & Closures"](scope & closures/README.md#you-dont-know-js-scope--closures), Published: [Buy Now](http://shop.oreilly.com/product/0636920026327.do) +* Read online (free!): ["this & Object Prototypes"](this & object prototypes/README.md#you-dont-know-js-this--object-prototypes), Published: [Buy Now](http://shop.oreilly.com/product/0636920033738.do) +* Read online (free!): ["Types & Grammar"](types & grammar/README.md#you-dont-know-js-types--grammar), Published: [Buy Now](http://shop.oreilly.com/product/0636920033745.do) +* Read online (free!): ["Async & Performance"](async & performance/README.md#you-dont-know-js-async--performance), Published: [Buy Now](http://shop.oreilly.com/product/0636920033752.do) +* Read online (free!): ["ES6 & Beyond"](es6 & beyond/README.md#you-dont-know-js-es6--beyond), Published: [Buy Now](http://shop.oreilly.com/product/0636920033769.do) + +## Publishing + +These books are being released here as drafts, free to read, but are also being edited, produced, and published through O'Reilly. + +If you like the content you find here, and want to support more content like it, please purchase the books once they are available for sale, through your normal book sources. :) + +If you'd like to contribute financially towards the effort (or any of my other OSS work) aside from purchasing the books, I do have a [patreon](https://www.patreon.com/getify) that I would always appreciate your generosity towards. + + + +## In-person Training + +The content for these books derives heavily from a series of training materials I teach professionally (in both public and private-corporate workshop format), called "Advanced JS: The 'What You Need To Know' Parts". + +If you like this content and would like to contact me regarding conducting training on these, or other various JS/HTML5/node.js topics, please reach out to me through any of these channels listed here: + +[http://getify.me](http://getify.me) + +## Online Video Training + +I also have some JS training material available in on-demand video format. I teach courses through [Frontend Masters](https://FrontendMasters.com), like my [Advanced JS](https://frontendmasters.com/courses/advanced-javascript/) workshop (more courses coming soon!). + +That same course is also [available through Pluralsight](http://www.pluralsight.com/courses/advanced-javascript). + +## Content Contributions + +Any contributions you make to this effort **are of course greatly appreciated**. + +However, if you choose to contribute content (not just typo corrections) to this repo, you agree that you're giving me a non-exclusive license to use that content for the book series, as I (and O'Reilly) deem appropriate. You probably guessed that already, but we just have to make the lawyers happy by explicitly stating it. + +So: blah, blah, blah... :) + +## License & Copyright + +The materials herein are all (c) 2013-2015 Kyle Simpson. + +Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. diff --git a/async & performance/README.md b/async & performance/README.md new file mode 100644 index 0000000..1f5e45e --- /dev/null +++ b/async & performance/README.md @@ -0,0 +1,23 @@ +# You Don't Know JS: Async & Performance + + + +----- + +**[Purchase digital/print copy from O'Reilly](http://shop.oreilly.com/product/0636920033752.do)** + +----- + +[Table of Contents](toc.md) + +* [Foreword](foreword.md) (by [Jake Archibald](http://jakearchibald.com)) +* [Preface](../preface.md) +* [Chapter 1: Asynchrony: Now & Later](ch1.md) +* [Chapter 2: Callbacks](ch2.md) +* [Chapter 3: Promises](ch3.md) +* [Chapter 4: Generators](ch4.md) +* [Chapter 5: Program Performance](ch5.md) +* [Chapter 6: Benchmarking & Tuning](ch6.md) +* [Appendix A: Library: asynquence](apA.md) +* [Appendix B: Advanced Async Patterns](apB.md) +* [Appendix C: Thank You's!](apC.md) diff --git a/async & performance/apA.md b/async & performance/apA.md new file mode 100644 index 0000000..dcb2376 --- /dev/null +++ b/async & performance/apA.md @@ -0,0 +1,828 @@ +# You Don't Know JS: Async & Performance +# Appendix A: *asynquence* Library + +Chapters 1 and 2 went into quite a bit of detail about typical asynchronous programming patterns and how they're commonly solved with callbacks. But we also saw why callbacks are fatally limited in capability, which led us to Chapters 3 and 4, with Promises and generators offering a much more solid, trustable, and reason-able base to build your asynchrony on. + +I referenced my own asynchronous library *asynquence* (http://github.com/getify/asynquence) -- "async" + "sequence" = "asynquence" -- several times in this book, and I want to now briefly explain how it works and why its unique design is important and helpful. + +In the next appendix, we'll explore some advanced async patterns, but you'll probably want a library to make those palatable enough to be useful. We'll use *asynquence* to express those patterns, so you'll want to spend a little time here getting to know the library first. + +*asynquence* is obviously not the only option for good async coding; certainly there are many great libraries in this space. But *asynquence* provides a unique perspective by combining the best of all these patterns into a single library, and moreover is built on a single basic abstraction: the (async) sequence. + +My premise is that sophisticated JS programs often need bits and pieces of various different asynchronous patterns woven together, and this is usually left entirely up to each developer to figure out. Instead of having to bring in two or more different async libraries that focus on different aspects of asynchrony, *asynquence* unifies them into variated sequence steps, with just one core library to learn and deploy. + +I believe the value is strong enough with *asynquence* to make async flow control programming with Promise-style semantics super easy to accomplish, so that's why we'll exclusively focus on that library here. + +To begin, I'll explain the design principles behind *asynquence*, and then we'll illustrate how its API works with code examples. + +## Sequences, Abstraction Design + +Understanding *asynquence* begins with understanding a fundamental abstraction: any series of steps for a task, whether they separately are synchronous or asynchronous, can be collectively thought of as a "sequence". In other words, a sequence is a container that represents a task, and is comprised of individual (potentially async) steps to complete that task. + +Each step in the sequence is controlled under the covers by a Promise (see Chapter 3). That is, every step you add to a sequence implicitly creates a Promise that is wired to the previous end of the sequence. Because of the semantics of Promises, every single step advancement in a sequence is asynchronous, even if you synchronously complete the step. + +Moreover, a sequence will always proceed linearly from step to step, meaning that step 2 always comes after step 1 finishes, and so on. + +Of course, a new sequence can be forked off an existing sequence, meaning the fork only occurs once the main sequence reaches that point in the flow. Sequences can also be combined in various ways, including having one sequence subsumed by another sequence at a particular point in the flow. + +A sequence is kind of like a Promise chain. However, with Promise chains, there is no "handle" to grab that references the entire chain. Whichever Promise you have a reference to only represents the current step in the chain plus any other steps hanging off it. Essentially, you cannot hold a reference to a Promise chain unless you hold a reference to the first Promise in the chain. + +There are many cases where it turns out to be quite useful to have a handle that references the entire sequence collectively. The most important of those cases is with sequence abort/cancel. As we covered extensively in Chapter 3, Promises themselves should never be able to be canceled, as this violates a fundamental design imperative: external immutability. + +But sequences have no such immutability design principle, mostly because sequences are not passed around as future-value containers that need immutable value semantics. So sequences are the proper level of abstraction to handle abort/cancel behavior. *asynquence* sequences can be `abort()`ed at any time, and the sequence will stop at that point and not go for any reason. + +There's plenty more reasons to prefer a sequence abstraction on top of Promise chains, for flow control purposes. + +First, Promise chaining is a rather manual process -- one that can get pretty tedious once you start creating and chaining Promises across a wide swath of your programs -- and this tedium can act counterproductively to dissuade the developer from using Promises in places where they are quite appropriate. + +Abstractions are meant to reduce boilerplate and tedium, so the sequence abstraction is a good solution to this problem. With Promises, your focus is on the individual step, and there's little assumption that you will keep the chain going. With sequences, the opposite approach is taken, assuming the sequence will keep having more steps added indefinitely. + +This abstraction complexity reduction is especially powerful when you start thinking about higher-order Promise patterns (beyond `race([..])` and `all([..])`. + +For example, in the middle of a sequence, you may want to express a step that is conceptually like a `try..catch` in that the step will always result in success, either the intended main success resolution or a positive nonerror signal for the caught error. Or, you might want to express a step that is like a retry/until loop, where it keeps trying the same step over and over until success occurs. + +These sorts of abstractions are quite nontrivial to express using only Promise primitives, and doing so in the middle of an existing Promise chain is not pretty. But if you abstract your thinking to a sequence, and consider a step as a wrapper around a Promise, that step wrapper can hide such details, freeing you to think about the flow control in the most sensible way without being bothered by the details. + +Second, and perhaps more importantly, thinking of async flow control in terms of steps in a sequence allows you to abstract out the details of what types of asynchronicity are involved with each individual step. Under the covers, a Promise will always control the step, but above the covers, that step can look either like a continuation callback (the simple default), or like a real Promise, or as a run-to-completion generator, or ... Hopefully, you get the picture. + +Third, sequences can more easily be twisted to adapt to different modes of thinking, such as event-, stream-, or reactive-based coding. *asynquence* provides a pattern I call "reactive sequences" (which we'll cover later) as a variation on the "reactive observable" ideas in RxJS ("Reactive Extensions"), that lets a repeatable event fire off a new sequence instance each time. Promises are one-shot-only, so it's quite awkward to express repetitious asynchrony with Promises alone. + +Another alternate mode of thinking inverts the resolution/control capability in a pattern I call "iterable sequences". Instead of each individual step internally controlling its own completion (and thus advancement of the sequence), the sequence is inverted so the advancement control is through an external iterator, and each step in the *iterable sequence* just responds to the `next(..)` *iterator* control. + +We'll explore all of these different variations as we go throughout the rest of this appendix, so don't worry if we ran over those bits far too quickly just now. + +The takeaway is that sequences are a more powerful and sensible abstraction for complex asynchrony than just Promises (Promise chains) or just generators, and *asynquence* is designed to express that abstraction with just the right level of sugar to make async programming more understandable and more enjoyable. + +## *asynquence* API + +To start off, the way you create a sequence (an *asynquence* instance) is with the `ASQ(..)` function. An `ASQ()` call with no parameters creates an empty initial sequence, whereas passing one or more values or functions to `ASQ(..)` sets up the sequence with each argument representing the initial steps of the sequence. + +**Note:** For the purposes of all code examples here, I will use the *asynquence* top-level identifier in global browser usage: `ASQ`. If you include and use *asynquence* through a module system (browser or server), you of course can define whichever symbol you prefer, and *asynquence* won't care! + +Many of the API methods discussed here are built into the core of *asynquence*, but others are provided through including the optional "contrib" plug-ins package. See the documentation for *asynquence* for whether a method is built in or defined via plug-in: http://github.com/getify/asynquence + +### Steps + +If a function represents a normal step in the sequence, that function is invoked with the first parameter being the continuation callback, and any subsequent parameters being any messages passed on from the previous step. The step will not complete until the continuation callback is called. Once it's called, any arguments you pass to it will be sent along as messages to the next step in the sequence. + +To add an additional normal step to the sequence, call `then(..)` (which has essentially the exact same semantics as the `ASQ(..)` call): + +```js +ASQ( + // step 1 + function(done){ + setTimeout( function(){ + done( "Hello" ); + }, 100 ); + }, + // step 2 + function(done,greeting) { + setTimeout( function(){ + done( greeting + " World" ); + }, 100 ); + } +) +// step 3 +.then( function(done,msg){ + setTimeout( function(){ + done( msg.toUpperCase() ); + }, 100 ); +} ) +// step 4 +.then( function(done,msg){ + console.log( msg ); // HELLO WORLD +} ); +``` + +**Note:** Though the name `then(..)` is identical to the native Promises API, this `then(..)` is different. You can pass as few or as many functions or values to `then(..)` as you'd like, and each is taken as a separate step. There's no two-callback fulfilled/rejected semantics involved. + +Unlike with Promises, where to chain one Promise to the next you have to create and `return` that Promise from a `then(..)` fulfillment handler, with *asynquence*, all you need to do is call the continuation callback -- I always call it `done()` but you can name it whatever suits you -- and optionally pass it completion messages as arguments. + +Each step defined by `then(..)` is assumed to be asynchronous. If you have a step that's synchronous, you can either just call `done(..)` right away, or you can use the simpler `val(..)` step helper: + +```js +// step 1 (sync) +ASQ( function(done){ + done( "Hello" ); // manually synchronous +} ) +// step 2 (sync) +.val( function(greeting){ + return greeting + " World"; +} ) +// step 3 (async) +.then( function(done,msg){ + setTimeout( function(){ + done( msg.toUpperCase() ); + }, 100 ); +} ) +// step 4 (sync) +.val( function(msg){ + console.log( msg ); +} ); +``` + +As you can see, `val(..)`-invoked steps don't receive a continuation callback, as that part is assumed for you -- and the parameter list is less cluttered as a result! To send a message along to the next step, you simply use `return`. + +Think of `val(..)` as representing a synchronous "value-only" step, which is useful for synchronous value operations, logging, and the like. + +### Errors + +One important difference with *asynquence* compared to Promises is with error handling. + +With Promises, each individual Promise (step) in a chain can have its own independent error, and each subsequent step has the ability to handle the error or not. The main reason for this semantic comes (again) from the focus on individual Promises rather than on the chain (sequence) as a whole. + +I believe that most of the time, an error in one part of a sequence is generally not recoverable, so the subsequent steps in the sequence are moot and should be skipped. So, by default, an error at any step of a sequence throws the entire sequence into error mode, and the rest of the normal steps are ignored. + +If you *do* need to have a step where its error is recoverable, there are several different API methods that can accommodate, such as `try(..)` -- previously mentioned as a kind of `try..catch` step -- or `until(..)` -- a retry loop that keeps attempting the step until it succeeds or you manually `break()` the loop. *asynquence* even has `pThen(..)` and `pCatch(..)` methods, which work identically to how normal Promise `then(..)` and `catch(..)` work (see Chapter 3), so you can do localized mid-sequence error handling if you so choose. + +The point is, you have both options, but the more common one in my experience is the default. With Promises, to get a chain of steps to ignore all steps once an error occurs, you have to take care not to register a rejection handler at any step; otherwise, that error gets swallowed as handled, and the sequence may continue (perhaps unexpectedly). This kind of desired behavior is a bit awkward to properly and reliably handle. + +To register a sequence error notification handler, *asynquence* provides an `or(..)` sequence method, which also has an alias of `onerror(..)`. You can call this method anywhere in the sequence, and you can register as many handlers as you'd like. That makes it easy for multiple different consumers to listen in on a sequence to know if it failed or not; it's kind of like an error event handler in that respect. + +Just like with Promises, all JS exceptions become sequence errors, or you can programmatically signal a sequence error: + +```js +var sq = ASQ( function(done){ + setTimeout( function(){ + // signal an error for the sequence + done.fail( "Oops" ); + }, 100 ); +} ) +.then( function(done){ + // will never get here +} ) +.or( function(err){ + console.log( err ); // Oops +} ) +.then( function(done){ + // won't get here either +} ); + +// later + +sq.or( function(err){ + console.log( err ); // Oops +} ); +``` + +Another really important difference with error handling in *asynquence* compared to native Promises is the default behavior of "unhandled exceptions". As we discussed at length in Chapter 3, a rejected Promise without a registered rejection handler will just silently hold (aka swallow) the error; you have to remember to always end a chain with a final `catch(..)`. + +In *asynquence*, the assumption is reversed. + +If an error occurs on a sequence, and it **at that moment** has no error handlers registered, the error is reported to the `console`. In other words, unhandled rejections are by default always reported so as not to be swallowed and missed. + +As soon as you register an error handler against a sequence, it opts that sequence out of such reporting, to prevent duplicate noise. + +There may, in fact, be cases where you want to create a sequence that may go into the error state before you have a chance to register the handler. This isn't common, but it can happen from time to time. + +In those cases, you can also **opt a sequence instance out** of error reporting by calling `defer()` on the sequence. You should only opt out of error reporting if you are sure that you're going to eventually handle such errors: + +```js +var sq1 = ASQ( function(done){ + doesnt.Exist(); // will throw exception to console +} ); + +var sq2 = ASQ( function(done){ + doesnt.Exist(); // will throw only a sequence error +} ) +// opt-out of error reporting +.defer(); + +setTimeout( function(){ + sq1.or( function(err){ + console.log( err ); // ReferenceError + } ); + + sq2.or( function(err){ + console.log( err ); // ReferenceError + } ); +}, 100 ); + +// ReferenceError (from sq1) +``` + +This is better error handling behavior than Promises themselves have, because it's the Pit of Success, not the Pit of Failure (see Chapter 3). + +**Note:** If a sequence is piped into (aka subsumed by) another sequence -- see "Combining Sequences" for a complete description -- then the source sequence is opted out of error reporting, but now the target sequence's error reporting or lack thereof must be considered. + +### Parallel Steps + +Not all steps in your sequences will have just a single (async) task to perform; some will need to perform multiple steps "in parallel" (concurrently). A step in a sequence in which multiple substeps are processing concurrently is called a `gate(..)` -- there's an `all(..)` alias if you prefer -- and is directly symmetric to native `Promise.all([..])`. + +If all the steps in the `gate(..)` complete successfully, all success messages will be passed to the next sequence step. If any of them generate errors, the whole sequence immediately goes into an error state. + +Consider: + +```js +ASQ( function(done){ + setTimeout( done, 100 ); +} ) +.gate( + function(done){ + setTimeout( function(){ + done( "Hello" ); + }, 100 ); + }, + function(done){ + setTimeout( function(){ + done( "World", "!" ); + }, 100 ); + } +) +.val( function(msg1,msg2){ + console.log( msg1 ); // Hello + console.log( msg2 ); // [ "World", "!" ] +} ); +``` + +For illustration, let's compare that example to native Promises: + +```js +new Promise( function(resolve,reject){ + setTimeout( resolve, 100 ); +} ) +.then( function(){ + return Promise.all( [ + new Promise( function(resolve,reject){ + setTimeout( function(){ + resolve( "Hello" ); + }, 100 ); + } ), + new Promise( function(resolve,reject){ + setTimeout( function(){ + // note: we need a [ ] array here + resolve( [ "World", "!" ] ); + }, 100 ); + } ) + ] ); +} ) +.then( function(msgs){ + console.log( msgs[0] ); // Hello + console.log( msgs[1] ); // [ "World", "!" ] +} ); +``` + +Yuck. Promises require a lot more boilerplate overhead to express the same asynchronous flow control. That's a great illustration of why the *asynquence* API and abstraction make dealing with Promise steps a lot nicer. The improvement only goes higher the more complex your asynchrony is. + +#### Step Variations + +There are several variations in the contrib plug-ins on *asynquence*'s `gate(..)` step type that can be quite helpful: + +* `any(..)` is like `gate(..)`, except just one segment has to eventually succeed to proceed on the main sequence. +* `first(..)` is like `any(..)`, except as soon as any segment succeeds, the main sequence proceeds (ignoring subsequent results from other segments). +* `race(..)` (symmetric with `Promise.race([..])`) is like `first(..)`, except the main sequence proceeds as soon as any segment completes (either success or failure). +* `last(..)` is like `any(..)`, except only the latest segment to complete successfully sends its message(s) along to the main sequence. +* `none(..)` is the inverse of `gate(..)`: the main sequence proceeds only if all the segments fail (with all segment error message(s) transposed as success message(s) and vice versa). + +Let's first define some helpers to make illustration cleaner: + +```js +function success1(done) { + setTimeout( function(){ + done( 1 ); + }, 100 ); +} + +function success2(done) { + setTimeout( function(){ + done( 2 ); + }, 100 ); +} + +function failure3(done) { + setTimeout( function(){ + done.fail( 3 ); + }, 100 ); +} + +function output(msg) { + console.log( msg ); +} +``` + +Now, let's demonstrate these `gate(..)` step variations: + +```js +ASQ().race( + failure3, + success1 +) +.or( output ); // 3 + + +ASQ().any( + success1, + failure3, + success2 +) +.val( function(){ + var args = [].slice.call( arguments ); + console.log( + args // [ 1, undefined, 2 ] + ); +} ); + + +ASQ().first( + failure3, + success1, + success2 +) +.val( output ); // 1 + + +ASQ().last( + failure3, + success1, + success2 +) +.val( output ); // 2 + +ASQ().none( + failure3 +) +.val( output ) // 3 +.none( + failure3 + success1 +) +.or( output ); // 1 +``` + +Another step variation is `map(..)`, which lets you asynchronously map elements of an array to different values, and the step doesn't proceed until all the mappings are complete. `map(..)` is very similar to `gate(..)`, except it gets the initial values from an array instead of from separately specified functions, and also because you define a single function callback to operate on each value: + +```js +function double(x,done) { + setTimeout( function(){ + done( x * 2 ); + }, 100 ); +} + +ASQ().map( [1,2,3], double ) +.val( output ); // [2,4,6] +``` + +Also, `map(..)` can receive either of its parameters (the array or the callback) from messages passed from the previous step: + +```js +function plusOne(x,done) { + setTimeout( function(){ + done( x + 1 ); + }, 100 ); +} + +ASQ( [1,2,3] ) +.map( double ) // message `[1,2,3]` comes in +.map( plusOne ) // message `[2,4,6]` comes in +.val( output ); // [3,5,7] +``` + +Another variation is `waterfall(..)`, which is kind of like a mixture between `gate(..)`'s message collection behavior but `then(..)`'s sequential processing. + +Step 1 is first executed, then the success message from step 1 is given to step 2, and then both success messages go to step 3, and then all three success messages go to step 4, and so on, such that the messages sort of collect and cascade down the "waterfall". + +Consider: + +```js +function double(done) { + var args = [].slice.call( arguments, 1 ); + console.log( args ); + + setTimeout( function(){ + done( args[args.length - 1] * 2 ); + }, 100 ); +} + +ASQ( 3 ) +.waterfall( + double, // [ 3 ] + double, // [ 6 ] + double, // [ 6, 12 ] + double // [ 6, 12, 24 ] +) +.val( function(){ + var args = [].slice.call( arguments ); + console.log( args ); // [ 6, 12, 24, 48 ] +} ); +``` + +If at any point in the "waterfall" an error occurs, the whole sequence immediately goes into an error state. + +#### Error Tolerance + +Sometimes you want to manage errors at the step level and not let them necessarily send the whole sequence into the error state. *asynquence* offers two step variations for that purpose. + +`try(..)` attempts a step, and if it succeeds, the sequence proceeds as normal, but if the step fails, the failure is turned into a success message formated as `{ catch: .. }` with the error message(s) filled in: + +```js +ASQ() +.try( success1 ) +.val( output ) // 1 +.try( failure3 ) +.val( output ) // { catch: 3 } +.or( function(err){ + // never gets here +} ); +``` + +You could instead set up a retry loop using `until(..)`, which tries the step and if it fails, retries the step again on the next event loop tick, and so on. + +This retry loop can continue indefinitely, but if you want to break out of the loop, you can call the `break()` flag on the completion trigger, which sends the main sequence into an error state: + +```js +var count = 0; + +ASQ( 3 ) +.until( double ) +.val( output ) // 6 +.until( function(done){ + count++; + + setTimeout( function(){ + if (count < 5) { + done.fail(); + } + else { + // break out of the `until(..)` retry loop + done.break( "Oops" ); + } + }, 100 ); +} ) +.or( output ); // Oops +``` + +#### Promise-Style Steps + +If you would prefer to have, inline in your sequence, Promise-style semantics like Promises' `then(..)` and `catch(..)` (see Chapter 3), you can use the `pThen` and `pCatch` plug-ins: + +```js +ASQ( 21 ) +.pThen( function(msg){ + return msg * 2; +} ) +.pThen( output ) // 42 +.pThen( function(){ + // throw an exception + doesnt.Exist(); +} ) +.pCatch( function(err){ + // caught the exception (rejection) + console.log( err ); // ReferenceError +} ) +.val( function(){ + // main sequence is back in a + // success state because previous + // exception was caught by + // `pCatch(..)` +} ); +``` + +`pThen(..)` and `pCatch(..)` are designed to run in the sequence, but behave as if it was a normal Promise chain. As such, you can either resolve genuine Promises or *asynquence* sequences from the "fulfillment" handler passed to `pThen(..)` (see Chapter 3). + +### Forking Sequences + +One feature that can be quite useful about Promises is that you can attach multiple `then(..)` handler registrations to the same promise, effectively "forking" the flow-control at that promise: + +```js +var p = Promise.resolve( 21 ); + +// fork 1 (from `p`) +p.then( function(msg){ + return msg * 2; +} ) +.then( function(msg){ + console.log( msg ); // 42 +} ) + +// fork 2 (from `p`) +p.then( function(msg){ + console.log( msg ); // 21 +} ); +``` + +The same "forking" is easy in *asynquence* with `fork()`: + +```js +var sq = ASQ(..).then(..).then(..); + +var sq2 = sq.fork(); + +// fork 1 +sq.then(..)..; + +// fork 2 +sq2.then(..)..; +``` + +### Combining Sequences + +The reverse of `fork()`ing, you can combine two sequences by subsuming one into another, using the `seq(..)` instance method: + +```js +var sq = ASQ( function(done){ + setTimeout( function(){ + done( "Hello World" ); + }, 200 ); +} ); + +ASQ( function(done){ + setTimeout( done, 100 ); +} ) +// subsume `sq` sequence into this sequence +.seq( sq ) +.val( function(msg){ + console.log( msg ); // Hello World +} ) +``` + +`seq(..)` can either accept a sequence itself, as shown here, or a function. If a function, it's expected that the function when called will return a sequence, so the preceding code could have been done with: + +```js +// .. +.seq( function(){ + return sq; +} ) +// .. +``` + +Also, that step could instead have been accomplished with a `pipe(..)`: + +```js +// .. +.then( function(done){ + // pipe `sq` into the `done` continuation callback + sq.pipe( done ); +} ) +// .. +``` + +When a sequence is subsumed, both its success message stream and its error stream are piped in. + +**Note:** As mentioned in an earlier note, piping (manually with `pipe(..)` or automatically with `seq(..)`) opts the source sequence out of error-reporting, but doesn't affect the error reporting status of the target sequence. + +## Value and Error Sequences + +If any step of a sequence is just a normal value, that value is just mapped to that step's completion message: + +```js +var sq = ASQ( 42 ); + +sq.val( function(msg){ + console.log( msg ); // 42 +} ); +``` + +If you want to make a sequence that's automatically errored: + +```js +var sq = ASQ.failed( "Oops" ); + +ASQ() +.seq( sq ) +.val( function(msg){ + // won't get here +} ) +.or( function(err){ + console.log( err ); // Oops +} ); +``` + +You also may want to automatically create a delayed-value or a delayed-error sequence. Using the `after` and `failAfter` contrib plug-ins, this is easy: + +```js +var sq1 = ASQ.after( 100, "Hello", "World" ); +var sq2 = ASQ.failAfter( 100, "Oops" ); + +sq1.val( function(msg1,msg2){ + console.log( msg1, msg2 ); // Hello World +} ); + +sq2.or( function(err){ + console.log( err ); // Oops +} ); +``` + +You can also insert a delay in the middle of a sequence using `after(..)`: + +```js +ASQ( 42 ) +// insert a delay into the sequence +.after( 100 ) +.val( function(msg){ + console.log( msg ); // 42 +} ); +``` + +## Promises and Callbacks + +I think *asynquence* sequences provide a lot of value on top of native Promises, and for the most part you'll find it more pleasant and more powerful to work at that level of abstraction. However, integrating *asynquence* with other non-*asynquence* code will be a reality. + +You can easily subsume a promise (e.g., thenable -- see Chapter 3) into a sequence using the `promise(..)` instance method: + +```js +var p = Promise.resolve( 42 ); + +ASQ() +.promise( p ) // could also: `function(){ return p; }` +.val( function(msg){ + console.log( msg ); // 42 +} ); +``` + +And to go the opposite direction and fork/vend a promise from a sequence at a certain step, use the `toPromise` contrib plug-in: + +```js +var sq = ASQ.after( 100, "Hello World" ); + +sq.toPromise() +// this is a standard promise chain now +.then( function(msg){ + return msg.toUpperCase(); +} ) +.then( function(msg){ + console.log( msg ); // HELLO WORLD +} ); +``` + +To adapt *asynquence* to systems using callbacks, there are several helper facilities. To automatically generate an "error-first style" callback from your sequence to wire into a callback-oriented utility, use `errfcb`: + +```js +var sq = ASQ( function(done){ + // note: expecting "error-first style" callback + someAsyncFuncWithCB( 1, 2, done.errfcb ) +} ) +.val( function(msg){ + // .. +} ) +.or( function(err){ + // .. +} ); + +// note: expecting "error-first style" callback +anotherAsyncFuncWithCB( 1, 2, sq.errfcb() ); +``` + +You also may want to create a sequence-wrapped version of a utility -- compare to "promisory" in Chapter 3 and "thunkory" in Chapter 4 -- and *asynquence* provides `ASQ.wrap(..)` for that purpose: + +```js +var coolUtility = ASQ.wrap( someAsyncFuncWithCB ); + +coolUtility( 1, 2 ) +.val( function(msg){ + // .. +} ) +.or( function(err){ + // .. +} ); +``` + +**Note:** For the sake of clarity (and for fun!), let's coin yet another term, for a sequence-producing function that comes from `ASQ.wrap(..)`, like `coolUtility` here. I propose "sequory" ("sequence" + "factory"). + +## Iterable Sequences + +The normal paradigm for a sequence is that each step is responsible for completing itself, which is what advances the sequence. Promises work the same way. + +The unfortunate part is that sometimes you need external control over a Promise/step, which leads to awkward "capability extraction". + +Consider this Promises example: + +```js +var domready = new Promise( function(resolve,reject){ + // don't want to put this here, because + // it belongs logically in another part + // of the code + document.addEventListener( "DOMContentLoaded", resolve ); +} ); + +// .. + +domready.then( function(){ + // DOM is ready! +} ); +``` + +The "capability extraction" anti-pattern with Promises looks like this: + +```js +var ready; + +var domready = new Promise( function(resolve,reject){ + // extract the `resolve()` capability + ready = resolve; +} ); + +// .. + +domready.then( function(){ + // DOM is ready! +} ); + +// .. + +document.addEventListener( "DOMContentLoaded", ready ); +``` + +**Note:** This anti-pattern is an awkward code smell, in my opinion, but some developers like it, for reasons I can't grasp. + +*asynquence* offers an inverted sequence type I call "iterable sequences", which externalizes the control capability (it's quite useful in use cases like the `domready`): + +```js +// note: `domready` here is an *iterator* that +// controls the sequence +var domready = ASQ.iterable(); + +// .. + +domready.val( function(){ + // DOM is ready +} ); + +// .. + +document.addEventListener( "DOMContentLoaded", domready.next ); +``` + +There's more to iterable sequences than what we see in this scenario. We'll come back to them in Appendix B. + +## Running Generators + +In Chapter 4, we derived a utility called `run(..)` which can run generators to completion, listening for `yield`ed Promises and using them to async resume the generator. *asynquence* has just such a utility built in, called `runner(..)`. + +Let's first set up some helpers for illustration: + +```js +function doublePr(x) { + return new Promise( function(resolve,reject){ + setTimeout( function(){ + resolve( x * 2 ); + }, 100 ); + } ); +} + +function doubleSeq(x) { + return ASQ( function(done){ + setTimeout( function(){ + done( x * 2) + }, 100 ); + } ); +} +``` + +Now, we can use `runner(..)` as a step in the middle of a sequence: + +```js +ASQ( 10, 11 ) +.runner( function*(token){ + var x = token.messages[0] + token.messages[1]; + + // yield a real promise + x = yield doublePr( x ); + + // yield a sequence + x = yield doubleSeq( x ); + + return x; +} ) +.val( function(msg){ + console.log( msg ); // 84 +} ); +``` + +### Wrapped Generators + +You can also create a self-packaged generator -- that is, a normal function that runs your specified generator and returns a sequence for its completion -- by `ASQ.wrap(..)`ing it: + +```js +var foo = ASQ.wrap( function*(token){ + var x = token.messages[0] + token.messages[1]; + + // yield a real promise + x = yield doublePr( x ); + + // yield a sequence + x = yield doubleSeq( x ); + + return x; +}, { gen: true } ); + +// .. + +foo( 8, 9 ) +.val( function(msg){ + console.log( msg ); // 68 +} ); +``` + +There's a lot more awesome that `runner(..)` is capable of, but we'll come back to that in Appendix B. + +## Review + +*asynquence* is a simple abstraction -- a sequence is a series of (async) steps -- on top of Promises, aimed at making working with various asynchronous patterns much easier, without any compromise in capability. + +There are other goodies in the *asynquence* core API and its contrib plug-ins beyond what we saw in this appendix, but we'll leave that as an exercise for the reader to go check the rest of the capabilities out. + +You've now seen the essence and spirit of *asynquence*. The key take away is that a sequence is comprised of steps, and those steps can be any of dozens of different variations on Promises, or they can be a generator-run, or... The choice is up to you, you have all the freedom to weave together whatever async flow control logic is appropriate for your tasks. No more library switching to catch different async patterns. + +If these *asynquence* snippets have made sense to you, you're now pretty well up to speed on the library; it doesn't take that much to learn, actually! + +If you're still a little fuzzy on how it works (or why!), you'll want to spend a little more time examining the previous examples and playing around with *asynquence* yourself, before going on to the next appendix. Appendix B will push *asynquence* into several more advanced and powerful async patterns. diff --git a/async & performance/apB.md b/async & performance/apB.md new file mode 100644 index 0000000..9707f79 --- /dev/null +++ b/async & performance/apB.md @@ -0,0 +1,833 @@ +# You Don't Know JS: Async & Performance +# Appendix B: Advanced Async Patterns + +Appendix A introduced the *asynquence* library for sequence-oriented async flow control, primarily based on Promises and generators. + +Now we'll explore other advanced asynchronous patterns built on top of that existing understanding and functionality, and see how *asynquence* makes those sophisticated async techniques easy to mix and match in our programs without needing lots of separate libraries. + +## Iterable Sequences + +We introduced *asynquence*'s iterable sequences in the previous appendix, but we want to revisit them in more detail. + +To refresh, recall: + +```js +var domready = ASQ.iterable(); + +// .. + +domready.val( function(){ + // DOM is ready +} ); + +// .. + +document.addEventListener( "DOMContentLoaded", domready.next ); +``` + +Now, let's define a sequence of multiple steps as an iterable sequence: + +```js +var steps = ASQ.iterable(); + +steps +.then( function STEP1(x){ + return x * 2; +} ) +.steps( function STEP2(x){ + return x + 3; +} ) +.steps( function STEP3(x){ + return x * 4; +} ); + +steps.next( 8 ).value; // 16 +steps.next( 16 ).value; // 19 +steps.next( 19 ).value; // 76 +steps.next().done; // true +``` + +As you can see, an iterable sequence is a standard-compliant *iterator* (see Chapter 4). So, it can be iterated with an ES6 `for..of` loop, just like a generator (or any other *iterable*) can: + +```js +var steps = ASQ.iterable(); + +steps +.then( function STEP1(){ return 2; } ) +.then( function STEP2(){ return 4; } ) +.then( function STEP3(){ return 6; } ) +.then( function STEP4(){ return 8; } ) +.then( function STEP5(){ return 10; } ); + +for (var v of steps) { + console.log( v ); +} +// 2 4 6 8 10 +``` + +Beyond the event triggering example shown in the previous appendix, iterable sequences are interesting because in essence they can be seen as a stand-in for generators or Promise chains, but with even more flexibility. + +Consider a multiple Ajax request example -- we've seen the same scenario in Chapters 3 and 4, both as a Promise chain and as a generator, respectively -- expressed as an iterable sequence: + +```js +// sequence-aware ajax +var request = ASQ.wrap( ajax ); + +ASQ( "http://some.url.1" ) +.runner( + ASQ.iterable() + + .then( function STEP1(token){ + var url = token.messages[0]; + return request( url ); + } ) + + .then( function STEP2(resp){ + return ASQ().gate( + request( "http://some.url.2/?v=" + resp ), + request( "http://some.url.3/?v=" + resp ) + ); + } ) + + .then( function STEP3(r1,r2){ return r1 + r2; } ) +) +.val( function(msg){ + console.log( msg ); +} ); +``` + +The iterable sequence expresses a sequential series of (sync or async) steps that looks awfully similar to a Promise chain -- in other words, it's much cleaner looking than just plain nested callbacks, but not quite as nice as the `yield`-based sequential syntax of generators. + +But we pass the iterable sequence into `ASQ#runner(..)`, which runs it to completion the same as if it was a generator. The fact that an iterable sequence behaves essentially the same as a generator is notable for a couple of reasons. + +First, iterable sequences are kind of a pre-ES6 equivalent to a certain subset of ES6 generators, which means you can either author them directly (to run anywhere), or you can author ES6 generators and transpile/convert them to iterable sequences (or Promise chains for that matter!). + +Thinking of an async-run-to-completion generator as just syntactic sugar for a Promise chain is an important recognition of their isomorphic relationship. + +Before we move on, we should note that the previous snippet could have been expressed in *asynquence* as: + +```js +ASQ( "http://some.url.1" ) +.seq( /*STEP 1*/ request ) +.seq( function STEP2(resp){ + return ASQ().gate( + request( "http://some.url.2/?v=" + resp ), + request( "http://some.url.3/?v=" + resp ) + ); +} ) +.val( function STEP3(r1,r2){ return r1 + r2; } ) +.val( function(msg){ + console.log( msg ); +} ); +``` + +Moreover, step 2 could have even been expressed as: + +```js +.gate( + function STEP2a(done,resp) { + request( "http://some.url.2/?v=" + resp ) + .pipe( done ); + }, + function STEP2b(done,resp) { + request( "http://some.url.3/?v=" + resp ) + .pipe( done ); + } +) +``` + +So, why would we go to the trouble of expressing our flow control as an iterable sequence in a `ASQ#runner(..)` step, when it seems like a simpler/flatter *asyquence* chain does the job well? + +Because the iterable sequence form has an important trick up its sleeve that gives us more capability. Read on. + +### Extending Iterable Sequences + +Generators, normal *asynquence* sequences, and Promise chains, are all **eagerly evaluated** -- whatever flow control is expressed initially *is* the fixed flow that will be followed. + +However, iterable sequences are **lazily evaluated**, which means that during execution of the iterable sequence, you can extend the sequence with more steps if desired. + +**Note:** You can only append to the end of an iterable sequence, not inject into the middle of the sequence. + +Let's first look at a simpler (synchronous) example of that capability to get familiar with it: + +```js +function double(x) { + x *= 2; + + // should we keep extending? + if (x < 500) { + isq.then( double ); + } + + return x; +} + +// setup single-step iterable sequence +var isq = ASQ.iterable().then( double ); + +for (var v = 10, ret; + (ret = isq.next( v )) && !ret.done; +) { + v = ret.value; + console.log( v ); +} +``` + +The iterable sequence starts out with only one defined step (`isq.then(double)`), but the sequence keeps extending itself under certain conditions (`x < 500`). Both *asynquence* sequences and Promise chains technically *can* do something similar, but we'll see in a little bit why their capability is insufficient. + +Though this example is rather trivial and could otherwise be expressed with a `while` loop in a generator, we'll consider more sophisticated cases. + +For instance, you could examine the response from an Ajax request and if it indicates that more data is needed, you conditionally insert more steps into the iterable sequence to make the additional request(s). Or you could conditionally add a value-formatting step to the end of your Ajax handling. + +Consider: + +```js +var steps = ASQ.iterable() + +.then( function STEP1(token){ + var url = token.messages[0].url; + + // was an additional formatting step provided? + if (token.messages[0].format) { + steps.then( token.messages[0].format ); + } + + return request( url ); +} ) + +.then( function STEP2(resp){ + // add another Ajax request to the sequence? + if (/x1/.test( resp )) { + steps.then( function STEP5(text){ + return request( + "http://some.url.4/?v=" + text + ); + } ); + } + + return ASQ().gate( + request( "http://some.url.2/?v=" + resp ), + request( "http://some.url.3/?v=" + resp ) + ); +} ) + +.then( function STEP3(r1,r2){ return r1 + r2; } ); +``` + +You can see in two different places where we conditionally extend `steps` with `steps.then(..)`. And to run this `steps` iterable sequence, we just wire it into our main program flow with an *asynquence* sequence (called `main` here) using `ASQ#runner(..)`: + +```js +var main = ASQ( { + url: "http://some.url.1", + format: function STEP4(text){ + return text.toUpperCase(); + } +} ) +.runner( steps ) +.val( function(msg){ + console.log( msg ); +} ); +``` + +Can the flexibility (conditional behavior) of the `steps` iterable sequence be expressed with a generator? Kind of, but we have to rearrange the logic in a slightly awkward way: + +```js +function *steps(token) { + // **STEP 1** + var resp = yield request( token.messages[0].url ); + + // **STEP 2** + var rvals = yield ASQ().gate( + request( "http://some.url.2/?v=" + resp ), + request( "http://some.url.3/?v=" + resp ) + ); + + // **STEP 3** + var text = rvals[0] + rvals[1]; + + // **STEP 4** + // was an additional formatting step provided? + if (token.messages[0].format) { + text = yield token.messages[0].format( text ); + } + + // **STEP 5** + // need another Ajax request added to the sequence? + if (/foobar/.test( resp )) { + text = yield request( + "http://some.url.4/?v=" + text + ); + } + + return text; +} + +// note: `*steps()` can be run by the same `ASQ` sequence +// as `steps` was previously +``` + +Setting aside the already identified benefits of the sequential, synchronous-looking syntax of generators (see Chapter 4), the `steps` logic had to be reordered in the `*steps()` generator form, to fake the dynamicism of the extendable iterable sequence `steps`. + +What about expressing the functionality with Promises or sequences, though? You *can* do something like this: + +```js +var steps = something( .. ) +.then( .. ) +.then( function(..){ + // .. + + // extending the chain, right? + steps = steps.then( .. ); + + // .. +}) +.then( .. ); +``` + +The problem is subtle but important to grasp. So, consider trying to wire up our `steps` Promise chain into our main program flow -- this time expressed with Promises instead of *asynquence*: + +```js +var main = Promise.resolve( { + url: "http://some.url.1", + format: function STEP4(text){ + return text.toUpperCase(); + } +} ) +.then( function(..){ + return steps; // hint! +} ) +.val( function(msg){ + console.log( msg ); +} ); +``` + +Can you spot the problem now? Look closely! + +There's a race condition for sequence steps ordering. When you `return steps`, at that moment `steps` *might* be the originally defined promise chain, or it might now point to the extended promise chain via the `steps = steps.then(..)` call, depending on what order things happen. + +Here are the two possible outcomes: + +* If `steps` is still the original promise chain, once it's later "extended" by `steps = steps.then(..)`, that extended promise on the end of the chain is **not** considered by the `main` flow, as it's already tapped the `steps` chain. This is the unfortunately limiting **eager evaluation**. +* If `steps` is already the extended promise chain, it works as we expect in that the extended promise is what `main` taps. + +Other than the obvious fact that a race condition is intolerable, the first case is the concern; it illustrates **eager evaluation** of the promise chain. By contrast, we easily extended the iterable sequence without such issues, because iterable sequences are **lazily evaluated**. + +The more dynamic you need your flow control, the more iterable sequences will shine. + +**Tip:** Check out more information and examples of iterable sequences on the *asynquence* site (https://github.com/getify/asynquence/blob/master/README.md#iterable-sequences). + +## Event Reactive + +It should be obvious from (at least!) Chapter 3 that Promises are a very powerful tool in your async toolbox. But one thing that's clearly lacking is in their capability to handle streams of events, as a Promise can only be resolved once. And frankly, this exact same weakness is true of plain *asynquence* sequences, as well. + +Consider a scenario where you want to fire off a series of steps every time a certain event is fired. A single Promise or sequence cannot represent all occurrences of that event. So, you have to create a whole new Promise chain (or sequence) for *each* event occurrence, such as: + +```js +listener.on( "foobar", function(data){ + + // create a new event handling promise chain + new Promise( function(resolve,reject){ + // .. + } ) + .then( .. ) + .then( .. ); + +} ); +``` + +The base functionality we need is present in this approach, but it's far from a desirable way to express our intended logic. There are two separate capabilities conflated in this paradigm: the event listening, and responding to the event; separation of concerns would implore us to separate out these capabilities. + +The carefully observant reader will see this problem as somewhat symmetrical to the problems we detailed with callbacks in Chapter 2; it's kind of an inversion of control problem. + +Imagine uninverting this paradigm, like so: + +```js +var observable = listener.on( "foobar" ); + +// later +observable +.then( .. ) +.then( .. ); + +// elsewhere +observable +.then( .. ) +.then( .. ); +``` + +The `observable` value is not exactly a Promise, but you can *observe* it much like you can observe a Promise, so it's closely related. In fact, it can be observed many times, and it will send out notifications every time its event (`"foobar"`) occurs. + +**Tip:** This pattern I've just illustrated is a **massive simplification** of the concepts and motivations behind reactive programming (aka RP), which has been implemented/expounded upon by several great projects and languages. A variation on RP is functional reactive programming (FRP), which refers to applying functional programming techniques (immutability, referential integrity, etc.) to streams of data. "Reactive" refers to spreading this functionality out over time in response to events. The interested reader should consider studying "Reactive Observables" in the fantastic "Reactive Extensions" library ("RxJS" for JavaScript) by Microsoft (http://rxjs.codeplex.com/); it's much more sophisticated and powerful than I've just shown. Also, Andre Staltz has an excellent write-up (https://gist.github.com/staltz/868e7e9bc2a7b8c1f754) that pragmatically lays out RP in concrete examples. + +### ES7 Observables + +At the time of this writing, there's an early ES7 proposal for a new data type called "Observable" (https://github.com/jhusain/asyncgenerator#introducing-observable), which in spirit is similar to what we've laid out here, but is definitely more sophisticated. + +The notion of this kind of Observable is that the way you "subscribe" to the events from a stream is to pass in a generator -- actually the *iterator* is the interested party -- whose `next(..)` method will be called for each event. + +You could imagine it sort of like this: + +```js +// `someEventStream` is a stream of events, like from +// mouse clicks, and the like. + +var observer = new Observer( someEventStream, function*(){ + while (var evt = yield) { + console.log( evt ); + } +} ); +``` + +The generator you pass in will `yield` pause the `while` loop waiting for the next event. The *iterator* attached to the generator instance will have its `next(..)` called each time `someEventStream` has a new event published, and so that event data will resume your generator/*iterator* with the `evt` data. + +In the subscription to events functionality here, it's the *iterator* part that matters, not the generator. So conceptually you could pass in practically any iterable, including `ASQ.iterable()` iterable sequences. + +Interestingly, there are also proposed adapters to make it easy to construct Observables from certain types of streams, such as `fromEvent(..)` for DOM events. If you look at a suggested implementation of `fromEvent(..)` in the earlier linked ES7 proposal, it looks an awful lot like the `ASQ.react(..)` we'll see in the next section. + +Of course, these are all early proposals, so what shakes out may very well look/behave differently than shown here. But it's exciting to see the early alignments of concepts across different libraries and language proposals! + +### Reactive Sequences + +With that crazy brief summary of Observables (and F/RP) as our inspiration and motivation, I will now illustrate an adaptation of a small subset of "Reactive Observables," which I call "Reactive Sequences." + +First, let's start with how to create an Observable, using an *asynquence* plug-in utility called `react(..)`: + +```js +var observable = ASQ.react( function setup(next){ + listener.on( "foobar", next ); +} ); +``` + +Now, let's see how to define a sequence that "reacts" -- in F/RP, this is typically called "subscribing" -- to that `observable`: + +```js +observable +.seq( .. ) +.then( .. ) +.val( .. ); +``` + +So, you just define the sequence by chaining off the Observable. That's easy, huh? + +In F/RP, the stream of events typically channels through a set of functional transforms, like `scan(..)`, `map(..)`, `reduce(..)`, and so on. With reactive sequences, each event channels through a new instance of the sequence. Let's look at a more concrete example: + +```js +ASQ.react( function setup(next){ + document.getElementById( "mybtn" ) + .addEventListener( "click", next, false ); +} ) +.seq( function(evt){ + var btnID = evt.target.id; + return request( + "http://some.url.1/?id=" + btnID + ); +} ) +.val( function(text){ + console.log( text ); +} ); +``` + +The "reactive" portion of the reactive sequence comes from assigning one or more event handlers to invoke the event trigger (calling `next(..)`). + +The "sequence" portion of the reactive sequence is exactly like the sequences we've already explored: each step can be whatever asynchronous technique makes sense, from continuation callback to Promise to generator. + +Once you set up a reactive sequence, it will continue to initiate instances of the sequence as long as the events keep firing. If you want to stop a reactive sequence, you can call `stop()`. + +If a reactive sequence is `stop()`'d, you likely want the event handler(s) to be unregistered as well; you can register a teardown handler for this purpose: + +```js +var sq = ASQ.react( function setup(next,registerTeardown){ + var btn = document.getElementById( "mybtn" ); + + btn.addEventListener( "click", next, false ); + + // will be called once `sq.stop()` is called + registerTeardown( function(){ + btn.removeEventListener( "click", next, false ); + } ); +} ) +.seq( .. ) +.then( .. ) +.val( .. ); + +// later +sq.stop(); +``` + +**Note:** The `this` binding reference inside the `setup(..)` handler is the same `sq` reactive sequence, so you can use the `this` reference to add to the reactive sequence definition, call methods like `stop()`, and so on. + +Here's an example from the Node.js world, using reactive sequences to handle incoming HTTP requests: + +```js +var server = http.createServer(); +server.listen(8000); + +// reactive observer +var request = ASQ.react( function setup(next,registerTeardown){ + server.addListener( "request", next ); + server.addListener( "close", this.stop ); + + registerTeardown( function(){ + server.removeListener( "request", next ); + server.removeListener( "close", request.stop ); + } ); +}); + +// respond to requests +request +.seq( pullFromDatabase ) +.val( function(data,res){ + res.end( data ); +} ); + +// node teardown +process.on( "SIGINT", request.stop ); +``` + +The `next(..)` trigger can also adapt to node streams easily, using `onStream(..)` and `unStream(..)`: + +```js +ASQ.react( function setup(next){ + var fstream = fs.createReadStream( "/some/file" ); + + // pipe the stream's "data" event to `next(..)` + next.onStream( fstream ); + + // listen for the end of the stream + fstream.on( "end", function(){ + next.unStream( fstream ); + } ); +} ) +.seq( .. ) +.then( .. ) +.val( .. ); +``` + +You can also use sequence combinations to compose multiple reactive sequence streams: + +```js +var sq1 = ASQ.react( .. ).seq( .. ).then( .. ); +var sq2 = ASQ.react( .. ).seq( .. ).then( .. ); + +var sq3 = ASQ.react(..) +.gate( + sq1, + sq2 +) +.then( .. ); +``` + +The main takeaway is that `ASQ.react(..)` is a lightweight adaptation of F/RP concepts, enabling the wiring of an event stream to a sequence, hence the term "reactive sequence." Reactive sequences are generally capable enough for basic reactive uses. + +**Note:** Here's an example of using `ASQ.react(..)` in managing UI state (http://jsbin.com/rozipaki/6/edit?js,output), and another example of handling HTTP request/response streams with `ASQ.react(..)` (https://gist.github.com/getify/bba5ec0de9d6047b720e). + +## Generator Coroutine + +Hopefully Chapter 4 helped you get pretty familiar with ES6 generators. In particular, we want to revisit the "Generator Concurrency" discussion, and push it even further. + +We imagined a `runAll(..)` utility that could take two or more generators and run them concurrently, letting them cooperatively `yield` control from one to the next, with optional message passing. + +In addition to being able to run a single generator to completion, the `ASQ#runner(..)` we discussed in Appendix A is a similar implementation of the concepts of `runAll(..)`, which can run multiple generators concurrently to completion. + +So let's see how we can implement the concurrent Ajax scenario from Chapter 4: + +```js +ASQ( + "http://some.url.2" +) +.runner( + function*(token){ + // transfer control + yield token; + + var url1 = token.messages[0]; // "http://some.url.1" + + // clear out messages to start fresh + token.messages = []; + + var p1 = request( url1 ); + + // transfer control + yield token; + + token.messages.push( yield p1 ); + }, + function*(token){ + var url2 = token.messages[0]; // "http://some.url.2" + + // message pass and transfer control + token.messages[0] = "http://some.url.1"; + yield token; + + var p2 = request( url2 ); + + // transfer control + yield token; + + token.messages.push( yield p2 ); + + // pass along results to next sequence step + return token.messages; + } +) +.val( function(res){ + // `res[0]` comes from "http://some.url.1" + // `res[1]` comes from "http://some.url.2" +} ); +``` + +The main differences between `ASQ#runner(..)` and `runAll(..)` are as follows: + +* Each generator (coroutine) is provided an argument we call `token`, which is the special value to `yield` when you want to explicitly transfer control to the next coroutine. +* `token.messages` is an array that holds any messages passed in from the previous sequence step. It's also a data structure that you can use to share messages between coroutines. +* `yield`ing a Promise (or sequence) value does not transfer control, but instead pauses the coroutine processing until that value is ready. +* The last `return`ed or `yield`ed value from the coroutine processing run will be forward passed to the next step in the sequence. + +It's also easy to layer helpers on top of the base `ASQ#runner(..)` functionality to suit different uses. + +### State Machines + +One example that may be familiar to many programmers is state machines. You can, with the help of a simple cosmetic utility, create an easy-to-express state machine processor. + +Let's imagine such a utility. We'll call it `state(..)`, and will pass it two arguments: a state value and a generator that handles that state. `state(..)` will do the dirty work of creating and returning an adapter generator to pass to `ASQ#runner(..)`. + +Consider: + +```js +function state(val,handler) { + // make a coroutine handler for this state + return function*(token) { + // state transition handler + function transition(to) { + token.messages[0] = to; + } + + // set initial state (if none set yet) + if (token.messages.length < 1) { + token.messages[0] = val; + } + + // keep going until final state (false) is reached + while (token.messages[0] !== false) { + // current state matches this handler? + if (token.messages[0] === val) { + // delegate to state handler + yield *handler( transition ); + } + + // transfer control to another state handler? + if (token.messages[0] !== false) { + yield token; + } + } + }; +} +``` + +If you look closely, you'll see that `state(..)` returns back a generator that accepts a `token`, and then it sets up a `while` loop that will run until the state machine reaches its final state (which we arbitrarily pick as the `false` value); that's exactly the kind of generator we want to pass to `ASQ#runner(..)`! + +We also arbitrarily reserve the `token.messages[0]` slot as the place where the current state of our state machine will be tracked, which means we can even seed the initial state as the value passed in from the previous step in the sequence. + +How do we use the `state(..)` helper along with `ASQ#runner(..)`? + +```js +var prevState; + +ASQ( + /* optional: initial state value */ + 2 +) +// run our state machine +// transitions: 2 -> 3 -> 1 -> 3 -> false +.runner( + // state `1` handler + state( 1, function *stateOne(transition){ + console.log( "in state 1" ); + + prevState = 1; + yield transition( 3 ); // goto state `3` + } ), + + // state `2` handler + state( 2, function *stateTwo(transition){ + console.log( "in state 2" ); + + prevState = 2; + yield transition( 3 ); // goto state `3` + } ), + + // state `3` handler + state( 3, function *stateThree(transition){ + console.log( "in state 3" ); + + if (prevState === 2) { + prevState = 3; + yield transition( 1 ); // goto state `1` + } + // all done! + else { + yield "That's all folks!"; + + prevState = 3; + yield transition( false ); // terminal state + } + } ) +) +// state machine complete, so move on +.val( function(msg){ + console.log( msg ); // That's all folks! +} ); +``` + +It's important to note that the `*stateOne(..)`, `*stateTwo(..)`, and `*stateThree(..)` generators themselves are reinvoked each time that state is entered, and they finish when you `transition(..)` to another value. While not shown here, of course these state generator handlers can be asynchronously paused by `yield`ing Promises/sequences/thunks. + +The underneath hidden generators produced by the `state(..)` helper and actually passed to `ASQ#runner(..)` are the ones that continue to run concurrently for the length of the state machine, and each of them handles cooperatively `yield`ing control to the next, and so on. + +**Note:** See this "ping pong" example (http://jsbin.com/qutabu/1/edit?js,output) for more illustration of using cooperative concurrency with generators driven by `ASQ#runner(..)`. + +## Communicating Sequential Processes (CSP) + +"Communicating Sequential Processes" (CSP) was first described by C. A. R. Hoare in a 1978 academic paper (http://dl.acm.org/citation.cfm?doid=359576.359585), and later in a 1985 book (http://www.usingcsp.com/) of the same name. CSP describes a formal method for concurrent "processes" to interact (aka "communicate") during processing. + +You may recall that we examined concurrent "processes" back in Chapter 1, so our exploration of CSP here will build upon that understanding. + +Like most great concepts in computer science, CSP is heavily steeped in academic formalism, expressed as a process algebra. However, I suspect symbolic algebra theorems won't make much practical difference to the reader, so we will want to find some other way of wrapping our brains around CSP. + +I will leave much of the formal description and proof of CSP to Hoare's writing, and to many other fantastic writings since. Instead, we will try to just briefly explain the idea of CSP in as un-academic and hopefully intuitively understandable a way as possible. + +### Message Passing + +The core principle in CSP is that all communication/interaction between otherwise independent processes must be through formal message passing. Perhaps counter to your expectations, CSP message passing is described as a synchronous action, where the sender process and the receiver process have to mutually be ready for the message to be passed. + +How could such synchronous messaging possibly be related to asynchronous programming in JavaScript? + +The concreteness of relationship comes from the nature of how ES6 generators are used to produce synchronous-looking actions that under the covers can indeed either be synchronous or (more likely) asynchronous. + +In other words, two or more concurrently running generators can appear to synchronously message each other while preserving the fundamental asynchrony of the system because each generator's code is paused (aka "blocked") waiting on resumption of an asynchronous action. + +How does this work? + +Imagine a generator (aka "process") called "A" that wants to send a message to generator "B." First, "A" `yield`s the message (thus pausing "A") to be sent to "B." When "B" is ready and takes the message, "A" is then resumed (unblocked). + +Symmetrically, imagine a generator "A" that wants a message **from** "B." "A" `yield`s its request (thus pausing "A") for the message from "B," and once "B" sends a message, "A" takes the message and is resumed. + +One of the more popular expressions of this CSP message passing theory comes from ClojureScript's core.async library, and also from the *go* language. These takes on CSP embody the described communication semantics in a conduit that is opened between processes called a "channel." + +**Note:** The term *channel* is used in part because there are modes in which more than one value can be sent at once into the "buffer" of the channel; this is similar to what you may think of as a stream. We won't go into depth about it here, but it can be a very powerful technique for managing streams of data. + +In the simplest notion of CSP, a channel that we create between "A" and "B" would have a method called `take(..)` for blocking to receive a value, and a method called `put(..)` for blocking to send a value. + +This might look like: + +```js +var ch = channel(); + +function *foo() { + var msg = yield take( ch ); + + console.log( msg ); +} + +function *bar() { + yield put( ch, "Hello World" ); + + console.log( "message sent" ); +} + +run( foo ); +run( bar ); +// Hello World +// "message sent" +``` + +Compare this structured, synchronous(-looking) message passing interaction to the informal and unstructured message sharing that `ASQ#runner(..)` provides through the `token.messages` array and cooperative `yield`ing. In essence, `yield put(..)` is a single operation that both sends the value and pauses execution to transfer control, whereas in earlier examples we did those as separate steps. + +Moreover, CSP stresses that you don't really explicitly "transfer control," but rather you design your concurrent routines to block expecting either a value received from the channel, or to block expecting to try to send a message on the channel. The blocking around receiving or sending messages is how you coordinate sequencing of behavior between the coroutines. + +**Note:** Fair warning: this pattern is very powerful but it's also a little mind twisting to get used to at first. You will want to practice this a bit to get used to this new way of thinking about coordinating your concurrency. + +There are several great libraries that have implemented this flavor of CSP in JavaScript, most notably "js-csp" (https://github.com/ubolonton/js-csp), which James Long (http://twitter.com/jlongster) forked (https://github.com/jlongster/js-csp) and has written extensively about (http://jlongster.com/Taming-the-Asynchronous-Beast-with-CSP-in-JavaScript). Also, it cannot be stressed enough how amazing the many writings of David Nolen (http://twitter.com/swannodette) are on the topic of adapting ClojureScript's go-style core.async CSP into JS generators (http://swannodette.github.io/2013/08/24/es6-generators-and-csp/). + +### asynquence CSP emulation + +Because we've been discussing async patterns here in the context of my *asynquence* library, you might be interested to see that we can fairly easily add an emulation layer on top of `ASQ#runner(..)` generator handling as a nearly perfect porting of the CSP API and behavior. This emulation layer ships as an optional part of the "asynquence-contrib" package alongside *asynquence*. + +Very similar to the `state(..)` helper from earlier, `ASQ.csp.go(..)` takes a generator -- in go/core.async terms, it's known as a goroutine -- and adapts it to use with `ASQ#runner(..)` by returning a new generator. + +Instead of being passed a `token`, your goroutine receives an initially created channel (`ch` below) that all goroutines in this run will share. You can create more channels (which is often quite helpful!) with `ASQ.csp.chan(..)`. + +In CSP, we model all asynchrony in terms of blocking on channel messages, rather than blocking waiting for a Promise/sequence/thunk to complete. + +So, instead of `yield`ing the Promise returned from `request(..)`, `request(..)` should return a channel that you `take(..)` a value from. In other words, a single-value channel is roughly equivalent in this context/usage to a Promise/sequence. + +Let's first make a channel-aware version of `request(..)`: + +```js +function request(url) { + var ch = ASQ.csp.channel(); + ajax( url ).then( function(content){ + // `putAsync(..)` is a version of `put(..)` that + // can be used outside of a generator. It returns + // a promise for the operation's completion. We + // don't use that promise here, but we could if + // we needed to be notified when the value had + // been `take(..)`n. + ASQ.csp.putAsync( ch, content ); + } ); + return ch; +} +``` + +From Chapter 3, "promisory" is a Promise-producing utility, "thunkory" from Chapter 4 is a thunk-producing utility, and finally, in Appendix A we invented "sequory" for a sequence-producing utility. + +Naturally, we need to coin a symmetric term here for a channel-producing utility. So let's unsurprisingly call it a "chanory" ("channel" + "factory"). As an exercise for the reader, try your hand at defining a `channelify(..)` utility similar to `Promise.wrap(..)`/`promisify(..)` (Chapter 3), `thunkify(..)` (Chapter 4), and `ASQ.wrap(..)` (Appendix A). + +Now consider the concurrent Ajax example using *asyquence*-flavored CSP: + +```js +ASQ() +.runner( + ASQ.csp.go( function*(ch){ + yield ASQ.csp.put( ch, "http://some.url.2" ); + + var url1 = yield ASQ.csp.take( ch ); + // "http://some.url.1" + + var res1 = yield ASQ.csp.take( request( url1 ) ); + + yield ASQ.csp.put( ch, res1 ); + } ), + ASQ.csp.go( function*(ch){ + var url2 = yield ASQ.csp.take( ch ); + // "http://some.url.2" + + yield ASQ.csp.put( ch, "http://some.url.1" ); + + var res2 = yield ASQ.csp.take( request( url2 ) ); + var res1 = yield ASQ.csp.take( ch ); + + // pass along results to next sequence step + ch.buffer_size = 2; + ASQ.csp.put( ch, res1 ); + ASQ.csp.put( ch, res2 ); + } ) +) +.val( function(res1,res2){ + // `res1` comes from "http://some.url.1" + // `res2` comes from "http://some.url.2" +} ); +``` + +The message passing that trades the URL strings between the two goroutines is pretty straightforward. The first goroutine makes an Ajax request to the first URL, and that response is put onto the `ch` channel. The second goroutine makes an Ajax request to the second URL, then gets the first response `res1` off the `ch` channel. At that point, both responses `res1` and `res2` are completed and ready. + +If there are any remaining values in the `ch` channel at the end of the goroutine run, they will be passed along to the next step in the sequence. So, to pass out message(s) from the final goroutine, `put(..)` them into `ch`. As shown, to avoid the blocking of those final `put(..)`s, we switch `ch` into buffering mode by setting its `buffer_size` to `2` (default: `0`). + +**Note:** See many more examples of using *asynquence*-flavored CSP here (https://gist.github.com/getify/e0d04f1f5aa24b1947ae). + +## Review + +Promises and generators provide the foundational building blocks upon which we can build much more sophisticated and capable asynchrony. + +*asynquence* has utilities for implementing *iterable sequences*, *reactive sequences* (aka "Observables"), *concurrent coroutines*, and even *CSP goroutines*. + +Those patterns, combined with the continuation-callback and Promise capabilities, gives *asynquence* a powerful mix of different asynchronous functionalities, all integrated in one clean async flow control abstraction: the sequence. diff --git a/async & performance/apC.md b/async & performance/apC.md new file mode 100644 index 0000000..7bf26da --- /dev/null +++ b/async & performance/apC.md @@ -0,0 +1,20 @@ +# You Don't Know JS: Async & Performance +# Appendix C: Acknowledgments + +I have many people to thank for making this book title and the overall series happen. + +First, I must thank my wife Christen Simpson, and my two kids Ethan and Emily, for putting up with Dad always pecking away at the computer. Even when not writing books, my obsession with JavaScript glues my eyes to the screen far more than it should. That time I borrow from my family is the reason these books can so deeply and completely explain JavaScript to you, the reader. I owe my family everything. + +I'd like to thank my editors at O'Reilly, namely Simon St.Laurent and Brian MacDonald, as well as the rest of the editorial and marketing staff. They are fantastic to work with, and have been especially accommodating during this experiment into "open source" book writing, editing, and production. + +Thank you to the many folks who have participated in making this book series better by providing editorial suggestions and corrections, including Shelley Powers, Tim Ferro, Evan Borden, Forrest L. Norvell, Jennifer Davis, Jesse Harlin, Kris Kowal, Rick Waldron, Jordan Harband, Benjamin Gruenbaum, Vyacheslav Egorov, David Nolen, and many others. A big thank you to Jake Archibald for writing the Foreword for this title. + +Thank you to the countless folks in the community, including members of the TC39 committee, who have shared so much knowledge with the rest of us, and especially tolerated my incessant questions and explorations with patience and detail. John-David Dalton, Juriy "kangax" Zaytsev, Mathias Bynens, Axel Rauschmayer, Nicholas Zakas, Angus Croll, Reginald Braithwaite, Dave Herman, Brendan Eich, Allen Wirfs-Brock, Bradley Meck, Domenic Denicola, David Walsh, Tim Disney, Peter van der Zee, Andrea Giammarchi, Kit Cambridge, Eric Elliott, and so many others, I can't even scratch the surface. + +The *You Don't Know JS* book series was born on Kickstarter, so I also wish to thank all my (nearly) 500 generous backers, without whom this book series could not have happened: + +> Jan Szpila, nokiko, Murali Krishnamoorthy, Ryan Joy, Craig Patchett, pdqtrader, Dale Fukami, ray hatfield, R0drigo Perez [Mx], Dan Petitt, Jack Franklin, Andrew Berry, Brian Grinstead, Rob Sutherland, Sergi Meseguer, Phillip Gourley, Mark Watson, Jeff Carouth, Alfredo Sumaran, Martin Sachse, Marcio Barrios, Dan, AimelyneM, Matt Sullivan, Delnatte Pierre-Antoine, Jake Smith, Eugen Tudorancea, Iris, David Trinh, simonstl, Ray Daly, Uros Gruber, Justin Myers, Shai Zonis, Mom & Dad, Devin Clark, Dennis Palmer, Brian Panahi Johnson, Josh Marshall, Marshall, Dennis Kerr, Matt Steele, Erik Slagter, Sacah, Justin Rainbow, Christian Nilsson, Delapouite, D.Pereira, Nicolas Hoizey, George V. Reilly, Dan Reeves, Bruno Laturner, Chad Jennings, Shane King, Jeremiah Lee Cohick, od3n, Stan Yamane, Marko Vucinic, Jim B, Stephen Collins, Ægir Þorsteinsson, Eric Pederson, Owain, Nathan Smith, Jeanetteurphy, Alexandre ELISÉ, Chris Peterson, Rik Watson, Luke Matthews, Justin Lowery, Morten Nielsen, Vernon Kesner, Chetan Shenoy, Paul Tregoing, Marc Grabanski, Dion Almaer, Andrew Sullivan, Keith Elsass, Tom Burke, Brian Ashenfelter, David Stuart, Karl Swedberg, Graeme, Brandon Hays, John Christopher, Gior, manoj reddy, Chad Smith, Jared Harbour, Minoru TODA, Chris Wigley, Daniel Mee, Mike, Handyface, Alex Jahraus, Carl Furrow, Rob Foulkrod, Max Shishkin, Leigh Penny Jr., Robert Ferguson, Mike van Hoenselaar, Hasse Schougaard, rajan venkataguru, Jeff Adams, Trae Robbins, Rolf Langenhuijzen, Jorge Antunes, Alex Koloskov, Hugh Greenish, Tim Jones, Jose Ochoa, Michael Brennan-White, Naga Harish Muvva, Barkóczi Dávid, Kitt Hodsden, Paul McGraw, Sascha Goldhofer, Andrew Metcalf, Markus Krogh, Michael Mathews, Matt Jared, Juanfran, Georgie Kirschner, Kenny Lee, Ted Zhang, Amit Pahwa, Inbal Sinai, Dan Raine, Schabse Laks, Michael Tervoort, Alexandre Abreu, Alan Joseph Williams, NicolasD, Cindy Wong, Reg Braithwaite, LocalPCGuy, Jon Friskics, Chris Merriman, John Pena, Jacob Katz, Sue Lockwood, Magnus Johansson, Jeremy Crapsey, Grzegorz Pawłowski, nico nuzzaci, Christine Wilks, Hans Bergren, charles montgomery, Ariel בר-לבב Fogel, Ivan Kolev, Daniel Campos, Hugh Wood, Christian Bradford, Frédéric Harper, Ionuţ Dan Popa, Jeff Trimble, Rupert Wood, Trey Carrico, Pancho Lopez, Joël kuijten, Tom A Marra, Jeff Jewiss, Jacob Rios, Paolo Di Stefano, Soledad Penades, Chris Gerber, Andrey Dolganov, Wil Moore III, Thomas Martineau, Kareem, Ben Thouret, Udi Nir, Morgan Laupies, jory carson-burson, Nathan L Smith, Eric Damon Walters, Derry Lozano-Hoyland, Geoffrey Wiseman, mkeehner, KatieK, Scott MacFarlane, Brian LaShomb, Adrien Mas, christopher ross, Ian Littman, Dan Atkinson, Elliot Jobe, Nick Dozier, Peter Wooley, John Hoover, dan, Martin A. Jackson, Héctor Fernando Hurtado, andy ennamorato, Paul Seltmann, Melissa Gore, Dave Pollard, Jack Smith, Philip Da Silva, Guy Israeli, @megalithic, Damian Crawford, Felix Gliesche, April Carter Grant, Heidi, jim tierney, Andrea Giammarchi, Nico Vignola, Don Jones, Chris Hartjes, Alex Howes, john gibbon, David J. Groom, BBox, Yu 'Dilys' Sun, Nate Steiner, Brandon Satrom, Brian Wyant, Wesley Hales, Ian Pouncey, Timothy Kevin Oxley, George Terezakis, sanjay raj, Jordan Harband, Marko McLion, Wolfgang Kaufmann, Pascal Peuckert, Dave Nugent, Markus Liebelt, Welling Guzman, Nick Cooley, Daniel Mesquita, Robert Syvarth, Chris Coyier, Rémy Bach, Adam Dougal, Alistair Duggin, David Loidolt, Ed Richer, Brian Chenault, GoldFire Studios, Carles Andrés, Carlos Cabo, Yuya Saito, roberto ricardo, Barnett Klane, Mike Moore, Kevin Marx, Justin Love, Joe Taylor, Paul Dijou, Michael Kohler, Rob Cassie, Mike Tierney, Cody Leroy Lindley, tofuji, Shimon Schwartz, Raymond, Luc De Brouwer, David Hayes, Rhys Brett-Bowen, Dmitry, Aziz Khoury, Dean, Scott Tolinski - Level Up, Clement Boirie, Djordje Lukic, Anton Kotenko, Rafael Corral, Philip Hurwitz, Jonathan Pidgeon, Jason Campbell, Joseph C., SwiftOne, Jan Hohner, Derick Bailey, getify, Daniel Cousineau, Chris Charlton, Eric Turner, David Turner, Joël Galeran, Dharma Vagabond, adam, Dirk van Bergen, dave ♥♫★ furf, Vedran Zakanj, Ryan McAllen, Natalie Patrice Tucker, Eric J. Bivona, Adam Spooner, Aaron Cavano, Kelly Packer, Eric J, Martin Drenovac, Emilis, Michael Pelikan, Scott F. Walter, Josh Freeman, Brandon Hudgeons, vijay chennupati, Bill Glennon, Robin R., Troy Forster, otaku_coder, Brad, Scott, Frederick Ostrander, Adam Brill, Seb Flippence, Michael Anderson, Jacob, Adam Randlett, Standard, Joshua Clanton, Sebastian Kouba, Chris Deck, SwordFire, Hannes Papenberg, Richard Woeber, hnzz, Rob Crowther, Jedidiah Broadbent, Sergey Chernyshev, Jay-Ar Jamon, Ben Combee, luciano bonachela, Mark Tomlinson, Kit Cambridge, Michael Melgares, Jacob Adams, Adrian Bruinhout, Bev Wieber, Scott Puleo, Thomas Herzog, April Leone, Daniel Mizieliński, Kees van Ginkel, Jon Abrams, Erwin Heiser, Avi Laviad, David newell, Jean-Francois Turcot, Niko Roberts, Erik Dana, Charles Neill, Aaron Holmes, Grzegorz Ziółkowski, Nathan Youngman, Timothy, Jacob Mather, Michael Allan, Mohit Seth, Ryan Ewing, Benjamin Van Treese, Marcelo Santos, Denis Wolf, Phil Keys, Chris Yung, Timo Tijhof, Martin Lekvall, Agendine, Greg Whitworth, Helen Humphrey, Dougal Campbell, Johannes Harth, Bruno Girin, Brian Hough, Darren Newton, Craig McPheat, Olivier Tille, Dennis Roethig, Mathias Bynens, Brendan Stromberger, sundeep, John Meyer, Ron Male, John F Croston III, gigante, Carl Bergenhem, B.J. May, Rebekah Tyler, Ted Foxberry, Jordan Reese, Terry Suitor, afeliz, Tom Kiefer, Darragh Duffy, Kevin Vanderbeken, Andy Pearson, Simon Mac Donald, Abid Din, Chris Joel, Tomas Theunissen, David Dick, Paul Grock, Brandon Wood, John Weis, dgrebb, Nick Jenkins, Chuck Lane, Johnny Megahan, marzsman, Tatu Tamminen, Geoffrey Knauth, Alexander Tarmolov, Jeremy Tymes, Chad Auld, Sean Parmelee, Rob Staenke, Dan Bender, Yannick derwa, Joshua Jones, Geert Plaisier, Tom LeZotte, Christen Simpson, Stefan Bruvik, Justin Falcone, Carlos Santana, Michael Weiss, Pablo Villoslada, Peter deHaan, Dimitris Iliopoulos, seyDoggy, Adam Jordens, Noah Kantrowitz, Amol M, Matthew Winnard, Dirk Ginader, Phinam Bui, David Rapson, Andrew Baxter, Florian Bougel, Michael George, Alban Escalier, Daniel Sellers, Sasha Rudan, John Green, Robert Kowalski, David I. Teixeira (@ditma, Charles Carpenter, Justin Yost, Sam S, Denis Ciccale, Kevin Sheurs, Yannick Croissant, Pau Fracés, Stephen McGowan, Shawn Searcy, Chris Ruppel, Kevin Lamping, Jessica Campbell, Christopher Schmitt, Sablons, Jonathan Reisdorf, Bunni Gek, Teddy Huff, Michael Mullany, Michael Fürstenberg, Carl Henderson, Rick Yoesting, Scott Nichols, Hernán Ciudad, Andrew Maier, Mike Stapp, Jesse Shawl, Sérgio Lopes, jsulak, Shawn Price, Joel Clermont, Chris Ridmann, Sean Timm, Jason Finch, Aiden Montgomery, Elijah Manor, Derek Gathright, Jesse Harlin, Dillon Curry, Courtney Myers, Diego Cadenas, Arne de Bree, João Paulo Dubas, James Taylor, Philipp Kraeutli, Mihai Păun, Sam Gharegozlou, joshjs, Matt Murchison, Eric Windham, Timo Behrmann, Andrew Hall, joshua price, Théophile Villard + +This book series is being produced in an open source fashion, including editing and production. We owe GitHub a debt of gratitude for making that sort of thing possible for the community! + +Thank you again to all the countless folks I didn't name but who I nonetheless owe thanks. May this book series be "owned" by all of us and serve to contribute to increasing awareness and understanding of the JavaScript language, to the benefit of all current and future community contributors. diff --git a/async & performance/ch1.md b/async & performance/ch1.md new file mode 100644 index 0000000..56c9d03 --- /dev/null +++ b/async & performance/ch1.md @@ -0,0 +1,893 @@ +# You Don't Know JS: Async & Performance +# Chapter 1: Asynchrony: Now & Later + +One of the most important and yet often misunderstood parts of programming in a language like JavaScript is how to express and manipulate program behavior spread out over a period of time. + +This is not just about what happens from the beginning of a `for` loop to the end of a `for` loop, which of course takes *some time* (microseconds to milliseconds) to complete. It's about what happens when part of your program runs *now*, and another part of your program runs *later* -- there's a gap between *now* and *later* where your program isn't actively executing. + +Practically all nontrivial programs ever written (especially in JS) have in some way or another had to manage this gap, whether that be in waiting for user input, requesting data from a database or file system, sending data across the network and waiting for a response, or performing a repeated task at a fixed interval of time (like animation). In all these various ways, your program has to manage state across the gap in time. As they famously say in London (of the chasm between the subway door and the platform): "mind the gap." + +In fact, the relationship between the *now* and *later* parts of your program is at the heart of asynchronous programming. + +Asynchronous programming has been around since the beginning of JS, for sure. But most JS developers have never really carefully considered exactly how and why it crops up in their programs, or explored various *other* ways to handle it. The *good enough* approach has always been the humble callback function. Many to this day will insist that callbacks are more than sufficient. + +But as JS continues to grow in both scope and complexity, to meet the ever-widening demands of a first-class programming language that runs in browsers and servers and every conceivable device in between, the pains by which we manage asynchrony are becoming increasingly crippling, and they cry out for approaches that are both more capable and more reason-able. + +While this all may seem rather abstract right now, I assure you we'll tackle it more completely and concretely as we go on through this book. We'll explore a variety of emerging techniques for async JavaScript programming over the next several chapters. + +But before we can get there, we're going to have to understand much more deeply what asynchrony is and how it operates in JS. + +## A Program in Chunks + +You may write your JS program in one *.js* file, but your program is almost certainly comprised of several chunks, only one of which is going to execute *now*, and the rest of which will execute *later*. The most common unit of *chunk* is the `function`. + +The problem most developers new to JS seem to have is that *later* doesn't happen strictly and immediately after *now*. In other words, tasks that cannot complete *now* are, by definition, going to complete asynchronously, and thus we will not have blocking behavior as you might intuitively expect or want. + +Consider: + +```js +// ajax(..) is some arbitrary Ajax function given by a library +var data = ajax( "http://some.url.1" ); + +console.log( data ); +// Oops! `data` generally won't have the Ajax results +``` + +You're probably aware that standard Ajax requests don't complete synchronously, which means the `ajax(..)` function does not yet have any value to return back to be assigned to `data` variable. If `ajax(..)` *could* block until the response came back, then the `data = ..` assignment would work fine. + +But that's not how we do Ajax. We make an asynchronous Ajax request *now*, and we won't get the results back until *later*. + +The simplest (but definitely not only, or necessarily even best!) way of "waiting" from *now* until *later* is to use a function, commonly called a callback function: + +```js +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", function myCallbackFunction(data){ + + console.log( data ); // Yay, I gots me some `data`! + +} ); +``` + +**Warning:** You may have heard that it's possible to make synchronous Ajax requests. While that's technically true, you should never, ever do it, under any circumstances, because it locks the browser UI (buttons, menus, scrolling, etc.) and prevents any user interaction whatsoever. This is a terrible idea, and should always be avoided. + +Before you protest in disagreement, no, your desire to avoid the mess of callbacks is *not* justification for blocking, synchronous Ajax. + +For example, consider this code: + +```js +function now() { + return 21; +} + +function later() { + answer = answer * 2; + console.log( "Meaning of life:", answer ); +} + +var answer = now(); + +setTimeout( later, 1000 ); // Meaning of life: 42 +``` + +There are two chunks to this program: the stuff that will run *now*, and the stuff that will run *later*. It should be fairly obvious what those two chunks are, but let's be super explicit: + +Now: +```js +function now() { + return 21; +} + +function later() { .. } + +var answer = now(); + +setTimeout( later, 1000 ); +``` + +Later: +```js +answer = answer * 2; +console.log( "Meaning of life:", answer ); +``` + +The *now* chunk runs right away, as soon as you execute your program. But `setTimeout(..)` also sets up an event (a timeout) to happen *later*, so the contents of the `later()` function will be executed at a later time (1,000 milliseconds from now). + +Any time you wrap a portion of code into a `function` and specify that it should be executed in response to some event (timer, mouse click, Ajax response, etc.), you are creating a *later* chunk of your code, and thus introducing asynchrony to your program. + +### Async Console + +There is no specification or set of requirements around how the `console.*` methods work -- they are not officially part of JavaScript, but are instead added to JS by the *hosting environment* (see the *Types & Grammar* title of this book series). + +So, different browsers and JS environments do as they please, which can sometimes lead to confusing behavior. + +In particular, there are some browsers and some conditions that `console.log(..)` does not actually immediately output what it's given. The main reason this may happen is because I/O is a very slow and blocking part of many programs (not just JS). So, it may perform better (from the page/UI perspective) for a browser to handle `console` I/O asynchronously in the background, without you perhaps even knowing that occurred. + +A not terribly common, but possible, scenario where this could be *observable* (not from code itself but from the outside): + +```js +var a = { + index: 1 +}; + +// later +console.log( a ); // ?? + +// even later +a.index++; +``` + +We'd normally expect to see the `a` object be snapshotted at the exact moment of the `console.log(..)` statement, printing something like `{ index: 1 }`, such that in the next statement when `a.index++` happens, it's modifying something different than, or just strictly after, the output of `a`. + +Most of the time, the preceding code will probably produce an object representation in your developer tools' console that's what you'd expect. But it's possible this same code could run in a situation where the browser felt it needed to defer the console I/O to the background, in which case it's *possible* that by the time the object is represented in the browser console, the `a.index++` has already happened, and it shows `{ index: 2 }`. + +It's a moving target under what conditions exactly `console` I/O will be deferred, or even whether it will be observable. Just be aware of this possible asynchronicity in I/O in case you ever run into issues in debugging where objects have been modified *after* a `console.log(..)` statement and yet you see the unexpected modifications show up. + +**Note:** If you run into this rare scenario, the best option is to use breakpoints in your JS debugger instead of relying on `console` output. The next best option would be to force a "snapshot" of the object in question by serializing it to a `string`, like with `JSON.stringify(..)`. + +## Event Loop + +Let's make a (perhaps shocking) claim: despite clearly allowing asynchronous JS code (like the timeout we just looked at), up until recently (ES6), JavaScript itself has actually never had any direct notion of asynchrony built into it. + +**What!?** That seems like a crazy claim, right? In fact, it's quite true. The JS engine itself has never done anything more than execute a single chunk of your program at any given moment, when asked to. + +"Asked to." By whom? That's the important part! + +The JS engine doesn't run in isolation. It runs inside a *hosting environment*, which is for most developers the typical web browser. Over the last several years (but by no means exclusively), JS has expanded beyond the browser into other environments, such as servers, via things like Node.js. In fact, JavaScript gets embedded into all kinds of devices these days, from robots to lightbulbs. + +But the one common "thread" (that's a not-so-subtle asynchronous joke, for what it's worth) of all these environments is that they have a mechanism in them that handles executing multiple chunks of your program *over time*, at each moment invoking the JS engine, called the "event loop." + +In other words, the JS engine has had no innate sense of *time*, but has instead been an on-demand execution environment for any arbitrary snippet of JS. It's the surrounding environment that has always *scheduled* "events" (JS code executions). + +So, for example, when your JS program makes an Ajax request to fetch some data from a server, you set up the "response" code in a function (commonly called a "callback"), and the JS engine tells the hosting environment, "Hey, I'm going to suspend execution for now, but whenever you finish with that network request, and you have some data, please *call* this function *back*." + +The browser is then set up to listen for the response from the network, and when it has something to give you, it schedules the callback function to be executed by inserting it into the *event loop*. + +So what is the *event loop*? + +Let's conceptualize it first through some fake-ish code: + +```js +// `eventLoop` is an array that acts as a queue (first-in, first-out) +var eventLoop = [ ]; +var event; + +// keep going "forever" +while (true) { + // perform a "tick" + if (eventLoop.length > 0) { + // get the next event in the queue + event = eventLoop.shift(); + + // now, execute the next event + try { + event(); + } + catch (err) { + reportError(err); + } + } +} +``` + +This is, of course, vastly simplified pseudocode to illustrate the concepts. But it should be enough to help get a better understanding. + +As you can see, there's a continuously running loop represented by the `while` loop, and each iteration of this loop is called a "tick." For each tick, if an event is waiting on the queue, it's taken off and executed. These events are your function callbacks. + +It's important to note that `setTimeout(..)` doesn't put your callback on the event loop queue. What it does is set up a timer; when the timer expires, the environment places your callback into the event loop, such that some future tick will pick it up and execute it. + +What if there are already 20 items in the event loop at that moment? Your callback waits. It gets in line behind the others -- there's not normally a path for preempting the queue and skipping ahead in line. This explains why `setTimeout(..)` timers may not fire with perfect temporal accuracy. You're guaranteed (roughly speaking) that your callback won't fire *before* the time interval you specify, but it can happen at or after that time, depending on the state of the event queue. + +So, in other words, your program is generally broken up into lots of small chunks, which happen one after the other in the event loop queue. And technically, other events not related directly to your program can be interleaved within the queue as well. + +**Note:** We mentioned "up until recently" in relation to ES6 changing the nature of where the event loop queue is managed. It's mostly a formal technicality, but ES6 now specifies how the event loop works, which means technically it's within the purview of the JS engine, rather than just the *hosting environment*. One main reason for this change is the introduction of ES6 Promises, which we'll discuss in Chapter 3, because they require the ability to have direct, fine-grained control over scheduling operations on the event loop queue (see the discussion of `setTimeout(..0)` in the "Cooperation" section). + +## Parallel Threading + +It's very common to conflate the terms "async" and "parallel," but they are actually quite different. Remember, async is about the gap between *now* and *later*. But parallel is about things being able to occur simultaneously. + +The most common tools for parallel computing are processes and threads. Processes and threads execute independently and may execute simultaneously: on separate processors, or even separate computers, but multiple threads can share the memory of a single process. + +An event loop, by contrast, breaks its work into tasks and executes them in serial, disallowing parallel access and changes to shared memory. Parallelism and "serialism" can coexist in the form of cooperating event loops in separate threads. + +The interleaving of parallel threads of execution and the interleaving of asynchronous events occur at very different levels of granularity. + +For example: + +```js +function later() { + answer = answer * 2; + console.log( "Meaning of life:", answer ); +} +``` + +While the entire contents of `later()` would be regarded as a single event loop queue entry, when thinking about a thread this code would run on, there's actually perhaps a dozen different low-level operations. For example, `answer = answer * 2` requires first loading the current value of `answer`, then putting `2` somewhere, then performing the multiplication, then taking the result and storing it back into `answer`. + +In a single-threaded environment, it really doesn't matter that the items in the thread queue are low-level operations, because nothing can interrupt the thread. But if you have a parallel system, where two different threads are operating in the same program, you could very likely have unpredictable behavior. + +Consider: + +```js +var a = 20; + +function foo() { + a = a + 1; +} + +function bar() { + a = a * 2; +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", foo ); +ajax( "http://some.url.2", bar ); +``` + +In JavaScript's single-threaded behavior, if `foo()` runs before `bar()`, the result is that `a` has `42`, but if `bar()` runs before `foo()` the result in `a` will be `41`. + +If JS events sharing the same data executed in parallel, though, the problems would be much more subtle. Consider these two lists of pseudocode tasks as the threads that could respectively run the code in `foo()` and `bar()`, and consider what happens if they are running at exactly the same time: + +Thread 1 (`X` and `Y` are temporary memory locations): +``` +foo(): + a. load value of `a` in `X` + b. store `1` in `Y` + c. add `X` and `Y`, store result in `X` + d. store value of `X` in `a` +``` + +Thread 2 (`X` and `Y` are temporary memory locations): +``` +bar(): + a. load value of `a` in `X` + b. store `2` in `Y` + c. multiply `X` and `Y`, store result in `X` + d. store value of `X` in `a` +``` + +Now, let's say that the two threads are running truly in parallel. You can probably spot the problem, right? They use shared memory locations `X` and `Y` for their temporary steps. + +What's the end result in `a` if the steps happen like this? + +``` +1a (load value of `a` in `X` ==> `20`) +2a (load value of `a` in `X` ==> `20`) +1b (store `1` in `Y` ==> `1`) +2b (store `2` in `Y` ==> `2`) +1c (add `X` and `Y`, store result in `X` ==> `22`) +1d (store value of `X` in `a` ==> `22`) +2c (multiply `X` and `Y`, store result in `X` ==> `44`) +2d (store value of `X` in `a` ==> `44`) +``` + +The result in `a` will be `44`. But what about this ordering? + +``` +1a (load value of `a` in `X` ==> `20`) +2a (load value of `a` in `X` ==> `20`) +2b (store `2` in `Y` ==> `2`) +1b (store `1` in `Y` ==> `1`) +2c (multiply `X` and `Y`, store result in `X` ==> `20`) +1c (add `X` and `Y`, store result in `X` ==> `21`) +1d (store value of `X` in `a` ==> `21`) +2d (store value of `X` in `a` ==> `21`) +``` + +The result in `a` will be `21`. + +So, threaded programming is very tricky, because if you don't take special steps to prevent this kind of interruption/interleaving from happening, you can get very surprising, nondeterministic behavior that frequently leads to headaches. + +JavaScript never shares data across threads, which means *that* level of nondeterminism isn't a concern. But that doesn't mean JS is always deterministic. Remember earlier, where the relative ordering of `foo()` and `bar()` produces two different results (`41` or `42`)? + +**Note:** It may not be obvious yet, but not all nondeterminism is bad. Sometimes it's irrelevant, and sometimes it's intentional. We'll see more examples of that throughout this and the next few chapters. + +### Run-to-Completion + +Because of JavaScript's single-threading, the code inside of `foo()` (and `bar()`) is atomic, which means that once `foo()` starts running, the entirety of its code will finish before any of the code in `bar()` can run, or vice versa. This is called "run-to-completion" behavior. + +In fact, the run-to-completion semantics are more obvious when `foo()` and `bar()` have more code in them, such as: + +```js +var a = 1; +var b = 2; + +function foo() { + a++; + b = b * a; + a = b + 3; +} + +function bar() { + b--; + a = 8 + b; + b = a * 2; +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", foo ); +ajax( "http://some.url.2", bar ); +``` + +Because `foo()` can't be interrupted by `bar()`, and `bar()` can't be interrupted by `foo()`, this program only has two possible outcomes depending on which starts running first -- if threading were present, and the individual statements in `foo()` and `bar()` could be interleaved, the number of possible outcomes would be greatly increased! + +Chunk 1 is synchronous (happens *now*), but chunks 2 and 3 are asynchronous (happen *later*), which means their execution will be separated by a gap of time. + +Chunk 1: +```js +var a = 1; +var b = 2; +``` + +Chunk 2 (`foo()`): +```js +a++; +b = b * a; +a = b + 3; +``` + +Chunk 3 (`bar()`): +```js +b--; +a = 8 + b; +b = a * 2; +``` + +Chunks 2 and 3 may happen in either-first order, so there are two possible outcomes for this program, as illustrated here: + +Outcome 1: +```js +var a = 1; +var b = 2; + +// foo() +a++; +b = b * a; +a = b + 3; + +// bar() +b--; +a = 8 + b; +b = a * 2; + +a; // 11 +b; // 22 +``` + +Outcome 2: +```js +var a = 1; +var b = 2; + +// bar() +b--; +a = 8 + b; +b = a * 2; + +// foo() +a++; +b = b * a; +a = b + 3; + +a; // 183 +b; // 180 +``` + +Two outcomes from the same code means we still have nondeterminism! But it's at the function (event) ordering level, rather than at the statement ordering level (or, in fact, the expression operation ordering level) as it is with threads. In other words, it's *more deterministic* than threads would have been. + +As applied to JavaScript's behavior, this function-ordering nondeterminism is the common term "race condition," as `foo()` and `bar()` are racing against each other to see which runs first. Specifically, it's a "race condition" because you cannot predict reliably how `a` and `b` will turn out. + +**Note:** If there was a function in JS that somehow did not have run-to-completion behavior, we could have many more possible outcomes, right? It turns out ES6 introduces just such a thing (see Chapter 4 "Generators"), but don't worry right now, we'll come back to that! + +## Concurrency + +Let's imagine a site that displays a list of status updates (like a social network news feed) that progressively loads as the user scrolls down the list. To make such a feature work correctly, (at least) two separate "processes" will need to be executing *simultaneously* (i.e., during the same window of time, but not necessarily at the same instant). + +**Note:** We're using "process" in quotes here because they aren't true operating system–level processes in the computer science sense. They're virtual processes, or tasks, that represent a logically connected, sequential series of operations. We'll simply prefer "process" over "task" because terminology-wise, it will match the definitions of the concepts we're exploring. + +The first "process" will respond to `onscroll` events (making Ajax requests for new content) as they fire when the user has scrolled the page further down. The second "process" will receive Ajax responses back (to render content onto the page). + +Obviously, if a user scrolls fast enough, you may see two or more `onscroll` events fired during the time it takes to get the first response back and process, and thus you're going to have `onscroll` events and Ajax response events firing rapidly, interleaved with each other. + +Concurrency is when two or more "processes" are executing simultaneously over the same period, regardless of whether their individual constituent operations happen *in parallel* (at the same instant on separate processors or cores) or not. You can think of concurrency then as "process"-level (or task-level) parallelism, as opposed to operation-level parallelism (separate-processor threads). + +**Note:** Concurrency also introduces an optional notion of these "processes" interacting with each other. We'll come back to that later. + +For a given window of time (a few seconds worth of a user scrolling), let's visualize each independent "process" as a series of events/operations: + +"Process" 1 (`onscroll` events): +``` +onscroll, request 1 +onscroll, request 2 +onscroll, request 3 +onscroll, request 4 +onscroll, request 5 +onscroll, request 6 +onscroll, request 7 +``` + +"Process" 2 (Ajax response events): +``` +response 1 +response 2 +response 3 +response 4 +response 5 +response 6 +response 7 +``` + +It's quite possible that an `onscroll` event and an Ajax response event could be ready to be processed at exactly the same *moment*. For example let's visualize these events in a timeline: + +``` +onscroll, request 1 +onscroll, request 2 response 1 +onscroll, request 3 response 2 +response 3 +onscroll, request 4 +onscroll, request 5 +onscroll, request 6 response 4 +onscroll, request 7 +response 6 +response 5 +response 7 +``` + +But, going back to our notion of the event loop from earlier in the chapter, JS is only going to be able to handle one event at a time, so either `onscroll, request 2` is going to happen first or `response 1` is going to happen first, but they cannot happen at literally the same moment. Just like kids at a school cafeteria, no matter what crowd they form outside the doors, they'll have to merge into a single line to get their lunch! + +Let's visualize the interleaving of all these events onto the event loop queue. + +Event Loop Queue: +``` +onscroll, request 1 <--- Process 1 starts +onscroll, request 2 +response 1 <--- Process 2 starts +onscroll, request 3 +response 2 +response 3 +onscroll, request 4 +onscroll, request 5 +onscroll, request 6 +response 4 +onscroll, request 7 <--- Process 1 finishes +response 6 +response 5 +response 7 <--- Process 2 finishes +``` + +"Process 1" and "Process 2" run concurrently (task-level parallel), but their individual events run sequentially on the event loop queue. + +By the way, notice how `response 6` and `response 5` came back out of expected order? + +The single-threaded event loop is one expression of concurrency (there are certainly others, which we'll come back to later). + +### Noninteracting + +As two or more "processes" are interleaving their steps/events concurrently within the same program, they don't necessarily need to interact with each other if the tasks are unrelated. **If they don't interact, nondeterminism is perfectly acceptable.** + +For example: + +```js +var res = {}; + +function foo(results) { + res.foo = results; +} + +function bar(results) { + res.bar = results; +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", foo ); +ajax( "http://some.url.2", bar ); +``` + +`foo()` and `bar()` are two concurrent "processes," and it's nondeterminate which order they will be fired in. But we've constructed the program so it doesn't matter what order they fire in, because they act independently and as such don't need to interact. + +This is not a "race condition" bug, as the code will always work correctly, regardless of the ordering. + +### Interaction + +More commonly, concurrent "processes" will by necessity interact, indirectly through scope and/or the DOM. When such interaction will occur, you need to coordinate these interactions to prevent "race conditions," as described earlier. + +Here's a simple example of two concurrent "processes" that interact because of implied ordering, which is only *sometimes broken*: + +```js +var res = []; + +function response(data) { + res.push( data ); +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", response ); +ajax( "http://some.url.2", response ); +``` + +The concurrent "processes" are the two `response()` calls that will be made to handle the Ajax responses. They can happen in either-first order. + +Let's assume the expected behavior is that `res[0]` has the results of the `"http://some.url.1"` call, and `res[1]` has the results of the `"http://some.url.2"` call. Sometimes that will be the case, but sometimes they'll be flipped, depending on which call finishes first. There's a pretty good likelihood that this nondeterminism is a "race condition" bug. + +**Note:** Be extremely wary of assumptions you might tend to make in these situations. For example, it's not uncommon for a developer to observe that `"http://some.url.2"` is "always" much slower to respond than `"http://some.url.1"`, perhaps by virtue of what tasks they're doing (e.g., one performing a database task and the other just fetching a static file), so the observed ordering seems to always be as expected. Even if both requests go to the same server, and *it* intentionally responds in a certain order, there's no *real* guarantee of what order the responses will arrive back in the browser. + +So, to address such a race condition, you can coordinate ordering interaction: + +```js +var res = []; + +function response(data) { + if (data.url == "http://some.url.1") { + res[0] = data; + } + else if (data.url == "http://some.url.2") { + res[1] = data; + } +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", response ); +ajax( "http://some.url.2", response ); +``` + +Regardless of which Ajax response comes back first, we inspect the `data.url` (assuming one is returned from the server, of course!) to figure out which position the response data should occupy in the `res` array. `res[0]` will always hold the `"http://some.url.1"` results and `res[1]` will always hold the `"http://some.url.2"` results. Through simple coordination, we eliminated the "race condition" nondeterminism. + +The same reasoning from this scenario would apply if multiple concurrent function calls were interacting with each other through the shared DOM, like one updating the contents of a `
` and the other updating the style or attributes of the `
` (e.g., to make the DOM element visible once it has content). You probably wouldn't want to show the DOM element before it had content, so the coordination must ensure proper ordering interaction. + +Some concurrency scenarios are *always broken* (not just *sometimes*) without coordinated interaction. Consider: + +```js +var a, b; + +function foo(x) { + a = x * 2; + baz(); +} + +function bar(y) { + b = y * 2; + baz(); +} + +function baz() { + console.log(a + b); +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", foo ); +ajax( "http://some.url.2", bar ); +``` + +In this example, whether `foo()` or `bar()` fires first, it will always cause `baz()` to run too early (either `a` or `b` will still be `undefined`), but the second invocation of `baz()` will work, as both `a` and `b` will be available. + +There are different ways to address such a condition. Here's one simple way: + +```js +var a, b; + +function foo(x) { + a = x * 2; + if (a && b) { + baz(); + } +} + +function bar(y) { + b = y * 2; + if (a && b) { + baz(); + } +} + +function baz() { + console.log( a + b ); +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", foo ); +ajax( "http://some.url.2", bar ); +``` + +The `if (a && b)` conditional around the `baz()` call is traditionally called a "gate," because we're not sure what order `a` and `b` will arrive, but we wait for both of them to get there before we proceed to open the gate (call `baz()`). + +Another concurrency interaction condition you may run into is sometimes called a "race," but more correctly called a "latch." It's characterized by "only the first one wins" behavior. Here, nondeterminism is acceptable, in that you are explicitly saying it's OK for the "race" to the finish line to have only one winner. + +Consider this broken code: + +```js +var a; + +function foo(x) { + a = x * 2; + baz(); +} + +function bar(x) { + a = x / 2; + baz(); +} + +function baz() { + console.log( a ); +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", foo ); +ajax( "http://some.url.2", bar ); +``` + +Whichever one (`foo()` or `bar()`) fires last will not only overwrite the assigned `a` value from the other, but it will also duplicate the call to `baz()` (likely undesired). + +So, we can coordinate the interaction with a simple latch, to let only the first one through: + +```js +var a; + +function foo(x) { + if (a == undefined) { + a = x * 2; + baz(); + } +} + +function bar(x) { + if (a == undefined) { + a = x / 2; + baz(); + } +} + +function baz() { + console.log( a ); +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", foo ); +ajax( "http://some.url.2", bar ); +``` + +The `if (a == undefined)` conditional allows only the first of `foo()` or `bar()` through, and the second (and indeed any subsequent) calls would just be ignored. There's just no virtue in coming in second place! + +**Note:** In all these scenarios, we've been using global variables for simplistic illustration purposes, but there's nothing about our reasoning here that requires it. As long as the functions in question can access the variables (via scope), they'll work as intended. Relying on lexically scoped variables (see the *Scope & Closures* title of this book series), and in fact global variables as in these examples, is one obvious downside to these forms of concurrency coordination. As we go through the next few chapters, we'll see other ways of coordination that are much cleaner in that respect. + +### Cooperation + +Another expression of concurrency coordination is called "cooperative concurrency." Here, the focus isn't so much on interacting via value sharing in scopes (though that's obviously still allowed!). The goal is to take a long-running "process" and break it up into steps or batches so that other concurrent "processes" have a chance to interleave their operations into the event loop queue. + +For example, consider an Ajax response handler that needs to run through a long list of results to transform the values. We'll use `Array#map(..)` to keep the code shorter: + +```js +var res = []; + +// `response(..)` receives array of results from the Ajax call +function response(data) { + // add onto existing `res` array + res = res.concat( + // make a new transformed array with all `data` values doubled + data.map( function(val){ + return val * 2; + } ) + ); +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", response ); +ajax( "http://some.url.2", response ); +``` + +If `"http://some.url.1"` gets its results back first, the entire list will be mapped into `res` all at once. If it's a few thousand or less records, this is not generally a big deal. But if it's say 10 million records, that can take a while to run (several seconds on a powerful laptop, much longer on a mobile device, etc.). + +While such a "process" is running, nothing else in the page can happen, including no other `response(..)` calls, no UI updates, not even user events like scrolling, typing, button clicking, and the like. That's pretty painful. + +So, to make a more cooperatively concurrent system, one that's friendlier and doesn't hog the event loop queue, you can process these results in asynchronous batches, after each one "yielding" back to the event loop to let other waiting events happen. + +Here's a very simple approach: + +```js +var res = []; + +// `response(..)` receives array of results from the Ajax call +function response(data) { + // let's just do 1000 at a time + var chunk = data.splice( 0, 1000 ); + + // add onto existing `res` array + res = res.concat( + // make a new transformed array with all `chunk` values doubled + chunk.map( function(val){ + return val * 2; + } ) + ); + + // anything left to process? + if (data.length > 0) { + // async schedule next batch + setTimeout( function(){ + response( data ); + }, 0 ); + } +} + +// ajax(..) is some arbitrary Ajax function given by a library +ajax( "http://some.url.1", response ); +ajax( "http://some.url.2", response ); +``` + +We process the data set in maximum-sized chunks of 1,000 items. By doing so, we ensure a short-running "process," even if that means many more subsequent "processes," as the interleaving onto the event loop queue will give us a much more responsive (performant) site/app. + +Of course, we're not interaction-coordinating the ordering of any of these "processes," so the order of results in `res` won't be predictable. If ordering was required, you'd need to use interaction techniques like those we discussed earlier, or ones we will cover in later chapters of this book. + +We use the `setTimeout(..0)` (hack) for async scheduling, which basically just means "stick this function at the end of the current event loop queue." + +**Note:** `setTimeout(..0)` is not technically inserting an item directly onto the event loop queue. The timer will insert the event at its next opportunity. For example, two subsequent `setTimeout(..0)` calls would not be strictly guaranteed to be processed in call order, so it *is* possible to see various conditions like timer drift where the ordering of such events isn't predictable. In Node.js, a similar approach is `process.nextTick(..)`. Despite how convenient (and usually more performant) it would be, there's not a single direct way (at least yet) across all environments to ensure async event ordering. We cover this topic in more detail in the next section. + +## Jobs + +As of ES6, there's a new concept layered on top of the event loop queue, called the "Job queue." The most likely exposure you'll have to it is with the asynchronous behavior of Promises (see Chapter 3). + +Unfortunately, at the moment it's a mechanism without an exposed API, and thus demonstrating it is a bit more convoluted. So we're going to have to just describe it conceptually, such that when we discuss async behavior with Promises in Chapter 3, you'll understand how those actions are being scheduled and processed. + +So, the best way to think about this that I've found is that the "Job queue" is a queue hanging off the end of every tick in the event loop queue. Certain async-implied actions that may occur during a tick of the event loop will not cause a whole new event to be added to the event loop queue, but will instead add an item (aka Job) to the end of the current tick's Job queue. + +It's kinda like saying, "oh, here's this other thing I need to do *later*, but make sure it happens right away before anything else can happen." + +Or, to use a metaphor: the event loop queue is like an amusement park ride, where once you finish the ride, you have to go to the back of the line to ride again. But the Job queue is like finishing the ride, but then cutting in line and getting right back on. + +A Job can also cause more Jobs to be added to the end of the same queue. So, it's theoretically possible that a Job "loop" (a Job that keeps adding another Job, etc.) could spin indefinitely, thus starving the program of the ability to move on to the next event loop tick. This would conceptually be almost the same as just expressing a long-running or infinite loop (like `while (true) ..`) in your code. + +Jobs are kind of like the spirit of the `setTimeout(..0)` hack, but implemented in such a way as to have a much more well-defined and guaranteed ordering: **later, but as soon as possible**. + +Let's imagine an API for scheduling Jobs (directly, without hacks), and call it `schedule(..)`. Consider: + +```js +console.log( "A" ); + +setTimeout( function(){ + console.log( "B" ); +}, 0 ); + +// theoretical "Job API" +schedule( function(){ + console.log( "C" ); + + schedule( function(){ + console.log( "D" ); + } ); +} ); +``` + +You might expect this to print out `A B C D`, but instead it would print out `A C D B`, because the Jobs happen at the end of the current event loop tick, and the timer fires to schedule for the *next* event loop tick (if available!). + +In Chapter 3, we'll see that the asynchronous behavior of Promises is based on Jobs, so it's important to keep clear how that relates to event loop behavior. + +## Statement Ordering + +The order in which we express statements in our code is not necessarily the same order as the JS engine will execute them. That may seem like quite a strange assertion to make, so we'll just briefly explore it. + +But before we do, we should be crystal clear on something: the rules/grammar of the language (see the *Types & Grammar* title of this book series) dictate a very predictable and reliable behavior for statement ordering from the program point of view. So what we're about to discuss are **not things you should ever be able to observe** in your JS program. + +**Warning:** If you are ever able to *observe* compiler statement reordering like we're about to illustrate, that'd be a clear violation of the specification, and it would unquestionably be due to a bug in the JS engine in question -- one which should promptly be reported and fixed! But it's vastly more common that you *suspect* something crazy is happening in the JS engine, when in fact it's just a bug (probably a "race condition"!) in your own code -- so look there first, and again and again. The JS debugger, using breakpoints and stepping through code line by line, will be your most powerful tool for sniffing out such bugs in *your code*. + +Consider: + +```js +var a, b; + +a = 10; +b = 30; + +a = a + 1; +b = b + 1; + +console.log( a + b ); // 42 +``` + +This code has no expressed asynchrony to it (other than the rare `console` async I/O discussed earlier!), so the most likely assumption is that it would process line by line in top-down fashion. + +But it's *possible* that the JS engine, after compiling this code (yes, JS is compiled -- see the *Scope & Closures* title of this book series!) might find opportunities to run your code faster by rearranging (safely) the order of these statements. Essentially, as long as you can't observe the reordering, anything's fair game. + +For example, the engine might find it's faster to actually execute the code like this: + +```js +var a, b; + +a = 10; +a++; + +b = 30; +b++; + +console.log( a + b ); // 42 +``` + +Or this: + +```js +var a, b; + +a = 11; +b = 31; + +console.log( a + b ); // 42 +``` + +Or even: + +```js +// because `a` and `b` aren't used anymore, we can +// inline and don't even need them! +console.log( 42 ); // 42 +``` + +In all these cases, the JS engine is performing safe optimizations during its compilation, as the end *observable* result will be the same. + +But here's a scenario where these specific optimizations would be unsafe and thus couldn't be allowed (of course, not to say that it's not optimized at all): + +```js +var a, b; + +a = 10; +b = 30; + +// we need `a` and `b` in their preincremented state! +console.log( a * b ); // 300 + +a = a + 1; +b = b + 1; + +console.log( a + b ); // 42 +``` + +Other examples where the compiler reordering could create observable side effects (and thus must be disallowed) would include things like any function call with side effects (even and especially getter functions), or ES6 Proxy objects (see the *ES6 & Beyond* title of this book series). + +Consider: + +```js +function foo() { + console.log( b ); + return 1; +} + +var a, b, c; + +// ES5.1 getter literal syntax +c = { + get bar() { + console.log( a ); + return 1; + } +}; + +a = 10; +b = 30; + +a += foo(); // 30 +b += c.bar; // 11 + +console.log( a + b ); // 42 +``` + +If it weren't for the `console.log(..)` statements in this snippet (just used as a convenient form of observable side effect for the illustration), the JS engine would likely have been free, if it wanted to (who knows if it would!?), to reorder the code to: + +```js +// ... + +a = 10 + foo(); +b = 30 + c.bar; + +// ... +``` + +While JS semantics thankfully protect us from the *observable* nightmares that compiler statement reordering would seem to be in danger of, it's still important to understand just how tenuous a link there is between the way source code is authored (in top-down fashion) and the way it runs after compilation. + +Compiler statement reordering is almost a micro-metaphor for concurrency and interaction. As a general concept, such awareness can help you understand async JS code flow issues better. + +## Review + +A JavaScript program is (practically) always broken up into two or more chunks, where the first chunk runs *now* and the next chunk runs *later*, in response to an event. Even though the program is executed chunk-by-chunk, all of them share the same access to the program scope and state, so each modification to state is made on top of the previous state. + +Whenever there are events to run, the *event loop* runs until the queue is empty. Each iteration of the event loop is a "tick." User interaction, IO, and timers enqueue events on the event queue. + +At any given moment, only one event can be processed from the queue at a time. While an event is executing, it can directly or indirectly cause one or more subsequent events. + +Concurrency is when two or more chains of events interleave over time, such that from a high-level perspective, they appear to be running *simultaneously* (even though at any given moment only one event is being processed). + +It's often necessary to do some form of interaction coordination between these concurrent "processes" (as distinct from operating system processes), for instance to ensure ordering or to prevent "race conditions." These "processes" can also *cooperate* by breaking themselves into smaller chunks and to allow other "process" interleaving. diff --git a/async & performance/ch2.md b/async & performance/ch2.md new file mode 100644 index 0000000..bf6e0eb --- /dev/null +++ b/async & performance/ch2.md @@ -0,0 +1,607 @@ +# You Don't Know JS: Async & Performance +# Chapter 2: Callbacks + +In Chapter 1, we explored the terminology and concepts around asynchronous programming in JavaScript. Our focus is on understanding the single-threaded (one-at-a-time) event loop queue that drives all "events" (async function invocations). We also explored various ways that concurrency patterns explain the relationships (if any!) between *simultaneously* running chains of events, or "processes" (tasks, function calls, etc.). + +All our examples in Chapter 1 used the function as the individual, indivisible unit of operations, whereby inside the function, statements run in predictable order (above the compiler level!), but at the function-ordering level, events (aka async function invocations) can happen in a variety of orders. + +In all these cases, the function is acting as a "callback," because it serves as the target for the event loop to "call back into" the program, whenever that item in the queue is processed. + +As you no doubt have observed, callbacks are by far the most common way that asynchrony in JS programs is expressed and managed. Indeed, the callback is the most fundamental async pattern in the language. + +Countless JS programs, even very sophisticated and complex ones, have been written upon no other async foundation than the callback (with of course the concurrency interaction patterns we explored in Chapter 1). The callback function is the async work horse for JavaScript, and it does its job respectably. + +Except... callbacks are not without their shortcomings. Many developers are excited by the *promise* (pun intended!) of better async patterns. But it's impossible to effectively use any abstraction if you don't understand what it's abstracting, and why. + +In this chapter, we will explore a couple of those in depth, as motivation for why more sophisticated async patterns (explored in subsequent chapters of this book) are necessary and desired. + +## Continuations + +Let's go back to the async callback example we started with in Chapter 1, but let me slightly modify it to illustrate a point: + +```js +// A +ajax( "..", function(..){ + // C +} ); +// B +``` + +`// A` and `// B` represent the first half of the program (aka the *now*), and `// C` marks the second half of the program (aka the *later*). The first half executes right away, and then there's a "pause" of indeterminate length. At some future moment, if the Ajax call completes, then the program will pick up where it left off, and *continue* with the second half. + +In other words, the callback function wraps or encapsulates the *continuation* of the program. + +Let's make the code even simpler: + +```js +// A +setTimeout( function(){ + // C +}, 1000 ); +// B +``` + +Stop for a moment and ask yourself how you'd describe (to someone else less informed about how JS works) the way that program behaves. Go ahead, try it out loud. It's a good exercise that will help my next points make more sense. + +Most readers just now probably thought or said something to the effect of: "Do A, then set up a timeout to wait 1,000 milliseconds, then once that fires, do C." How close was your rendition? + +You might have caught yourself and self-edited to: "Do A, setup the timeout for 1,000 milliseconds, then do B, then after the timeout fires, do C." That's more accurate than the first version. Can you spot the difference? + +Even though the second version is more accurate, both versions are deficient in explaining this code in a way that matches our brains to the code, and the code to the JS engine. The disconnect is both subtle and monumental, and is at the very heart of understanding the shortcomings of callbacks as async expression and management. + +As soon as we introduce a single continuation (or several dozen as many programs do!) in the form of a callback function, we have allowed a divergence to form between how our brains work and the way the code will operate. Any time these two diverge (and this is by far not the only place that happens, as I'm sure you know!), we run into the inevitable fact that our code becomes harder to understand, reason about, debug, and maintain. + +## Sequential Brain + +I'm pretty sure most of you readers have heard someone say (even made the claim yourself), "I'm a multitasker." The effects of trying to act as a multitasker range from humorous (e.g., the silly patting-head-rubbing-stomach kids' game) to mundane (chewing gum while walking) to downright dangerous (texting while driving). + +But are we multitaskers? Can we really do two conscious, intentional actions at once and think/reason about both of them at exactly the same moment? Does our highest level of brain functionality have parallel multithreading going on? + +The answer may surprise you: **probably not.** + +That's just not really how our brains appear to be set up. We're much more single taskers than many of us (especially A-type personalities!) would like to admit. We can really only think about one thing at any given instant. + +I'm not talking about all our involuntary, subconscious, automatic brain functions, such as heart beating, breathing, and eyelid blinking. Those are all vital tasks to our sustained life, but we don't intentionally allocate any brain power to them. Thankfully, while we obsess about checking social network feeds for the 15th time in three minutes, our brain carries on in the background (threads!) with all those important tasks. + +We're instead talking about whatever task is at the forefront of our minds at the moment. For me, it's writing the text in this book right now. Am I doing any other higher level brain function at exactly this same moment? Nope, not really. I get distracted quickly and easily -- a few dozen times in these last couple of paragraphs! + +When we *fake* multitasking, such as trying to type something at the same time we're talking to a friend or family member on the phone, what we're actually most likely doing is acting as fast context switchers. In other words, we switch back and forth between two or more tasks in rapid succession, *simultaneously* progressing on each task in tiny, fast little chunks. We do it so fast that to the outside world it appears as if we're doing these things *in parallel*. + +Does that sound suspiciously like async evented concurrency (like the sort that happens in JS) to you?! If not, go back and read Chapter 1 again! + +In fact, one way of simplifying (i.e., abusing) the massively complex world of neurology into something I can remotely hope to discuss here is that our brains work kinda like the event loop queue. + +If you think about every single letter (or word) I type as a single async event, in just this sentence alone there are several dozen opportunities for my brain to be interrupted by some other event, such as from my senses, or even just my random thoughts. + +I don't get interrupted and pulled to another "process" at every opportunity that I could be (thankfully -- or this book would never be written!). But it happens often enough that I feel my own brain is nearly constantly switching to various different contexts (aka "processes"). And that's an awful lot like how the JS engine would probably feel. + +### Doing Versus Planning + +OK, so our brains can be thought of as operating in single-threaded event loop queue like ways, as can the JS engine. That sounds like a good match. + +But we need to be more nuanced than that in our analysis. There's a big, observable difference between how we plan various tasks, and how our brains actually operate those tasks. + +Again, back to the writing of this text as my metaphor. My rough mental outline plan here is to keep writing and writing, going sequentially through a set of points I have ordered in my thoughts. I don't plan to have any interruptions or nonlinear activity in this writing. But yet, my brain is nevertheless switching around all the time. + +Even though at an operational level our brains are async evented, we seem to plan out tasks in a sequential, synchronous way. "I need to go to the store, then buy some milk, then drop off my dry cleaning." + +You'll notice that this higher level thinking (planning) doesn't seem very async evented in its formulation. In fact, it's kind of rare for us to deliberately think solely in terms of events. Instead, we plan things out carefully, sequentially (A then B then C), and we assume to an extent a sort of temporal blocking that forces B to wait on A, and C to wait on B. + +When a developer writes code, they are planning out a set of actions to occur. If they're any good at being a developer, they're **carefully planning** it out. "I need to set `z` to the value of `x`, and then `x` to the value of `y`," and so forth. + +When we write out synchronous code, statement by statement, it works a lot like our errands to-do list: + +```js +// swap `x` and `y` (via temp variable `z`) +z = x; +x = y; +y = z; +``` + +These three assignment statements are synchronous, so `x = y` waits for `z = x` to finish, and `y = z` in turn waits for `x = y` to finish. Another way of saying it is that these three statements are temporally bound to execute in a certain order, one right after the other. Thankfully, we don't need to be bothered with any async evented details here. If we did, the code gets a lot more complex, quickly! + +So if synchronous brain planning maps well to synchronous code statements, how well do our brains do at planning out asynchronous code? + +It turns out that how we express asynchrony (with callbacks) in our code doesn't map very well at all to that synchronous brain planning behavior. + +Can you actually imagine having a line of thinking that plans out your to-do errands like this? + +> "I need to go to the store, but on the way I'm sure I'll get a phone call, so 'Hi, Mom', and while she starts talking, I'll be looking up the store address on GPS, but that'll take a second to load, so I'll turn down the radio so I can hear Mom better, then I'll realize I forgot to put on a jacket and it's cold outside, but no matter, keep driving and talking to Mom, and then the seatbelt ding reminds me to buckle up, so 'Yes, Mom, I am wearing my seatbelt, I always do!'. Ah, finally the GPS got the directions, now..." + +As ridiculous as that sounds as a formulation for how we plan our day out and think about what to do and in what order, nonetheless it's exactly how our brains operate at a functional level. Remember, that's not multitasking, it's just fast context switching. + +The reason it's difficult for us as developers to write async evented code, especially when all we have is the callback to do it, is that stream of consciousness thinking/planning is unnatural for most of us. + +We think in step-by-step terms, but the tools (callbacks) available to us in code are not expressed in a step-by-step fashion once we move from synchronous to asynchronous. + +And **that** is why it's so hard to accurately author and reason about async JS code with callbacks: because it's not how our brain planning works. + +**Note:** The only thing worse than not knowing why some code breaks is not knowing why it worked in the first place! It's the classic "house of cards" mentality: "it works, but not sure why, so nobody touch it!" You may have heard, "Hell is other people" (Sartre), and the programmer meme twist, "Hell is other people's code." I believe truly: "Hell is not understanding my own code." And callbacks are one main culprit. + +### Nested/Chained Callbacks + +Consider: + +```js +listen( "click", function handler(evt){ + setTimeout( function request(){ + ajax( "http://some.url.1", function response(text){ + if (text == "hello") { + handler(); + } + else if (text == "world") { + request(); + } + } ); + }, 500) ; +} ); +``` + +There's a good chance code like that is recognizable to you. We've got a chain of three functions nested together, each one representing a step in an asynchronous series (task, "process"). + +This kind of code is often called "callback hell," and sometimes also referred to as the "pyramid of doom" (for its sideways-facing triangular shape due to the nested indentation). + +But "callback hell" actually has almost nothing to do with the nesting/indentation. It's a far deeper problem than that. We'll see how and why as we continue through the rest of this chapter. + +First, we're waiting for the "click" event, then we're waiting for the timer to fire, then we're waiting for the Ajax response to come back, at which point it might do it all again. + +At first glance, this code may seem to map its asynchrony naturally to sequential brain planning. + +First (*now*), we: + +```js +listen( "..", function handler(..){ + // .. +} ); +``` + +Then *later*, we: + +```js +setTimeout( function request(..){ + // .. +}, 500) ; +``` + +Then still *later*, we: + +```js +ajax( "..", function response(..){ + // .. +} ); +``` + +And finally (most *later*), we: + +```js +if ( .. ) { + // .. +} +else .. +``` + +But there's several problems with reasoning about this code linearly in such a fashion. + +First, it's an accident of the example that our steps are on subsequent lines (1, 2, 3, and 4...). In real async JS programs, there's often a lot more noise cluttering things up, noise that we have to deftly maneuver past in our brains as we jump from one function to the next. Understanding the async flow in such callback-laden code is not impossible, but it's certainly not natural or easy, even with lots of practice. + +But also, there's something deeper wrong, which isn't evident just in that code example. Let me make up another scenario (pseudocode-ish) to illustrate it: + +```js +doA( function(){ + doB(); + + doC( function(){ + doD(); + } ) + + doE(); +} ); + +doF(); +``` + +While the experienced among you will correctly identify the true order of operations here, I'm betting it is more than a little confusing at first glance, and takes some concerted mental cycles to arrive at. The operations will happen in this order: + +* `doA()` +* `doF()` +* `doB()` +* `doC()` +* `doE()` +* `doD()` + +Did you get that right the very first time you glanced at the code? + +OK, some of you are thinking I was unfair in my function naming, to intentionally lead you astray. I swear I was just naming in top-down appearance order. But let me try again: + +```js +doA( function(){ + doC(); + + doD( function(){ + doF(); + } ) + + doE(); +} ); + +doB(); +``` + +Now, I've named them alphabetically in order of actual execution. But I still bet, even with experience now in this scenario, tracing through the `A -> B -> C -> D -> E -> F` order doesn't come natural to many if any of you readers. Certainly your eyes do an awful lot of jumping up and down the code snippet, right? + +But even if that all comes natural to you, there's still one more hazard that could wreak havoc. Can you spot what it is? + +What if `doA(..)` or `doD(..)` aren't actually async, the way we obviously assumed them to be? Uh oh, now the order is different. If they're both sync (and maybe only sometimes, depending on the conditions of the program at the time), the order is now `A -> C -> D -> F -> E -> B`. + +That sound you just heard faintly in the background is the sighs of thousands of JS developers who just had a face-in-hands moment. + +Is nesting the problem? Is that what makes it so hard to trace the async flow? That's part of it, certainly. + +But let me rewrite the previous nested event/timeout/Ajax example without using nesting: + +```js +listen( "click", handler ); + +function handler() { + setTimeout( request, 500 ); +} + +function request(){ + ajax( "http://some.url.1", response ); +} + +function response(text){ + if (text == "hello") { + handler(); + } + else if (text == "world") { + request(); + } +} +``` + +This formulation of the code is not hardly as recognizable as having the nesting/indentation woes of its previous form, and yet it's every bit as susceptible to "callback hell." Why? + +As we go to linearly (sequentially) reason about this code, we have to skip from one function, to the next, to the next, and bounce all around the code base to "see" the sequence flow. And remember, this is simplified code in sort of best-case fashion. We all know that real async JS program code bases are often fantastically more jumbled, which makes such reasoning orders of magnitude more difficult. + +Another thing to notice: to get steps 2, 3, and 4 linked together so they happen in succession, the only affordance callbacks alone gives us is to hardcode step 2 into step 1, step 3 into step 2, step 4 into step 3, and so on. The hardcoding isn't necessarily a bad thing, if it really is a fixed condition that step 2 should always lead to step 3. + +But the hardcoding definitely makes the code a bit more brittle, as it doesn't account for anything going wrong that might cause a deviation in the progression of steps. For example, if step 2 fails, step 3 never gets reached, nor does step 2 retry, or move to an alternate error handling flow, and so on. + +All of these issues are things you *can* manually hardcode into each step, but that code is often very repetitive and not reusable in other steps or in other async flows in your program. + +Even though our brains might plan out a series of tasks in a sequential type of way (this, then this, then this), the evented nature of our brain operation makes recovery/retry/forking of flow control almost effortless. If you're out running errands, and you realize you left a shopping list at home, it doesn't end the day because you didn't plan that ahead of time. Your brain routes around this hiccup easily: you go home, get the list, then head right back out to the store. + +But the brittle nature of manually hardcoded callbacks (even with hardcoded error handling) is often far less graceful. Once you end up specifying (aka pre-planning) all the various eventualities/paths, the code becomes so convoluted that it's hard to ever maintain or update it. + +**That** is what "callback hell" is all about! The nesting/indentation are basically a side show, a red herring. + +And as if all that's not enough, we haven't even touched what happens when two or more chains of these callback continuations are happening *simultaneously*, or when the third step branches out into "parallel" callbacks with gates or latches, or... OMG, my brain hurts, how about yours!? + +Are you catching the notion here that our sequential, blocking brain planning behaviors just don't map well onto callback-oriented async code? That's the first major deficiency to articulate about callbacks: they express asynchrony in code in ways our brains have to fight just to keep in sync with (pun intended!). + +## Trust Issues + +The mismatch between sequential brain planning and callback-driven async JS code is only part of the problem with callbacks. There's something much deeper to be concerned about. + +Let's once again revisit the notion of a callback function as the continuation (aka the second half) of our program: + +```js +// A +ajax( "..", function(..){ + // C +} ); +// B +``` + +`// A` and `// B` happen *now*, under the direct control of the main JS program. But `// C` gets deferred to happen *later*, and under the control of another party -- in this case, the `ajax(..)` function. In a basic sense, that sort of hand-off of control doesn't regularly cause lots of problems for programs. + +But don't be fooled by its infrequency that this control switch isn't a big deal. In fact, it's one of the worst (and yet most subtle) problems about callback-driven design. It revolves around the idea that sometimes `ajax(..)` (i.e., the "party" you hand your callback continuation to) is not a function that you wrote, or that you directly control. Many times it's a utility provided by some third party. + +We call this "inversion of control," when you take part of your program and give over control of its execution to another third party. There's an unspoken "contract" that exists between your code and the third-party utility -- a set of things you expect to be maintained. + +### Tale of Five Callbacks + +It might not be terribly obvious why this is such a big deal. Let me construct an exaggerated scenario to illustrate the hazards of trust at play. + +Imagine you're a developer tasked with building out an ecommerce checkout system for a site that sells expensive TVs. You already have all the various pages of the checkout system built out just fine. On the last page, when the user clicks "confirm" to buy the TV, you need to call a third-party function (provided say by some analytics tracking company) so that the sale can be tracked. + +You notice that they've provided what looks like an async tracking utility, probably for the sake of performance best practices, which means you need to pass in a callback function. In this continuation that you pass in, you will have the final code that charges the customer's credit card and displays the thank you page. + +This code might look like: + +```js +analytics.trackPurchase( purchaseData, function(){ + chargeCreditCard(); + displayThankyouPage(); +} ); +``` + +Easy enough, right? You write the code, test it, everything works, and you deploy to production. Everyone's happy! + +Six months go by and no issues. You've almost forgotten you even wrote that code. One morning, you're at a coffee shop before work, casually enjoying your latte, when you get a panicked call from your boss insisting you drop the coffee and rush into work right away. + +When you arrive, you find out that a high-profile customer has had his credit card charged five times for the same TV, and he's understandably upset. Customer service has already issued an apology and processed a refund. But your boss demands to know how this could possibly have happened. "Don't we have tests for stuff like this!?" + +You don't even remember the code you wrote. But you dig back in and start trying to find out what could have gone awry. + +After digging through some logs, you come to the conclusion that the only explanation is that the analytics utility somehow, for some reason, called your callback five times instead of once. Nothing in their documentation mentions anything about this. + +Frustrated, you contact customer support, who of course is as astonished as you are. They agree to escalate it to their developers, and promise to get back to you. The next day, you receive a lengthy email explaining what they found, which you promptly forward to your boss. + +Apparently, the developers at the analytics company had been working on some experimental code that, under certain conditions, would retry the provided callback once per second, for five seconds, before failing with a timeout. They had never intended to push that into production, but somehow they did, and they're totally embarrassed and apologetic. They go into plenty of detail about how they've identified the breakdown and what they'll do to ensure it never happens again. Yadda, yadda. + +What's next? + +You talk it over with your boss, but he's not feeling particularly comfortable with the state of things. He insists, and you reluctantly agree, that you can't trust *them* anymore (that's what bit you), and that you'll need to figure out how to protect the checkout code from such a vulnerability again. + +After some tinkering, you implement some simple ad hoc code like the following, which the team seems happy with: + +```js +var tracked = false; + +analytics.trackPurchase( purchaseData, function(){ + if (!tracked) { + tracked = true; + chargeCreditCard(); + displayThankyouPage(); + } +} ); +``` + +**Note:** This should look familiar to you from Chapter 1, because we're essentially creating a latch to handle if there happen to be multiple concurrent invocations of our callback. + +But then one of your QA engineers asks, "what happens if they never call the callback?" Oops. Neither of you had thought about that. + +You begin to chase down the rabbit hole, and think of all the possible things that could go wrong with them calling your callback. Here's roughly the list you come up with of ways the analytics utility could misbehave: + +* Call the callback too early (before it's been tracked) +* Call the callback too late (or never) +* Call the callback too few or too many times (like the problem you encountered!) +* Fail to pass along any necessary environment/parameters to your callback +* Swallow any errors/exceptions that may happen +* ... + +That should feel like a troubling list, because it is. You're probably slowly starting to realize that you're going to have to invent an awful lot of ad hoc logic **in each and every single callback** that's passed to a utility you're not positive you can trust. + +Now you realize a bit more completely just how hellish "callback hell" is. + +### Not Just Others' Code + +Some of you may be skeptical at this point whether this is as big a deal as I'm making it out to be. Perhaps you don't interact with truly third-party utilities much if at all. Perhaps you use versioned APIs or self-host such libraries, so that its behavior can't be changed out from underneath you. + +So, contemplate this: can you even *really* trust utilities that you do theoretically control (in your own code base)? + +Think of it this way: most of us agree that at least to some extent we should build our own internal functions with some defensive checks on the input parameters, to reduce/prevent unexpected issues. + +Overly trusting of input: +```js +function addNumbers(x,y) { + // + is overloaded with coercion to also be + // string concatenation, so this operation + // isn't strictly safe depending on what's + // passed in. + return x + y; +} + +addNumbers( 21, 21 ); // 42 +addNumbers( 21, "21" ); // "2121" +``` + +Defensive against untrusted input: +```js +function addNumbers(x,y) { + // ensure numerical input + if (typeof x != "number" || typeof y != "number") { + throw Error( "Bad parameters" ); + } + + // if we get here, + will safely do numeric addition + return x + y; +} + +addNumbers( 21, 21 ); // 42 +addNumbers( 21, "21" ); // Error: "Bad parameters" +``` + +Or perhaps still safe but friendlier: +```js +function addNumbers(x,y) { + // ensure numerical input + x = Number( x ); + y = Number( y ); + + // + will safely do numeric addition + return x + y; +} + +addNumbers( 21, 21 ); // 42 +addNumbers( 21, "21" ); // 42 +``` + +However you go about it, these sorts of checks/normalizations are fairly common on function inputs, even with code we theoretically entirely trust. In a crude sort of way, it's like the programming equivalent of the geopolitical principle of "Trust But Verify." + +So, doesn't it stand to reason that we should do the same thing about composition of async function callbacks, not just with truly external code but even with code we know is generally "under our own control"? **Of course we should.** + +But callbacks don't really offer anything to assist us. We have to construct all that machinery ourselves, and it often ends up being a lot of boilerplate/overhead that we repeat for every single async callback. + +The most troublesome problem with callbacks is *inversion of control* leading to a complete breakdown along all those trust lines. + +If you have code that uses callbacks, especially but not exclusively with third-party utilities, and you're not already applying some sort of mitigation logic for all these *inversion of control* trust issues, your code *has* bugs in it right now even though they may not have bitten you yet. Latent bugs are still bugs. + +Hell indeed. + +## Trying to Save Callbacks + +There are several variations of callback design that have attempted to address some (not all!) of the trust issues we've just looked at. It's a valiant, but doomed, effort to save the callback pattern from imploding on itself. + +For example, regarding more graceful error handling, some API designs provide for split callbacks (one for the success notification, one for the error notification): + +```js +function success(data) { + console.log( data ); +} + +function failure(err) { + console.error( err ); +} + +ajax( "http://some.url.1", success, failure ); +``` + +In APIs of this design, often the `failure()` error handler is optional, and if not provided it will be assumed you want the errors swallowed. Ugh. + +**Note:** This split-callback design is what the ES6 Promise API uses. We'll cover ES6 Promises in much more detail in the next chapter. + +Another common callback pattern is called "error-first style" (sometimes called "Node style," as it's also the convention used across nearly all Node.js APIs), where the first argument of a single callback is reserved for an error object (if any). If success, this argument will be empty/falsy (and any subsequent arguments will be the success data), but if an error result is being signaled, the first argument is set/truthy (and usually nothing else is passed): + +```js +function response(err,data) { + // error? + if (err) { + console.error( err ); + } + // otherwise, assume success + else { + console.log( data ); + } +} + +ajax( "http://some.url.1", response ); +``` + +In both of these cases, several things should be observed. + +First, it has not really resolved the majority of trust issues like it may appear. There's nothing about either callback that prevents or filters unwanted repeated invocations. Moreover, things are worse now, because you may get both success and error signals, or neither, and you still have to code around either of those conditions. + +Also, don't miss the fact that while it's a standard pattern you can employ, it's definitely more verbose and boilerplate-ish without much reuse, so you're going to get weary of typing all that out for every single callback in your application. + +What about the trust issue of never being called? If this is a concern (and it probably should be!), you likely will need to set up a timeout that cancels the event. You could make a utility (proof-of-concept only shown) to help you with that: + +```js +function timeoutify(fn,delay) { + var intv = setTimeout( function(){ + intv = null; + fn( new Error( "Timeout!" ) ); + }, delay ) + ; + + return function() { + // timeout hasn't happened yet? + if (intv) { + clearTimeout( intv ); + fn.apply( this, [ null ].concat( [].slice.call( arguments ) ) ); + } + }; +} +``` + +Here's how you use it: + +```js +// using "error-first style" callback design +function foo(err,data) { + if (err) { + console.error( err ); + } + else { + console.log( data ); + } +} + +ajax( "http://some.url.1", timeoutify( foo, 500 ) ); +``` + +Another trust issue is being called "too early." In application-specific terms, this may actually involve being called before some critical task is complete. But more generally, the problem is evident in utilities that can either invoke the callback you provide *now* (synchronously), or *later* (asynchronously). + +This nondeterminism around the sync-or-async behavior is almost always going to lead to very difficult to track down bugs. In some circles, the fictional insanity-inducing monster named Zalgo is used to describe the sync/async nightmares. "Don't release Zalgo!" is a common cry, and it leads to very sound advice: always invoke callbacks asynchronously, even if that's "right away" on the next turn of the event loop, so that all callbacks are predictably async. + +**Note:** For more information on Zalgo, see Oren Golan's "Don't Release Zalgo!" (https://github.com/oren/oren.github.io/blob/master/posts/zalgo.md) and Isaac Z. Schlueter's "Designing APIs for Asynchrony" (http://blog.izs.me/post/59142742143/designing-apis-for-asynchrony). + +Consider: + +```js +function result(data) { + console.log( a ); +} + +var a = 0; + +ajax( "..pre-cached-url..", result ); +a++; +``` + +Will this code print `0` (sync callback invocation) or `1` (async callback invocation)? Depends... on the conditions. + +You can see just how quickly the unpredictability of Zalgo can threaten any JS program. So the silly-sounding "never release Zalgo" is actually incredibly common and solid advice. Always be asyncing. + +What if you don't know whether the API in question will always execute async? You could invent a utility like this `asyncify(..)` proof-of-concept: + +```js +function asyncify(fn) { + var orig_fn = fn, + intv = setTimeout( function(){ + intv = null; + if (fn) fn(); + }, 0 ) + ; + + fn = null; + + return function() { + // firing too quickly, before `intv` timer has fired to + // indicate async turn has passed? + if (intv) { + fn = orig_fn.bind.apply( + orig_fn, + // add the wrapper's `this` to the `bind(..)` + // call parameters, as well as currying any + // passed in parameters + [this].concat( [].slice.call( arguments ) ) + ); + } + // already async + else { + // invoke original function + orig_fn.apply( this, arguments ); + } + }; +} +``` + +You use `asyncify(..)` like this: + +```js +function result(data) { + console.log( a ); +} + +var a = 0; + +ajax( "..pre-cached-url..", asyncify( result ) ); +a++; +``` + +Whether the Ajax request is in the cache and resolves to try to call the callback right away, or must be fetched over the wire and thus complete later asynchronously, this code will always output `1` instead of `0` -- `result(..)` cannot help but be invoked asynchronously, which means the `a++` has a chance to run before `result(..)` does. + +Yay, another trust issued "solved"! But it's inefficient, and yet again more bloated boilerplate to weigh your project down. + +That's just the story, over and over again, with callbacks. They can do pretty much anything you want, but you have to be willing to work hard to get it, and oftentimes this effort is much more than you can or should spend on such code reasoning. + +You might find yourself wishing for built-in APIs or other language mechanics to address these issues. Finally ES6 has arrived on the scene with some great answers, so keep reading! + +## Review + +Callbacks are the fundamental unit of asynchrony in JS. But they're not enough for the evolving landscape of async programming as JS matures. + +First, our brains plan things out in sequential, blocking, single-threaded semantic ways, but callbacks express asynchronous flow in a rather nonlinear, nonsequential way, which makes reasoning properly about such code much harder. Bad to reason about code is bad code that leads to bad bugs. + +We need a way to express asynchrony in a more synchronous, sequential, blocking manner, just like our brains do. + +Second, and more importantly, callbacks suffer from *inversion of control* in that they implicitly give control over to another party (often a third-party utility not in your control!) to invoke the *continuation* of your program. This control transfer leads us to a troubling list of trust issues, such as whether the callback is called more times than we expect. + +Inventing ad hoc logic to solve these trust issues is possible, but it's more difficult than it should be, and it produces clunkier and harder to maintain code, as well as code that is likely insufficiently protected from these hazards until you get visibly bitten by the bugs. + +We need a generalized solution to **all of the trust issues**, one that can be reused for as many callbacks as we create without all the extra boilerplate overhead. + +We need something better than callbacks. They've served us well to this point, but the *future* of JavaScript demands more sophisticated and capable async patterns. The subsequent chapters in this book will dive into those emerging evolutions. diff --git a/async & performance/ch3.md b/async & performance/ch3.md new file mode 100644 index 0000000..5f85734 --- /dev/null +++ b/async & performance/ch3.md @@ -0,0 +1,2130 @@ +# You Don't Know JS: Async & Performance +# Chapter 3: Promises + +In Chapter 2, we identified two major categories of deficiencies with using callbacks to express program asynchrony and manage concurrency: lack of sequentiality and lack of trustability. Now that we understand the problems more intimately, it's time we turn our attention to patterns that can address them. + +The issue we want to address first is the *inversion of control*, the trust that is so fragilely held and so easily lost. + +Recall that we wrap up the *continuation* of our program in a callback function, and hand that callback over to another party (potentially even external code) and just cross our fingers that it will do the right thing with the invocation of the callback. + +We do this because we want to say, "here's what happens *later*, after the current step finishes." + +But what if we could uninvert that *inversion of control*? What if instead of handing the continuation of our program to another party, we could expect it to return us a capability to know when its task finishes, and then our code could decide what to do next? + +This paradigm is called **Promises**. + +Promises are starting to take the JS world by storm, as developers and specification writers alike desperately seek to untangle the insanity of callback hell in their code/design. In fact, most new async APIs being added to JS/DOM platform are being built on Promises. So it's probably a good idea to dig in and learn them, don't you think!? + +**Note:** The word "immediately" will be used frequently in this chapter, generally to refer to some Promise resolution action. However, in essentially all cases, "immediately" means in terms of the Job queue behavior (see Chapter 1), not in the strictly synchronous *now* sense. + +## What Is a Promise? + +When developers decide to learn a new technology or pattern, usually their first step is "Show me the code!" It's quite natural for us to just jump in feet first and learn as we go. + +But it turns out that some abstractions get lost on the APIs alone. Promises are one of those tools where it can be painfully obvious from how someone uses it whether they understand what it's for and about versus just learning and using the API. + +So before I show the Promise code, I want to fully explain what a Promise really is conceptually. I hope this will then guide you better as you explore integrating Promise theory into your own async flow. + +With that in mind, let's look at two different analogies for what a Promise *is*. + +### Future Value + +Imagine this scenario: I walk up to the counter at a fast-food restaurant, and place an order for a cheeseburger. I hand the cashier $1.47. By placing my order and paying for it, I've made a request for a *value* back (the cheeseburger). I've started a transaction. + +But often, the cheeseburger is not immediately available for me. The cashier hands me something in place of my cheeseburger: a receipt with an order number on it. This order number is an IOU ("I owe you") *promise* that ensures that eventually, I should receive my cheeseburger. + +So I hold onto my receipt and order number. I know it represents my *future cheeseburger*, so I don't need to worry about it anymore -- aside from being hungry! + +While I wait, I can do other things, like send a text message to a friend that says, "Hey, can you come join me for lunch? I'm going to eat a cheeseburger." + +I am reasoning about my *future cheeseburger* already, even though I don't have it in my hands yet. My brain is able to do this because it's treating the order number as a placeholder for the cheeseburger. The placeholder essentially makes the value *time independent*. It's a **future value**. + +Eventually, I hear, "Order 113!" and I gleefully walk back up to the counter with receipt in hand. I hand my receipt to the cashier, and I take my cheeseburger in return. + +In other words, once my *future value* was ready, I exchanged my value-promise for the value itself. + +But there's another possible outcome. They call my order number, but when I go to retrieve my cheeseburger, the cashier regretfully informs me, "I'm sorry, but we appear to be all out of cheeseburgers." Setting aside the customer frustration of this scenario for a moment, we can see an important characteristic of *future values*: they can either indicate a success or failure. + +Every time I order a cheeseburger, I know that I'll either get a cheeseburger eventually, or I'll get the sad news of the cheeseburger shortage, and I'll have to figure out something else to eat for lunch. + +**Note:** In code, things are not quite as simple, because metaphorically the order number may never be called, in which case we're left indefinitely in an unresolved state. We'll come back to dealing with that case later. + +#### Values Now and Later + +This all might sound too mentally abstract to apply to your code. So let's be more concrete. + +However, before we can introduce how Promises work in this fashion, we're going to derive in code that we already understand -- callbacks! -- how to handle these *future values*. + +When you write code to reason about a value, such as performing math on a `number`, whether you realize it or not, you've been assuming something very fundamental about that value, which is that it's a concrete *now* value already: + +```js +var x, y = 2; + +console.log( x + y ); // NaN <-- because `x` isn't set yet +``` + +The `x + y` operation assumes both `x` and `y` are already set. In terms we'll expound on shortly, we assume the `x` and `y` values are already *resolved*. + +It would be nonsense to expect that the `+` operator by itself would somehow be magically capable of detecting and waiting around until both `x` and `y` are resolved (aka ready), only then to do the operation. That would cause chaos in the program if different statements finished *now* and others finished *later*, right? + +How could you possibly reason about the relationships between two statements if either one (or both) of them might not be finished yet? If statement 2 relies on statement 1 being finished, there are just two outcomes: either statement 1 finished right *now* and everything proceeds fine, or statement 1 didn't finish yet, and thus statement 2 is going to fail. + +If this sort of thing sounds familiar from Chapter 1, good! + +Let's go back to our `x + y` math operation. Imagine if there was a way to say, "Add `x` and `y`, but if either of them isn't ready yet, just wait until they are. Add them as soon as you can." + +Your brain might have just jumped to callbacks. OK, so... + +```js +function add(getX,getY,cb) { + var x, y; + getX( function(xVal){ + x = xVal; + // both are ready? + if (y != undefined) { + cb( x + y ); // send along sum + } + } ); + getY( function(yVal){ + y = yVal; + // both are ready? + if (x != undefined) { + cb( x + y ); // send along sum + } + } ); +} + +// `fetchX()` and `fetchY()` are sync or async +// functions +add( fetchX, fetchY, function(sum){ + console.log( sum ); // that was easy, huh? +} ); +``` + +Take just a moment to let the beauty (or lack thereof) of that snippet sink in (whistles patiently). + +While the ugliness is undeniable, there's something very important to notice about this async pattern. + +In that snippet, we treated `x` and `y` as future values, and we express an operation `add(..)` that (from the outside) does not care whether `x` or `y` or both are available right away or not. In other words, it normalizes the *now* and *later*, such that we can rely on a predictable outcome of the `add(..)` operation. + +By using an `add(..)` that is temporally consistent -- it behaves the same across *now* and *later* times -- the async code is much easier to reason about. + +To put it more plainly: to consistently handle both *now* and *later*, we make both of them *later*: all operations become async. + +Of course, this rough callbacks-based approach leaves much to be desired. It's just a first tiny step toward realizing the benefits of reasoning about *future values* without worrying about the time aspect of when it's available or not. + +#### Promise Value + +We'll definitely go into a lot more detail about Promises later in the chapter -- so don't worry if some of this is confusing -- but let's just briefly glimpse at how we can express the `x + y` example via `Promise`s: + +```js +function add(xPromise,yPromise) { + // `Promise.all([ .. ])` takes an array of promises, + // and returns a new promise that waits on them + // all to finish + return Promise.all( [xPromise, yPromise] ) + + // when that promise is resolved, let's take the + // received `X` and `Y` values and add them together. + .then( function(values){ + // `values` is an array of the messages from the + // previously resolved promises + return values[0] + values[1]; + } ); +} + +// `fetchX()` and `fetchY()` return promises for +// their respective values, which may be ready +// *now* or *later*. +add( fetchX(), fetchY() ) + +// we get a promise back for the sum of those +// two numbers. +// now we chain-call `then(..)` to wait for the +// resolution of that returned promise. +.then( function(sum){ + console.log( sum ); // that was easier! +} ); +``` + +There are two layers of Promises in this snippet. + +`fetchX()` and `fetchY()` are called directly, and the values they return (promises!) are passed into `add(..)`. The underlying values those promises represent may be ready *now* or *later*, but each promise normalizes the behavior to be the same regardless. We reason about `X` and `Y` values in a time-independent way. They are *future values*. + +The second layer is the promise that `add(..)` creates (via `Promise.all([ .. ])`) and returns, which we wait on by calling `then(..)`. When the `add(..)` operation completes, our `sum` *future value* is ready and we can print it out. We hide inside of `add(..)` the logic for waiting on the `X` and `Y` *future values*. + +**Note:** Inside `add(..)`, the `Promise.all([ .. ])` call creates a promise (which is waiting on `promiseX` and `promiseY` to resolve). The chained call to `.then(..)` creates another promise, which the `return values[0] + values[1]` line immediately resolves (with the result of the addition). Thus, the `then(..)` call we chain off the end of the `add(..)` call -- at the end of the snippet -- is actually operating on that second promise returned, rather than the first one created by `Promise.all([ .. ])`. Also, though we are not chaining off the end of that second `then(..)`, it too has created another promise, had we chosen to observe/use it. This Promise chaining stuff will be explained in much greater detail later in this chapter. + +Just like with cheeseburger orders, it's possible that the resolution of a Promise is rejection instead of fulfillment. Unlike a fulfilled Promise, where the value is always programmatic, a rejection value -- commonly called a "rejection reason" -- can either be set directly by the program logic, or it can result implicitly from a runtime exception. + +With Promises, the `then(..)` call can actually take two functions, the first for fulfillment (as shown earlier), and the second for rejection: + +```js +add( fetchX(), fetchY() ) +.then( + // fullfillment handler + function(sum) { + console.log( sum ); + }, + // rejection handler + function(err) { + console.error( err ); // bummer! + } +); +``` + +If something went wrong getting `X` or `Y`, or something somehow failed during the addition, the promise that `add(..)` returns is rejected, and the second callback error handler passed to `then(..)` will receive the rejection value from the promise. + +Because Promises encapsulate the time-dependent state -- waiting on the fulfillment or rejection of the underlying value -- from the outside, the Promise itself is time-independent, and thus Promises can be composed (combined) in predictable ways regardless of the timing or outcome underneath. + +Moreover, once a Promise is resolved, it stays that way forever -- it becomes an *immutable value* at that point -- and can then be *observed* as many times as necessary. + +**Note:** Because a Promise is externally immutable once resolved, it's now safe to pass that value around to any party and know that it cannot be modified accidentally or maliciously. This is especially true in relation to multiple parties observing the resolution of a Promise. It is not possible for one party to affect another party's ability to observe Promise resolution. Immutability may sound like an academic topic, but it's actually one of the most fundamental and important aspects of Promise design, and shouldn't be casually passed over. + +That's one of the most powerful and important concepts to understand about Promises. With a fair amount of work, you could ad hoc create the same effects with nothing but ugly callback composition, but that's not really an effective strategy, especially because you have to do it over and over again. + +Promises are an easily repeatable mechanism for encapsulating and composing *future values*. + +### Completion Event + +As we just saw, an individual Promise behaves as a *future value*. But there's another way to think of the resolution of a Promise: as a flow-control mechanism -- a temporal this-then-that -- for two or more steps in an asynchronous task. + +Let's imagine calling a function `foo(..)` to perform some task. We don't know about any of its details, nor do we care. It may complete the task right away, or it may take a while. + +We just simply need to know when `foo(..)` finishes so that we can move on to our next task. In other words, we'd like a way to be notified of `foo(..)`'s completion so that we can *continue*. + +In typical JavaScript fashion, if you need to listen for a notification, you'd likely think of that in terms of events. So we could reframe our need for notification as a need to listen for a *completion* (or *continuation*) event emitted by `foo(..)`. + +**Note:** Whether you call it a "completion event" or a "continuation event" depends on your perspective. Is the focus more on what happens with `foo(..)`, or what happens *after* `foo(..)` finishes? Both perspectives are accurate and useful. The event notification tells us that `foo(..)` has *completed*, but also that it's OK to *continue* with the next step. Indeed, the callback you pass to be called for the event notification is itself what we've previously called a *continuation*. Because *completion event* is a bit more focused on the `foo(..)`, which more has our attention at present, we slightly favor *completion event* for the rest of this text. + +With callbacks, the "notification" would be our callback invoked by the task (`foo(..)`). But with Promises, we turn the relationship around, and expect that we can listen for an event from `foo(..)`, and when notified, proceed accordingly. + +First, consider some pseudocode: + +```js +foo(x) { + // start doing something that could take a while +} + +foo( 42 ) + +on (foo "completion") { + // now we can do the next step! +} + +on (foo "error") { + // oops, something went wrong in `foo(..)` +} +``` + +We call `foo(..)` and then we set up two event listeners, one for `"completion"` and one for `"error"` -- the two possible *final* outcomes of the `foo(..)` call. In essence, `foo(..)` doesn't even appear to be aware that the calling code has subscribed to these events, which makes for a very nice *separation of concerns*. + +Unfortunately, such code would require some "magic" of the JS environment that doesn't exist (and would likely be a bit impractical). Here's the more natural way we could express that in JS: + +```js +function foo(x) { + // start doing something that could take a while + + // make a `listener` event notification + // capability to return + + return listener; +} + +var evt = foo( 42 ); + +evt.on( "completion", function(){ + // now we can do the next step! +} ); + +evt.on( "failure", function(err){ + // oops, something went wrong in `foo(..)` +} ); +``` + +`foo(..)` expressly creates an event subscription capability to return back, and the calling code receives and registers the two event handlers against it. + +The inversion from normal callback-oriented code should be obvious, and it's intentional. Instead of passing the callbacks to `foo(..)`, it returns an event capability we call `evt`, which receives the callbacks. + +But if you recall from Chapter 2, callbacks themselves represent an *inversion of control*. So inverting the callback pattern is actually an *inversion of inversion*, or an *uninversion of control* -- restoring control back to the calling code where we wanted it to be in the first place. + +One important benefit is that multiple separate parts of the code can be given the event listening capability, and they can all independently be notified of when `foo(..)` completes to perform subsequent steps after its completion: + +```js +var evt = foo( 42 ); + +// let `bar(..)` listen to `foo(..)`'s completion +bar( evt ); + +// also, let `baz(..)` listen to `foo(..)`'s completion +baz( evt ); +``` + +*Uninversion of control* enables a nicer *separation of concerns*, where `bar(..)` and `baz(..)` don't need to be involved in how `foo(..)` is called. Similarly, `foo(..)` doesn't need to know or care that `bar(..)` and `baz(..)` exist or are waiting to be notified when `foo(..)` completes. + +Essentially, this `evt` object is a neutral third-party negotiation between the separate concerns. + +#### Promise "Events" + +As you may have guessed by now, the `evt` event listening capability is an analogy for a Promise. + +In a Promise-based approach, the previous snippet would have `foo(..)` creating and returning a `Promise` instance, and that promise would then be passed to `bar(..)` and `baz(..)`. + +**Note:** The Promise resolution "events" we listen for aren't strictly events (though they certainly behave like events for these purposes), and they're not typically called `"completion"` or `"error"`. Instead, we use `then(..)` to register a `"then"` event. Or perhaps more precisely, `then(..)` registers `"fulfillment"` and/or `"rejection"` event(s), though we don't see those terms used explicitly in the code. + +Consider: + +```js +function foo(x) { + // start doing something that could take a while + + // construct and return a promise + return new Promise( function(resolve,reject){ + // eventually, call `resolve(..)` or `reject(..)`, + // which are the resolution callbacks for + // the promise. + } ); +} + +var p = foo( 42 ); + +bar( p ); + +baz( p ); +``` + +**Note:** The pattern shown with `new Promise( function(..){ .. } )` is generally called the ["revealing constructor"](http://domenic.me/2014/02/13/the-revealing-constructor-pattern/). The function passed in is executed immediately (not async deferred, as callbacks to `then(..)` are), and it's provided two parameters, which in this case we've named `resolve` and `reject`. These are the resolution functions for the promise. `resolve(..)` generally signals fulfillment, and `reject(..)` signals rejection. + +You can probably guess what the internals of `bar(..)` and `baz(..)` might look like: + +```js +function bar(fooPromise) { + // listen for `foo(..)` to complete + fooPromise.then( + function(){ + // `foo(..)` has now finished, so + // do `bar(..)`'s task + }, + function(){ + // oops, something went wrong in `foo(..)` + } + ); +} + +// ditto for `baz(..)` +``` + +Promise resolution doesn't necessarily need to involve sending along a message, as it did when we were examining Promises as *future values*. It can just be a flow-control signal, as used in the previous snippet. + +Another way to approach this is: + +```js +function bar() { + // `foo(..)` has definitely finished, so + // do `bar(..)`'s task +} + +function oopsBar() { + // oops, something went wrong in `foo(..)`, + // so `bar(..)` didn't run +} + +// ditto for `baz()` and `oopsBaz()` + +var p = foo( 42 ); + +p.then( bar, oopsBar ); + +p.then( baz, oopsBaz ); +``` + +**Note:** If you've seen Promise-based coding before, you might be tempted to believe that the last two lines of that code could be written as `p.then( .. ).then( .. )`, using chaining, rather than `p.then(..); p.then(..)`. That would have an entirely different behavior, so be careful! The difference might not be clear right now, but it's actually a different async pattern than we've seen thus far: splitting/forking. Don't worry! We'll come back to this point later in this chapter. + +Instead of passing the `p` promise to `bar(..)` and `baz(..)`, we use the promise to control when `bar(..)` and `baz(..)` will get executed, if ever. The primary difference is in the error handling. + +In the first snippet's approach, `bar(..)` is called regardless of whether `foo(..)` succeeds or fails, and it handles its own fallback logic if it's notified that `foo(..)` failed. The same is true for `baz(..)`, obviously. + +In the second snippet, `bar(..)` only gets called if `foo(..)` succeeds, and otherwise `oopsBar(..)` gets called. Ditto for `baz(..)`. + +Neither approach is *correct* per se. There will be cases where one is preferred over the other. + +In either case, the promise `p` that comes back from `foo(..)` is used to control what happens next. + +Moreover, the fact that both snippets end up calling `then(..)` twice against the same promise `p` illustrates the point made earlier, which is that Promises (once resolved) retain their same resolution (fulfillment or rejection) forever, and can subsequently be observed as many times as necessary. + +Whenever `p` is resolved, the next step will always be the same, both *now* and *later*. + +## Thenable Duck Typing + +In Promises-land, an important detail is how to know for sure if some value is a genuine Promise or not. Or more directly, is it a value that will behave like a Promise? + +Given that Promises are constructed by the `new Promise(..)` syntax, you might think that `p instanceof Promise` would be an acceptable check. But unfortunately, there are a number of reasons that's not totally sufficient. + +Mainly, you can receive a Promise value from another browser window (iframe, etc.), which would have its own Promise different from the one in the current window/frame, and that check would fail to identify the Promise instance. + +Moreover, a library or framework may choose to vend its own Promises and not use the native ES6 `Promise` implementation to do so. In fact, you may very well be using Promises with libraries in older browsers that have no Promise at all. + +When we discuss Promise resolution processes later in this chapter, it will become more obvious why a non-genuine-but-Promise-like value would still be very important to be able to recognize and assimilate. But for now, just take my word for it that it's a critical piece of the puzzle. + +As such, it was decided that the way to recognize a Promise (or something that behaves like a Promise) would be to define something called a "thenable" as any object or function which has a `then(..)` method on it. It is assumed that any such value is a Promise-conforming thenable. + +The general term for "type checks" that make assumptions about a value's "type" based on its shape (what properties are present) is called "duck typing" -- "If it looks like a duck, and quacks like a duck, it must be a duck" (see the *Types & Grammar* title of this book series). So the duck typing check for a thenable would roughly be: + +```js +if ( + p !== null && + ( + typeof p === "object" || + typeof p === "function" + ) && + typeof p.then === "function" +) { + // assume it's a thenable! +} +else { + // not a thenable +} +``` + +Yuck! Setting aside the fact that this logic is a bit ugly to implement in various places, there's something deeper and more troubling going on. + +If you try to fulfill a Promise with any object/function value that happens to have a `then(..)` function on it, but you weren't intending it to be treated as a Promise/thenable, you're out of luck, because it will automatically be recognized as thenable and treated with special rules (see later in the chapter). + +This is even true if you didn't realize the value has a `then(..)` on it. For example: + +```js +var o = { then: function(){} }; + +// make `v` be `[[Prototype]]`-linked to `o` +var v = Object.create( o ); + +v.someStuff = "cool"; +v.otherStuff = "not so cool"; + +v.hasOwnProperty( "then" ); // false +``` + +`v` doesn't look like a Promise or thenable at all. It's just a plain object with some properties on it. You're probably just intending to send that value around like any other object. + +But unknown to you, `v` is also `[[Prototype]]`-linked (see the *this & Object Prototypes* title of this book series) to another object `o`, which happens to have a `then(..)` on it. So the thenable duck typing checks will think and assume `v` is a thenable. Uh oh. + +It doesn't even need to be something as directly intentional as that: + +```js +Object.prototype.then = function(){}; +Array.prototype.then = function(){}; + +var v1 = { hello: "world" }; +var v2 = [ "Hello", "World" ]; +``` + +Both `v1` and `v2` will be assumed to be thenables. You can't control or predict if any other code accidentally or maliciously adds `then(..)` to `Object.prototype`, `Array.prototype`, or any of the other native prototypes. And if what's specified is a function that doesn't call either of its parameters as callbacks, then any Promise resolved with such a value will just silently hang forever! Crazy. + +Sound implausible or unlikely? Perhaps. + +But keep in mind that there were several well-known non-Promise libraries preexisting in the community prior to ES6 that happened to already have a method on them called `then(..)`. Some of those libraries chose to rename their own methods to avoid collision (that sucks!). Others have simply been relegated to the unfortunate status of "incompatible with Promise-based coding" in reward for their inability to change to get out of the way. + +The standards decision to hijack the previously nonreserved -- and completely general-purpose sounding -- `then` property name means that no value (or any of its delegates), either past, present, or future, can have a `then(..)` function present, either on purpose or by accident, or that value will be confused for a thenable in Promises systems, which will probably create bugs that are really hard to track down. + +**Warning:** I do not like how we ended up with duck typing of thenables for Promise recognition. There were other options, such as "branding" or even "anti-branding"; what we got seems like a worst-case compromise. But it's not all doom and gloom. Thenable duck typing can be helpful, as we'll see later. Just beware that thenable duck typing can be hazardous if it incorrectly identifies something as a Promise that isn't. + +## Promise Trust + +We've now seen two strong analogies that explain different aspects of what Promises can do for our async code. But if we stop there, we've missed perhaps the single most important characteristic that the Promise pattern establishes: trust. + +Whereas the *future values* and *completion events* analogies play out explicitly in the code patterns we've explored, it won't be entirely obvious why or how Promises are designed to solve all of the *inversion of control* trust issues we laid out in the "Trust Issues" section of Chapter 2. But with a little digging, we can uncover some important guarantees that restore the confidence in async coding that Chapter 2 tore down! + +Let's start by reviewing the trust issues with callbacks-only coding. When you pass a callback to a utility `foo(..)`, it might: + +* Call the callback too early +* Call the callback too late (or never) +* Call the callback too few or too many times +* Fail to pass along any necessary environment/parameters +* swallow any errors/exceptions that may happen + +The characteristics of Promises are intentionally designed to provide useful, repeatable answers to all these concerns. + +### Calling Too Early + +Primarily, this is a concern of whether code can introduce Zalgo-like effects (see Chapter 2), where sometimes a task finishes synchronously and sometimes asynchronously, which can lead to race conditions. + +Promises by definition cannot be susceptible to this concern, because even an immediately fulfilled Promise (like `new Promise(function(resolve){ resolve(42); })`) cannot be *observed* synchronously. + +That is, when you call `then(..)` on a Promise, even if that Promise was already resolved, the callback you provide to `then(..)` will **always** be called asynchronously (for more on this, refer back to "Jobs" in Chapter 1). + +No more need to insert your own `setTimeout(..,0)` hacks. Promises prevent Zalgo automatically. + +### Calling Too Late + +Similar to the previous point, a Promise's `then(..)` registered observation callbacks are automatically scheduled when either `resolve(..)` or `reject(..)` are called by the Promise creation capability. Those scheduled callbacks will predictably be fired at the next asynchronous moment (see "Jobs" in Chapter 1). + +It's not possible for synchronous observation, so it's not possible for a synchronous chain of tasks to run in such a way to in effect "delay" another callback from happening as expected. That is, when a Promise is resolved, all `then(..)` registered callbacks on it will be called, in order, immediately at the next asynchronous opportunity (again, see "Jobs" in Chapter 1), and nothing that happens inside of one of those callbacks can affect/delay the calling of the other callbacks. + +For example: + +```js +p.then( function(){ + p.then( function(){ + console.log( "C" ); + } ); + console.log( "A" ); +} ); +p.then( function(){ + console.log( "B" ); +} ); +// A B C +``` + +Here, `"C"` cannot interrupt and precede `"B"`, by virtue of how Promises are defined to operate. + +#### Promise Scheduling Quirks + +It's important to note, though, that there are lots of nuances of scheduling where the relative ordering between callbacks chained off two separate Promises is not reliably predictable. + +If two promises `p1` and `p2` are both already resolved, it should be true that `p1.then(..); p2.then(..)` would end up calling the callback(s) for `p1` before the ones for `p2`. But there are subtle cases where that might not be true, such as the following: + +```js +var p3 = new Promise( function(resolve,reject){ + resolve( "B" ); +} ); + +var p1 = new Promise( function(resolve,reject){ + resolve( p3 ); +} ); + +var p2 = new Promise( function(resolve,reject){ + resolve( "A" ); +} ); + +p1.then( function(v){ + console.log( v ); +} ); + +p2.then( function(v){ + console.log( v ); +} ); + +// A B <-- not B A as you might expect +``` + +We'll cover this more later, but as you can see, `p1` is resolved not with an immediate value, but with another promise `p3` which is itself resolved with the value `"B"`. The specified behavior is to *unwrap* `p3` into `p1`, but asynchronously, so `p1`'s callback(s) are *behind* `p2`'s callback(s) in the asynchronus Job queue (see Chapter 1). + +To avoid such nuanced nightmares, you should never rely on anything about the ordering/scheduling of callbacks across Promises. In fact, a good practice is not to code in such a way where the ordering of multiple callbacks matters at all. Avoid that if you can. + +### Never Calling the Callback + +This is a very common concern. It's addressable in several ways with Promises. + +First, nothing (not even a JS error) can prevent a Promise from notifying you of its resolution (if it's resolved). If you register both fulfillment and rejection callbacks for a Promise, and the Promise gets resolved, one of the two callbacks will always be called. + +Of course, if your callbacks themselves have JS errors, you may not see the outcome you expect, but the callback will in fact have been called. We'll cover later how to be notified of an error in your callback, because even those don't get swallowed. + +But what if the Promise itself never gets resolved either way? Even that is a condition that Promises provide an answer for, using a higher level abstraction called a "race": + +```js +// a utility for timing out a Promise +function timeoutPromise(delay) { + return new Promise( function(resolve,reject){ + setTimeout( function(){ + reject( "Timeout!" ); + }, delay ); + } ); +} + +// setup a timeout for `foo()` +Promise.race( [ + foo(), // attempt `foo()` + timeoutPromise( 3000 ) // give it 3 seconds +] ) +.then( + function(){ + // `foo(..)` fulfilled in time! + }, + function(err){ + // either `foo()` rejected, or it just + // didn't finish in time, so inspect + // `err` to know which + } +); +``` + +There are more details to consider with this Promise timeout pattern, but we'll come back to it later. + +Importantly, we can ensure a signal as to the outcome of `foo()`, to prevent it from hanging our program indefinitely. + +### Calling Too Few or Too Many Times + +By definition, *one* is the appropriate number of times for the callback to be called. The "too few" case would be zero calls, which is the same as the "never" case we just examined. + +The "too many" case is easy to explain. Promises are defined so that they can only be resolved once. If for some reason the Promise creation code tries to call `resolve(..)` or `reject(..)` multiple times, or tries to call both, the Promise will accept only the first resolution, and will silently ignore any subsequent attempts. + +Because a Promise can only be resolved once, any `then(..)` registered callbacks will only ever be called once (each). + +Of course, if you register the same callback more than once, (e.g., `p.then(f); p.then(f);`), it'll be called as many times as it was registered. The guarantee that a response function is called only once does not prevent you from shooting yourself in the foot. + +### Failing to Pass Along Any Parameters/Environment + +Promises can have, at most, one resolution value (fulfillment or rejection). + +If you don't explicitly resolve with a value either way, the value is `undefined`, as is typical in JS. But whatever the value, it will always be passed to all registered (and appropriate: fulfillment or rejection) callbacks, either *now* or in the future. + +Something to be aware of: If you call `resolve(..)` or `reject(..)` with multiple parameters, all subsequent parameters beyond the first will be silently ignored. Although that might seem a violation of the guarantee we just described, it's not exactly, because it constitutes an invalid usage of the Promise mechanism. Other invalid usages of the API (such as calling `resolve(..)` multiple times) are similarly *protected*, so the Promise behavior here is consistent (if not a tiny bit frustrating). + +If you want to pass along multiple values, you must wrap them in another single value that you pass, such as an `array` or an `object`. + +As for environment, functions in JS always retain their closure of the scope in which they're defined (see the *Scope & Closures* title of this series), so they of course would continue to have access to whatever surrounding state you provide. Of course, the same is true of callbacks-only design, so this isn't a specific augmentation of benefit from Promises -- but it's a guarantee we can rely on nonetheless. + +### Swallowing Any Errors/Exceptions + +In the base sense, this is a restatement of the previous point. If you reject a Promise with a *reason* (aka error message), that value is passed to the rejection callback(s). + +But there's something much bigger at play here. If at any point in the creation of a Promise, or in the observation of its resolution, a JS exception error occurs, such as a `TypeError` or `ReferenceError`, that exception will be caught, and it will force the Promise in question to become rejected. + +For example: + +```js +var p = new Promise( function(resolve,reject){ + foo.bar(); // `foo` is not defined, so error! + resolve( 42 ); // never gets here :( +} ); + +p.then( + function fulfilled(){ + // never gets here :( + }, + function rejected(err){ + // `err` will be a `TypeError` exception object + // from the `foo.bar()` line. + } +); +``` + +The JS exception that occurs from `foo.bar()` becomes a Promise rejection that you can catch and respond to. + +This is an important detail, because it effectively solves another potential Zalgo moment, which is that errors could create a synchronous reaction whereas nonerrors would be asynchronous. Promises turn even JS exceptions into asynchronous behavior, thereby reducing the race condition chances greatly. + +But what happens if a Promise is fulfilled, but there's a JS exception error during the observation (in a `then(..)` registered callback)? Even those aren't lost, but you may find how they're handled a bit surprising, until you dig in a little deeper: + +```js +var p = new Promise( function(resolve,reject){ + resolve( 42 ); +} ); + +p.then( + function fulfilled(msg){ + foo.bar(); + console.log( msg ); // never gets here :( + }, + function rejected(err){ + // never gets here either :( + } +); +``` + +Wait, that makes it seem like the exception from `foo.bar()` really did get swallowed. Never fear, it didn't. But something deeper is wrong, which is that we've failed to listen for it. The `p.then(..)` call itself returns another promise, and it's *that* promise that will be rejected with the `TypeError` exception. + +Why couldn't it just call the error handler we have defined there? Seems like a logical behavior on the surface. But it would violate the fundamental principle that Promises are **immutable** once resolved. `p` was already fulfilled to the value `42`, so it can't later be changed to a rejection just because there's an error in observing `p`'s resolution. + +Besides the principle violation, such behavior could wreak havoc, if say there were multiple `then(..)` registered callbacks on the promise `p`, because some would get called and others wouldn't, and it would be very opaque as to why. + +### Trustable Promise? + +There's one last detail to examine to establish trust based on the Promise pattern. + +You've no doubt noticed that Promises don't get rid of callbacks at all. They just change where the callback is passed to. Instead of passing a callback to `foo(..)`, we get *something* (ostensibly a genuine Promise) back from `foo(..)`, and we pass the callback to that *something* instead. + +But why would this be any more trustable than just callbacks alone? How can we be sure the *something* we get back is in fact a trustable Promise? Isn't it basically all just a house of cards where we can trust only because we already trusted? + +One of the most important, but often overlooked, details of Promises is that they have a solution to this issue as well. Included with the native ES6 `Promise` implementation is `Promise.resolve(..)`. + +If you pass an immediate, non-Promise, non-thenable value to `Promise.resolve(..)`, you get a promise that's fulfilled with that value. In other words, these two promises `p1` and `p2` will behave basically identically: + +```js +var p1 = new Promise( function(resolve,reject){ + resolve( 42 ); +} ); + +var p2 = Promise.resolve( 42 ); +``` + +But if you pass a genuine Promise to `Promise.resolve(..)`, you just get the same promise back: + +```js +var p1 = Promise.resolve( 42 ); + +var p2 = Promise.resolve( p1 ); + +p1 === p2; // true +``` + +Even more importantly, if you pass a non-Promise thenable value to `Promise.resolve(..)`, it will attempt to unwrap that value, and the unwrapping will keep going until a concrete final non-Promise-like value is extracted. + +Recall our previous discussion of thenables? + +Consider: + +```js +var p = { + then: function(cb) { + cb( 42 ); + } +}; + +// this works OK, but only by good fortune +p +.then( + function fulfilled(val){ + console.log( val ); // 42 + }, + function rejected(err){ + // never gets here + } +); +``` + +This `p` is a thenable, but it's not a genuine Promise. Luckily, it's reasonable, as most will be. But what if you got back instead something that looked like: + +```js +var p = { + then: function(cb,errcb) { + cb( 42 ); + errcb( "evil laugh" ); + } +}; + +p +.then( + function fulfilled(val){ + console.log( val ); // 42 + }, + function rejected(err){ + // oops, shouldn't have run + console.log( err ); // evil laugh + } +); +``` + +This `p` is a thenable but it's not so well behaved of a promise. Is it malicious? Or is it just ignorant of how Promises should work? It doesn't really matter, to be honest. In either case, it's not trustable as is. + +Nonetheless, we can pass either of these versions of `p` to `Promise.resolve(..)`, and we'll get the normalized, safe result we'd expect: + +```js +Promise.resolve( p ) +.then( + function fulfilled(val){ + console.log( val ); // 42 + }, + function rejected(err){ + // never gets here + } +); +``` + +`Promise.resolve(..)` will accept any thenable, and will unwrap it to its non-thenable value. But you get back from `Promise.resolve(..)` a real, genuine Promise in its place, **one that you can trust**. If what you passed in is already a genuine Promise, you just get it right back, so there's no downside at all to filtering through `Promise.resolve(..)` to gain trust. + +So let's say we're calling a `foo(..)` utility and we're not sure we can trust its return value to be a well-behaving Promise, but we know it's at least a thenable. `Promise.resolve(..)` will give us a trustable Promise wrapper to chain off of: + +```js +// don't just do this: +foo( 42 ) +.then( function(v){ + console.log( v ); +} ); + +// instead, do this: +Promise.resolve( foo( 42 ) ) +.then( function(v){ + console.log( v ); +} ); +``` + +**Note:** Another beneficial side effect of wrapping `Promise.resolve(..)` around any function's return value (thenable or not) is that it's an easy way to normalize that function call into a well-behaving async task. If `foo(42)` returns an immediate value sometimes, or a Promise other times, `Promise.resolve( foo(42) )` makes sure it's always a Promise result. And avoiding Zalgo makes for much better code. + +### Trust Built + +Hopefully the previous discussion now fully "resolves" (pun intended) in your mind why the Promise is trustable, and more importantly, why that trust is so critical in building robust, maintainable software. + +Can you write async code in JS without trust? Of course you can. We JS developers have been coding async with nothing but callbacks for nearly two decades. + +But once you start questioning just how much you can trust the mechanisms you build upon to actually be predictable and reliable, you start to realize callbacks have a pretty shaky trust foundation. + +Promises are a pattern that augments callbacks with trustable semantics, so that the behavior is more reason-able and more reliable. By uninverting the *inversion of control* of callbacks, we place the control with a trustable system (Promises) that was designed specifically to bring sanity to our async. + +## Chain Flow + +We've hinted at this a couple of times already, but Promises are not just a mechanism for a single-step *this-then-that* sort of operation. That's the building block, of course, but it turns out we can string multiple Promises together to represent a sequence of async steps. + +The key to making this work is built on two behaviors intrinsic to Promises: + +* Every time you call `then(..)` on a Promise, it creates and returns a new Promise, which we can *chain* with. +* Whatever value you return from the `then(..)` call's fulfillment callback (the first parameter) is automatically set as the fulfillment of the *chained* Promise (from the first point). + +Let's first illustrate what that means, and *then* we'll derive how that helps us create async sequences of flow control. Consider the following: + +```js +var p = Promise.resolve( 21 ); + +var p2 = p.then( function(v){ + console.log( v ); // 21 + + // fulfill `p2` with value `42` + return v * 2; +} ); + +// chain off `p2` +p2.then( function(v){ + console.log( v ); // 42 +} ); +``` + +By returning `v * 2` (i.e., `42`), we fulfill the `p2` promise that the first `then(..)` call created and returned. When `p2`'s `then(..)` call runs, it's receiving the fulfillment from the `return v * 2` statement. Of course, `p2.then(..)` creates yet another promise, which we could have stored in a `p3` variable. + +But it's a little annoying to have to create an intermediate variable `p2` (or `p3`, etc.). Thankfully, we can easily just chain these together: + +```js +var p = Promise.resolve( 21 ); + +p +.then( function(v){ + console.log( v ); // 21 + + // fulfill the chained promise with value `42` + return v * 2; +} ) +// here's the chained promise +.then( function(v){ + console.log( v ); // 42 +} ); +``` + +So now the first `then(..)` is the first step in an async sequence, and the second `then(..)` is the second step. This could keep going for as long as you needed it to extend. Just keep chaining off a previous `then(..)` with each automatically created Promise. + +But there's something missing here. What if we want step 2 to wait for step 1 to do something asynchronous? We're using an immediate `return` statement, which immediately fulfills the chained promise. + +The key to making a Promise sequence truly async capable at every step is to recall how `Promise.resolve(..)` operates when what you pass to it is a Promise or thenable instead of a final value. `Promise.resolve(..)` directly returns a received genuine Promise, or it unwraps the value of a received thenable -- and keeps going recursively while it keeps unwrapping thenables. + +The same sort of unwrapping happens if you `return` a thenable or Promise from the fulfillment (or rejection) handler. Consider: + +```js +var p = Promise.resolve( 21 ); + +p.then( function(v){ + console.log( v ); // 21 + + // create a promise and return it + return new Promise( function(resolve,reject){ + // fulfill with value `42` + resolve( v * 2 ); + } ); +} ) +.then( function(v){ + console.log( v ); // 42 +} ); +``` + +Even though we wrapped `42` up in a promise that we returned, it still got unwrapped and ended up as the resolution of the chained promise, such that the second `then(..)` still received `42`. If we introduce asynchrony to that wrapping promise, everything still nicely works the same: + +```js +var p = Promise.resolve( 21 ); + +p.then( function(v){ + console.log( v ); // 21 + + // create a promise to return + return new Promise( function(resolve,reject){ + // introduce asynchrony! + setTimeout( function(){ + // fulfill with value `42` + resolve( v * 2 ); + }, 100 ); + } ); +} ) +.then( function(v){ + // runs after the 100ms delay in the previous step + console.log( v ); // 42 +} ); +``` + +That's incredibly powerful! Now we can construct a sequence of however many async steps we want, and each step can delay the next step (or not!), as necessary. + +Of course, the value passing from step to step in these examples is optional. If you don't return an explicit value, an implicit `undefined` is assumed, and the promises still chain together the same way. Each Promise resolution is thus just a signal to proceed to the next step. + +To further the chain illustration, let's generalize a delay-Promise creation (without resolution messages) into a utility we can reuse for multiple steps: + +```js +function delay(time) { + return new Promise( function(resolve,reject){ + setTimeout( resolve, time ); + } ); +} + +delay( 100 ) // step 1 +.then( function STEP2(){ + console.log( "step 2 (after 100ms)" ); + return delay( 200 ); +} ) +.then( function STEP3(){ + console.log( "step 3 (after another 200ms)" ); +} ) +.then( function STEP4(){ + console.log( "step 4 (next Job)" ); + return delay( 50 ); +} ) +.then( function STEP5(){ + console.log( "step 5 (after another 50ms)" ); +} ) +... +``` + +Calling `delay(200)` creates a promise that will fulfill in 200ms, and then we return that from the first `then(..)` fulfillment callback, which causes the second `then(..)`'s promise to wait on that 200ms promise. + +**Note:** As described, technically there are two promises in that interchange: the 200ms-delay promise and the chained promise that the second `then(..)` chains from. But you may find it easier to mentally combine these two promises together, because the Promise mechanism automatically merges their states for you. In that respect, you could think of `return delay(200)` as creating a promise that replaces the earlier-returned chained promise. + +To be honest, though, sequences of delays with no message passing isn't a terribly useful example of Promise flow control. Let's look at a scenario that's a little more practical. + +Instead of timers, let's consider making Ajax requests: + +```js +// assume an `ajax( {url}, {callback} )` utility + +// Promise-aware ajax +function request(url) { + return new Promise( function(resolve,reject){ + // the `ajax(..)` callback should be our + // promise's `resolve(..)` function + ajax( url, resolve ); + } ); +} +``` + +We first define a `request(..)` utility that constructs a promise to represent the completion of the `ajax(..)` call: + +```js +request( "http://some.url.1/" ) +.then( function(response1){ + return request( "http://some.url.2/?v=" + response1 ); +} ) +.then( function(response2){ + console.log( response2 ); +} ); +``` + +**Note:** Developers commonly encounter situations in which they want to do Promise-aware async flow control with utilities that are not themselves Promise-enabled (like `ajax(..)` here, which expects a callback). Although the native ES6 `Promise` mechanism doesn't automatically solve this pattern for us, practically all Promise libraries *do*. They usually call this process "lifting" or "promisifying" or some variation thereof. We'll come back to this technique later. + +Using the Promise-returning `request(..)`, we create the first step in our chain implicitly by calling it with the first URL, and chain off that returned promise with the first `then(..)`. + +Once `response1` comes back, we use that value to construct a second URL, and make a second `request(..)` call. That second `request(..)` promise is `return`ed so that the third step in our async flow control waits for that Ajax call to complete. Finally, we print `response2` once it returns. + +The Promise chain we construct is not only a flow control that expresses a multistep async sequence, but it also acts as a message channel to propagate messages from step to step. + +What if something went wrong in one of the steps of the Promise chain? An error/exception is on a per-Promise basis, which means it's possible to catch such an error at any point in the chain, and that catching acts to sort of "reset" the chain back to normal operation at that point: + +```js +// step 1: +request( "http://some.url.1/" ) + +// step 2: +.then( function(response1){ + foo.bar(); // undefined, error! + + // never gets here + return request( "http://some.url.2/?v=" + response1 ); +} ) + +// step 3: +.then( + function fulfilled(response2){ + // never gets here + }, + // rejection handler to catch the error + function rejected(err){ + console.log( err ); // `TypeError` from `foo.bar()` error + return 42; + } +) + +// step 4: +.then( function(msg){ + console.log( msg ); // 42 +} ); +``` + +When the error occurs in step 2, the rejection handler in step 3 catches it. The return value (`42` in this snippet), if any, from that rejection handler fulfills the promise for the next step (4), such that the chain is now back in a fulfillment state. + +**Note:** As we discussed earlier, when returning a promise from a fulfillment handler, it's unwrapped and can delay the next step. That's also true for returning promises from rejection handlers, such that if the `return 42` in step 3 instead returned a promise, that promise could delay step 4. A thrown exception inside either the fulfillment or rejection handler of a `then(..)` call causes the next (chained) promise to be immediately rejected with that exception. + +If you call `then(..)` on a promise, and you only pass a fulfillment handler to it, an assumed rejection handler is substituted: + +```js +var p = new Promise( function(resolve,reject){ + reject( "Oops" ); +} ); + +var p2 = p.then( + function fulfilled(){ + // never gets here + } + // assumed rejection handler, if omitted or + // any other non-function value passed + // function(err) { + // throw err; + // } +); +``` + +As you can see, the assumed rejection handler simply rethrows the error, which ends up forcing `p2` (the chained promise) to reject with the same error reason. In essence, this allows the error to continue propagating along a Promise chain until an explicitly defined rejection handler is encountered. + +**Note:** We'll cover more details of error handling with Promises a little later, because there are other nuanced details to be concerned about. + +If a proper valid function is not passed as the fulfillment handler parameter to `then(..)`, there's also a default handler substituted: + +```js +var p = Promise.resolve( 42 ); + +p.then( + // assumed fulfillment handler, if omitted or + // any other non-function value passed + // function(v) { + // return v; + // } + null, + function rejected(err){ + // never gets here + } +); +``` + +As you can see, the default fulfillment handler simply passes whatever value it receives along to the next step (Promise). + +**Note:** The `then(null,function(err){ .. })` pattern -- only handling rejections (if any) but letting fulfillments pass through -- has a shortcut in the API: `catch(function(err){ .. })`. We'll cover `catch(..)` more fully in the next section. + +Let's review briefly the intrinsic behaviors of Promises that enable chaining flow control: + +* A `then(..)` call against one Promise automatically produces a new Promise to return from the call. +* Inside the fulfillment/rejection handlers, if you return a value or an exception is thrown, the new returned (chainable) Promise is resolved accordingly. +* If the fulfillment or rejection handler returns a Promise, it is unwrapped, so that whatever its resolution is will become the resolution of the chained Promise returned from the current `then(..)`. + +While chaining flow control is helpful, it's probably most accurate to think of it as a side benefit of how Promises compose (combine) together, rather than the main intent. As we've discussed in detail several times already, Promises normalize asynchrony and encapsulate time-dependent value state, and *that* is what lets us chain them together in this useful way. + +Certainly, the sequential expressiveness of the chain (this-then-this-then-this...) is a big improvement over the tangled mess of callbacks as we identified in Chapter 2. But there's still a fair amount of boilerplate (`then(..)` and `function(){ .. }`) to wade through. In the next chapter, we'll see a significantly nicer pattern for sequential flow control expressivity, with generators. + +### Terminology: Resolve, Fulfill, and Reject + +There's some slight confusion around the terms "resolve," "fulfill," and "reject" that we need to clear up, before you get too much deeper into learning about Promises. Let's first consider the `Promise(..)` constructor: + +```js +var p = new Promise( function(X,Y){ + // X() for fulfillment + // Y() for rejection +} ); +``` + +As you can see, two callbacks (here labeled `X` and `Y`) are provided. The first is *usually* used to mark the Promise as fulfilled, and the second *always* marks the Promise as rejected. But what's the "usually" about, and what does that imply about accurately naming those parameters? + +Ultimately, it's just your user code and the identifier names aren't interpreted by the engine to mean anything, so it doesn't *technically* matter; `foo(..)` and `bar(..)` are equally functional. But the words you use can affect not only how you are thinking about the code, but how other developers on your team will think about it. Thinking wrongly about carefully orchestrated async code is almost surely going to be worse than the spaghetti-callback alternatives. + +So it actually does kind of matter what you call them. + +The second parameter is easy to decide. Almost all literature uses `reject(..)` as its name, and because that's exactly (and only!) what it does, that's a very good choice for the name. I'd strongly recommend you always use `reject(..)`. + +But there's a little more ambiguity around the first parameter, which in Promise literature is often labeled `resolve(..)`. That word is obviously related to "resolution," which is what's used across the literature (including this book) to describe setting a final value/state to a Promise. We've already used "resolve the Promise" several times to mean either fulfilling or rejecting the Promise. + +But if this parameter seems to be used to specifically fulfill the Promise, why shouldn't we call it `fulfill(..)` instead of `resolve(..)` to be more accurate? To answer that question, let's also take a look at two of the `Promise` API methods: + +```js +var fulfilledPr = Promise.resolve( 42 ); + +var rejectedPr = Promise.reject( "Oops" ); +``` + +`Promise.resolve(..)` creates a Promise that's resolved to the value given to it. In this example, `42` is a normal, non-Promise, non-thenable value, so the fulfilled promise `fulfilledPr` is created for the value `42`. `Promise.reject("Oops")` creates the rejected promise `rejectedPr` for the reason `"Oops"`. + +Let's now illustrate why the word "resolve" (such as in `Promise.resolve(..)`) is unambiguous and indeed more accurate, if used explicitly in a context that could result in either fulfillment or rejection: + +```js +var rejectedTh = { + then: function(resolved,rejected) { + rejected( "Oops" ); + } +}; + +var rejectedPr = Promise.resolve( rejectedTh ); +``` + +As we discussed earlier in this chapter, `Promise.resolve(..)` will return a received genuine Promise directly, or unwrap a received thenable. If that thenable unwrapping reveals a rejected state, the Promise returned from `Promise.resolve(..)` is in fact in that same rejected state. + +So `Promise.resolve(..)` is a good, accurate name for the API method, because it can actually result in either fulfillment or rejection. + +The first callback parameter of the `Promise(..)` constructor will unwrap either a thenable (identically to `Promise.resolve(..)`) or a genuine Promise: + +```js +var rejectedPr = new Promise( function(resolve,reject){ + // resolve this promise with a rejected promise + resolve( Promise.reject( "Oops" ) ); +} ); + +rejectedPr.then( + function fulfilled(){ + // never gets here + }, + function rejected(err){ + console.log( err ); // "Oops" + } +); +``` + +It should be clear now that `resolve(..)` is the appropriate name for the first callback parameter of the `Promise(..)` constructor. + +**Warning:** The previously mentioned `reject(..)` does **not** do the unwrapping that `resolve(..)` does. If you pass a Promise/thenable value to `reject(..)`, that untouched value will be set as the rejection reason. A subsequent rejection handler would receive the actual Promise/thenable you passed to `reject(..)`, not its underlying immediate value. + +But now let's turn our attention to the callbacks provided to `then(..)`. What should they be called (both in literature and in code)? I would suggest `fulfilled(..)` and `rejected(..)`: + +```js +function fulfilled(msg) { + console.log( msg ); +} + +function rejected(err) { + console.error( err ); +} + +p.then( + fulfilled, + rejected +); +``` + +In the case of the first parameter to `then(..)`, it's unambiguously always the fulfillment case, so there's no need for the duality of "resolve" terminology. As a side note, the ES6 specification uses `onFulfilled(..)` and `onRejected(..)` to label these two callbacks, so they are accurate terms. + +## Error Handling + +We've already seen several examples of how Promise rejection -- either intentional through calling `reject(..)` or accidental through JS exceptions -- allows saner error handling in asynchronous programming. Let's circle back though and be explicit about some of the details that we glossed over. + +The most natural form of error handling for most developers is the synchronous `try..catch` construct. Unfortunately, it's synchronous-only, so it fails to help in async code patterns: + +```js +function foo() { + setTimeout( function(){ + baz.bar(); + }, 100 ); +} + +try { + foo(); + // later throws global error from `baz.bar()` +} +catch (err) { + // never gets here +} +``` + +`try..catch` would certainly be nice to have, but it doesn't work across async operations. That is, unless there's some additional environmental support, which we'll come back to with generators in Chapter 4. + +In callbacks, some standards have emerged for patterned error handling, most notably the "error-first callback" style: + +```js +function foo(cb) { + setTimeout( function(){ + try { + var x = baz.bar(); + cb( null, x ); // success! + } + catch (err) { + cb( err ); + } + }, 100 ); +} + +foo( function(err,val){ + if (err) { + console.error( err ); // bummer :( + } + else { + console.log( val ); + } +} ); +``` + +**Note:** The `try..catch` here works only from the perspective that the `baz.bar()` call will either succeed or fail immediately, synchronously. If `baz.bar()` was itself its own async completing function, any async errors inside it would not be catchable. + +The callback we pass to `foo(..)` expects to receive a signal of an error by the reserved first parameter `err`. If present, error is assumed. If not, success is assumed. + +This sort of error handling is technically *async capable*, but it doesn't compose well at all. Multiple levels of error-first callbacks woven together with these ubiquitous `if` statement checks inevitably will lead you to the perils of callback hell (see Chapter 2). + +So we come back to error handling in Promises, with the rejection handler passed to `then(..)`. Promises don't use the popular "error-first callback" design style, but instead use "split callbacks" style; there's one callback for fulfillment and one for rejection: + +```js +var p = Promise.reject( "Oops" ); + +p.then( + function fulfilled(){ + // never gets here + }, + function rejected(err){ + console.log( err ); // "Oops" + } +); +``` + +While this pattern of error handling makes fine sense on the surface, the nuances of Promise error handling are often a fair bit more difficult to fully grasp. + +Consider: + +```js +var p = Promise.resolve( 42 ); + +p.then( + function fulfilled(msg){ + // numbers don't have string functions, + // so will throw an error + console.log( msg.toLowerCase() ); + }, + function rejected(err){ + // never gets here + } +); +``` + +If the `msg.toLowerCase()` legitimately throws an error (it does!), why doesn't our error handler get notified? As we explained earlier, it's because *that* error handler is for the `p` promise, which has already been fulfilled with value `42`. The `p` promise is immutable, so the only promise that can be notified of the error is the one returned from `p.then(..)`, which in this case we don't capture. + +That should paint a clear picture of why error handling with Promises is error-prone (pun intended). It's far too easy to have errors swallowed, as this is very rarely what you'd intend. + +**Warning:** If you use the Promise API in an invalid way and an error occurs that prevents proper Promise construction, the result will be an immediately thrown exception, **not a rejected Promise**. Some examples of incorrect usage that fail Promise construction: `new Promise(null)`, `Promise.all()`, `Promise.race(42)`, and so on. You can't get a rejected Promise if you don't use the Promise API validly enough to actually construct a Promise in the first place! + +### Pit of Despair + +Jeff Atwood noted years ago: programming languages are often set up in such a way that by default, developers fall into the "pit of despair" (http://blog.codinghorror.com/falling-into-the-pit-of-success/) -- where accidents are punished -- and that you have to try harder to get it right. He implored us to instead create a "pit of success," where by default you fall into expected (successful) action, and thus would have to try hard to fail. + +Promise error handling is unquestionably "pit of despair" design. By default, it assumes that you want any error to be swallowed by the Promise state, and if you forget to observe that state, the error silently languishes/dies in obscurity -- usually despair. + +To avoid losing an error to the silence of a forgotten/discarded Promise, some developers have claimed that a "best practice" for Promise chains is to always end your chain with a final `catch(..)`, like: + +```js +var p = Promise.resolve( 42 ); + +p.then( + function fulfilled(msg){ + // numbers don't have string functions, + // so will throw an error + console.log( msg.toLowerCase() ); + } +) +.catch( handleErrors ); +``` + +Because we didn't pass a rejection handler to the `then(..)`, the default handler was substituted, which simply propagates the error to the next promise in the chain. As such, both errors that come into `p`, and errors that come *after* `p` in its resolution (like the `msg.toLowerCase()` one) will filter down to the final `handleErrors(..)`. + +Problem solved, right? Not so fast! + +What happens if `handleErrors(..)` itself also has an error in it? Who catches that? There's still yet another unattended promise: the one `catch(..)` returns, which we don't capture and don't register a rejection handler for. + +You can't just stick another `catch(..)` on the end of that chain, because it too could fail. The last step in any Promise chain, whatever it is, always has the possibility, even decreasingly so, of dangling with an uncaught error stuck inside an unobserved Promise. + +Sound like an impossible conundrum yet? + +### Uncaught Handling + +It's not exactly an easy problem to solve completely. There are other ways to approach it which many would say are *better*. + +Some Promise libraries have added methods for registering something like a "global unhandled rejection" handler, which would be called instead of a globally thrown error. But their solution for how to identify an error as "uncaught" is to have an arbitrary-length timer, say 3 seconds, running from time of rejection. If a Promise is rejected but no error handler is registered before the timer fires, then it's assumed that you won't ever be registering a handler, so it's "uncaught." + +In practice, this has worked well for many libraries, as most usage patterns don't typically call for significant delay between Promise rejection and observation of that rejection. But this pattern is troublesome because 3 seconds is so arbitrary (even if empirical), and also because there are indeed some cases where you want a Promise to hold on to its rejectedness for some indefinite period of time, and you don't really want to have your "uncaught" handler called for all those false positives (not-yet-handled "uncaught errors"). + +Another more common suggestion is that Promises should have a `done(..)` added to them, which essentially marks the Promise chain as "done." `done(..)` doesn't create and return a Promise, so the callbacks passed to `done(..)` are obviously not wired up to report problems to a chained Promise that doesn't exist. + +So what happens instead? It's treated as you might usually expect in uncaught error conditions: any exception inside a `done(..)` rejection handler would be thrown as a global uncaught error (in the developer console, basically): + +```js +var p = Promise.resolve( 42 ); + +p.then( + function fulfilled(msg){ + // numbers don't have string functions, + // so will throw an error + console.log( msg.toLowerCase() ); + } +) +.done( null, handleErrors ); + +// if `handleErrors(..)` caused its own exception, it would +// be thrown globally here +``` + +This might sound more attractive than the never-ending chain or the arbitrary timeouts. But the biggest problem is that it's not part of the ES6 standard, so no matter how good it sounds, at best it's a lot longer way off from being a reliable and ubiquitous solution. + +Are we just stuck, then? Not entirely. + +Browsers have a unique capability that our code does not have: they can track and know for sure when any object gets thrown away and garbage collected. So, browsers can track Promise objects, and whenever they get garbage collected, if they have a rejection in them, the browser knows for sure this was a legitimate "uncaught error," and can thus confidently know it should report it to the developer console. + +**Note:** At the time of this writing, both Chrome and Firefox have early attempts at that sort of "uncaught rejection" capability, though support is incomplete at best. + +However, if a Promise doesn't get garbage collected -- it's exceedingly easy for that to accidentally happen through lots of different coding patterns -- the browser's garbage collection sniffing won't help you know and diagnose that you have a silently rejected Promise laying around. + +Is there any other alternative? Yes. + +### Pit of Success + +The following is just theoretical, how Promises *could* be someday changed to behave. I believe it would be far superior to what we currently have. And I think this change would be possible even post-ES6 because I don't think it would break web compatibility with ES6 Promises. Moreover, it can be polyfilled/prollyfilled in, if you're careful. Let's take a look: + +* Promises could default to reporting (to the developer console) any rejection, on the next Job or event loop tick, if at that exact moment no error handler has been registered for the Promise. +* For the cases where you want a rejected Promise to hold onto its rejected state for an indefinite amount of time before observing, you could call `defer()`, which suppresses automatic error reporting on that Promise. + +If a Promise is rejected, it defaults to noisily reporting that fact to the developer console (instead of defaulting to silence). You can opt out of that reporting either implicitly (by registering an error handler before rejection), or explicitly (with `defer()`). In either case, *you* control the false positives. + +Consider: + +```js +var p = Promise.reject( "Oops" ).defer(); + +// `foo(..)` is Promise-aware +foo( 42 ) +.then( + function fulfilled(){ + return p; + }, + function rejected(err){ + // handle `foo(..)` error + } +); +... +``` + +When we create `p`, we know we're going to wait a while to use/observe its rejection, so we call `defer()` -- thus no global reporting. `defer()` simply returns the same promise, for chaining purposes. + +The promise returned from `foo(..)` gets an error handler attached *right away*, so it's implicitly opted out and no global reporting for it occurs either. + +But the promise returned from the `then(..)` call has no `defer()` or error handler attached, so if it rejects (from inside either resolution handler), then *it* will be reported to the developer console as an uncaught error. + +**This design is a pit of success.** By default, all errors are either handled or reported -- what almost all developers in almost all cases would expect. You either have to register a handler or you have to intentionally opt out, and indicate you intend to defer error handling until *later*; you're opting for the extra responsibility in just that specific case. + +The only real danger in this approach is if you `defer()` a Promise but then fail to actually ever observe/handle its rejection. + +But you had to intentionally call `defer()` to opt into that pit of despair -- the default was the pit of success -- so there's not much else we could do to save you from your own mistakes. + +I think there's still hope for Promise error handling (post-ES6). I hope the powers that be will rethink the situation and consider this alternative. In the meantime, you can implement this yourself (a challenging exercise for the reader!), or use a *smarter* Promise library that does so for you! + +**Note:** This exact model for error handling/reporting is implemented in my *asynquence* Promise abstraction library, which will be discussed in Appendix A of this book. + +## Promise Patterns + +We've already implicitly seen the sequence pattern with Promise chains (this-then-this-then-that flow control) but there are lots of variations on asynchronous patterns that we can build as abstractions on top of Promises. These patterns serve to simplify the expression of async flow control -- which helps make our code more reason-able and more maintainable -- even in the most complex parts of our programs. + +Two such patterns are codified directly into the native ES6 `Promise` implementation, so we get them for free, to use as building blocks for other patterns. + +### Promise.all([ .. ]) + +In an async sequence (Promise chain), only one async task is being coordinated at any given moment -- step 2 strictly follows step 1, and step 3 strictly follows step 2. But what about doing two or more steps concurrently (aka "in parallel")? + +In classic programming terminology, a "gate" is a mechanism that waits on two or more parallel/concurrent tasks to complete before continuing. It doesn't matter what order they finish in, just that all of them have to complete for the gate to open and let the flow control through. + +In the Promise API, we call this pattern `all([ .. ])`. + +Say you wanted to make two Ajax requests at the same time, and wait for both to finish, regardless of their order, before making a third Ajax request. Consider: + +```js +// `request(..)` is a Promise-aware Ajax utility, +// like we defined earlier in the chapter + +var p1 = request( "http://some.url.1/" ); +var p2 = request( "http://some.url.2/" ); + +Promise.all( [p1,p2] ) +.then( function(msgs){ + // both `p1` and `p2` fulfill and pass in + // their messages here + return request( + "http://some.url.3/?v=" + msgs.join(",") + ); +} ) +.then( function(msg){ + console.log( msg ); +} ); +``` + +`Promise.all([ .. ])` expects a single argument, an `array`, consisting generally of Promise instances. The promise returned from the `Promise.all([ .. ])` call will receive a fulfillment message (`msgs` in this snippet) that is an `array` of all the fulfillment messages from the passed in promises, in the same order as specified (regardless of fulfillment order). + +**Note:** Technically, the `array` of values passed into `Promise.all([ .. ])` can include Promises, thenables, or even immediate values. Each value in the list is essentially passed through `Promise.resolve(..)` to make sure it's a genuine Promise to be waited on, so an immediate value will just be normalized into a Promise for that value. If the `array` is empty, the main Promise is immediately fulfilled. + +The main promise returned from `Promise.all([ .. ])` will only be fulfilled if and when all its constituent promises are fulfilled. If any one of those promises instead is rejected, the main `Promise.all([ .. ])` promise is immediately rejected, discarding all results from any other promises. + +Remember to always attach a rejection/error handler to every promise, even and especially the one that comes back from `Promise.all([ .. ])`. + +### Promise.race([ .. ]) + +While `Promise.all([ .. ])` coordinates multiple Promises concurrently and assumes all are needed for fulfillment, sometimes you only want to respond to the "first Promise to cross the finish line," letting the other Promises fall away. + +This pattern is classically called a "latch," but in Promises it's called a "race." + +**Warning:** While the metaphor of "only the first across the finish line wins" fits the behavior well, unfortunately "race" is kind of a loaded term, because "race conditions" are generally taken as bugs in programs (see Chapter 1). Don't confuse `Promise.race([ .. ])` with "race condition." + +`Promise.race([ .. ])` also expects a single `array` argument, containing one or more Promises, thenables, or immediate values. It doesn't make much practical sense to have a race with immediate values, because the first one listed will obviously win -- like a foot race where one runner starts at the finish line! + +Similar to `Promise.all([ .. ])`, `Promise.race([ .. ])` will fulfill if and when any Promise resolution is a fulfillment, and it will reject if and when any Promise resolution is a rejection. + +**Warning:** A "race" requires at least one "runner," so if you pass an empty `array`, instead of immediately resolving, the main `race([..])` Promise will never resolve. This is a footgun! ES6 should have specified that it either fulfills, rejects, or just throws some sort of synchronous error. Unfortunately, because of precedence in Promise libraries predating ES6 `Promise`, they had to leave this gotcha in there, so be careful never to send in an empty `array`. + +Let's revisit our previous concurrent Ajax example, but in the context of a race between `p1` and `p2`: + +```js +// `request(..)` is a Promise-aware Ajax utility, +// like we defined earlier in the chapter + +var p1 = request( "http://some.url.1/" ); +var p2 = request( "http://some.url.2/" ); + +Promise.race( [p1,p2] ) +.then( function(msg){ + // either `p1` or `p2` will win the race + return request( + "http://some.url.3/?v=" + msg + ); +} ) +.then( function(msg){ + console.log( msg ); +} ); +``` + +Because only one promise wins, the fulfillment value is a single message, not an `array` as it was for `Promise.all([ .. ])`. + +#### Timeout Race + +We saw this example earlier, illustrating how `Promise.race([ .. ])` can be used to express the "promise timeout" pattern: + +```js +// `foo()` is a Promise-aware function + +// `timeoutPromise(..)`, defined ealier, returns +// a Promise that rejects after a specified delay + +// setup a timeout for `foo()` +Promise.race( [ + foo(), // attempt `foo()` + timeoutPromise( 3000 ) // give it 3 seconds +] ) +.then( + function(){ + // `foo(..)` fulfilled in time! + }, + function(err){ + // either `foo()` rejected, or it just + // didn't finish in time, so inspect + // `err` to know which + } +); +``` + +This timeout pattern works well in most cases. But there are some nuances to consider, and frankly they apply to both `Promise.race([ .. ])` and `Promise.all([ .. ])` equally. + +#### "Finally" + +The key question to ask is, "What happens to the promises that get discarded/ignored?" We're not asking that question from the performance perspective -- they would typically end up garbage collection eligible -- but from the behavioral perspective (side effects, etc.). Promises cannot be canceled -- and shouldn't be as that would destroy the external immutability trust discussed in the "Promise Uncancelable" section later in this chapter -- so they can only be silently ignored. + +But what if `foo()` in the previous example is reserving some sort of resource for usage, but the timeout fires first and causes that promise to be ignored? Is there anything in this pattern that proactively frees the reserved resource after the timeout, or otherwise cancels any side effects it may have had? What if all you wanted was to log the fact that `foo()` timed out? + +Some developers have proposed that Promises need a `finally(..)` callback registration, which is always called when a Promise resolves, and allows you to specify any cleanup that may be necessary. This doesn't exist in the specification at the moment, but it may come in ES7+. We'll have to wait and see. + +It might look like: + +```js +var p = Promise.resolve( 42 ); + +p.then( something ) +.finally( cleanup ) +.then( another ) +.finally( cleanup ); +``` + +**Note:** In various Promise libraries, `finally(..)` still creates and returns a new Promise (to keep the chain going). If the `cleanup(..)` function were to return a Promise, it would be linked into the chain, which means you could still have the unhandled rejection issues we discussed earlier. + +In the meantime, we could make a static helper utility that lets us observe (without interfering) the resolution of a Promise: + +```js +// polyfill-safe guard check +if (!Promise.observe) { + Promise.observe = function(pr,cb) { + // side-observe `pr`'s resolution + pr.then( + function fulfilled(msg){ + // schedule callback async (as Job) + Promise.resolve( msg ).then( cb ); + }, + function rejected(err){ + // schedule callback async (as Job) + Promise.resolve( err ).then( cb ); + } + ); + + // return original promise + return pr; + }; +} +``` + +Here's how we'd use it in the timeout example from before: + +```js +Promise.race( [ + Promise.observe( + foo(), // attempt `foo()` + function cleanup(msg){ + // clean up after `foo()`, even if it + // didn't finish before the timeout + } + ), + timeoutPromise( 3000 ) // give it 3 seconds +] ) +``` + +This `Promise.observe(..)` helper is just an illustration of how you could observe the completions of Promises without interfering with them. Other Promise libraries have their own solutions. Regardless of how you do it, you'll likely have places where you want to make sure your Promises aren't *just* silently ignored by accident. + +### Variations on all([ .. ]) and race([ .. ]) + +While native ES6 Promises come with built-in `Promise.all([ .. ])` and `Promise.race([ .. ])`, there are several other commonly used patterns with variations on those semantics: + +* `none([ .. ])` is like `all([ .. ])`, but fulfillments and rejections are transposed. All Promises need to be rejected -- rejections become the fulfillment values and vice versa. +* `any([ .. ])` is like `all([ .. ])`, but it ignores any rejections, so only one needs to fulfill instead of *all* of them. +* `first([ .. ])` is a like a race with `any([ .. ])`, which is that it ignores any rejections and fulfills as soon as the first Promise fulfills. +* `last([ .. ])` is like `first([ .. ])`, but only the latest fulfillment wins. + +Some Promise abstraction libraries provide these, but you could also define them yourself using the mechanics of Promises, `race([ .. ])` and `all([ .. ])`. + +For example, here's how we could define `first([ .. ])`: + +```js +// polyfill-safe guard check +if (!Promise.first) { + Promise.first = function(prs) { + return new Promise( function(resolve,reject){ + // loop through all promises + prs.forEach( function(pr){ + // normalize the value + Promise.resolve( pr ) + // whichever one fulfills first wins, and + // gets to resolve the main promise + .then( resolve ); + } ); + } ); + }; +} +``` + +**Note:** This implementation of `first(..)` does not reject if all its promises reject; it simply hangs, much like a `Promise.race([])` does. If desired, you could add additional logic to track each promise rejection and if all reject, call `reject()` on the main promise. We'll leave that as an exercise for the reader. + +### Concurrent Iterations + +Sometimes you want to iterate over a list of Promises and perform some task against all of them, much like you can do with synchronous `array`s (e.g., `forEach(..)`, `map(..)`, `some(..)`, and `every(..)`). If the task to perform against each Promise is fundamentally synchronous, these work fine, just as we used `forEach(..)` in the previous snippet. + +But if the tasks are fundamentally asynchronous, or can/should otherwise be performed concurrently, you can use async versions of these utilities as provided by many libraries. + +For example, let's consider an asynchronous `map(..)` utility that takes an `array` of values (could be Promises or anything else), plus a function (task) to perform against each. `map(..)` itself returns a promise whose fulfillment value is an `array` that holds (in the same mapping order) the async fulfillment value from each task: + +```js +if (!Promise.map) { + Promise.map = function(vals,cb) { + // new promise that waits for all mapped promises + return Promise.all( + // note: regular array `map(..)`, turns + // the array of values into an array of + // promises + vals.map( function(val){ + // replace `val` with a new promise that + // resolves after `val` is async mapped + return new Promise( function(resolve){ + cb( val, resolve ); + } ); + } ) + ); + }; +} +``` + +**Note:** In this implementation of `map(..)`, you can't signal async rejection, but if a synchronous exception/error occurs inside of the mapping callback (`cb(..)`), the main `Promise.map(..)` returned promise would reject. + +Let's illustrate using `map(..)` with a list of Promises (instead of simple values): + +```js +var p1 = Promise.resolve( 21 ); +var p2 = Promise.resolve( 42 ); +var p3 = Promise.reject( "Oops" ); + +// double values in list even if they're in +// Promises +Promise.map( [p1,p2,p3], function(pr,done){ + // make sure the item itself is a Promise + Promise.resolve( pr ) + .then( + // extract value as `v` + function(v){ + // map fulfillment `v` to new value + done( v * 2 ); + }, + // or, map to promise rejection message + done + ); +} ) +.then( function(vals){ + console.log( vals ); // [42,84,"Oops"] +} ); +``` + +## Promise API Recap + +Let's review the ES6 `Promise` API that we've already seen unfold in bits and pieces throughout this chapter. + +**Note:** The following API is native only as of ES6, but there are specification-compliant polyfills (not just extended Promise libraries) which can define `Promise` and all its associated behavior so that you can use native Promises even in pre-ES6 browsers. One such polyfill is "Native Promise Only" (http://github.com/getify/native-promise-only), which I wrote! + +### new Promise(..) Constructor + +The *revealing constructor* `Promise(..)` must be used with `new`, and must be provided a function callback that is synchronously/immediately called. This function is passed two function callbacks that act as resolution capabilities for the promise. We commonly label these `resolve(..)` and `reject(..)`: + +```js +var p = new Promise( function(resolve,reject){ + // `resolve(..)` to resolve/fulfill the promise + // `reject(..)` to reject the promise +} ); +``` + +`reject(..)` simply rejects the promise, but `resolve(..)` can either fulfill the promise or reject it, depending on what it's passed. If `resolve(..)` is passed an immediate, non-Promise, non-thenable value, then the promise is fulfilled with that value. + +But if `resolve(..)` is passed a genuine Promise or thenable value, that value is unwrapped recursively, and whatever its final resolution/state is will be adopted by the promise. + +### Promise.resolve(..) and Promise.reject(..) + +A shortcut for creating an already-rejected Promise is `Promise.reject(..)`, so these two promises are equivalent: + +```js +var p1 = new Promise( function(resolve,reject){ + reject( "Oops" ); +} ); + +var p2 = Promise.reject( "Oops" ); +``` + +`Promise.resolve(..)` is usually used to create an already-fulfilled Promise in a similar way to `Promise.reject(..)`. However, `Promise.resolve(..)` also unwraps thenable values (as discussed several times already). In that case, the Promise returned adopts the final resolution of the thenable you passed in, which could either be fulfillment or rejection: + +```js +var fulfilledTh = { + then: function(cb) { cb( 42 ); } +}; +var rejectedTh = { + then: function(cb,errCb) { + errCb( "Oops" ); + } +}; + +var p1 = Promise.resolve( fulfilledTh ); +var p2 = Promise.resolve( rejectedTh ); + +// `p1` will be a fulfilled promise +// `p2` will be a rejected promise +``` + +And remember, `Promise.resolve(..)` doesn't do anything if what you pass is already a genuine Promise; it just returns the value directly. So there's no overhead to calling `Promise.resolve(..)` on values that you don't know the nature of, if one happens to already be a genuine Promise. + +### then(..) and catch(..) + +Each Promise instance (**not** the `Promise` API namespace) has `then(..)` and `catch(..)` methods, which allow registering of fulfillment and rejection handlers for the Promise. Once the Promise is resolved, one or the other of these handlers will be called, but not both, and it will always be called asynchronously (see "Jobs" in Chapter 1). + +`then(..)` takes one or two parameters, the first for the fulfillment callback, and the second for the rejection callback. If either is omitted or is otherwise passed as a non-function value, a default callback is substituted respectively. The default fulfillment callback simply passes the message along, while the default rejection callback simply rethrows (propagates) the error reason it receives. + +`catch(..)` takes only the rejection callback as a parameter, and automatically substitutes the default fulfillment callback, as just discussed. In other words, it's equivalent to `then(null,..)`: + +```js +p.then( fulfilled ); + +p.then( fulfilled, rejected ); + +p.catch( rejected ); // or `p.then( null, rejected )` +``` + +`then(..)` and `catch(..)` also create and return a new promise, which can be used to express Promise chain flow control. If the fulfillment or rejection callbacks have an exception thrown, the returned promise is rejected. If either callback returns an immediate, non-Promise, non-thenable value, that value is set as the fulfillment for the returned promise. If the fulfillment handler specifically returns a promise or thenable value, that value is unwrapped and becomes the resolution of the returned promise. + +### Promise.all([ .. ]) and Promise.race([ .. ]) + +The static helpers `Promise.all([ .. ])` and `Promise.race([ .. ])` on the ES6 `Promise` API both create a Promise as their return value. The resolution of that promise is controlled entirely by the array of promises that you pass in. + +For `Promise.all([ .. ])`, all the promises you pass in must fulfill for the returned promise to fulfill. If any promise is rejected, the main returned promise is immediately rejected, too (discarding the results of any of the other promises). For fulfillment, you receive an `array` of all the passed in promises' fulfillment values. For rejection, you receive just the first promise rejection reason value. This pattern is classically called a "gate": all must arrive before the gate opens. + +For `Promise.race([ .. ])`, only the first promise to resolve (fulfillment or rejection) "wins," and whatever that resolution is becomes the resolution of the returned promise. This pattern is classically called a "latch": first one to open the latch gets through. Consider: + +```js +var p1 = Promise.resolve( 42 ); +var p2 = Promise.resolve( "Hello World" ); +var p3 = Promise.reject( "Oops" ); + +Promise.race( [p1,p2,p3] ) +.then( function(msg){ + console.log( msg ); // 42 +} ); + +Promise.all( [p1,p2,p3] ) +.catch( function(err){ + console.error( err ); // "Oops" +} ); + +Promise.all( [p1,p2] ) +.then( function(msgs){ + console.log( msgs ); // [42,"Hello World"] +} ); +``` + +**Warning:** Be careful! If an empty `array` is passed to `Promise.all([ .. ])`, it will fulfill immediately, but `Promise.race([ .. ])` will hang forever and never resolve. + +The ES6 `Promise` API is pretty simple and straightforward. It's at least good enough to serve the most basic of async cases, and is a good place to start when rearranging your code from callback hell to something better. + +But there's a whole lot of async sophistication that apps often demand which Promises themselves will be limited in addressing. In the next section, we'll dive into those limitations as motivations for the benefit of Promise libraries. + +## Promise Limitations + +Many of the details we'll discuss in this section have already been alluded to in this chapter, but we'll just make sure to review these limitations specifically. + +### Sequence Error Handling + +We covered Promise-flavored error handling in detail earlier in this chapter. The limitations of how Promises are designed -- how they chain, specifically -- creates a very easy pitfall where an error in a Promise chain can be silently ignored accidentally. + +But there's something else to consider with Promise errors. Because a Promise chain is nothing more than its constituent Promises wired together, there's no entity to refer to the entire chain as a single *thing*, which means there's no external way to observe any errors that may occur. + +If you construct a Promise chain that has no error handling in it, any error anywhere in the chain will propagate indefinitely down the chain, until observed (by registering a rejection handler at some step). So, in that specific case, having a reference to the *last* promise in the chain is enough (`p` in the following snippet), because you can register a rejection handler there, and it will be notified of any propagated errors: + +```js +// `foo(..)`, `STEP2(..)` and `STEP3(..)` are +// all promise-aware utilities + +var p = foo( 42 ) +.then( STEP2 ) +.then( STEP3 ); +``` + +Although it may seem sneakily confusing, `p` here doesn't point to the first promise in the chain (the one from the `foo(42)` call), but instead from the last promise, the one that comes from the `then(STEP3)` call. + +Also, no step in the promise chain is observably doing its own error handling. That means that you could then register a rejection error handler on `p`, and it would be notified if any errors occur anywhere in the chain: + +``` +p.catch( handleErrors ); +``` + +But if any step of the chain in fact does its own error handling (perhaps hidden/abstracted away from what you can see), your `handleErrors(..)` won't be notified. This may be what you want -- it was, after all, a "handled rejection" -- but it also may *not* be what you want. The complete lack of ability to be notified (of "already handled" rejection errors) is a limitation that restricts capabilities in some use cases. + +It's basically the same limitation that exists with a `try..catch` that can catch an exception and simply swallow it. So this isn't a limitation **unique to Promises**, but it *is* something we might wish to have a workaround for. + +Unfortunately, many times there is no reference kept for the intermediate steps in a Promise-chain sequence, so without such references, you cannot attach error handlers to reliably observe the errors. + +### Single Value + +Promises by definition only have a single fulfillment value or a single rejection reason. In simple examples, this isn't that big of a deal, but in more sophisticated scenarios, you may find this limiting. + +The typical advice is to construct a values wrapper (such as an `object` or `array`) to contain these multiple messages. This solution works, but it can be quite awkward and tedious to wrap and unwrap your messages with every single step of your Promise chain. + +#### Splitting Values + +Sometimes you can take this as a signal that you could/should decompose the problem into two or more Promises. + +Imagine you have a utility `foo(..)` that produces two values (`x` and `y`) asynchronously: + +```js +function getY(x) { + return new Promise( function(resolve,reject){ + setTimeout( function(){ + resolve( (3 * x) - 1 ); + }, 100 ); + } ); +} + +function foo(bar,baz) { + var x = bar * baz; + + return getY( x ) + .then( function(y){ + // wrap both values into container + return [x,y]; + } ); +} + +foo( 10, 20 ) +.then( function(msgs){ + var x = msgs[0]; + var y = msgs[1]; + + console.log( x, y ); // 200 599 +} ); +``` + +First, let's rearrange what `foo(..)` returns so that we don't have to wrap `x` and `y` into a single `array` value to transport through one Promise. Instead, we can wrap each value into its own promise: + +```js +function foo(bar,baz) { + var x = bar * baz; + + // return both promises + return [ + Promise.resolve( x ), + getY( x ) + ]; +} + +Promise.all( + foo( 10, 20 ) +) +.then( function(msgs){ + var x = msgs[0]; + var y = msgs[1]; + + console.log( x, y ); +} ); +``` + +Is an `array` of promises really better than an `array` of values passed through a single promise? Syntactically, it's not much of an improvement. + +But this approach more closely embraces the Promise design theory. It's now easier in the future to refactor to split the calculation of `x` and `y` into separate functions. It's cleaner and more flexible to let the calling code decide how to orchestrate the two promises -- using `Promise.all([ .. ])` here, but certainly not the only option -- rather than to abstract such details away inside of `foo(..)`. + +#### Unwrap/Spread Arguments + +The `var x = ..` and `var y = ..` assignments are still awkward overhead. We can employ some functional trickery (hat tip to Reginald Braithwaite, @raganwald on Twitter) in a helper utility: + +```js +function spread(fn) { + return Function.apply.bind( fn, null ); +} + +Promise.all( + foo( 10, 20 ) +) +.then( + spread( function(x,y){ + console.log( x, y ); // 200 599 + } ) +) +``` + +That's a bit nicer! Of course, you could inline the functional magic to avoid the extra helper: + +```js +Promise.all( + foo( 10, 20 ) +) +.then( Function.apply.bind( + function(x,y){ + console.log( x, y ); // 200 599 + }, + null +) ); +``` + +These tricks may be neat, but ES6 has an even better answer for us: destructuring. The array destructuring assignment form looks like this: + +```js +Promise.all( + foo( 10, 20 ) +) +.then( function(msgs){ + var [x,y] = msgs; + + console.log( x, y ); // 200 599 +} ); +``` + +But best of all, ES6 offers the array parameter destructuring form: + +```js +Promise.all( + foo( 10, 20 ) +) +.then( function([x,y]){ + console.log( x, y ); // 200 599 +} ); +``` + +We've now embraced the one-value-per-Promise mantra, but kept our supporting boilerplate to a minimum! + +**Note:** For more information on ES6 destructuring forms, see the *ES6 & Beyond* title of this series. + +### Single Resolution + +One of the most intrinsic behaviors of Promises is that a Promise can only be resolved once (fulfillment or rejection). For many async use cases, you're only retrieving a value once, so this works fine. + +But there's also a lot of async cases that fit into a different model -- one that's more akin to events and/or streams of data. It's not clear on the surface how well Promises can fit into such use cases, if at all. Without a significant abstraction on top of Promises, they will completely fall short for handling multiple value resolution. + +Imagine a scenario where you might want to fire off a sequence of async steps in response to a stimulus (like an event) that can in fact happen multiple times, like a button click. + +This probably won't work the way you want: + +```js +// `click(..)` binds the `"click"` event to a DOM element +// `request(..)` is the previously defined Promise-aware Ajax + +var p = new Promise( function(resolve,reject){ + click( "#mybtn", resolve ); +} ); + +p.then( function(evt){ + var btnID = evt.currentTarget.id; + return request( "http://some.url.1/?id=" + btnID ); +} ) +.then( function(text){ + console.log( text ); +} ); +``` + +The behavior here only works if your application calls for the button to be clicked just once. If the button is clicked a second time, the `p` promise has already been resolved, so the second `resolve(..)` call would be ignored. + +Instead, you'd probably need to invert the paradigm, creating a whole new Promise chain for each event firing: + +```js +click( "#mybtn", function(evt){ + var btnID = evt.currentTarget.id; + + request( "http://some.url.1/?id=" + btnID ) + .then( function(text){ + console.log( text ); + } ); +} ); +``` + +This approach will *work* in that a whole new Promise sequence will be fired off for each `"click"` event on the button. + +But beyond just the ugliness of having to define the entire Promise chain inside the event handler, this design in some respects violates the idea of separation of concerns/capabilities (SoC). You might very well want to define your event handler in a different place in your code from where you define the *response* to the event (the Promise chain). That's pretty awkward to do in this pattern, without helper mechanisms. + +**Note:** Another way of articulating this limitation is that it'd be nice if we could construct some sort of "observable" that we can subscribe a Promise chain to. There are libraries that have created these abstractions (such as RxJS -- http://rxjs.codeplex.com/), but the abstractions can seem so heavy that you can't even see the nature of Promises anymore. Such heavy abstraction brings important questions to mind such as whether (sans Promises) these mechanisms are as *trustable* as Promises themselves have been designed to be. We'll revisit the "Observable" pattern in Appendix B. + +### Inertia + +One concrete barrier to starting to use Promises in your own code is all the code that currently exists which is not already Promise-aware. If you have lots of callback-based code, it's far easier to just keep coding in that same style. + +"A code base in motion (with callbacks) will remain in motion (with callbacks) unless acted upon by a smart, Promises-aware developer." + +Promises offer a different paradigm, and as such, the approach to the code can be anywhere from just a little different to, in some cases, radically different. You have to be intentional about it, because Promises will not just naturally shake out from the same ol' ways of doing code that have served you well thus far. + +Consider a callback-based scenario like the following: + +```js +function foo(x,y,cb) { + ajax( + "http://some.url.1/?x=" + x + "&y=" + y, + cb + ); +} + +foo( 11, 31, function(err,text) { + if (err) { + console.error( err ); + } + else { + console.log( text ); + } +} ); +``` + +Is it immediately obvious what the first steps are to convert this callback-based code to Promise-aware code? Depends on your experience. The more practice you have with it, the more natural it will feel. But certainly, Promises don't just advertise on the label exactly how to do it -- there's no one-size-fits-all answer -- so the responsibility is up to you. + +As we've covered before, we definitely need an Ajax utility that is Promise-aware instead of callback-based, which we could call `request(..)`. You can make your own, as we have already. But the overhead of having to manually define Promise-aware wrappers for every callback-based utility makes it less likely you'll choose to refactor to Promise-aware coding at all. + +Promises offer no direct answer to that limitation. Most Promise libraries do offer a helper, however. But even without a library, imagine a helper like this: + +```js +// polyfill-safe guard check +if (!Promise.wrap) { + Promise.wrap = function(fn) { + return function() { + var args = [].slice.call( arguments ); + + return new Promise( function(resolve,reject){ + fn.apply( + null, + args.concat( function(err,v){ + if (err) { + reject( err ); + } + else { + resolve( v ); + } + } ) + ); + } ); + }; + }; +} +``` + +OK, that's more than just a tiny trivial utility. However, although it may look a bit intimidating, it's not as bad as you'd think. It takes a function that expects an error-first style callback as its last parameter, and returns a new one that automatically creates a Promise to return, and substitutes the callback for you, wired up to the Promise fulfillment/rejection. + +Rather than waste too much time talking about *how* this `Promise.wrap(..)` helper works, let's just look at how we use it: + +```js +var request = Promise.wrap( ajax ); + +request( "http://some.url.1/" ) +.then( .. ) +.. +``` + +Wow, that was pretty easy! + +`Promise.wrap(..)` does **not** produce a Promise. It produces a function that will produce Promises. In a sense, a Promise-producing function could be seen as a "Promise factory." I propose "promisory" as the name for such a thing ("Promise" + "factory"). + +The act of wrapping a callback-expecting function to be a Promise-aware function is sometimes referred to as "lifting" or "promisifying". But there doesn't seem to be a standard term for what to call the resultant function other than a "lifted function", so I like "promisory" better as I think it's more descriptive. + +**Note:** Promisory isn't a made-up term. It's a real word, and its definition means to contain or convey a promise. That's exactly what these functions are doing, so it turns out to be a pretty perfect terminology match! + +So, `Promise.wrap(ajax)` produces an `ajax(..)` promisory we call `request(..)`, and that promisory produces Promises for Ajax responses. + +If all functions were already promisories, we wouldn't need to make them ourselves, so the extra step is a tad bit of a shame. But at least the wrapping pattern is (usually) repeatable so we can put it into a `Promise.wrap(..)` helper as shown to aid our promise coding. + +So back to our earlier example, we need a promisory for both `ajax(..)` and `foo(..)`: + +```js +// make a promisory for `ajax(..)` +var request = Promise.wrap( ajax ); + +// refactor `foo(..)`, but keep it externally +// callback-based for compatibility with other +// parts of the code for now -- only use +// `request(..)`'s promise internally. +function foo(x,y,cb) { + request( + "http://some.url.1/?x=" + x + "&y=" + y + ) + .then( + function fulfilled(text){ + cb( null, text ); + }, + cb + ); +} + +// now, for this code's purposes, make a +// promisory for `foo(..)` +var betterFoo = Promise.wrap( foo ); + +// and use the promisory +betterFoo( 11, 31 ) +.then( + function fulfilled(text){ + console.log( text ); + }, + function rejected(err){ + console.error( err ); + } +); +``` + +Of course, while we're refactoring `foo(..)` to use our new `request(..)` promisory, we could just make `foo(..)` a promisory itself, instead of remaining callback-based and needing to make and use the subsequent `betterFoo(..)` promisory. This decision just depends on whether `foo(..)` needs to stay callback-based compatible with other parts of the code base or not. + +Consider: + +```js +// `foo(..)` is now also a promisory because it +// delegates to the `request(..)` promisory +function foo(x,y) { + return request( + "http://some.url.1/?x=" + x + "&y=" + y + ); +} + +foo( 11, 31 ) +.then( .. ) +.. +``` + +While ES6 Promises don't natively ship with helpers for such promisory wrapping, most libraries provide them, or you can make your own. Either way, this particular limitation of Promises is addressable without too much pain (certainly compared to the pain of callback hell!). + +### Promise Uncancelable + +Once you create a Promise and register a fulfillment and/or rejection handler for it, there's nothing external you can do to stop that progression if something else happens to make that task moot. + +**Note:** Many Promise abstraction libraries provide facilities to cancel Promises, but this is a terrible idea! Many developers wish Promises had natively been designed with external cancelation capability, but the problem is that it would let one consumer/observer of a Promise affect some other consumer's ability to observe that same Promise. This violates the future-value's trustability (external immutability), but morever is the embodiment of the "action at a distance" anti-pattern (http://en.wikipedia.org/wiki/Action_at_a_distance_%28computer_programming%29). Regardless of how useful it seems, it will actually lead you straight back into the same nightmares as callbacks. + +Consider our Promise timeout scenario from earlier: + +```js +var p = foo( 42 ); + +Promise.race( [ + p, + timeoutPromise( 3000 ) +] ) +.then( + doSomething, + handleError +); + +p.then( function(){ + // still happens even in the timeout case :( +} ); +``` + +The "timeout" was external to the promise `p`, so `p` itself keeps going, which we probably don't want. + +One option is to invasively define your resolution callbacks: + +```js +var OK = true; + +var p = foo( 42 ); + +Promise.race( [ + p, + timeoutPromise( 3000 ) + .catch( function(err){ + OK = false; + throw err; + } ) +] ) +.then( + doSomething, + handleError +); + +p.then( function(){ + if (OK) { + // only happens if no timeout! :) + } +} ); +``` + +This is ugly. It works, but it's far from ideal. Generally, you should try to avoid such scenarios. + +But if you can't, the ugliness of this solution should be a clue that *cancelation* is a functionality that belongs at a higher level of abstraction on top of Promises. I'd recommend you look to Promise abstraction libraries for assistance rather than hacking it yourself. + +**Note:** My *asynquence* Promise abstraction library provides just such an abstraction and an `abort()` capability for the sequence, all of which will be discussed in Appendix A. + +A single Promise is not really a flow-control mechanism (at least not in a very meaningful sense), which is exactly what *cancelation* refers to; that's why Promise cancelation would feel awkward. + +By contrast, a chain of Promises taken collectively together -- what I like to call a "sequence" -- *is* a flow control expression, and thus it's appropriate for cancelation to be defined at that level of abstraction. + +No individual Promise should be cancelable, but it's sensible for a *sequence* to be cancelable, because you don't pass around a sequence as a single immutable value like you do with a Promise. + +### Promise Performance + +This particular limitation is both simple and complex. + +Comparing how many pieces are moving with a basic callback-based async task chain versus a Promise chain, it's clear Promises have a fair bit more going on, which means they are naturally at least a tiny bit slower. Think back to just the simple list of trust guarantees that Promises offer, as compared to the ad hoc solution code you'd have to layer on top of callbacks to achieve the same protections. + +More work to do, more guards to protect, means that Promises *are* slower as compared to naked, untrustable callbacks. That much is obvious, and probably simple to wrap your brain around. + +But how much slower? Well... that's actually proving to be an incredibly difficult question to answer absolutely, across the board. + +Frankly, it's kind of an apples-to-oranges comparison, so it's probably the wrong question to ask. You should actually compare whether an ad-hoc callback system with all the same protections manually layered in is faster than a Promise implementation. + +If Promises have a legitimate performance limitation, it's more that they don't really offer a line-item choice as to which trustability protections you want/need or not -- you get them all, always. + +Nevertheless, if we grant that a Promise is generally a *little bit slower* than its non-Promise, non-trustable callback equivalent -- assuming there are places where you feel you can justify the lack of trustability -- does that mean that Promises should be avoided across the board, as if your entire application is driven by nothing but must-be-utterly-the-fastest code possible? + +Sanity check: if your code is legitimately like that, **is JavaScript even the right language for such tasks?** JavaScript can be optimized to run applications very performantly (see Chapter 5 and Chapter 6). But is obsessing over tiny performance tradeoffs with Promises, in light of all the benefits they offer, *really* appropriate? + +Another subtle issue is that Promises make *everything* async, which means that some immediately (synchronously) complete steps still defer advancement of the next step to a Job (see Chapter 1). That means that it's possible that a sequence of Promise tasks could complete ever-so-slightly slower than the same sequence wired up with callbacks. + +Of course, the question here is this: are these potential slips in tiny fractions of performance *worth* all the other articulated benefits of Promises we've laid out across this chapter? + +My take is that in virtually all cases where you might think Promise performance is slow enough to be concerned, it's actually an anti-pattern to optimize away the benefits of Promise trustability and composability by avoiding them altogether. + +Instead, you should default to using them across the code base, and then profile and analyze your application's hot (critical) paths. Are Promises *really* a bottleneck, or are they just a theoretical slowdown? Only *then*, armed with actual valid benchmarks (see Chapter 6) is it responsible and prudent to factor out the Promises in just those identified critical areas. + +Promises are a little slower, but in exchange you're getting a lot of trustability, non-Zalgo predictability, and composability built in. Maybe the limitation is not actually their performance, but your lack of perception of their benefits? + +## Review + +Promises are awesome. Use them. They solve the *inversion of control* issues that plague us with callbacks-only code. + +They don't get rid of callbacks, they just redirect the orchestration of those callbacks to a trustable intermediary mechanism that sits between us and another utility. + +Promise chains also begin to address (though certainly not perfectly) a better way of expressing async flow in sequential fashion, which helps our brains plan and maintain async JS code better. We'll see an even better solution to *that* problem in the next chapter! diff --git a/async & performance/ch4.md b/async & performance/ch4.md new file mode 100644 index 0000000..fcbc159 --- /dev/null +++ b/async & performance/ch4.md @@ -0,0 +1,2247 @@ +# You Don't Know JS: Async & Performance +# Chapter 4: Generators + +In Chapter 2, we identified two key drawbacks to expressing async flow control with callbacks: + +* Callback-based async doesn't fit how our brain plans out steps of a task. +* Callbacks aren't trustable or composable because of *inversion of control*. + +In Chapter 3, we detailed how Promises uninvert the *inversion of control* of callbacks, restoring trustability/composability. + +Now we turn our attention to expressing async flow control in a sequential, synchronous-looking fashion. The "magic" that makes it possible is ES6 **generators**. + +## Breaking Run-to-Completion + +In Chapter 1, we explained an expectation that JS developers almost universally rely on in their code: once a function starts executing, it runs until it completes, and no other code can interrupt and run in between. + +As bizarre as it may seem, ES6 introduces a new type of function that does not behave with the run-to-completion behavior. This new type of function is called a "generator." + +To understand the implications, let's consider this example: + +```js +var x = 1; + +function foo() { + x++; + bar(); // <-- what about this line? + console.log( "x:", x ); +} + +function bar() { + x++; +} + +foo(); // x: 3 +``` + +In this example, we know for sure that `bar()` runs in between `x++` and `console.log(x)`. But what if `bar()` wasn't there? Obviously the result would be `2` instead of `3`. + +Now let's twist your brain. What if `bar()` wasn't present, but it could still somehow run between the `x++` and `console.log(x)` statements? How would that be possible? + +In **preemptive** multithreaded languages, it would essentially be possible for `bar()` to "interrupt" and run at exactly the right moment between those two statements. But JS is not preemptive, nor is it (currently) multithreaded. And yet, a **cooperative** form of this "interruption" (concurrency) is possible, if `foo()` itself could somehow indicate a "pause" at that part in the code. + +**Note:** I use the word "cooperative" not only because of the connection to classical concurrency terminology (see Chapter 1), but because as you'll see in the next snippet, the ES6 syntax for indicating a pause point in code is `yield` -- suggesting a politely *cooperative* yielding of control. + +Here's the ES6 code to accomplish such cooperative concurrency: + +```js +var x = 1; + +function *foo() { + x++; + yield; // pause! + console.log( "x:", x ); +} + +function bar() { + x++; +} +``` + +**Note:** You will likely see most other JS documentation/code that will format a generator declaration as `function* foo() { .. }` instead of as I've done here with `function *foo() { .. }` -- the only difference being the stylistic positioning of the `*`. The two forms are functionally/syntactically identical, as is a third `function*foo() { .. }` (no space) form. There are arguments for both styles, but I basically prefer `function *foo..` because it then matches when I reference a generator in writing with `*foo()`. If I said only `foo()`, you wouldn't know as clearly if I was talking about a generator or a regular function. It's purely a stylistic preference. + +Now, how can we run the code in that previous snippet such that `bar()` executes at the point of the `yield` inside of `*foo()`? + +```js +// construct an iterator `it` to control the generator +var it = foo(); + +// start `foo()` here! +it.next(); +x; // 2 +bar(); +x; // 3 +it.next(); // x: 3 +``` + +OK, there's quite a bit of new and potentially confusing stuff in those two code snippets, so we've got plenty to wade through. But before we explain the different mechanics/syntax with ES6 generators, let's walk through the behavior flow: + +1. The `it = foo()` operation does *not* execute the `*foo()` generator yet, but it merely constructs an *iterator* that will control its execution. More on *iterators* in a bit. +2. The first `it.next()` starts the `*foo()` generator, and runs the `x++` on the first line of `*foo()`. +3. `*foo()` pauses at the `yield` statement, at which point that first `it.next()` call finishes. At the moment, `*foo()` is still running and active, but it's in a paused state. +4. We inspect the value of `x`, and it's now `2`. +5. We call `bar()`, which increments `x` again with `x++`. +6. We inspect the value of `x` again, and it's now `3`. +7. The final `it.next()` call resumes the `*foo()` generator from where it was paused, and runs the `console.log(..)` statement, which uses the current value of `x` of `3`. + +Clearly, `*foo()` started, but did *not* run-to-completion -- it paused at the `yield`. We resumed `*foo()` later, and let it finish, but that wasn't even required. + +So, a generator is a special kind of function that can start and stop one or more times, and doesn't necessarily ever have to finish. While it won't be terribly obvious yet why that's so powerful, as we go throughout the rest of this chapter, that will be one of the fundamental building blocks we use to construct generators-as-async-flow-control as a pattern for our code. + +### Input and Output + +A generator function is a special function with the new processing model we just alluded to. But it's still a function, which means it still has some basic tenets that haven't changed -- namely, that it still accepts arguments (aka "input"), and that it can still return a value (aka "output"): + +```js +function *foo(x,y) { + return x * y; +} + +var it = foo( 6, 7 ); + +var res = it.next(); + +res.value; // 42 +``` + +We pass in the arguments `6` and `7` to `*foo(..)` as the parameters `x` and `y`, respectively. And `*foo(..)` returns the value `42` back to the calling code. + +We now see a difference with how the generator is invoked compared to a normal function. `foo(6,7)` obviously looks familiar. But subtly, the `*foo(..)` generator hasn't actually run yet as it would have with a function. + +Instead, we're just creating an *iterator* object, which we assign to the variable `it`, to control the `*foo(..)` generator. Then we call `it.next()`, which instructs the `*foo(..)` generator to advance from its current location, stopping either at the next `yield` or end of the generator. + +The result of that `next(..)` call is an object with a `value` property on it holding whatever value (if anything) was returned from `*foo(..)`. In other words, `yield` caused a value to be sent out from the generator during the middle of its execution, kind of like an intermediate `return`. + +Again, it won't be obvious yet why we need this whole indirect *iterator* object to control the generator. We'll get there, I *promise*. + +#### Iteration Messaging + +In addition to generators accepting arguments and having return values, there's even more powerful and compelling input/output messaging capability built into them, via `yield` and `next(..)`. + +Consider: + +```js +function *foo(x) { + var y = x * (yield); + return y; +} + +var it = foo( 6 ); + +// start `foo(..)` +it.next(); + +var res = it.next( 7 ); + +res.value; // 42 +``` + +First, we pass in `6` as the parameter `x`. Then we call `it.next()`, and it starts up `*foo(..)`. + +Inside `*foo(..)`, the `var y = x ..` statement starts to be processed, but then it runs across a `yield` expression. At that point, it pauses `*foo(..)` (in the middle of the assignment statement!), and essentially requests the calling code to provide a result value for the `yield` expression. Next, we call `it.next( 7 )`, which is passing the `7` value back in to *be* that result of the paused `yield` expression. + +So, at this point, the assignment statement is essentially `var y = 6 * 7`. Now, `return y` returns that `42` value back as the result of the `it.next( 7 )` call. + +Notice something very important but also easily confusing, even to seasoned JS developers: depending on your perspective, there's a mismatch between the `yield` and the `next(..)` call. In general, you're going to have one more `next(..)` call than you have `yield` statements -- the preceding snippet has one `yield` and two `next(..)` calls. + +Why the mismatch? + +Because the first `next(..)` always starts a generator, and runs to the first `yield`. But it's the second `next(..)` call that fulfills the first paused `yield` expression, and the third `next(..)` would fulfill the second `yield`, and so on. + +##### Tale of Two Questions + +Actually, which code you're thinking about primarily will affect whether there's a perceived mismatch or not. + +Consider only the generator code: + +```js +var y = x * (yield); +return y; +``` + +This **first** `yield` is basically *asking a question*: "What value should I insert here?" + +Who's going to answer that question? Well, the **first** `next()` has already run to get the generator up to this point, so obviously *it* can't answer the question. So, the **second** `next(..)` call must answer the question *posed* by the **first** `yield`. + +See the mismatch -- second-to-first? + +But let's flip our perspective. Let's look at it not from the generator's point of view, but from the iterator's point of view. + +To properly illustrate this perspective, we also need to explain that messages can go in both directions -- `yield ..` as an expression can send out messages in response to `next(..)` calls, and `next(..)` can send values to a paused `yield` expression. Consider this slightly adjusted code: + +```js +function *foo(x) { + var y = x * (yield "Hello"); // <-- yield a value! + return y; +} + +var it = foo( 6 ); + +var res = it.next(); // first `next()`, don't pass anything +res.value; // "Hello" + +res = it.next( 7 ); // pass `7` to waiting `yield` +res.value; // 42 +``` + +`yield ..` and `next(..)` pair together as a two-way message passing system **during the execution of the generator**. + +So, looking only at the *iterator* code: + +```js +var res = it.next(); // first `next()`, don't pass anything +res.value; // "Hello" + +res = it.next( 7 ); // pass `7` to waiting `yield` +res.value; // 42 +``` + +**Note:** We don't pass a value to the first `next()` call, and that's on purpose. Only a paused `yield` could accept such a value passed by a `next(..)`, and at the beginning of the generator when we call the first `next()`, there **is no paused `yield`** to accept such a value. The specification and all compliant browsers just silently **discard** anything passed to the first `next()`. It's still a bad idea to pass a value, as you're just creating silently "failing" code that's confusing. So, always start a generator with an argument-free `next()`. + +The first `next()` call (with nothing passed to it) is basically *asking a question*: "What *next* value does the `*foo(..)` generator have to give me?" And who answers this question? The first `yield "hello"` expression. + +See? No mismatch there. + +Depending on *who* you think about asking the question, there is either a mismatch between the `yield` and `next(..)` calls, or not. + +But wait! There's still an extra `next()` compared to the number of `yield` statements. So, that final `it.next(7)` call is again asking the question about what *next* value the generator will produce. But there's no more `yield` statements left to answer, is there? So who answers? + +The `return` statement answers the question! + +And if there **is no `return`** in your generator -- `return` is certainly not any more required in generators than in regular functions -- there's always an assumed/implicit `return;` (aka `return undefined;`), which serves the purpose of default answering the question *posed* by the final `it.next(7)` call. + +These questions and answers -- the two-way message passing with `yield` and `next(..)` -- are quite powerful, but it's not obvious at all how these mechanisms are connected to async flow control. We're getting there! + +### Multiple Iterators + +It may appear from the syntactic usage that when you use an *iterator* to control a generator, you're controlling the declared generator function itself. But there's a subtlety that's easy to miss: each time you construct an *iterator*, you are implicitly constructing an instance of the generator which that *iterator* will control. + +You can have multiple instances of the same generator running at the same time, and they can even interact: + +```js +function *foo() { + var x = yield 2; + z++; + var y = yield (x * z); + console.log( x, y, z ); +} + +var z = 1; + +var it1 = foo(); +var it2 = foo(); + +var val1 = it1.next().value; // 2 <-- yield 2 +var val2 = it2.next().value; // 2 <-- yield 2 + +val1 = it1.next( val2 * 10 ).value; // 40 <-- x:20, z:2 +val2 = it2.next( val1 * 5 ).value; // 600 <-- x:200, z:3 + +it1.next( val2 / 2 ); // y:300 + // 20 300 3 +it2.next( val1 / 4 ); // y:10 + // 200 10 3 +``` + +**Warning:** The most common usage of multiple instances of the same generator running concurrently is not such interactions, but when the generator is producing its own values without input, perhaps from some independently connected resource. We'll talk more about value production in the next section. + +Let's briefly walk through the processing: + +1. Both instances of `*foo()` are started at the same time, and both `next()` calls reveal a `value` of `2` from the `yield 2` statements, respectively. +2. `val2 * 10` is `2 * 10`, which is sent into the first generator instance `it1`, so that `x` gets value `20`. `z` is incremented from `1` to `2`, and then `20 * 2` is `yield`ed out, setting `val1` to `40`. +3. `val1 * 5` is `40 * 5`, which is sent into the second generator instance `it2`, so that `x` gets value `200`. `z` is incremented again, from `2` to `3`, and then `200 * 3` is `yield`ed out, setting `val2` to `600`. +4. `val2 / 2` is `600 / 2`, which is sent into the first generator instance `it1`, so that `y` gets value `300`, then printing out `20 300 3` for its `x y z` values, respectively. +5. `val1 / 4` is `40 / 4`, which is sent into the second generator instance `it2`, so that `y` gets value `10`, then printing out `200 10 3` for its `x y z` values, respectively. + +That's a "fun" example to run through in your mind. Did you keep it straight? + +#### Interleaving + +Recall this scenario from the "Run-to-completion" section of Chapter 1: + +```js +var a = 1; +var b = 2; + +function foo() { + a++; + b = b * a; + a = b + 3; +} + +function bar() { + b--; + a = 8 + b; + b = a * 2; +} +``` + +With normal JS functions, of course either `foo()` can run completely first, or `bar()` can run completely first, but `foo()` cannot interleave its individual statements with `bar()`. So, there are only two possible outcomes to the preceding program. + +However, with generators, clearly interleaving (even in the middle of statements!) is possible: + +```js +var a = 1; +var b = 2; + +function *foo() { + a++; + yield; + b = b * a; + a = (yield b) + 3; +} + +function *bar() { + b--; + yield; + a = (yield 8) + b; + b = a * (yield 2); +} +``` + +Depending on what respective order the *iterators* controlling `*foo()` and `*bar()` are called, the preceding program could produce several different results. In other words, we can actually illustrate (in a sort of fake-ish way) the theoretical "threaded race conditions" circumstances discussed in Chapter 1, by interleaving the two generator interations over the same shared variables. + +First, let's make a helper called `step(..)` that controls an *iterator*: + +```js +function step(gen) { + var it = gen(); + var last; + + return function() { + // whatever is `yield`ed out, just + // send it right back in the next time! + last = it.next( last ).value; + }; +} +``` + +`step(..)` initializes a generator to create its `it` *iterator*, then returns a function which, when called, advances the *iterator* by one step. Additionally, the previously `yield`ed out value is sent right back in at the *next* step. So, `yield 8` will just become `8` and `yield b` will just be `b` (whatever it was at the time of `yield`). + +Now, just for fun, let's experiment to see the effects of interleaving these different chunks of `*foo()` and `*bar()`. We'll start with the boring base case, making sure `*foo()` totally finishes before `*bar()` (just like we did in Chapter 1): + +```js +// make sure to reset `a` and `b` +a = 1; +b = 2; + +var s1 = step( foo ); +var s2 = step( bar ); + +// run `*foo()` completely first +s1(); +s1(); +s1(); + +// now run `*bar()` +s2(); +s2(); +s2(); +s2(); + +console.log( a, b ); // 11 22 +``` + +The end result is `11` and `22`, just as it was in the Chapter 1 version. Now let's mix up the interleaving ordering and see how it changes the final values of `a` and `b`: + +```js +// make sure to reset `a` and `b` +a = 1; +b = 2; + +var s1 = step( foo ); +var s2 = step( bar ); + +s2(); // b--; +s2(); // yield 8 +s1(); // a++; +s2(); // a = 8 + b; + // yield 2 +s1(); // b = b * a; + // yield b +s1(); // a = b + 3; +s2(); // b = a * 2; +``` + +Before I tell you the results, can you figure out what `a` and `b` are after the preceding program? No cheating! + +```js +console.log( a, b ); // 12 18 +``` + +**Note:** As an exercise for the reader, try to see how many other combinations of results you can get back rearranging the order of the `s1()` and `s2()` calls. Don't forget you'll always need three `s1()` calls and four `s2()` calls. Recall the discussion earlier about matching `next()` with `yield` for the reasons why. + +You almost certainly won't want to intentionally create *this* level of interleaving confusion, as it creates incredibly difficult to understand code. But the exercise is interesting and instructive to understand more about how multiple generators can run concurrently in the same shared scope, because there will be places where this capability is quite useful. + +We'll discuss generator concurrency in more detail at the end of this chapter. + +## Generator'ing Values + +In the previous section, we mentioned an interesting use for generators, as a way to produce values. This is **not** the main focus in this chapter, but we'd be remiss if we didn't cover the basics, especially because this use case is essentially the origin of the name: generators. + +We're going to take a slight diversion into the topic of *iterators* for a bit, but we'll circle back to how they relate to generators and using a generator to *generate* values. + +### Producers and Iterators + +Imagine you're producing a series of values where each value has a definable relationship to the previous value. To do this, you're going to need a stateful producer that remembers the last value it gave out. + +You can implement something like that straightforwardly using a function closure (see the *Scope & Closures* title of this series): + +```js +var gimmeSomething = (function(){ + var nextVal; + + return function(){ + if (nextVal === undefined) { + nextVal = 1; + } + else { + nextVal = (3 * nextVal) + 6; + } + + return nextVal; + }; +})(); + +gimmeSomething(); // 1 +gimmeSomething(); // 9 +gimmeSomething(); // 33 +gimmeSomething(); // 105 +``` + +**Note:** The `nextVal` computation logic here could have been simplified, but conceptually, we don't want to calculate the *next value* (aka `nextVal`) until the *next* `gimmeSomething()` call happens, because in general that could be a resource-leaky design for producers of more persistent or resource-limited values than simple `number`s. + +Generating an arbitrary number series isn't a terribly realistic example. But what if you were generating records from a data source? You could imagine much the same code. + +In fact, this task is a very common design pattern, usually solved by iterators. An *iterator* is a well-defined interface for stepping through a series of values from a producer. The JS interface for iterators, as it is in most languages, is to call `next()` each time you want the next value from the producer. + +We could implement the standard *iterator* interface for our number series producer: + +```js +var something = (function(){ + var nextVal; + + return { + // needed for `for..of` loops + [Symbol.iterator]: function(){ return this; }, + + // standard iterator interface method + next: function(){ + if (nextVal === undefined) { + nextVal = 1; + } + else { + nextVal = (3 * nextVal) + 6; + } + + return { done:false, value:nextVal }; + } + }; +})(); + +something.next().value; // 1 +something.next().value; // 9 +something.next().value; // 33 +something.next().value; // 105 +``` + +**Note:** We'll explain why we need the `[Symbol.iterator]: ..` part of this code snippet in the "Iterables" section. Syntactically though, two ES6 features are at play. First, the `[ .. ]` syntax is called a *computed property name* (see the *this & Object Prototypes* title of this series). It's a way in an object literal definition to specify an expression and use the result of that expression as the name for the property. Next, `Symbol.iterator` is one of ES6's predefined special `Symbol` values (see the *ES6 & Beyond* title of this book series). + +The `next()` call returns an object with two properties: `done` is a `boolean` value signaling the *iterator's* complete status; `value` holds the iteration value. + +ES6 also adds the `for..of` loop, which means that a standard *iterator* can automatically be consumed with native loop syntax: + +```js +for (var v of something) { + console.log( v ); + + // don't let the loop run forever! + if (v > 500) { + break; + } +} +// 1 9 33 105 321 969 +``` + +**Note:** Because our `something` *iterator* always returns `done:false`, this `for..of` loop would run forever, which is why we put the `break` conditional in. It's totally OK for iterators to be never-ending, but there are also cases where the *iterator* will run over a finite set of values and eventually return a `done:true`. + +The `for..of` loop automatically calls `next()` for each iteration -- it doesn't pass any values in to the `next()` -- and it will automatically terminate on receiving a `done:true`. It's quite handy for looping over a set of data. + +Of course, you could manually loop over iterators, calling `next()` and checking for the `done:true` condition to know when to stop: + +```js +for ( + var ret; + (ret = something.next()) && !ret.done; +) { + console.log( ret.value ); + + // don't let the loop run forever! + if (ret.value > 500) { + break; + } +} +// 1 9 33 105 321 969 +``` + +**Note:** This manual `for` approach is certainly uglier than the ES6 `for..of` loop syntax, but its advantage is that it affords you the opportunity to pass in values to the `next(..)` calls if necessary. + +In addition to making your own *iterators*, many built-in data structures in JS (as of ES6), like `array`s, also have default *iterators*: + +```js +var a = [1,3,5,7,9]; + +for (var v of a) { + console.log( v ); +} +// 1 3 5 7 9 +``` + +The `for..of` loop asks `a` for its *iterator*, and automatically uses it to iterate over `a`'s values. + +**Note:** It may seem a strange omission by ES6, but regular `object`s intentionally do not come with a default *iterator* the way `array`s do. The reasons go deeper than we will cover here. If all you want is to iterate over the properties of an object (with no particular guarantee of ordering), `Object.keys(..)` returns an `array`, which can then be used like `for (var k of Object.keys(obj)) { ..`. Such a `for..of` loop over an object's keys would be similar to a `for..in` loop, except that `Object.keys(..)` does not include properties from the `[[Prototype]]` chain while `for..in` does (see the *this & Object Prototypes* title of this series). + +### Iterables + +The `something` object in our running example is called an *iterator*, as it has the `next()` method on its interface. But a closely related term is *iterable*, which is an `object` that **contains** an *iterator* that can iterate over its values. + +As of ES6, the way to retrieve an *iterator* from an *iterable* is that the *iterable* must have a function on it, with the name being the special ES6 symbol value `Symbol.iterator`. When this function is called, it returns an *iterator*. Though not required, generally each call should return a fresh new *iterator*. + +`a` in the previous snippet is an *iterable*. The `for..of` loop automatically calls its `Symbol.iterator` function to construct an *iterator*. But we could of course call the function manually, and use the *iterator* it returns: + +```js +var a = [1,3,5,7,9]; + +var it = a[Symbol.iterator](); + +it.next().value; // 1 +it.next().value; // 3 +it.next().value; // 5 +.. +``` + +In the previous code listing that defined `something`, you may have noticed this line: + +```js +[Symbol.iterator]: function(){ return this; } +``` + +That little bit of confusing code is making the `something` value -- the interface of the `something` *iterator* -- also an *iterable*; it's now both an *iterable* and an *iterator*. Then, we pass `something` to the `for..of` loop: + +```js +for (var v of something) { + .. +} +``` + +The `for..of` loop expects `something` to be an *iterable*, so it looks for and calls its `Symbol.iterator` function. We defined that function to simply `return this`, so it just gives itself back, and the `for..of` loop is none the wiser. + +### Generator Iterator + +Let's turn our attention back to generators, in the context of *iterators*. A generator can be treated as a producer of values that we extract one at a time through an *iterator* interface's `next()` calls. + +So, a generator itself is not technically an *iterable*, though it's very similar -- when you execute the generator, you get an *iterator* back: + +```js +function *foo(){ .. } + +var it = foo(); +``` + +We can implement the `something` infinite number series producer from earlier with a generator, like this: + +```js +function *something() { + var nextVal; + + while (true) { + if (nextVal === undefined) { + nextVal = 1; + } + else { + nextVal = (3 * nextVal) + 6; + } + + yield nextVal; + } +} +``` + +**Note:** A `while..true` loop would normally be a very bad thing to include in a real JS program, at least if it doesn't have a `break` or `return` in it, as it would likely run forever, synchronously, and block/lock-up the browser UI. However, in a generator, such a loop is generally totally OK if it has a `yield` in it, as the generator will pause at each iteration, `yield`ing back to the main program and/or to the event loop queue. To put it glibly, "generators put the `while..true` back in JS programming!" + +That's a fair bit cleaner and simpler, right? Because the generator pauses at each `yield`, the state (scope) of the function `*something()` is kept around, meaning there's no need for the closure boilerplate to preserve variable state across calls. + +Not only is it simpler code -- we don't have to make our own *iterator* interface -- it actually is more reason-able code, because it more clearly expresses the intent. For example, the `while..true` loop tells us the generator is intended to run forever -- to keep *generating* values as long as we keep asking for them. + +And now we can use our shiny new `*something()` generator with a `for..of` loop, and you'll see it works basically identically: + +```js +for (var v of something()) { + console.log( v ); + + // don't let the loop run forever! + if (v > 500) { + break; + } +} +// 1 9 33 105 321 969 +``` + +But don't skip over `for (var v of something()) ..`! We didn't just reference `something` as a value like in earlier examples, but instead called the `*something()` generator to get its *iterator* for the `for..of` loop to use. + +If you're paying close attention, two questions may arise from this interaction between the generator and the loop: + +* Why couldn't we say `for (var v of something) ..`? Because `something` here is a generator, which is not an *iterable*. We have to call `something()` to construct a producer for the `for..of` loop to iterate over. +* The `something()` call produces an *iterator*, but the `for..of` loop wants an *iterable*, right? Yep. The generator's *iterator* also has a `Symbol.iterator` function on it, which basically does a `return this`, just like the `something` *iterable* we defined earlier. In other words, a generator's *iterator* is also an *iterable*! + +#### Stopping the Generator + +In the previous example, it would appear the *iterator* instance for the `*something()` generator was basically left in a suspended state forever after the `break` in the loop was called. + +But there's a hidden behavior that takes care of that for you. "Abnormal completion" (i.e., "early termination") of the `for..of` loop -- generally caused by a `break`, `return`, or an uncaught exception -- sends a signal to the generator's *iterator* for it to terminate. + +**Note:** Technically, the `for..of` loop also sends this signal to the *iterator* at the normal completion of the loop. For a generator, that's essentially a moot operation, as the generator's *iterator* had to complete first so the `for..of` loop completed. However, custom *iterators* might desire to receive this additional signal from `for..of` loop consumers. + +While a `for..of` loop will automatically send this signal, you may wish to send the signal manually to an *iterator*; you do this by calling `return(..)`. + +If you specify a `try..finally` clause inside the generator, it will always be run even when the generator is externally completed. This is useful if you need to clean up resources (database connections, etc.): + +```js +function *something() { + try { + var nextVal; + + while (true) { + if (nextVal === undefined) { + nextVal = 1; + } + else { + nextVal = (3 * nextVal) + 6; + } + + yield nextVal; + } + } + // cleanup clause + finally { + console.log( "cleaning up!" ); + } +} +``` + +The earlier example with `break` in the `for..of` loop will trigger the `finally` clause. But you could instead manually terminate the generator's *iterator* instance from the outside with `return(..)`: + +```js +var it = something(); +for (var v of it) { + console.log( v ); + + // don't let the loop run forever! + if (v > 500) { + console.log( + // complete the generator's iterator + it.return( "Hello World" ).value + ); + // no `break` needed here + } +} +// 1 9 33 105 321 969 +// cleaning up! +// Hello World +``` + +When we call `it.return(..)`, it immediately terminates the generator, which of course runs the `finally` clause. Also, it sets the returned `value` to whatever you passed in to `return(..)`, which is how `"Hello World"` comes right back out. We also don't need to include a `break` now because the generator's *iterator* is set to `done:true`, so the `for..of` loop will terminate on its next iteration. + +Generators owe their namesake mostly to this *consuming produced values* use. But again, that's just one of the uses for generators, and frankly not even the main one we're concerned with in the context of this book. + +But now that we more fully understand some of the mechanics of how they work, we can *next* turn our attention to how generators apply to async concurrency. + +## Iterating Generators Asynchronously + +What do generators have to do with async coding patterns, fixing problems with callbacks, and the like? Let's get to answering that important question. + +We should revisit one of our scenarios from Chapter 3. Let's recall the callback approach: + +```js +function foo(x,y,cb) { + ajax( + "http://some.url.1/?x=" + x + "&y=" + y, + cb + ); +} + +foo( 11, 31, function(err,text) { + if (err) { + console.error( err ); + } + else { + console.log( text ); + } +} ); +``` + +If we wanted to express this same task flow control with a generator, we could do: + +```js +function foo(x,y) { + ajax( + "http://some.url.1/?x=" + x + "&y=" + y, + function(err,data){ + if (err) { + // throw an error into `*main()` + it.throw( err ); + } + else { + // resume `*main()` with received `data` + it.next( data ); + } + } + ); +} + +function *main() { + try { + var text = yield foo( 11, 31 ); + console.log( text ); + } + catch (err) { + console.error( err ); + } +} + +var it = main(); + +// start it all up! +it.next(); +``` + +At first glance, this snippet is longer, and perhaps a little more complex looking, than the callback snippet before it. But don't let that impression get you off track. The generator snippet is actually **much** better! But there's a lot going on for us to explain. + +First, let's look at this part of the code, which is the most important: + +```js +var text = yield foo( 11, 31 ); +console.log( text ); +``` + +Think about how that code works for a moment. We're calling a normal function `foo(..)` and we're apparently able to get back the `text` from the Ajax call, even though it's asynchronous. + +How is that possible? If you recall the beginning of Chapter 1, we had almost identical code: + +```js +var data = ajax( "..url 1.." ); +console.log( data ); +``` + +And that code didn't work! Can you spot the difference? It's the `yield` used in a generator. + +That's the magic! That's what allows us to have what appears to be blocking, synchronous code, but it doesn't actually block the whole program; it only pauses/blocks the code in the generator itself. + +In `yield foo(11,31)`, first the `foo(11,31)` call is made, which returns nothing (aka `undefined`), so we're making a call to request data, but we're actually then doing `yield undefined`. That's OK, because the code is not currently relying on a `yield`ed value to do anything interesting. We'll revisit this point later in the chapter. + +We're not using `yield` in a message passing sense here, only in a flow control sense to pause/block. Actually, it will have message passing, but only in one direction, after the generator is resumed. + +So, the generator pauses at the `yield`, essentially asking the question, "what value should I return to assign to the variable `text`?" Who's going to answer that question? + +Look at `foo(..)`. If the Ajax request is successful, we call: + +```js +it.next( data ); +``` + +That's resuming the generator with the response data, which means that our paused `yield` expression receives that value directly, and then as it restarts the generator code, that value gets assigned to the local variable `text`. + +Pretty cool, huh? + +Take a step back and consider the implications. We have totally synchronous-looking code inside the generator (other than the `yield` keyword itself), but hidden behind the scenes, inside of `foo(..)`, the operations can complete asynchronously. + +**That's huge!** That's a nearly perfect solution to our previously stated problem with callbacks not being able to express asynchrony in a sequential, synchronous fashion that our brains can relate to. + +In essence, we are abstracting the asynchrony away as an implementation detail, so that we can reason synchronously/sequentially about our flow control: "Make an Ajax request, and when it finishes print out the response." And of course, we just expressed two steps in the flow control, but this same capability extends without bounds, to let us express however many steps we need to. + +**Tip:** This is such an important realization, just go back and read the last three paragraphs again to let it sink in! + +### Synchronous Error Handling + +But the preceding generator code has even more goodness to *yield* to us. Let's turn our attention to the `try..catch` inside the generator: + +```js +try { + var text = yield foo( 11, 31 ); + console.log( text ); +} +catch (err) { + console.error( err ); +} +``` + +How does this work? The `foo(..)` call is asynchronously completing, and doesn't `try..catch` fail to catch asynchronous errors, as we looked at in Chapter 3? + +We already saw how the `yield` lets the assignment statement pause to wait for `foo(..)` to finish, so that the completed response can be assigned to `text`. The awesome part is that this `yield` pausing *also* allows the generator to `catch` an error. We throw that error into the generator with this part of the earlier code listing: + +```js +if (err) { + // throw an error into `*main()` + it.throw( err ); +} +``` + +The `yield`-pause nature of generators means that not only do we get synchronous-looking `return` values from async function calls, but we can also synchronously `catch` errors from those async function calls! + +So we've seen we can throw errors *into* a generator, but what about throwing errors *out of* a generator? Exactly as you'd expect: + +```js +function *main() { + var x = yield "Hello World"; + + yield x.toLowerCase(); // cause an exception! +} + +var it = main(); + +it.next().value; // Hello World + +try { + it.next( 42 ); +} +catch (err) { + console.error( err ); // TypeError +} +``` + +Of course, we could have manually thrown an error with `throw ..` instead of causing an exception. + +We can even `catch` the same error that we `throw(..)` into the generator, essentially giving the generator a chance to handle it but if it doesn't, the *iterator* code must handle it: + +```js +function *main() { + var x = yield "Hello World"; + + // never gets here + console.log( x ); +} + +var it = main(); + +it.next(); + +try { + // will `*main()` handle this error? we'll see! + it.throw( "Oops" ); +} +catch (err) { + // nope, didn't handle it! + console.error( err ); // Oops +} +``` + +Synchronous-looking error handling (via `try..catch`) with async code is a huge win for readability and reason-ability. + +## Generators + Promises + +In our previous discussion, we showed how generators can be iterated asynchronously, which is a huge step forward in sequential reason-ability over the spaghetti mess of callbacks. But we lost something very important: the trustability and composability of Promises (see Chapter 3)! + +Don't worry -- we can get that back. The best of all worlds in ES6 is to combine generators (synchronous-looking async code) with Promises (trustable and composable). + +But how? + +Recall from Chapter 3 the Promise-based approach to our running Ajax example: + +```js +function foo(x,y) { + return request( + "http://some.url.1/?x=" + x + "&y=" + y + ); +} + +foo( 11, 31 ) +.then( + function(text){ + console.log( text ); + }, + function(err){ + console.error( err ); + } +); +``` + +In our earlier generator code for the running Ajax example, `foo(..)` returned nothing (`undefined`), and our *iterator* control code didn't care about that `yield`ed value. + +But here the Promise-aware `foo(..)` returns a promise after making the Ajax call. That suggests that we could construct a promise with `foo(..)` and then `yield` it from the generator, and then the *iterator* control code would receive that promise. + +But what should the *iterator* do with the promise? + +It should listen for the promise to resolve (fulfillment or rejection), and then either resume the generator with the fulfillment message or throw an error into the generator with the rejection reason. + +Let me repeat that, because it's so important. The natural way to get the most out of Promises and generators is **to `yield` a Promise**, and wire that Promise to control the generator's *iterator*. + +Let's give it a try! First, we'll put the Promise-aware `foo(..)` together with the generator `*main()`: + +```js +function foo(x,y) { + return request( + "http://some.url.1/?x=" + x + "&y=" + y + ); +} + +function *main() { + try { + var text = yield foo( 11, 31 ); + console.log( text ); + } + catch (err) { + console.error( err ); + } +} +``` + +The most powerful revelation in this refactor is that the code inside `*main()` **did not have to change at all!** Inside the generator, whatever values are `yield`ed out is just an opaque implementation detail, so we're not even aware it's happening, nor do we need to worry about it. + +But how are we going to run `*main()` now? We still have some of the implementation plumbing work to do, to receive and wire up the `yield`ed promise so that it resumes the generator upon resolution. We'll start by trying that manually: + +```js +var it = main(); + +var p = it.next().value; + +// wait for the `p` promise to resolve +p.then( + function(text){ + it.next( text ); + }, + function(err){ + it.throw( err ); + } +); +``` + +Actually, that wasn't so painful at all, was it? + +This snippet should look very similar to what we did earlier with the manually wired generator controlled by the error-first callback. Instead of an `if (err) { it.throw..`, the promise already splits fulfillment (success) and rejection (failure) for us, but otherwise the *iterator* control is identical. + +Now, we've glossed over some important details. + +Most importantly, we took advantage of the fact that we knew that `*main()` only had one Promise-aware step in it. What if we wanted to be able to Promise-drive a generator no matter how many steps it has? We certainly don't want to manually write out the Promise chain differently for each generator! What would be much nicer is if there was a way to repeat (aka "loop" over) the iteration control, and each time a Promise comes out, wait on its resolution before continuing. + +Also, what if the generator throws out an error (intentionally or accidentally) during the `it.next(..)` call? Should we quit, or should we `catch` it and send it right back in? Similarly, what if we `it.throw(..)` a Promise rejection into the generator, but it's not handled, and comes right back out? + +### Promise-Aware Generator Runner + +The more you start to explore this path, the more you realize, "wow, it'd be great if there was just some utility to do it for me." And you're absolutely correct. This is such an important pattern, and you don't want to get it wrong (or exhaust yourself repeating it over and over), so your best bet is to use a utility that is specifically designed to *run* Promise-`yield`ing generators in the manner we've illustrated. + +Several Promise abstraction libraries provide just such a utility, including my *asynquence* library and its `runner(..)`, which will be discussed in Appendix A of this book. + +But for the sake of learning and illustration, let's just define our own standalone utility that we'll call `run(..)`: + +```js +// thanks to Benjamin Gruenbaum (@benjamingr on GitHub) for +// big improvements here! +function run(gen) { + var args = [].slice.call( arguments, 1), it; + + // initialize the generator in the current context + it = gen.apply( this, args ); + + // return a promise for the generator completing + return Promise.resolve() + .then( function handleNext(value){ + // run to the next yielded value + var next = it.next( value ); + + return (function handleResult(next){ + // generator has completed running? + if (next.done) { + return next.value; + } + // otherwise keep going + else { + return Promise.resolve( next.value ) + .then( + // resume the async loop on + // success, sending the resolved + // value back into the generator + handleNext, + + // if `value` is a rejected + // promise, propagate error back + // into the generator for its own + // error handling + function handleErr(err) { + return Promise.resolve( + it.throw( err ) + ) + .then( handleResult ); + } + ); + } + })(next); + } ); +} +``` + +As you can see, it's a quite a bit more complex than you'd probably want to author yourself, and you especially wouldn't want to repeat this code for each generator you use. So, a utility/library helper is definitely the way to go. Nevertheless, I encourage you to spend a few minutes studying that code listing to get a better sense of how to manage the generator+Promise negotiation. + +How would you use `run(..)` with `*main()` in our *running* Ajax example? + +```js +function *main() { + // .. +} + +run( main ); +``` + +That's it! The way we wired `run(..)`, it will automatically advance the generator you pass to it, asynchronously until completion. + +**Note:** The `run(..)` we defined returns a promise which is wired to resolve once the generator is complete, or receive an uncaught exception if the generator doesn't handle it. We don't show that capability here, but we'll come back to it later in the chapter. + +#### ES7: `async` and `await`? + +The preceding pattern -- generators yielding Promises that then control the generator's *iterator* to advance it to completion -- is such a powerful and useful approach, it would be nicer if we could do it without the clutter of the library utility helper (aka `run(..)`). + +There's probably good news on that front. At the time of this writing, there's early but strong support for a proposal for more syntactic addition in this realm for the post-ES6, ES7-ish timeframe. Obviously, it's too early to guarantee the details, but there's a pretty decent chance it will shake out similar to the following: + +```js +function foo(x,y) { + return request( + "http://some.url.1/?x=" + x + "&y=" + y + ); +} + +async function main() { + try { + var text = await foo( 11, 31 ); + console.log( text ); + } + catch (err) { + console.error( err ); + } +} + +main(); +``` + +As you can see, there's no `run(..)` call (meaning no need for a library utility!) to invoke and drive `main()` -- it's just called as a normal function. Also, `main()` isn't declared as a generator function anymore; it's a new kind of function: `async function`. And finally, instead of `yield`ing a Promise, we `await` for it to resolve. + +The `async function` automatically knows what to do if you `await` a Promise -- it will pause the function (just like with generators) until the Promise resolves. We didn't illustrate it in this snippet, but calling an async function like `main()` automatically returns a promise that's resolved whenever the function finishes completely. + +**Tip:** The `async` / `await` syntax should look very familiar to readers with experience in C#, because it's basically identical. + +The proposal essentially codifies support for the pattern we've already derived, into a syntactic mechanism: combining Promises with sync-looking flow control code. That's the best of both worlds combined, to effectively address practically all of the major concerns we outlined with callbacks. + +The mere fact that such a ES7-ish proposal already exists and has early support and enthusiasm is a major vote of confidence in the future importance of this async pattern. + +### Promise Concurrency in Generators + +So far, all we've demonstrated is a single-step async flow with Promises+generators. But real-world code will often have many async steps. + +If you're not careful, the sync-looking style of generators may lull you into complacency with how you structure your async concurrency, leading to suboptimal performance patterns. So we want to spend a little time exploring the options. + +Imagine a scenario where you need to fetch data from two different sources, then combine those responses to make a third request, and finally print out the last response. We explored a similar scenario with Promises in Chapter 3, but let's reconsider it in the context of generators. + +Your first instinct might be something like: + +```js +function *foo() { + var r1 = yield request( "http://some.url.1" ); + var r2 = yield request( "http://some.url.2" ); + + var r3 = yield request( + "http://some.url.3/?v=" + r1 + "," + r2 + ); + + console.log( r3 ); +} + +// use previously defined `run(..)` utility +run( foo ); +``` + +This code will work, but in the specifics of our scenario, it's not optimal. Can you spot why? + +Because the `r1` and `r2` requests can -- and for performance reasons, *should* -- run concurrently, but in this code they will run sequentially; the `"http://some.url.2"` URL isn't Ajax fetched until after the `"http://some.url.1"` request is finished. These two requests are independent, so the better performance approach would likely be to have them run at the same time. + +But how exactly would you do that with a generator and `yield`? We know that `yield` is only a single pause point in the code, so you can't really do two pauses at the same time. + +The most natural and effective answer is to base the async flow on Promises, specifically on their capability to manage state in a time-independent fashion (see "Future Value" in Chapter 3). + +The simplest approach: + +```js +function *foo() { + // make both requests "in parallel" + var p1 = request( "http://some.url.1" ); + var p2 = request( "http://some.url.2" ); + + // wait until both promises resolve + var r1 = yield p1; + var r2 = yield p2; + + var r3 = yield request( + "http://some.url.3/?v=" + r1 + "," + r2 + ); + + console.log( r3 ); +} + +// use previously defined `run(..)` utility +run( foo ); +``` + +Why is this different from the previous snippet? Look at where the `yield` is and is not. `p1` and `p2` are promises for Ajax requests made concurrently (aka "in parallel"). It doesn't matter which one finishes first, because promises will hold onto their resolved state for as long as necessary. + +Then we use two subsequent `yield` statements to wait for and retrieve the resolutions from the promises (into `r1` and `r2`, respectively). If `p1` resolves first, the `yield p1` resumes first then waits on the `yield p2` to resume. If `p2` resolves first, it will just patiently hold onto that resolution value until asked, but the `yield p1` will hold on first, until `p1` resolves. + +Either way, both `p1` and `p2` will run concurrently, and both have to finish, in either order, before the `r3 = yield request..` Ajax request will be made. + +If that flow control processing model sounds familiar, it's basically the same as what we identified in Chapter 3 as the "gate" pattern, enabled by the `Promise.all([ .. ])` utility. So, we could also express the flow control like this: + +```js +function *foo() { + // make both requests "in parallel," and + // wait until both promises resolve + var results = yield Promise.all( [ + request( "http://some.url.1" ), + request( "http://some.url.2" ) + ] ); + + var r1 = results[0]; + var r2 = results[1]; + + var r3 = yield request( + "http://some.url.3/?v=" + r1 + "," + r2 + ); + + console.log( r3 ); +} + +// use previously defined `run(..)` utility +run( foo ); +``` + +**Note:** As we discussed in Chapter 3, we can even use ES6 destructuring assignment to simplify the `var r1 = .. var r2 = ..` assignments, with `var [r1,r2] = results`. + +In other words, all of the concurrency capabilities of Promises are available to us in the generator+Promise approach. So in any place where you need more than sequential this-then-that async flow control steps, Promises are likely your best bet. + +#### Promises, Hidden + +As a word of stylistic caution, be careful about how much Promise logic you include **inside your generators**. The whole point of using generators for asynchrony in the way we've described is to create simple, sequential, sync-looking code, and to hide as much of the details of asynchrony away from that code as possible. + +For example, this might be a cleaner approach: + +```js +// note: normal function, not generator +function bar(url1,url2) { + return Promise.all( [ + request( url1 ), + request( url2 ) + ] ); +} + +function *foo() { + // hide the Promise-based concurrency details + // inside `bar(..)` + var results = yield bar( + "http://some.url.1", + "http://some.url.2" + ); + + var r1 = results[0]; + var r2 = results[1]; + + var r3 = yield request( + "http://some.url.3/?v=" + r1 + "," + r2 + ); + + console.log( r3 ); +} + +// use previously defined `run(..)` utility +run( foo ); +``` + +Inside `*foo()`, it's cleaner and clearer that all we're doing is just asking `bar(..)` to get us some `results`, and we'll `yield`-wait on that to happen. We don't have to care that under the covers a `Promise.all([ .. ])` Promise composition will be used to make that happen. + +**We treat asynchrony, and indeed Promises, as an implementation detail.** + +Hiding your Promise logic inside a function that you merely call from your generator is especially useful if you're going to do a sophisticated series flow-control. For example: + +```js +function bar() { + Promise.all( [ + baz( .. ) + .then( .. ), + Promise.race( [ .. ] ) + ] ) + .then( .. ) +} +``` + +That kind of logic is sometimes required, and if you dump it directly inside your generator(s), you've defeated most of the reason why you would want to use generators in the first place. We *should* intentionally abstract such details away from our generator code so that they don't clutter up the higher level task expression. + +Beyond creating code that is both functional and performant, you should also strive to make code that is as reason-able and maintainable as possible. + +**Note:** Abstraction is not *always* a healthy thing for programming -- many times it can increase complexity in exchange for terseness. But in this case, I believe it's much healthier for your generator+Promise async code than the alternatives. As with all such advice, though, pay attention to your specific situations and make proper decisions for you and your team. + +## Generator Delegation + +In the previous section, we showed calling regular functions from inside a generator, and how that remains a useful technique for abstracting away implementation details (like async Promise flow). But the main drawback of using a normal function for this task is that it has to behave by the normal function rules, which means it cannot pause itself with `yield` like a generator can. + +It may then occur to you that you might try to call one generator from another generator, using our `run(..)` helper, such as: + +```js +function *foo() { + var r2 = yield request( "http://some.url.2" ); + var r3 = yield request( "http://some.url.3/?v=" + r2 ); + + return r3; +} + +function *bar() { + var r1 = yield request( "http://some.url.1" ); + + // "delegating" to `*foo()` via `run(..)` + var r3 = yield run( foo ); + + console.log( r3 ); +} + +run( bar ); +``` + +We run `*foo()` inside of `*bar()` by using our `run(..)` utility again. We take advantage here of the fact that the `run(..)` we defined earlier returns a promise which is resolved when its generator is run to completion (or errors out), so if we `yield` out to a `run(..)` instance the promise from another `run(..)` call, it automatically pauses `*bar()` until `*foo()` finishes. + +But there's an even better way to integrate calling `*foo()` into `*bar()`, and it's called `yield`-delegation. The special syntax for `yield`-delegation is: `yield * __` (notice the extra `*`). Before we see it work in our previous example, let's look at a simpler scenario: + +```js +function *foo() { + console.log( "`*foo()` starting" ); + yield 3; + yield 4; + console.log( "`*foo()` finished" ); +} + +function *bar() { + yield 1; + yield 2; + yield *foo(); // `yield`-delegation! + yield 5; +} + +var it = bar(); + +it.next().value; // 1 +it.next().value; // 2 +it.next().value; // `*foo()` starting + // 3 +it.next().value; // 4 +it.next().value; // `*foo()` finished + // 5 +``` + +**Note:** Similar to a note earlier in the chapter where I explained why I prefer `function *foo() ..` instead of `function* foo() ..`, I also prefer -- differing from most other documentation on the topic -- to say `yield *foo()` instead of `yield* foo()`. The placement of the `*` is purely stylistic and up to your best judgment. But I find the consistency of styling attractive. + +How does the `yield *foo()` delegation work? + +First, calling `foo()` creates an *iterator* exactly as we've already seen. Then, `yield *` delegates/transfers the *iterator* instance control (of the present `*bar()` generator) over to this other `*foo()` *iterator*. + +So, the first two `it.next()` calls are controlling `*bar()`, but when we make the third `it.next()` call, now `*foo()` starts up, and now we're controlling `*foo()` instead of `*bar()`. That's why it's called delegation -- `*bar()` delegated its iteration control to `*foo()`. + +As soon as the `it` *iterator* control exhausts the entire `*foo()` *iterator*, it automatically returns to controlling `*bar()`. + +So now back to the previous example with the three sequential Ajax requests: + +```js +function *foo() { + var r2 = yield request( "http://some.url.2" ); + var r3 = yield request( "http://some.url.3/?v=" + r2 ); + + return r3; +} + +function *bar() { + var r1 = yield request( "http://some.url.1" ); + + // "delegating" to `*foo()` via `yield*` + var r3 = yield *foo(); + + console.log( r3 ); +} + +run( bar ); +``` + +The only difference between this snippet and the version used earlier is the use of `yield *foo()` instead of the previous `yield run(foo)`. + +**Note:** `yield *` yields iteration control, not generator control; when you invoke the `*foo()` generator, you're now `yield`-delegating to its *iterator*. But you can actually `yield`-delegate to any *iterable*; `yield *[1,2,3]` would consume the default *iterator* for the `[1,2,3]` array value. + +### Why Delegation? + +The purpose of `yield`-delegation is mostly code organization, and in that way is symmetrical with normal function calling. + +Imagine two modules that respectively provide methods `foo()` and `bar()`, where `bar()` calls `foo()`. The reason the two are separate is generally because the proper organization of code for the program calls for them to be in separate functions. For example, there may be cases where `foo()` is called standalone, and other places where `bar()` calls `foo()`. + +For all these exact same reasons, keeping generators separate aids in program readability, maintenance, and debuggability. In that respect, `yield *` is a syntactic shortcut for manually iterating over the steps of `*foo()` while inside of `*bar()`. + +Such manual approach would be especially complex if the steps in `*foo()` were asynchronous, which is why you'd probably need to use that `run(..)` utility to do it. And as we've shown, `yield *foo()` eliminates the need for a sub-instance of the `run(..)` utility (like `run(foo)`). + +### Delegating Messages + +You may wonder how this `yield`-delegation works not just with *iterator* control but with the two-way message passing. Carefully follow the flow of messages in and out, through the `yield`-delegation: + +```js +function *foo() { + console.log( "inside `*foo()`:", yield "B" ); + + console.log( "inside `*foo()`:", yield "C" ); + + return "D"; +} + +function *bar() { + console.log( "inside `*bar()`:", yield "A" ); + + // `yield`-delegation! + console.log( "inside `*bar()`:", yield *foo() ); + + console.log( "inside `*bar()`:", yield "E" ); + + return "F"; +} + +var it = bar(); + +console.log( "outside:", it.next().value ); +// outside: A + +console.log( "outside:", it.next( 1 ).value ); +// inside `*bar()`: 1 +// outside: B + +console.log( "outside:", it.next( 2 ).value ); +// inside `*foo()`: 2 +// outside: C + +console.log( "outside:", it.next( 3 ).value ); +// inside `*foo()`: 3 +// inside `*bar()`: D +// outside: E + +console.log( "outside:", it.next( 4 ).value ); +// inside `*bar()`: 4 +// outside: F +``` + +Pay particular attention to the processing steps after the `it.next(3)` call: + +1. The `3` value is passed (through the `yield`-delegation in `*bar()`) into the waiting `yield "C"` expression inside of `*foo()`. +2. `*foo()` then calls `return "D"`, but this value doesn't get returned all the way back to the outside `it.next(3)` call. +3. Instead, the `"D"` value is sent as the result of the waiting `yield *foo()` expression inside of `*bar()` -- this `yield`-delegation expression has essentially been paused while all of `*foo()` was exhausted. So `"D"` ends up inside of `*bar()` for it to print out. +4. `yield "E"` is called inside of `*bar()`, and the `"E"` value is yielded to the outside as the result of the `it.next(3)` call. + +From the perspective of the external *iterator* (`it`), it doesn't appear any differently between controlling the initial generator or a delegated one. + +In fact, `yield`-delegation doesn't even have to be directed to another generator; it can just be directed to a non-generator, general *iterable*. For example: + +```js +function *bar() { + console.log( "inside `*bar()`:", yield "A" ); + + // `yield`-delegation to a non-generator! + console.log( "inside `*bar()`:", yield *[ "B", "C", "D" ] ); + + console.log( "inside `*bar()`:", yield "E" ); + + return "F"; +} + +var it = bar(); + +console.log( "outside:", it.next().value ); +// outside: A + +console.log( "outside:", it.next( 1 ).value ); +// inside `*bar()`: 1 +// outside: B + +console.log( "outside:", it.next( 2 ).value ); +// outside: C + +console.log( "outside:", it.next( 3 ).value ); +// outside: D + +console.log( "outside:", it.next( 4 ).value ); +// inside `*bar()`: undefined +// outside: E + +console.log( "outside:", it.next( 5 ).value ); +// inside `*bar()`: 5 +// outside: F +``` + +Notice the differences in where the messages were received/reported between this example and the one previous. + +Most strikingly, the default `array` *iterator* doesn't care about any messages sent in via `next(..)` calls, so the values `2`, `3`, and `4` are essentially ignored. Also, because that *iterator* has no explicit `return` value (unlike the previously used `*foo()`), the `yield *` expression gets an `undefined` when it finishes. + +#### Exceptions Delegated, Too! + +In the same way that `yield`-delegation transparently passes messages through in both directions, errors/exceptions also pass in both directions: + +```js +function *foo() { + try { + yield "B"; + } + catch (err) { + console.log( "error caught inside `*foo()`:", err ); + } + + yield "C"; + + throw "D"; +} + +function *bar() { + yield "A"; + + try { + yield *foo(); + } + catch (err) { + console.log( "error caught inside `*bar()`:", err ); + } + + yield "E"; + + yield *baz(); + + // note: can't get here! + yield "G"; +} + +function *baz() { + throw "F"; +} + +var it = bar(); + +console.log( "outside:", it.next().value ); +// outside: A + +console.log( "outside:", it.next( 1 ).value ); +// outside: B + +console.log( "outside:", it.throw( 2 ).value ); +// error caught inside `*foo()`: 2 +// outside: C + +console.log( "outside:", it.next( 3 ).value ); +// error caught inside `*bar()`: D +// outside: E + +try { + console.log( "outside:", it.next( 4 ).value ); +} +catch (err) { + console.log( "error caught outside:", err ); +} +// error caught outside: F +``` + +Some things to note from this snippet: + +1. When we call `it.throw(2)`, it sends the error message `2` into `*bar()`, which delegates that to `*foo()`, which then `catch`es it and handles it gracefully. Then, the `yield "C"` sends `"C"` back out as the return `value` from the `it.throw(2)` call. +2. The `"D"` value that's next `throw`n from inside `*foo()` propagates out to `*bar()`, which `catch`es it and handles it gracefully. Then the `yield "E"` sends `"E"` back out as the return `value` from the `it.next(3)` call. +3. Next, the exception `throw`n from `*baz()` isn't caught in `*bar()` -- though we did `catch` it outside -- so both `*baz()` and `*bar()` are set to a completed state. After this snippet, you would not be able to get the `"G"` value out with any subsequent `next(..)` call(s) -- they will just return `undefined` for `value`. + +### Delegating Asynchrony + +Let's finally get back to our earlier `yield`-delegation example with the multiple sequential Ajax requests: + +```js +function *foo() { + var r2 = yield request( "http://some.url.2" ); + var r3 = yield request( "http://some.url.3/?v=" + r2 ); + + return r3; +} + +function *bar() { + var r1 = yield request( "http://some.url.1" ); + + var r3 = yield *foo(); + + console.log( r3 ); +} + +run( bar ); +``` + +Instead of calling `yield run(foo)` inside of `*bar()`, we just call `yield *foo()`. + +In the previous version of this example, the Promise mechanism (controlled by `run(..)`) was used to transport the value from `return r3` in `*foo()` to the local variable `r3` inside `*bar()`. Now, that value is just returned back directly via the `yield *` mechanics. + +Otherwise, the behavior is pretty much identical. + +### Delegating "Recursion" + +Of course, `yield`-delegation can keep following as many delegation steps as you wire up. You could even use `yield`-delegation for async-capable generator "recursion" -- a generator `yield`-delegating to itself: + +```js +function *foo(val) { + if (val > 1) { + // generator recursion + val = yield *foo( val - 1 ); + } + + return yield request( "http://some.url/?v=" + val ); +} + +function *bar() { + var r1 = yield *foo( 3 ); + console.log( r1 ); +} + +run( bar ); +``` + +**Note:** Our `run(..)` utility could have been called with `run( foo, 3 )`, because it supports additional parameters being passed along to the initialization of the generator. However, we used a parameter-free `*bar()` here to highlight the flexibility of `yield *`. + +What processing steps follow from that code? Hang on, this is going to be quite intricate to describe in detail: + +1. `run(bar)` starts up the `*bar()` generator. +2. `foo(3)` creates an *iterator* for `*foo(..)` and passes `3` as its `val` parameter. +3. Because `3 > 1`, `foo(2)` creates another *iterator* and passes in `2` as its `val` parameter. +4. Because `2 > 1`, `foo(1)` creates yet another *iterator* and passes in `1` as its `val` parameter. +5. `1 > 1` is `false`, so we next call `request(..)` with the `1` value, and get a promise back for that first Ajax call. +6. That promise is `yield`ed out, which comes back to the `*foo(2)` generator instance. +7. The `yield *` passes that promise back out to the `*foo(3)` generator instance. Another `yield *` passes the promise out to the `*bar()` generator instance. And yet again another `yield *` passes the promise out to the `run(..)` utility, which will wait on that promise (for the first Ajax request) to proceed. +8. When the promise resolves, its fulfillment message is sent to resume `*bar()`, which passes through the `yield *` into the `*foo(3)` instance, which then passes through the `yield *` to the `*foo(2)` generator instance, which then passes through the `yield *` to the normal `yield` that's waiting in the `*foo(3)` generator instance. +9. That first call's Ajax response is now immediately `return`ed from the `*foo(3)` generator instance, which sends that value back as the result of the `yield *` expression in the `*foo(2)` instance, and assigned to its local `val` variable. +10. Inside `*foo(2)`, a second Ajax request is made with `request(..)`, whose promise is `yield`ed back to the `*foo(1)` instance, and then `yield *` propagates all the way out to `run(..)` (step 7 again). When the promise resolves, the second Ajax response propagates all the way back into the `*foo(2)` generator instance, and is assigned to its local `val` variable. +11. Finally, the third Ajax request is made with `request(..)`, its promise goes out to `run(..)`, and then its resolution value comes all the way back, which is then `return`ed so that it comes back to the waiting `yield *` expression in `*bar()`. + +Phew! A lot of crazy mental juggling, huh? You might want to read through that a few more times, and then go grab a snack to clear your head! + +## Generator Concurrency + +As we discussed in both Chapter 1 and earlier in this chapter, two simultaneously running "processes" can cooperatively interleave their operations, and many times this can *yield* (pun intended) very powerful asynchrony expressions. + +Frankly, our earlier examples of concurrency interleaving of multiple generators showed how to make it really confusing. But we hinted that there's places where this capability is quite useful. + +Recall a scenario we looked at in Chapter 1, where two different simultaneous Ajax response handlers needed to coordinate with each other to make sure that the data communication was not a race condition. We slotted the responses into the `res` array like this: + +```js +function response(data) { + if (data.url == "http://some.url.1") { + res[0] = data; + } + else if (data.url == "http://some.url.2") { + res[1] = data; + } +} +``` + +But how can we use multiple generators concurrently for this scenario? + +```js +// `request(..)` is a Promise-aware Ajax utility + +var res = []; + +function *reqData(url) { + res.push( + yield request( url ) + ); +} +``` + +**Note:** We're going to use two instances of the `*reqData(..)` generator here, but there's no difference to running a single instance of two different generators; both approaches are reasoned about identically. We'll see two different generators coordinating in just a bit. + +Instead of having to manually sort out `res[0]` and `res[1]` assignments, we'll use coordinated ordering so that `res.push(..)` properly slots the values in the expected and predictable order. The expressed logic thus should feel a bit cleaner. + +But how will we actually orchestrate this interaction? First, let's just do it manually, with Promises: + +```js +var it1 = reqData( "http://some.url.1" ); +var it2 = reqData( "http://some.url.2" ); + +var p1 = it1.next().value; +var p2 = it2.next().value; + +p1 +.then( function(data){ + it1.next( data ); + return p2; +} ) +.then( function(data){ + it2.next( data ); +} ); +``` + +`*reqData(..)`'s two instances are both started to make their Ajax requests, then paused with `yield`. Then we choose to resume the first instance when `p1` resolves, and then `p2`'s resolution will restart the second instance. In this way, we use Promise orchestration to ensure that `res[0]` will have the first response and `res[1]` will have the second response. + +But frankly, this is awfully manual, and it doesn't really let the generators orchestrate themselves, which is where the true power can lie. Let's try it a different way: + +```js +// `request(..)` is a Promise-aware Ajax utility + +var res = []; + +function *reqData(url) { + var data = yield request( url ); + + // transfer control + yield; + + res.push( data ); +} + +var it1 = reqData( "http://some.url.1" ); +var it2 = reqData( "http://some.url.2" ); + +var p1 = it1.next().value; +var p2 = it2.next().value; + +p1.then( function(data){ + it1.next( data ); +} ); + +p2.then( function(data){ + it2.next( data ); +} ); + +Promise.all( [p1,p2] ) +.then( function(){ + it1.next(); + it2.next(); +} ); +``` + +OK, this is a bit better (though still manual!), because now the two instances of `*reqData(..)` run truly concurrently, and (at least for the first part) independently. + +In the previous snippet, the second instance was not given its data until after the first instance was totally finished. But here, both instances receive their data as soon as their respective responses come back, and then each instance does another `yield` for control transfer purposes. We then choose what order to resume them in the `Promise.all([ .. ])` handler. + +What may not be as obvious is that this approach hints at an easier form for a reusable utility, because of the symmetry. We can do even better. Let's imagine using a utility called `runAll(..)`: + +```js +// `request(..)` is a Promise-aware Ajax utility + +var res = []; + +runAll( + function*(){ + var p1 = request( "http://some.url.1" ); + + // transfer control + yield; + + res.push( yield p1 ); + }, + function*(){ + var p2 = request( "http://some.url.2" ); + + // transfer control + yield; + + res.push( yield p2 ); + } +); +``` + +**Note:** We're not including a code listing for `runAll(..)` as it is not only long enough to bog down the text, but is an extension of the logic we've already implemented in `run(..)` earlier. So, as a good supplementary exercise for the reader, try your hand at evolving the code from `run(..)` to work like the imagined `runAll(..)`. Also, my *asynquence* library provides a previously mentioned `runner(..)` utility with this kind of capability already built in, and will be discussed in Appendix A of this book. + +Here's how the processing inside `runAll(..)` would operate: + +1. The first generator gets a promise for the first Ajax response from `"http://some.url.1"`, then `yield`s control back to the `runAll(..)` utility. +2. The second generator runs and does the same for `"http://some.url.2"`, `yield`ing control back to the `runAll(..)` utility. +3. The first generator resumes, and then `yield`s out its promise `p1`. The `runAll(..)` utility does the same in this case as our previous `run(..)`, in that it waits on that promise to resolve, then resumes the same generator (no control transfer!). When `p1` resolves, `runAll(..)` resumes the first generator again with that resolution value, and then `res[0]` is given its value. When the first generator then finishes, that's an implicit transfer of control. +4. The second generator resumes, `yield`s out its promise `p2`, and waits for it to resolve. Once it does, `runAll(..)` resumes the second generator with that value, and `res[1]` is set. + +In this running example, we use an outer variable called `res` to store the results of the two different Ajax responses -- that's our concurrency coordination making that possible. + +But it might be quite helpful to further extend `runAll(..)` to provide an inner variable space for the multiple generator instances to *share*, such as an empty object we'll call `data` below. Also, it could take non-Promise values that are `yield`ed and hand them off to the next generator. + +Consider: + +```js +// `request(..)` is a Promise-aware Ajax utility + +runAll( + function*(data){ + data.res = []; + + // transfer control (and message pass) + var url1 = yield "http://some.url.2"; + + var p1 = request( url1 ); // "http://some.url.1" + + // transfer control + yield; + + data.res.push( yield p1 ); + }, + function*(data){ + // transfer control (and message pass) + var url2 = yield "http://some.url.1"; + + var p2 = request( url2 ); // "http://some.url.2" + + // transfer control + yield; + + data.res.push( yield p2 ); + } +); +``` + +In this formulation, the two generators are not just coordinating control transfer, but actually communicating with each other, both through `data.res` and the `yield`ed messages that trade `url1` and `url2` values. That's incredibly powerful! + +Such realization also serves as a conceptual base for a more sophisticated asynchrony technique called CSP (Communicating Sequential Processes), which we will cover in Appendix B of this book. + +## Thunks + +So far, we've made the assumption that `yield`ing a Promise from a generator -- and having that Promise resume the generator via a helper utility like `run(..)` -- was the best possible way to manage asynchrony with generators. To be clear, it is. + +But we skipped over another pattern that has some mildly widespread adoption, so in the interest of completeness we'll take a brief look at it. + +In general computer science, there's an old pre-JS concept called a "thunk." Without getting bogged down in the historical nature, a narrow expression of a thunk in JS is a function that -- without any parameters -- is wired to call another function. + +In other words, you wrap a function definition around function call -- with any parameters it needs -- to *defer* the execution of that call, and that wrapping function is a thunk. When you later execute the thunk, you end up calling the original function. + +For example: + +```js +function foo(x,y) { + return x + y; +} + +function fooThunk() { + return foo( 3, 4 ); +} + +// later + +console.log( fooThunk() ); // 7 +``` + +So, a synchronous thunk is pretty straightforward. But what about an async thunk? We can essentially extend the narrow thunk definition to include it receiving a callback. + +Consider: + +```js +function foo(x,y,cb) { + setTimeout( function(){ + cb( x + y ); + }, 1000 ); +} + +function fooThunk(cb) { + foo( 3, 4, cb ); +} + +// later + +fooThunk( function(sum){ + console.log( sum ); // 7 +} ); +``` + +As you can see, `fooThunk(..)` only expects a `cb(..)` parameter, as it already has values `3` and `4` (for `x` and `y`, respectively) pre-specified and ready to pass to `foo(..)`. A thunk is just waiting around patiently for the last piece it needs to do its job: the callback. + +You don't want to make thunks manually, though. So, let's invent a utility that does this wrapping for us. + +Consider: + +```js +function thunkify(fn) { + var args = [].slice.call( arguments, 1 ); + return function(cb) { + args.push( cb ); + return fn.apply( null, args ); + }; +} + +var fooThunk = thunkify( foo, 3, 4 ); + +// later + +fooThunk( function(sum) { + console.log( sum ); // 7 +} ); +``` + +**Tip:** Here we assume that the original (`foo(..)`) function signature expects its callback in the last position, with any other parameters coming before it. This is a pretty ubiquitous "standard" for async JS function standards. You might call it "callback-last style." If for some reason you had a need to handle "callback-first style" signatures, you would just make a utility that used `args.unshift(..)` instead of `args.push(..)`. + +The preceding formulation of `thunkify(..)` takes both the `foo(..)` function reference, and any parameters it needs, and returns back the thunk itself (`fooThunk(..)`). However, that's not the typical approach you'll find to thunks in JS. + +Instead of `thunkify(..)` making the thunk itself, typically -- if not perplexingly -- the `thunkify(..)` utility would produce a function that produces thunks. + +Uhhhh... yeah. + +Consider: + +```js +function thunkify(fn) { + return function() { + var args = [].slice.call( arguments ); + return function(cb) { + args.push( cb ); + return fn.apply( null, args ); + }; + }; +} +``` + +The main difference here is the extra `return function() { .. }` layer. Here's how its usage differs: + +```js +var whatIsThis = thunkify( foo ); + +var fooThunk = whatIsThis( 3, 4 ); + +// later + +fooThunk( function(sum) { + console.log( sum ); // 7 +} ); +``` + +Obviously, the big question this snippet implies is what is `whatIsThis` properly called? It's not the thunk, it's the thing that will produce thunks from `foo(..)` calls. It's kind of like a "factory" for "thunks." There doesn't seem to be any kind of standard agreement for naming such a thing. + +So, my proposal is "thunkory" ("thunk" + "factory"). So, `thunkify(..)` produces a thunkory, and a thunkory produces thunks. That reasoning is symmetric to my proposal for "promisory" in Chapter 3: + +```js +var fooThunkory = thunkify( foo ); + +var fooThunk1 = fooThunkory( 3, 4 ); +var fooThunk2 = fooThunkory( 5, 6 ); + +// later + +fooThunk1( function(sum) { + console.log( sum ); // 7 +} ); + +fooThunk2( function(sum) { + console.log( sum ); // 11 +} ); +``` + +**Note:** The running `foo(..)` example expects a style of callback that's not "error-first style." Of course, "error-first style" is much more common. If `foo(..)` had some sort of legitimate error-producing expectation, we could change it to expect and use an error-first callback. None of the subsequent `thunkify(..)` machinery cares what style of callback is assumed. The only difference in usage would be `fooThunk1(function(err,sum){..`. + +Exposing the thunkory method -- instead of how the earlier `thunkify(..)` hides this intermediary step -- may seem like unnecessary complication. But in general, it's quite useful to make thunkories at the beginning of your program to wrap existing API methods, and then be able to pass around and call those thunkories when you need thunks. The two distinct steps preserve a cleaner separation of capability. + +To illustrate: + +```js +// cleaner: +var fooThunkory = thunkify( foo ); + +var fooThunk1 = fooThunkory( 3, 4 ); +var fooThunk2 = fooThunkory( 5, 6 ); + +// instead of: +var fooThunk1 = thunkify( foo, 3, 4 ); +var fooThunk2 = thunkify( foo, 5, 6 ); +``` + +Regardless of whether you like to deal with the thunkories explicitly or not, the usage of thunks `fooThunk1(..)` and `fooThunk2(..)` remains the same. + +### s/promise/thunk/ + +So what's all this thunk stuff have to do with generators? + +Comparing thunks to promises generally: they're not directly interchangable as they're not equivalent in behavior. Promises are vastly more capable and trustable than bare thunks. + +But in another sense, they both can be seen as a request for a value, which may be async in its answering. + +Recall from Chapter 3 we defined a utility for promisifying a function, which we called `Promise.wrap(..)` -- we could have called it `promisify(..)`, too! This Promise-wrapping utility doesn't produce Promises; it produces promisories that in turn produce Promises. This is completely symmetric to the thunkories and thunks presently being discussed. + +To illustrate the symmetry, let's first alter the running `foo(..)` example from earlier to assume an "error-first style" callback: + +```js +function foo(x,y,cb) { + setTimeout( function(){ + // assume `cb(..)` as "error-first style" + cb( null, x + y ); + }, 1000 ); +} +``` + +Now, we'll compare using `thunkify(..)` and `promisify(..)` (aka `Promise.wrap(..)` from Chapter 3): + +```js +// symmetrical: constructing the question asker +var fooThunkory = thunkify( foo ); +var fooPromisory = promisify( foo ); + +// symmetrical: asking the question +var fooThunk = fooThunkory( 3, 4 ); +var fooPromise = fooPromisory( 3, 4 ); + +// get the thunk answer +fooThunk( function(err,sum){ + if (err) { + console.error( err ); + } + else { + console.log( sum ); // 7 + } +} ); + +// get the promise answer +fooPromise +.then( + function(sum){ + console.log( sum ); // 7 + }, + function(err){ + console.error( err ); + } +); +``` + +Both the thunkory and the promisory are essentially asking a question (for a value), and respectively the thunk `fooThunk` and promise `fooPromise` represent the future answers to that question. Presented in that light, the symmetry is clear. + +With that perspective in mind, we can see that generators which `yield` Promises for asynchrony could instead `yield` thunks for asynchrony. All we'd need is a smarter `run(..)` utility (like from before) that can not only look for and wire up to a `yield`ed Promise but also to provide a callback to a `yield`ed thunk. + +Consider: + +```js +function *foo() { + var val = yield request( "http://some.url.1" ); + console.log( val ); +} + +run( foo ); +``` + +In this example, `request(..)` could either be a promisory that returns a promise, or a thunkory that returns a thunk. From the perspective of what's going on inside the generator code logic, we don't care about that implementation detail, which is quite powerful! + +So, `request(..)` could be either: + +```js +// promisory `request(..)` (see Chapter 3) +var request = Promise.wrap( ajax ); + +// vs. + +// thunkory `request(..)` +var request = thunkify( ajax ); +``` + +Finally, as a thunk-aware patch to our earlier `run(..)` utility, we would need logic like this: + +```js +// .. +// did we receive a thunk back? +else if (typeof next.value == "function") { + return new Promise( function(resolve,reject){ + // call the thunk with an error-first callback + next.value( function(err,msg) { + if (err) { + reject( err ); + } + else { + resolve( msg ); + } + } ); + } ) + .then( + handleNext, + function handleErr(err) { + return Promise.resolve( + it.throw( err ) + ) + .then( handleResult ); + } + ); +} +``` + +Now, our generators can either call promisories to `yield` Promises, or call thunkories to `yield` thunks, and in either case, `run(..)` would handle that value and use it to wait for the completion to resume the generator. + +Symmetry wise, these two approaches look identical. However, we should point out that's true only from the perspective of Promises or thunks representing the future value continuation of a generator. + +From the larger perspective, thunks do not in and of themselves have hardly any of the trustability or composability guarantees that Promises are designed with. Using a thunk as a stand-in for a Promise in this particular generator asynchrony pattern is workable but should be seen as less than ideal when compared to all the benefits that Promises offer (see Chapter 3). + +If you have the option, prefer `yield pr` rather than `yield th`. But there's nothing wrong with having a `run(..)` utility which can handle both value types. + +**Note:** The `runner(..)` utility in my *asynquence* library, which will be discussed in Appendix A, handles `yield`s of Promises, thunks and *asynquence* sequences. + +## Pre-ES6 Generators + +You're hopefully convinced now that generators are a very important addition to the async programming toolbox. But it's a new syntax in ES6, which means you can't just polyfill generators like you can Promises (which are just a new API). So what can we do to bring generators to our browser JS if we don't have the luxury of ignoring pre-ES6 browsers? + +For all new syntax extensions in ES6, there are tools -- the most common term for them is transpilers, for trans-compilers -- which can take your ES6 syntax and transform it into equivalent (but obviously uglier!) pre-ES6 code. So, generators can be transpiled into code that will have the same behavior but work in ES5 and below. + +But how? The "magic" of `yield` doesn't obviously sound like code that's easy to transpile. We actually hinted at a solution in our earlier discussion of closure-based *iterators*. + +### Manual Transformation + +Before we discuss the transpilers, let's derive how manual transpilation would work in the case of generators. This isn't just an academic exercise, because doing so will actually help further reinforce how they work. + +Consider: + +```js +// `request(..)` is a Promise-aware Ajax utility + +function *foo(url) { + try { + console.log( "requesting:", url ); + var val = yield request( url ); + console.log( val ); + } + catch (err) { + console.log( "Oops:", err ); + return false; + } +} + +var it = foo( "http://some.url.1" ); +``` + +The first thing to observe is that we'll still need a normal `foo()` function that can be called, and it will still need to return an *iterator*. So, let's sketch out the non-generator transformation: + +```js +function foo(url) { + + // .. + + // make and return an iterator + return { + next: function(v) { + // .. + }, + throw: function(e) { + // .. + } + }; +} + +var it = foo( "http://some.url.1" ); +``` + +The next thing to observe is that a generator does its "magic" by suspending its scope/state, but we can emulate that with function closure (see the *Scope & Closures* title of this series). To understand how to write such code, we'll first annotate different parts of our generator with state values: + +```js +// `request(..)` is a Promise-aware Ajax utility + +function *foo(url) { + // STATE *1* + + try { + console.log( "requesting:", url ); + var TMP1 = request( url ); + + // STATE *2* + var val = yield TMP1; + console.log( val ); + } + catch (err) { + // STATE *3* + console.log( "Oops:", err ); + return false; + } +} +``` + +**Note:** For more accurate illustration, we split up the `val = yield request..` statement into two parts, using the temporary `TMP1` variable. `request(..)` happens in state `*1*`, and the assignment of its completion value to `val` happens in state `*2*`. We'll get rid of that intermediate `TMP1` when we convert the code to its non-generator equivalent. + +In other words, `*1*` is the beginning state, `*2*` is the state if the `request(..)` succeeds, and `*3*` is the state if the `request(..)` fails. You can probably imagine how any extra `yield` steps would just be encoded as extra states. + +Back to our transpiled generator, let's define a variable `state` in the closure we can use to keep track of the state: + +```js +function foo(url) { + // manage generator state + var state; + + // .. +} +``` + +Now, let's define an inner function called `process(..)` inside the closure which handles each state, using a `switch` statement: + +```js +// `request(..)` is a Promise-aware Ajax utility + +function foo(url) { + // manage generator state + var state; + + // generator-wide variable declarations + var val; + + function process(v) { + switch (state) { + case 1: + console.log( "requesting:", url ); + return request( url ); + case 2: + val = v; + console.log( val ); + return; + case 3: + var err = v; + console.log( "Oops:", err ); + return false; + } + } + + // .. +} +``` + +Each state in our generator is represented by its own `case` in the `switch` statement. `process(..)` will be called each time we need to process a new state. We'll come back to how that works in just a moment. + +For any generator-wide variable declarations (`val`), we move those to a `var` declaration outside of `process(..)` so they can survive multiple calls to `process(..)`. But the "block scoped" `err` variable is only needed for the `*3*` state, so we leave it in place. + +In state `*1*`, instead of `yield request(..)`, we did `return request(..)`. In terminal state `*2*`, there was no explicit `return`, so we just do a `return;` which is the same as `return undefined`. In terminal state `*3*`, there was a `return false`, so we preserve that. + +Now we need to define the code in the *iterator* functions so they call `process(..)` appropriately: + +```js +function foo(url) { + // manage generator state + var state; + + // generator-wide variable declarations + var val; + + function process(v) { + switch (state) { + case 1: + console.log( "requesting:", url ); + return request( url ); + case 2: + val = v; + console.log( val ); + return; + case 3: + var err = v; + console.log( "Oops:", err ); + return false; + } + } + + // make and return an iterator + return { + next: function(v) { + // initial state + if (!state) { + state = 1; + return { + done: false, + value: process() + }; + } + // yield resumed successfully + else if (state == 1) { + state = 2; + return { + done: true, + value: process( v ) + }; + } + // generator already completed + else { + return { + done: true, + value: undefined + }; + } + }, + "throw": function(e) { + // the only explicit error handling is in + // state *1* + if (state == 1) { + state = 3; + return { + done: true, + value: process( e ) + }; + } + // otherwise, an error won't be handled, + // so just throw it right back out + else { + throw e; + } + } + }; +} +``` + +How does this code work? + +1. The first call to the *iterator*'s `next()` call would move the generator from the uninitialized state to state `1`, and then call `process()` to handle that state. The return value from `request(..)`, which is the promise for the Ajax response, is returned back as the `value` property from the `next()` call. +2. If the Ajax request succeeds, the second call to `next(..)` should send in the Ajax response value, which moves our state to `2`. `process(..)` is again called (this time with the passed in Ajax response value), and the `value` property returned from `next(..)` will be `undefined`. +3. However, if the Ajax request fails, `throw(..)` should be called with the error, which would move the state from `1` to `3` (instead of `2`). Again `process(..)` is called, this time with the error value. That `case` returns `false`, which is set as the `value` property returned from the `throw(..)` call. + +From the outside -- that is, interacting only with the *iterator* -- this `foo(..)` normal function works pretty much the same as the `*foo(..)` generator would have worked. So we've effectively "transpiled" our ES6 generator to pre-ES6 compatibility! + +We could then manually instantiate our generator and control its iterator -- calling `var it = foo("..")` and `it.next(..)` and such -- or better, we could pass it to our previously defined `run(..)` utility as `run(foo,"..")`. + +### Automatic Transpilation + +The preceding exercise of manually deriving a transformation of our ES6 generator to pre-ES6 equivalent teaches us how generators work conceptually. But that transformation was really intricate and very non-portable to other generators in our code. It would be quite impractical to do this work by hand, and would completely obviate all the benefit of generators. + +But luckily, several tools already exist that can automatically convert ES6 generators to things like what we derived in the previous section. Not only do they do the heavy lifting work for us, but they also handle several complications that we glossed over. + +One such tool is regenerator (https://facebook.github.io/regenerator/), from the smart folks at Facebook. + +If we use regenerator to transpile our previous generator, here's the code produced (at the time of this writing): + +```js +// `request(..)` is a Promise-aware Ajax utility + +var foo = regeneratorRuntime.mark(function foo(url) { + var val; + + return regeneratorRuntime.wrap(function foo$(context$1$0) { + while (1) switch (context$1$0.prev = context$1$0.next) { + case 0: + context$1$0.prev = 0; + console.log( "requesting:", url ); + context$1$0.next = 4; + return request( url ); + case 4: + val = context$1$0.sent; + console.log( val ); + context$1$0.next = 12; + break; + case 8: + context$1$0.prev = 8; + context$1$0.t0 = context$1$0.catch(0); + console.log("Oops:", context$1$0.t0); + return context$1$0.abrupt("return", false); + case 12: + case "end": + return context$1$0.stop(); + } + }, foo, this, [[0, 8]]); +}); +``` + +There's some obvious similarities here to our manual derivation, such as the `switch` / `case` statements, and we even see `val` pulled out of the closure just as we did. + +Of course, one trade-off is that regenerator's transpilation requires a helper library `regeneratorRuntime` that holds all the reusable logic for managing a general generator / *iterator*. A lot of that boilerplate looks different than our version, but even then, the concepts can be seen, like with `context$1$0.next = 4` keeping track of the next state for the generator. + +The main takeaway is that generators are not restricted to only being useful in ES6+ environments. Once you understand the concepts, you can employ them throughout your code, and use tools to transform the code to be compatible with older environments. + +This is more work than just using a `Promise` API polyfill for pre-ES6 Promises, but the effort is totally worth it, because generators are so much better at expressing async flow control in a reason-able, sensible, synchronous-looking, sequential fashion. + +Once you get hooked on generators, you'll never want to go back to the hell of async spaghetti callbacks! + +## Review + +Generators are a new ES6 function type that does not run-to-completion like normal functions. Instead, the generator can be paused in mid-completion (entirely preserving its state), and it can later be resumed from where it left off. + +This pause/resume interchange is cooperative rather than preemptive, which means that the generator has the sole capability to pause itself, using the `yield` keyword, and yet the *iterator* that controls the generator has the sole capability (via `next(..)`) to resume the generator. + +The `yield` / `next(..)` duality is not just a control mechanism, it's actually a two-way message passing mechanism. A `yield ..` expression essentially pauses waiting for a value, and the next `next(..)` call passes a value (or implicit `undefined`) back to that paused `yield` expression. + +The key benefit of generators related to async flow control is that the code inside a generator expresses a sequence of steps for the task in a naturally sync/sequential fashion. The trick is that we essentially hide potential asynchrony behind the `yield` keyword -- moving the asynchrony to the code where the generator's *iterator* is controlled. + +In other words, generators preserve a sequential, synchronous, blocking code pattern for async code, which lets our brains reason about the code much more naturally, addressing one of the two key drawbacks of callback-based async. diff --git a/async & performance/ch5.md b/async & performance/ch5.md new file mode 100644 index 0000000..fb9820a --- /dev/null +++ b/async & performance/ch5.md @@ -0,0 +1,368 @@ +# You Don't Know JS: Async & Performance +# Chapter 5: Program Performance + +This book so far has been all about how to leverage asynchrony patterns more effectively. But we haven't directly addressed why asynchrony really matters to JS. The most obvious explicit reason is **performance**. + +For example, if you have two Ajax requests to make, and they're independent, but you need to wait on them both to finish before doing the next task, you have two options for modeling that interaction: serial and concurrent. + +You could make the first request and wait to start the second request until the first finishes. Or, as we've seen both with promises and generators, you could make both requests "in parallel," and express the "gate" to wait on both of them before moving on. + +Clearly, the latter is usually going to be more performant than the former. And better performance generally leads to better user experience. + +It's even possible that asynchrony (interleaved concurrency) can improve just the perception of performance, even if the overall program still takes the same amount of time to complete. User perception of performance is every bit -- if not more! -- as important as actual measurable performance. + +We want to now move beyond localized asynchrony patterns to talk about some bigger picture performance details at the program level. + +**Note:** You may be wondering about micro-performance issues like if `a++` or `++a` is faster. We'll look at those sorts of performance details in the next chapter on "Benchmarking & Tuning." + +## Web Workers + +If you have processing-intensive tasks but you don't want them to run on the main thread (which may slow down the browser/UI), you might have wished that JavaScript could operate in a multithreaded manner. + +In Chapter 1, we talked in detail about how JavaScript is single threaded. And that's still true. But a single thread isn't the only way to organize the execution of your program. + +Imagine splitting your program into two pieces, and running one of those pieces on the main UI thread, and running the other piece on an entirely separate thread. + +What kinds of concerns would such an architecture bring up? + +For one, you'd want to know if running on a separate thread meant that it ran in parallel (on systems with multiple CPUs/cores) such that a long-running process on that second thread would **not** block the main program thread. Otherwise, "virtual threading" wouldn't be of much benefit over what we already have in JS with async concurrency. + +And you'd want to know if these two pieces of the program have access to the same shared scope/resources. If they do, then you have all the questions that multithreaded languages (Java, C++, etc.) deal with, such as needing cooperative or preemptive locking (mutexes, etc.). That's a lot of extra work, and shouldn't be undertaken lightly. + +Alternatively, you'd want to know how these two pieces could "communicate" if they couldn't share scope/resources. + +All these are great questions to consider as we explore a feature added to the web platform circa HTML5 called "Web Workers." This is a feature of the browser (aka host environment) and actually has almost nothing to do with the JS language itself. That is, JavaScript does not *currently* have any features that support threaded execution. + +But an environment like your browser can easily provide multiple instances of the JavaScript engine, each on its own thread, and let you run a different program in each thread. Each of those separate threaded pieces of your program is called a "(Web) Worker." This type of parallelism is called "task parallelism," as the emphasis is on splitting up chunks of your program to run in parallel. + +From your main JS program (or another Worker), you instantiate a Worker like so: + +```js +var w1 = new Worker( "http://some.url.1/mycoolworker.js" ); +``` + +The URL should point to the location of a JS file (not an HTML page!) which is intended to be loaded into a Worker. The browser will then spin up a separate thread and let that file run as an independent program in that thread. + +**Note:** The kind of Worker created with such a URL is called a "Dedicated Worker." But instead of providing a URL to an external file, you can also create an "Inline Worker" by providing a Blob URL (another HTML5 feature); essentially it's an inline file stored in a single (binary) value. However, Blobs are beyond the scope of what we'll discuss here. + +Workers do not share any scope or resources with each other or the main program -- that would bring all the nightmares of threaded programming to the forefront -- but instead have a basic event messaging mechanism connecting them. + +The `w1` Worker object is an event listener and trigger, which lets you subscribe to events sent by the Worker as well as send events to the Worker. + +Here's how to listen for events (actually, the fixed `"message"` event): + +```js +w1.addEventListener( "message", function(evt){ + // evt.data +} ); +``` + +And you can send the `"message"` event to the Worker: + +```js +w1.postMessage( "something cool to say" ); +``` + +Inside the Worker, the messaging is totally symmetrical: + +```js +// "mycoolworker.js" + +addEventListener( "message", function(evt){ + // evt.data +} ); + +postMessage( "a really cool reply" ); +``` + +Notice that a dedicated Worker is in a one-to-one relationship with the program that created it. That is, the `"message"` event doesn't need any disambiguation here, because we're sure that it could only have come from this one-to-one relationship -- either it came from the Worker or the main page. + +Usually the main page application creates the Workers, but a Worker can instantiate its own child Worker(s) -- known as subworkers -- as necessary. Sometimes this is useful to delegate such details to a sort of "master" Worker that spawns other Workers to process parts of a task. Unfortunately, at the time of this writing, Chrome still does not support subworkers, while Firefox does. + +To kill a Worker immediately from the program that created it, call `terminate()` on the Worker object (like `w1` in the previous snippets). Abruptly terminating a Worker thread does not give it any chance to finish up its work or clean up any resources. It's akin to you closing a browser tab to kill a page. + +If you have two or more pages (or multiple tabs with the same page!) in the browser that try to create a Worker from the same file URL, those will actually end up as completely separate Workers. Shortly, we'll discuss a way to "share" a Worker. + +**Note:** It may seem like a malicious or ignorant JS program could easily perform a denial-of-service attack on a system by spawning hundreds of Workers, seemingly each with their own thread. While it's true that it's somewhat of a guarantee that a Worker will end up on a separate thread, this guarantee is not unlimited. The system is free to decide how many actual threads/CPUs/cores it really wants to create. There's no way to predict or guarantee how many you'll have access to, though many people assume it's at least as many as the number of CPUs/cores available. I think the safest assumption is that there's at least one other thread besides the main UI thread, but that's about it. + +### Worker Environment + +Inside the Worker, you do not have access to any of the main program's resources. That means you cannot access any of its global variables, nor can you access the page's DOM or other resources. Remember: it's a totally separate thread. + +You can, however, perform network operations (Ajax, WebSockets) and set timers. Also, the Worker has access to its own copy of several important global variables/features, including `navigator`, `location`, `JSON`, and `applicationCache`. + +You can also load extra JS scripts into your Worker, using `importScripts(..)`: + +```js +// inside the Worker +importScripts( "foo.js", "bar.js" ); +``` + +These scripts are loaded synchronously, which means the `importScripts(..)` call will block the rest of the Worker's execution until the file(s) are finished loading and executing. + +**Note:** There have also been some discussions about exposing the `` API to Workers, which combined with having canvases be Transferables (see the "Data Transfer" section), would allow Workers to perform more sophisticated off-thread graphics processing, which can be useful for high-performance gaming (WebGL) and other similar applications. Although this doesn't exist yet in any browsers, it's likely to happen in the near future. + +What are some common uses for Web Workers? + +* Processing intensive math calculations +* Sorting large data sets +* Data operations (compression, audio analysis, image pixel manipulations, etc.) +* High-traffic network communications + +### Data Transfer + +You may notice a common characteristic of most of those uses, which is that they require a large amount of information to be transferred across the barrier between threads using the event mechanism, perhaps in both directions. + +In the early days of Workers, serializing all data to a string value was the only option. In addition to the speed penalty of the two-way serializations, the other major negative was that the data was being copied, which meant a doubling of memory usage (and the subsequent churn of garbage collection). + +Thankfully, we now have a few better options. + +If you pass an object, a so-called "Structured Cloning Algorithm" (https://developer.mozilla.org/en-US/docs/Web/Guide/API/DOM/The_structured_clone_algorithm) is used to copy/duplicate the object on the other side. This algorithm is fairly sophisticated and can even handle duplicating objects with circular references. The to-string/from-string performance penalty is not paid, but we still have duplication of memory using this approach. There is support for this in IE10 and above, as well as all the other major browsers. + +An even better option, especially for larger data sets, is "Transferable Objects" (http://updates.html5rocks.com/2011/12/Transferable-Objects-Lightning-Fast). What happens is that the object's "ownership" is transferred, but the data itself is not moved. Once you transfer away an object to a Worker, it's empty or inaccessible in the originating location -- that eliminates the hazards of threaded programming over a shared scope. Of course, transfer of ownership can go in both directions. + +There really isn't much you need to do to opt into a Transferable Object; any data structure that implements the Transferable interface (https://developer.mozilla.org/en-US/docs/Web/API/Transferable) will automatically be transferred this way (support Firefox & Chrome). + +For example, typed arrays like `Uint8Array` (see the *ES6 & Beyond* title of this series) are "Transferables." This is how you'd send a Transferable Object using `postMessage(..)`: + +```js +// `foo` is a `Uint8Array` for instance + +postMessage( foo.buffer, [ foo.buffer ] ); +``` + +The first parameter is the raw buffer and the second parameter is a list of what to transfer. + +Browsers that don't support Transferable Objects simply degrade to structured cloning, which means performance reduction rather than outright feature breakage. + +### Shared Workers + +If your site or app allows for loading multiple tabs of the same page (a common feature), you may very well want to reduce the resource usage of their system by preventing duplicate dedicated Workers; the most common limited resource in this respect is a socket network connection, as browsers limit the number of simultaneous connections to a single host. Of course, limiting multiple connections from a client also eases your server resource requirements. + +In this case, creating a single centralized Worker that all the page instances of your site or app can *share* is quite useful. + +That's called a `SharedWorker`, which you create like so (support for this is limited to Firefox and Chrome): + +```js +var w1 = new SharedWorker( "http://some.url.1/mycoolworker.js" ); +``` + +Because a shared Worker can be connected to or from more than one program instance or page on your site, the Worker needs a way to know which program a message comes from. This unique identification is called a "port" -- think network socket ports. So the calling program must use the `port` object of the Worker for communication: + +```js +w1.port.addEventListener( "message", handleMessages ); + +// .. + +w1.port.postMessage( "something cool" ); +``` + +Also, the port connection must be initialized, as: + +```js +w1.port.start(); +``` + +Inside the shared Worker, an extra event must be handled: `"connect"`. This event provides the port `object` for that particular connection. The most convenient way to keep multiple connections separate is to use closure (see *Scope & Closures* title of this series) over the `port`, as shown next, with the event listening and transmitting for that connection defined inside the handler for the `"connect"` event: + +```js +// inside the shared Worker +addEventListener( "connect", function(evt){ + // the assigned port for this connection + var port = evt.ports[0]; + + port.addEventListener( "message", function(evt){ + // .. + + port.postMessage( .. ); + + // .. + } ); + + // initialize the port connection + port.start(); +} ); +``` + +Other than that difference, shared and dedicated Workers have the same capabilities and semantics. + +**Note:** Shared Workers survive the termination of a port connection if other port connections are still alive, whereas dedicated Workers are terminated whenever the connection to their initiating program is terminated. + +### Polyfilling Web Workers + +Web Workers are very attractive performance-wise for running JS programs in parallel. However, you may be in a position where your code needs to run in older browsers that lack support. Because Workers are an API and not a syntax, they can be polyfilled, to an extent. + +If a browser doesn't support Workers, there's simply no way to fake multithreading from the performance perspective. Iframes are commonly thought of to provide a parallel environment, but in all modern browsers they actually run on the same thread as the main page, so they're not sufficient for faking parallelism. + +As we detailed in Chapter 1, JS's asynchronicity (not parallelism) comes from the event loop queue, so you can force faked Workers to be asynchronous using timers (`setTimeout(..)`, etc.). Then you just need to provide a polyfill for the Worker API. There are some listed here (https://github.com/Modernizr/Modernizr/wiki/HTML5-Cross-Browser-Polyfills#web-workers), but frankly none of them look great. + +I've written a sketch of a polyfill for `Worker` here (https://gist.github.com/getify/1b26accb1a09aa53ad25). It's basic, but it should get the job done for simple `Worker` support, given that the two-way messaging works correctly as well as `"onerror"` handling. You could probably also extend it with more features, such as `terminate()` or faked Shared Workers, as you see fit. + +**Note:** You can't fake synchronous blocking, so this polyfill just disallows use of `importScripts(..)`. Another option might have been to parse and transform the Worker's code (once Ajax loaded) to handle rewriting to some asynchronous form of an `importScripts(..)` polyfill, perhaps with a promise-aware interface. + +## SIMD + +Single instruction, multiple data (SIMD) is a form of "data parallelism," as contrasted to "task parallelism" with Web Workers, because the emphasis is not really on program logic chunks being parallelized, but rather multiple bits of data being processed in parallel. + +With SIMD, threads don't provide the parallelism. Instead, modern CPUs provide SIMD capability with "vectors" of numbers -- think: type specialized arrays -- as well as instructions that can operate in parallel across all the numbers; these are low-level operations leveraging instruction-level parallelism. + +The effort to expose SIMD capability to JavaScript is primarily spearheaded by Intel (https://01.org/node/1495), namely by Mohammad Haghighat (at the time of this writing), in cooperation with Firefox and Chrome teams. SIMD is on an early standards track with a good chance of making it into a future revision of JavaScript, likely in the ES7 timeframe. + +SIMD JavaScript proposes to expose short vector types and APIs to JS code, which on those SIMD-enabled systems would map the operations directly through to the CPU equivalents, with fallback to non-parallelized operation "shims" on non-SIMD systems. + +The performance benefits for data-intensive applications (signal analysis, matrix operations on graphics, etc.) with such parallel math processing are quite obvious! + +Early proposal forms of the SIMD API at the time of this writing look like this: + +```js +var v1 = SIMD.float32x4( 3.14159, 21.0, 32.3, 55.55 ); +var v2 = SIMD.float32x4( 2.1, 3.2, 4.3, 5.4 ); + +var v3 = SIMD.int32x4( 10, 101, 1001, 10001 ); +var v4 = SIMD.int32x4( 10, 20, 30, 40 ); + +SIMD.float32x4.mul( v1, v2 ); // [ 6.597339, 67.2, 138.89, 299.97 ] +SIMD.int32x4.add( v3, v4 ); // [ 20, 121, 1031, 10041 ] +``` + +Shown here are two different vector data types, 32-bit floating-point numbers and 32-bit integer numbers. You can see that these vectors are sized exactly to four 32-bit elements, as this matches the SIMD vector sizes (128-bit) available in most modern CPUs. It's also possible we may see an `x8` (or larger!) version of these APIs in the future. + +Besides `mul()` and `add()`, many other operations are likely to be included, such as `sub()`, `div()`, `abs()`, `neg()`, `sqrt()`, `reciprocal()`, `reciprocalSqrt()` (arithmetic), `shuffle()` (rearrange vector elements), `and()`, `or()`, `xor()`, `not()` (logical), `equal()`, `greaterThan()`, `lessThan()` (comparison), `shiftLeft()`, `shiftRightLogical()`, `shiftRightArithmetic()` (shifts), `fromFloat32x4()`, and `fromInt32x4()` (conversions). + +**Note:** There's an official "prollyfill" (hopeful, expectant, future-leaning polyfill) for the SIMD functionality available (https://github.com/johnmccutchan/ecmascript_simd), which illustrates a lot more of the planned SIMD capability than we've illustrated in this section. + +## asm.js + +"asm.js" (http://asmjs.org/) is a label for a highly optimizable subset of the JavaScript language. By carefully avoiding certain mechanisms and patterns that are *hard* to optimize (garbage collection, coercion, etc.), asm.js-styled code can be recognized by the JS engine and given special attention with aggressive low-level optimizations. + +Distinct from other program performance mechanisms discussed in this chapter, asm.js isn't necessarily something that needs to be adopted into the JS language specification. There *is* an asm.js specification (http://asmjs.org/spec/latest/), but it's mostly for tracking an agreed upon set of candidate inferences for optimization rather than a set of requirements of JS engines. + +There's not currently any new syntax being proposed. Instead, asm.js suggests ways to recognize existing standard JS syntax that conforms to the rules of asm.js and let engines implement their own optimizations accordingly. + +There's been some disagreement between browser vendors over exactly how asm.js should be activated in a program. Early versions of the asm.js experiment required a `"use asm";` pragma (similar to strict mode's `"use strict";`) to help clue the JS engine to be looking for asm.js optimization opportunities and hints. Others have asserted that asm.js should just be a set of heuristics that engines automatically recognize without the author having to do anything extra, meaning that existing programs could theoretically benefit from asm.js-style optimizations without doing anything special. + +### How to Optimize with asm.js + +The first thing to understand about asm.js optimizations is around types and coercion (see the *Types & Grammar* title of this series). If the JS engine has to track multiple different types of values in a variable through various operations, so that it can handle coercions between types as necessary, that's a lot of extra work that keeps the program optimization suboptimal. + +**Note:** We're going to use asm.js-style code here for illustration purposes, but be aware that it's not commonly expected that you'll author such code by hand. asm.js is more intended to a compilation target from other tools, such as Emscripten (https://github.com/kripken/emscripten/wiki). It's of course possible to write your own asm.js code, but that's usually a bad idea because the code is very low level and managing it can be very time consuming and error prone. Nevertheless, there may be cases where you'd want to hand tweak your code for asm.js optimization purposes. + +There are some "tricks" you can use to hint to an asm.js-aware JS engine what the intended type is for variables/operations, so that it can skip these coercion tracking steps. + +For example: + +```js +var a = 42; + +// .. + +var b = a; +``` + +In that program, the `b = a` assignment leaves the door open for type divergence in variables. However, it could instead be written as: + +```js +var a = 42; + +// .. + +var b = a | 0; +``` + +Here, we've used the `|` ("binary OR") with value `0`, which has no effect on the value other than to make sure it's a 32-bit integer. That code run in a normal JS engine works just fine, but when run in an asm.js-aware JS engine it *can* signal that `b` should always be treated as a 32-bit integer, so the coercion tracking can be skipped. + +Similarly, the addition operation between two variables can be restricted to a more performant integer addition (instead of floating point): + +```js +(a + b) | 0 +``` + +Again, the asm.js-aware JS engine can see that hint and infer that the `+` operation should be 32-bit integer addition because the end result of the whole expression would automatically be 32-bit integer conformed anyway. + +### asm.js Modules + +One of the biggest detractors to performance in JS is around memory allocation, garbage collection, and scope access. asm.js suggests one of the ways around these issues is to declare a more formalized asm.js "module" -- do not confuse these with ES6 modules; see the *ES6 & Beyond* title of this series. + +For an asm.js module, you need to explicitly pass in a tightly conformed namespace -- this is referred to in the spec as `stdlib`, as it should represent standard libraries needed -- to import necessary symbols, rather than just using globals via lexical scope. In the base case, the `window` object is an acceptable `stdlib` object for asm.js module purposes, but you could and perhaps should construct an even more restricted one. + +You also must declare a "heap" -- which is just a fancy term for a reserved spot in memory where variables can already be used without asking for more memory or releasing previously used memory -- and pass that in, so that the asm.js module won't need to do anything that would cause memory churn; it can just use the pre-reserved space. + +A "heap" is likely a typed `ArrayBuffer`, such as: + +```js +var heap = new ArrayBuffer( 0x10000 ); // 64k heap +``` + +Using that pre-reserved 64k of binary space, an asm.js module can store and retrieve values in that buffer without any memory allocation or garbage collection penalties. For example, the `heap` buffer could be used inside the module to back an array of 64-bit float values like this: + +```js +var arr = new Float64Array( heap ); +``` + +OK, so let's make a quick, silly example of an asm.js-styled module to illustrate how these pieces fit together. We'll define a `foo(..)` that takes a start (`x`) and end (`y`) integer for a range, and calculates all the inner adjacent multiplications of the values in the range, and then finally averages those values together: + +```js +function fooASM(stdlib,foreign,heap) { + "use asm"; + + var arr = new stdlib.Int32Array( heap ); + + function foo(x,y) { + x = x | 0; + y = y | 0; + + var i = 0; + var p = 0; + var sum = 0; + var count = ((y|0) - (x|0)) | 0; + + // calculate all the inner adjacent multiplications + for (i = x | 0; + (i | 0) < (y | 0); + p = (p + 8) | 0, i = (i + 1) | 0 + ) { + // store result + arr[ p >> 3 ] = (i * (i + 1)) | 0; + } + + // calculate average of all intermediate values + for (i = 0, p = 0; + (i | 0) < (count | 0); + p = (p + 8) | 0, i = (i + 1) | 0 + ) { + sum = (sum + arr[ p >> 3 ]) | 0; + } + + return +(sum / count); + } + + return { + foo: foo + }; +} + +var heap = new ArrayBuffer( 0x1000 ); +var foo = fooASM( window, null, heap ).foo; + +foo( 10, 20 ); // 233 +``` + +**Note:** This asm.js example is hand authored for illustration purposes, so it doesn't represent the same code that would be produced from a compilation tool targeting asm.js. But it does show the typical nature of asm.js code, especially the type hinting and use of the `heap` buffer for temporary variable storage. + +The first call to `fooASM(..)` is what sets up our asm.js module with its `heap` allocation. The result is a `foo(..)` function we can call as many times as necessary. Those `foo(..)` calls should be specially optimized by an asm.js-aware JS engine. Importantly, the preceding code is completely standard JS and would run just fine (without special optimization) in a non-asm.js engine. + +Obviously, the nature of restrictions that make asm.js code so optimizable reduces the possible uses for such code significantly. asm.js won't necessarily be a general optimization set for any given JS program. Instead, it's intended to provide an optimized way of handling specialized tasks such as intensive math operations (e.g., those used in graphics processing for games). + +## Review + +The first four chapters of this book are based on the premise that async coding patterns give you the ability to write more performant code, which is generally a very important improvement. But async behavior only gets you so far, because it's still fundamentally bound to a single event loop thread. + +So in this chapter we've covered several program-level mechanisms for improving performance even further. + +Web Workers let you run a JS file (aka program) in a separate thread using async events to message between the threads. They're wonderful for offloading long-running or resource-intensive tasks to a different thread, leaving the main UI thread more responsive. + +SIMD proposes to map CPU-level parallel math operations to JavaScript APIs for high-performance data-parallel operations, like number processing on large data sets. + +Finally, asm.js describes a small subset of JavaScript that avoids the hard-to-optimize parts of JS (like garbage collection and coercion) and lets the JS engine recognize and run such code through aggressive optimizations. asm.js could be hand authored, but that's extremely tedious and error prone, akin to hand authoring assembly language (hence the name). Instead, the main intent is that asm.js would be a good target for cross-compilation from other highly optimized program languages -- for example, Emscripten (https://github.com/kripken/emscripten/wiki) transpiling C/C++ to JavaScript. + +While not covered explicitly in this chapter, there are even more radical ideas under very early discussion for JavaScript, including approximations of direct threaded functionality (not just hidden behind data structure APIs). Whether that happens explicitly, or we just see more parallelism creep into JS behind the scenes, the future of more optimized program-level performance in JS looks really *promising*. diff --git a/async & performance/ch6.md b/async & performance/ch6.md new file mode 100644 index 0000000..6f34c28 --- /dev/null +++ b/async & performance/ch6.md @@ -0,0 +1,619 @@ +# You Don't Know JS: Async & Performance +# Chapter 6: Benchmarking & Tuning + +As the first four chapters of this book were all about performance as a coding pattern (asynchrony and concurrency), and Chapter 5 was about performance at the macro program architecture level, this chapter goes after the topic of performance at the micro level, focusing on single expressions/statements. + +One of the most common areas of curiosity -- indeed, some developers can get quite obsessed about it -- is in analyzing and testing various options for how to write a line or chunk of code, and which one is faster. + +We're going to look at some of these issues, but it's important to understand from the outset that this chapter is **not** about feeding the obsession of micro-performance tuning, like whether some given JS engine can run `++a` faster than `a++`. The more important goal of this chapter is to figure out what kinds of JS performance matter and which ones don't, *and how to tell the difference*. + +But even before we get there, we need to explore how to most accurately and reliably test JS performance, because there's tons of misconceptions and myths that have flooded our collective cult knowledge base. We've got to sift through all that junk to find some clarity. + +## Benchmarking + +OK, time to start dispelling some misconceptions. I'd wager the vast majority of JS developers, if asked to benchmark the speed (execution time) of a certain operation, would initially go about it something like this: + +```js +var start = (new Date()).getTime(); // or `Date.now()` + +// do some operation + +var end = (new Date()).getTime(); + +console.log( "Duration:", (end - start) ); +``` + +Raise your hand if that's roughly what came to your mind. Yep, I thought so. There's a lot wrong with this approach, but don't feel bad; **we've all been there.** + +What did that measurement tell you, exactly? Understanding what it does and doesn't say about the execution time of the operation in question is key to learning how to appropriately benchmark performance in JavaScript. + +If the duration reported is `0`, you may be tempted to believe that it took less than a millisecond. But that's not very accurate. Some platforms don't have single millisecond precision, but instead only update the timer in larger increments. For example, older versions of windows (and thus IE) had only 15ms precision, which means the operation has to take at least that long for anything other than `0` to be reported! + +Moreover, whatever duration is reported, the only thing you really know is that the operation took approximately that long on that exact single run. You have near-zero confidence that it will always run at that speed. You have no idea if the engine or system had some sort of interference at that exact moment, and that at other times the operation could run faster. + +What if the duration reported is `4`? Are you more sure it took about four milliseconds? Nope. It might have taken less time, and there may have been some other delay in getting either `start` or `end` timestamps. + +More troublingly, you also don't know that the circumstances of this operation test aren't overly optimistic. It's possible that the JS engine figured out a way to optimize your isolated test case, but in a more real program such optimization would be diluted or impossible, such that the operation would run slower than your test. + +So... what do we know? Unfortunately, with those realizations stated, **we know very little.** Something of such low confidence isn't even remotely good enough to build your determinations on. Your "benchmark" is basically useless. And worse, it's dangerous in that it implies false confidence, not just to you but also to others who don't think critically about the conditions that led to those results. + +### Repetition + +"OK," you now say, "Just put a loop around it so the whole test takes longer." If you repeat an operation 100 times, and that whole loop reportedly takes a total of 137ms, then you can just divide by 100 and get an average duration of 1.37ms for each operation, right? + +Well, not exactly. + +A straight mathematical average by itself is definitely not sufficient for making judgments about performance which you plan to extrapolate to the breadth of your entire application. With a hundred iterations, even a couple of outliers (high or low) can skew the average, and then when you apply that conclusion repeatedly, you even further inflate the skew beyond credulity. + +Instead of just running for a fixed number of iterations, you can instead choose to run the loop of tests until a certain amount of time has passed. That might be more reliable, but how do you decide how long to run? You might guess that it should be some multiple of how long your operation should take to run once. Wrong. + +Actually, the length of time to repeat across should be based on the accuracy of the timer you're using, specifically to minimize the chances of inaccuracy. The less precise your timer, the longer you need to run to make sure you've minimized the error percentage. A 15ms timer is pretty bad for accurate benchmarking; to minimize its uncertainty (aka "error rate") to less than 1%, you need to run your each cycle of test iterations for 750ms. A 1ms timer only needs a cycle to run for 50ms to get the same confidence. + +But then, that's just a single sample. To be sure you're factoring out the skew, you'll want lots of samples to average across. You'll also want to understand something about just how slow the worst sample is, how fast the best sample is, how far apart those best and worse cases were, and so on. You'll want to know not just a number that tells you how fast something ran, but also to have some quantifiable measure of how trustable that number is. + +Also, you probably want to combine these different techniques (as well as others), so that you get the best balance of all the possible approaches. + +That's all bare minimum just to get started. If you've been approaching performance benchmarking with anything less serious than what I just glossed over, well... "you don't know: proper benchmarking." + +### Benchmark.js + +Any relevant and reliable benchmark should be based on statistically sound practices. I am not going to write a chapter on statistics here, so I'll hand wave around some terms: standard deviation, variance, margin of error. If you don't know what those terms really mean -- I took a stats class back in college and I'm still a little fuzzy on them -- you are not actually qualified to write your own benchmarking logic. + +Luckily, smart folks like John-David Dalton and Mathias Bynens do understand these concepts, and wrote a statistically sound benchmarking tool called Benchmark.js (http://benchmarkjs.com/). So I can end the suspense by simply saying: "just use that tool." + +I won't repeat their whole documentation for how Benchmark.js works; they have fantastic API Docs (http://benchmarkjs.com/docs) you should read. Also there are some great (http://calendar.perfplanet.com/2010/bulletproof-javascript-benchmarks/) writeups (http://monsur.hossa.in/2012/12/11/benchmarkjs.html) on more of the details and methodology. + +But just for quick illustration purposes, here's how you could use Benchmark.js to run a quick performance test: + +```js +function foo() { + // operation(s) to test +} + +var bench = new Benchmark( + "foo test", // test name + foo, // function to test (just contents) + { + // .. // optional extra options (see docs) + } +); + +bench.hz; // number of operations per second +bench.stats.moe; // margin of error +bench.stats.variance; // variance across samples +// .. +``` + +There's *lots* more to learn about using Benchmark.js besides this glance I'm including here. But the point is that it's handling all of the complexities of setting up a fair, reliable, and valid performance benchmark for a given piece of JavaScript code. If you're going to try to test and benchmark your code, this library is the first place you should turn. + +We're showing here the usage to test a single operation like X, but it's fairly common that you want to compare X to Y. This is easy to do by simply setting up two different tests in a "Suite" (a Benchmark.js organizational feature). Then, you run them head-to-head, and compare the statistics to conclude whether X or Y was faster. + +Benchmark.js can of course be used to test JavaScript in a browser (see the "jsPerf.com" section later in this chapter), but it can also run in non-browser environments (Node.js, etc.). + +One largely untapped potential use-case for Benchmark.js is to use it in your Dev or QA environments to run automated performance regression tests against critical path parts of your application's JavaScript. Similar to how you might run unit test suites before deployment, you can also compare the performance against previous benchmarks to monitor if you are improving or degrading application performance. + +#### Setup/Teardown + +In the previous code snippet, we glossed over the "extra options" `{ .. }` object. But there are two options we should discuss: `setup` and `teardown`. + +These two options let you define functions to be called before and after your test case runs. + +It's incredibly important to understand that your `setup` and `teardown` code **does not run for each test iteration**. The best way to think about it is that there's an outer loop (repeating cycles), and an inner loop (repeating test iterations). `setup` and `teardown` are run at the beginning and end of each *outer* loop (aka cycle) iteration, but not inside the inner loop. + +Why does this matter? Let's imagine you have a test case that looks like this: + +```js +a = a + "w"; +b = a.charAt( 1 ); +``` + +Then, you set up your test `setup` as follows: + +```js +var a = "x"; +``` + +Your temptation is probably to believe that `a` is starting out as `"x"` for each test iteration. + +But it's not! It's starting `a` at `"x"` for each test cycle, and then your repeated `+ "w"` concatenations will be making a larger and larger `a` value, even though you're only ever accessing the character `"w"` at the `1` position. + +Where this most commonly bites you is when you make side effect changes to something like the DOM, like appending a child element. You may think your parent element is set as empty each time, but it's actually getting lots of elements added, and that can significantly sway the results of your tests. + +## Context Is King + +Don't forget to check the context of a particular performance benchmark, especially a comparison between X and Y tasks. Just because your test reveals that X is faster than Y doesn't mean that the conclusion "X is faster than Y" is actually relevant. + +For example, let's say a performance test reveals that X runs 10,000,000 operations per second, and Y runs at 8,000,000 operations per second. You could claim that Y is 20% slower than X, and you'd be mathematically correct, but your assertion doesn't hold as much water as you'd think. + +Let's think about the results more critically: 10,000,000 operations per second is 10,000 operations per millisecond, and 10 operations per microsecond. In other words, a single operation takes 0.1 microseconds, or 100 nanoseconds. It's hard to fathom just how small 100ns is, but for comparison, it's often cited that the human eye isn't generally capable of distinguishing anything less than 100ms, which is one million times slower than the 100ns speed of the X operation. + +Even recent scientific studies showing that maybe the brain can process as quick as 13ms (about 8x faster than previously asserted) would mean that X is still running 125,000 times faster than the human brain can perceive a distinct thing happening. **X is going really, really fast.** + +But more importantly, let's talk about the difference between X and Y, the 2,000,000 operations per second difference. If X takes 100ns, and Y takes 80ns, the difference is 20ns, which in the best case is still one 650-thousandth of the interval the human brain can perceive. + +What's my point? **None of this performance difference matters, at all!** + +But wait, what if this operation is going to happen a whole bunch of times in a row? Then the difference could add up, right? + +OK, so what we're asking then is, how likely is it that operation X is going to be run over and over again, one right after the other, and that this has to happen 650,000 times just to get a sliver of a hope the human brain could perceive it. More likely, it'd have to happen 5,000,000 to 10,000,000 times together in a tight loop to even approach relevance. + +While the computer scientist in you might protest that this is possible, the louder voice of realism in you should sanity check just how likely or unlikely that really is. Even if it is relevant in rare occasions, it's irrelevant in most situations. + +The vast majority of your benchmark results on tiny operations -- like the `++x` vs `x++` myth -- **are just totally bogus** for supporting the conclusion that X should be favored over Y on a performance basis. + +### Engine Optimizations + +You simply cannot reliably extrapolate that if X was 10 microseconds faster than Y in your isolated test, that means X is always faster than Y and should always be used. That's not how performance works. It's vastly more complicated. + +For example, let's imagine (purely hypothetical) that you test some microperformance behavior such as comparing: + +```js +var twelve = "12"; +var foo = "foo"; + +// test 1 +var X1 = parseInt( twelve ); +var X2 = parseInt( foo ); + +// test 2 +var Y1 = Number( twelve ); +var Y2 = Number( foo ); +``` + +If you understand what `parseInt(..)` does compared to `Number(..)`, you might intuit that `parseInt(..)` potentially has "more work" to do, especially in the `foo` case. Or you might intuit that they should have the same amount of work to do in the `foo` case, as both should be able to stop at the first character `"f"`. + +Which intuition is correct? I honestly don't know. But I'll make the case it doesn't matter what your intuition is. What might the results be when you test it? Again, I'm making up a pure hypothetical here, I haven't actually tried, nor do I care. + +Let's pretend the test comes back that `X` and `Y` are statistically identical. Have you then confirmed your intuition about the `"f"` character thing? Nope. + +It's possible in our hypothetical that the engine might recognize that the variables `twelve` and `foo` are only being used in one place in each test, and so it might decide to inline those values. Then it may realize that `Number( "12" )` can just be replaced by `12`. And maybe it comes to the same conclusion with `parseInt(..)`, or maybe not. + +Or an engine's dead-code removal heuristic could kick in, and it could realize that variables `X` and `Y` aren't being used, so declaring them is irrelevant, so it doesn't end up doing anything at all in either test. + +And all that's just made with the mindset of assumptions about a single test run. Modern engines are fantastically more complicated than what we're intuiting here. They do all sorts of tricks, like tracing and tracking how a piece of code behaves over a short period of time, or with a particularly constrained set of inputs. + +What if the engine optimizes a certain way because of the fixed input, but in your real program you give more varied input and the optimization decisions shake out differently (or not at all!)? Or what if the engine kicks in optimizations because it sees the code being run tens of thousands of times by the benchmarking utility, but in your real program it will only run a hundred times in near proximity, and under those conditions the engine determines the optimizations are not worth it? + +And all those optimizations we just hypothesized about might happen in our constrained test but maybe the engine wouldn't do them in a more complex program (for various reasons). Or it could be reversed -- the engine might not optimize such trivial code but may be more inclined to optimize it more aggressively when the system is already more taxed by a more sophisticated program. + +The point I'm trying to make is that you really don't know for sure exactly what's going on under the covers. All the guesses and hypothesis you can muster don't amount to hardly anything concrete for really making such decisions. + +Does that mean you can't really do any useful testing? **Definitely not!** + +What this boils down to is that testing *not real* code gives you *not real* results. In so much as is possible and practical, you should test actual real, non-trivial snippets of your code, and under as best of real conditions as you can actually hope to. Only then will the results you get have a chance to approximate reality. + +Microbenchmarks like `++x` vs `x++` are so incredibly likely to be bogus, we might as well just flatly assume them as such. + +## jsPerf.com + +While Benchmark.js is useful for testing the performance of your code in whatever JS environment you're running, it cannot be stressed enough that you need to compile test results from lots of different environments (desktop browsers, mobile devices, etc.) if you want to have any hope of reliable test conclusions. + +For example, Chrome on a high-end desktop machine is not likely to perform anywhere near the same as Chrome mobile on a smartphone. And a smartphone with a full battery charge is not likely to perform anywhere near the same as a smartphone with 2% battery life left, when the device is starting to power down the radio and processor. + +If you want to make assertions like "X is faster than Y" in any reasonable sense across more than just a single environment, you're going to need to actually test as many of those real world environments as possible. Just because Chrome executes some X operation faster than Y doesn't mean that all browsers do. And of course you also probably will want to cross-reference the results of multiple browser test runs with the demographics of your users. + +There's an awesome website for this purpose called jsPerf (http://jsperf.com). It uses the Benchmark.js library we talked about earlier to run statistically accurate and reliable tests, and makes the test on an openly available URL that you can pass around to others. + +Each time a test is run, the results are collected and persisted with the test, and the cumulative test results are graphed on the page for anyone to see. + +When creating a test on the site, you start out with two test cases to fill in, but you can add as many as you need. You also have the ability to set up `setup` code that is run at the beginning of each test cycle and `teardown` code run at the end of each cycle. + +**Note:** A trick for doing just one test case (if you're benchmarking a single approach instead of a head-to-head) is to fill in the second test input boxes with placeholder text on first creation, then edit the test and leave the second test blank, which will delete it. You can always add more test cases later. + +You can define the initial page setup (importing libraries, defining utility helper functions, declaring variables, etc.). There are also options for defining setup and teardown behavior if needed -- consult the "Setup/Teardown" section in the Benchmark.js discussion earlier. + +### Sanity Check + +jsPerf is a fantastic resource, but there's an awful lot of tests published that when you analyze them are quite flawed or bogus, for any of a variety of reasons as outlined so far in this chapter. + +Consider: + +```js +// Case 1 +var x = []; +for (var i=0; i<10; i++) { + x[i] = "x"; +} + +// Case 2 +var x = []; +for (var i=0; i<10; i++) { + x[x.length] = "x"; +} + +// Case 3 +var x = []; +for (var i=0; i<10; i++) { + x.push( "x" ); +} +``` + +Some observations to ponder about this test scenario: + +* It's extremely common for devs to put their own loops into test cases, and they forget that Benchmark.js already does all the repetition you need. There's a really strong chance that the `for` loops in these cases are totally unnecessary noise. +* The declaring and initializing of `x` is included in each test case, possibly unnecessarily. Recall from earlier that if `x = []` were in the `setup` code, it wouldn't actually be run before each test iteration, but instead once at the beginning of each cycle. That means `x` would continue growing quite large, not just the size `10` implied by the `for` loops. + + So is the intent to make sure the tests are constrained only to how the JS engine behaves with very small arrays (size `10`)? That *could* be the intent, but if it is, you have to consider if that's not focusing far too much on nuanced internal implementation details. + + On the other hand, does the intent of the test embrace the context that the arrays will actually be growing quite large? Is the JS engines' behavior with larger arrays relevant and accurate when compared with the intended real world usage? + +* Is the intent to find out how much `x.length` or `x.push(..)` add to the performance of the operation to append to the `x` array? OK, that might be a valid thing to test. But then again, `push(..)` is a function call, so of course it's going to be slower than `[..]` access. Arguably, cases 1 and 2 are fairer than case 3. + + +Here's another example that illustrates a common apples-to-oranges flaw: + +```js +// Case 1 +var x = ["John","Albert","Sue","Frank","Bob"]; +x.sort(); + +// Case 2 +var x = ["John","Albert","Sue","Frank","Bob"]; +x.sort( function mySort(a,b){ + if (a < b) return -1; + if (a > b) return 1; + return 0; +} ); +``` + +Here, the obvious intent is to find out how much slower the custom `mySort(..)` comparator is than the built-in default comparator. But by specifying the function `mySort(..)` as inline function expression, you've created an unfair/bogus test. Here, the second case is not only testing a custom user JS function, **but it's also testing creating a new function expression for each iteration.** + +Would it surprise you to find out that if you run a similar test but update it to isolate only for creating an inline function expression versus using a pre-declared function, the inline function expression creation can be from 2% to 20% slower!? + +Unless your intent with this test *is* to consider the inline function expression creation "cost," a better/fairer test would put `mySort(..)`'s declaration in the page setup -- don't put it in the test `setup` as that's unnecessary redeclaration for each cycle -- and simply reference it by name in the test case: `x.sort(mySort)`. + +Building on the previous example, another pitfall is in opaquely avoiding or adding "extra work" to one test case that creates an apples-to-oranges scenario: + +```js +// Case 1 +var x = [12,-14,0,3,18,0,2.9]; +x.sort(); + +// Case 2 +var x = [12,-14,0,3,18,0,2.9]; +x.sort( function mySort(a,b){ + return a - b; +} ); +``` + +Setting aside the previously mentioned inline function expression pitfall, the second case's `mySort(..)` works in this case because you have provided it numbers, but would have of course failed with strings. The first case doesn't throw an error, but it actually behaves differently and has a different outcome! It should be obvious, but: **a different outcome between two test cases almost certainly invalidates the entire test!** + +But beyond the different outcomes, in this case, the built in `sort(..)`'s comparator is actually doing "extra work" that `mySort()` does not, in that the built-in one coerces the compared values to strings and does lexicographic comparison. The first snippet results in `[-14, 0, 0, 12, 18, 2.9, 3]` while the second snippet results (likely more accurately based on intent) in `[-14, 0, 0, 2.9, 3, 12, 18]`. + +So that test is unfair because it's not actually doing the same task between the cases. Any results you get are bogus. + +These same pitfalls can even be much more subtle: + +```js +// Case 1 +var x = false; +var y = x ? 1 : 2; + +// Case 2 +var x; +var y = x ? 1 : 2; +``` + +Here, the intent might be to test the performance impact of the coercion to a Boolean that the `? :` operator will do if the `x` expression is not already a Boolean (see the *Types & Grammar* title of this book series). So, you're apparently OK with the fact that there is extra work to do the coercion in the second case. + +The subtle problem? You're setting `x`'s value in the first case and not setting it in the other, so you're actually doing work in the first case that you're not doing in the second. To eliminate any potential (albeit minor) skew, try: + +```js +// Case 1 +var x = false; +var y = x ? 1 : 2; + +// Case 2 +var x = undefined; +var y = x ? 1 : 2; +``` + +Now there's an assignment in both cases, so the thing you want to test -- the coercion of `x` or not -- has likely been more accurately isolated and tested. + +## Writing Good Tests + +Let me see if I can articulate the bigger point I'm trying to make here. + +Good test authoring requires careful analytical thinking about what differences exist between two test cases and whether the differences between them are *intentional* or *unintentional*. + +Intentional differences are of course normal and OK, but it's too easy to create unintentional differences that skew your results. You have to be really, really careful to avoid that skew. Moreover, you may intend a difference but it may not be obvious to other readers of your test what your intent was, so they may doubt (or trust!) your test incorrectly. How do you fix that? + +**Write better, clearer tests.** But also, take the time to document (using the jsPerf.com "Description" field and/or code comments) exactly what the intent of your test is, even to the nuanced detail. Call out the intentional differences, which will help others and your future self to better identify unintentional differences that could be skewing the test results. + +Isolate things which aren't relevant to your test by pre-declaring them in the page or test setup settings so they're outside the timed parts of the test. + +Instead of trying to narrow in on a tiny snippet of your real code and benchmarking just that piece out of context, tests and benchmarks are better when they include a larger (while still relevant) context. Those tests also tend to run slower, which means any differences you spot are more relevant in context. + +## Microperformance + +OK, until now we've been dancing around various microperformance issues and generally looking disfavorably upon obsessing about them. I want to take just a moment to address them directly. + +The first thing you need to get more comfortable with when thinking about performance benchmarking your code is that the code you write is not always the code the engine actually runs. We briefly looked at that topic back in Chapter 1 when we discussed statement reordering by the compiler, but here we're going to suggest the compiler can sometimes decide to run different code than you wrote, not just in different orders but different in substance. + +Let's consider this piece of code: + +```js +var foo = 41; + +(function(){ + (function(){ + (function(baz){ + var bar = foo + baz; + // .. + })(1); + })(); +})(); +``` + +You may think about the `foo` reference in the innermost function as needing to do a three-level scope lookup. We covered in the *Scope & Closures* title of this book series how lexical scope works, and the fact that the compiler generally caches such lookups so that referencing `foo` from different scopes doesn't really practically "cost" anything extra. + +But there's something deeper to consider. What if the compiler realizes that `foo` isn't referenced anywhere else but that one location, and it further notices that the value never is anything except the `41` as shown? + +Isn't it quite possible and acceptable that the JS compiler could decide to just remove the `foo` variable entirely, and *inline* the value, such as this: + +```js +(function(){ + (function(){ + (function(baz){ + var bar = 41 + baz; + // .. + })(1); + })(); +})(); +``` + +**Note:** Of course, the compiler could probably also do a similar analysis and rewrite with the `baz` variable here, too. + +When you begin to think about your JS code as being a hint or suggestion to the engine of what to do, rather than a literal requirement, you realize that a lot of the obsession over discrete syntactic minutia is most likely unfounded. + +Another example: + +```js +function factorial(n) { + if (n < 2) return 1; + return n * factorial( n - 1 ); +} + +factorial( 5 ); // 120 +``` + +Ah, the good ol' fashioned "factorial" algorithm! You might assume that the JS engine will run that code mostly as is. And to be honest, it might -- I'm not really sure. + +But as an anecdote, the same code expressed in C and compiled with advanced optimizations would result in the compiler realizing that the call `factorial(5)` can just be replaced with the constant value `120`, eliminating the function and call entirely! + +Moreover, some engines have a practice called "unrolling recursion," where it can realize that the recursion you've expressed can actually be done "easier" (i.e., more optimally) with a loop. It's possible the preceding code could be *rewritten* by a JS engine to run as: + +```js +function factorial(n) { + if (n < 2) return 1; + + var res = 1; + for (var i=n; i>1; i--) { + res *= i; + } + return res; +} + +factorial( 5 ); // 120 +``` + +Now, let's imagine that in the earlier snippet you had been worried about whether `n * factorial(n-1)` or `n *= factorial(--n)` runs faster. Maybe you even did a performance benchmark to try to figure out which was better. But you miss the fact that in the bigger context, the engine may not run either line of code because it may unroll the recursion! + +Speaking of `--`, `--n` versus `n--` is often cited as one of those places where you can optimize by choosing the `--n` version, because theoretically it requires less effort down at the assembly level of processing. + +That sort of obsession is basically nonsense in modern JavaScript. That's the kind of thing you should be letting the engine take care of. You should write the code that makes the most sense. Compare these three `for` loops: + +```js +// Option 1 +for (var i=0; i<10; i++) { + console.log( i ); +} + +// Option 2 +for (var i=0; i<10; ++i) { + console.log( i ); +} + +// Option 3 +for (var i=-1; ++i<10; ) { + console.log( i ); +} +``` + +Even if you have some theory where the second or third option is more performant than the first option by a tiny bit, which is dubious at best, the third loop is more confusing because you have to start with `-1` for `i` to account for the fact that `++i` pre-increment is used. And the difference between the first and second options is really quite irrelevant. + +It's entirely possible that a JS engine may see a place where `i++` is used and realize that it can safely replace it with the `++i` equivalent, which means your time spent deciding which one to pick was completely wasted and the outcome moot. + +Here's another common example of silly microperformance obsession: + +```js +var x = [ .. ]; + +// Option 1 +for (var i=0; i < x.length; i++) { + // .. +} + +// Option 2 +for (var i=0, len = x.length; i < len; i++) { + // .. +} +``` + +The theory here goes that you should cache the length of the `x` array in the variable `len`, because ostensibly it doesn't change, to avoid paying the price of `x.length` being consulted for each iteration of the loop. + +If you run performance benchmarks around `x.length` usage compared to caching it in a `len` variable, you'll find that while the theory sounds nice, in practice any measured differences are statistically completely irrelevant. + +In fact, in some engines like v8, it can be shown (http://mrale.ph/blog/2014/12/24/array-length-caching.html) that you could make things slightly worse by pre-caching the length instead of letting the engine figure it out for you. Don't try to outsmart your JavaScript engine, you'll probably lose when it comes to performance optimizations. + +### Not All Engines Are Alike + +The different JS engines in various browsers can all be "spec compliant" while having radically different ways of handling code. The JS specification doesn't require anything performance related -- well, except ES6's "Tail Call Optimization" covered later in this chapter. + +The engines are free to decide that one operation will receive its attention to optimize, perhaps trading off for lesser performance on another operation. It can be very tenuous to find an approach for an operation that always runs faster in all browsers. + +There's a movement among some in the JS dev community, especially those who work with Node.js, to analyze the specific internal implementation details of the v8 JavaScript engine and make decisions about writing JS code that is tailored to take best advantage of how v8 works. You can actually achieve a surprisingly high degree of performance optimization with such endeavors, so the payoff for the effort can be quite high. + +Some commonly cited examples (https://github.com/petkaantonov/bluebird/wiki/Optimization-killers) for v8: + +* Don't pass the `arguments` variable from one function to any other function, as such "leakage" slows down the function implementation. +* Isolate a `try..catch` in its own function. Browsers struggle with optimizing any function with a `try..catch` in it, so moving that construct to its own function means you contain the de-optimization harm while letting the surrounding code be optimizable. + +But rather than focus on those tips specifically, let's sanity check the v8-only optimization approach in a general sense. + +Are you genuinely writing code that only needs to run in one JS engine? Even if your code is entirely intended for Node.js *right now*, is the assumption that v8 will *always* be the used JS engine reliable? Is it possible that someday a few years from now, there's another server-side JS platform besides Node.js that you choose to run your code on? What if what you optimized for before is now a much slower way of doing that operation on the new engine? + +Or what if your code always stays running on v8 from here on out, but v8 decides at some point to change the way some set of operations works such that what used to be fast is now slow, and vice versa? + +These scenarios aren't just theoretical, either. It used to be that it was faster to put multiple string values into an array and then call `join("")` on the array to concatenate the values than to just use `+` concatenation directly with the values. The historical reason for this is nuanced, but it has to do with internal implementation details about how string values were stored and managed in memory. + +As a result, "best practice" advice at the time disseminated across the industry suggesting developers always use the array `join(..)` approach. And many followed. + +Except, somewhere along the way, the JS engines changed approaches for internally managing strings, and specifically put in optimizations for `+` concatenation. They didn't slow down `join(..)` per se, but they put more effort into helping `+` usage, as it was still quite a bit more widespread. + +**Note:** The practice of standardizing or optimizing some particular approach based mostly on its existing widespread usage is often called (metaphorically) "paving the cowpath." + +Once that new approach to handling strings and concatenation took hold, unfortunately all the code out in the wild that was using array `join(..)` to concatenate strings was then sub-optimal. + +Another example: at one time, the Opera browser differed from other browsers in how it handled the boxing/unboxing of primitive wrapper objects (see the *Types & Grammar* title of this book series). As such, their advice to developers was to use a `String` object instead of the primitive `string` value if properties like `length` or methods like `charAt(..)` needed to be accessed. This advice may have been correct for Opera at the time, but it was literally completely opposite for other major contemporary browsers, as they had optimizations specifically for the `string` primitives and not their object wrapper counterparts. + +I think these various gotchas are at least possible, if not likely, for code even today. So I'm very cautious about making wide ranging performance optimizations in my JS code based purely on engine implementation details, **especially if those details are only true of a single engine**. + +The reverse is also something to be wary of: you shouldn't necessarily change a piece of code to work around one engine's difficulty with running a piece of code in an acceptably performant way. + +Historically, IE has been the brunt of many such frustrations, given that there have been plenty of scenarios in older IE versions where it struggled with some performance aspect that other major browsers of the time seemed not to have much trouble with. The string concatenation discussion we just had was actually a real concern back in the IE6 and IE7 days, where it was possible to get better performance out of `join(..)` than `+`. + +But it's troublesome to suggest that just one browser's trouble with performance is justification for using a code approach that quite possibly could be sub-optimal in all other browsers. Even if the browser in question has a large market share for your site's audience, it may be more practical to write the proper code and rely on the browser to update itself with better optimizations eventually. + +"There is nothing more permanent than a temporary hack." Chances are, the code you write now to work around some performance bug will probably outlive the performance bug in the browser itself. + +In the days when a browser only updated once every five years, that was a tougher call to make. But as it stands now, browsers across the board are updating at a much more rapid interval (though obviously the mobile world still lags), and they're all competing to optimize web features better and better. + +If you run across a case where a browser *does* have a performance wart that others don't suffer from, make sure to report it to them through whatever means you have available. Most browsers have open public bug trackers suitable for this purpose. + +**Tip:** I'd only suggest working around a performance issue in a browser if it was a really drastic show-stopper, not just an annoyance or frustration. And I'd be very careful to check that the performance hack didn't have noticeable negative side effects in another browser. + +### Big Picture + +Instead of worrying about all these microperformance nuances, we should instead be looking at big-picture types of optimizations. + +How do you know what's big picture or not? You have to first understand if your code is running on a critical path or not. If it's not on the critical path, chances are your optimizations are not worth much. + +Ever heard the admonition, "that's premature optimization!"? It comes from a famous quote from Donald Knuth: "premature optimization is the root of all evil.". Many developers cite this quote to suggest that most optimizations are "premature" and are thus a waste of effort. The truth is, as usual, more nuanced. + +Here is Knuth's quote, in context: + +> Programmers waste enormous amounts of time thinking about, or worrying about, the speed of **noncritical** parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that **critical** 3%. [emphasis added] + +(http://web.archive.org/web/20130731202547/http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf, Computing Surveys, Vol 6, No 4, December 1974) + +I believe it's a fair paraphrasing to say that Knuth *meant*: "non-critical path optimization is the root of all evil." So the key is to figure out if your code is on the critical path -- you should optimize it! -- or not. + +I'd even go so far as to say this: no amount of time spent optimizing critical paths is wasted, no matter how little is saved; but no amount of optimization on noncritical paths is justified, no matter how much is saved. + +If your code is on the critical path, such as a "hot" piece of code that's going to be run over and over again, or in UX critical places where users will notice, like an animation loop or CSS style updates, then you should spare no effort in trying to employ relevant, measurably significant optimizations. + +For example, consider a critical path animation loop that needs to coerce a string value to a number. There are of course multiple ways to do that (see the *Types & Grammar* title of this book series), but which one if any is the fastest? + +```js +var x = "42"; // need number `42` + +// Option 1: let implicit coercion automatically happen +var y = x / 2; + +// Option 2: use `parseInt(..)` +var y = parseInt( x, 0 ) / 2; + +// Option 3: use `Number(..)` +var y = Number( x ) / 2; + +// Option 4: use `+` unary operator +var y = +x / 2; + +// Option 5: use `|` unary operator +var y = (x | 0) / 2; +``` + +**Note:** I will leave it as an exercise to the reader to set up a test if you're interested in examining the minute differences in performance among these options. + +When considering these different options, as they say, "One of these things is not like the others." `parseInt(..)` does the job, but it also does a lot more -- it parses the string rather than just coercing. You can probably guess, correctly, that `parseInt(..)` is a slower option, and you should probably avoid it. + +Of course, if `x` can ever be a value that **needs parsing**, such as `"42px"` (like from a CSS style lookup), then `parseInt(..)` really is the only suitable option! + +`Number(..)` is also a function call. From a behavioral perspective, it's identical to the `+` unary operator option, but it may in fact be a little slower, requiring more machinery to execute the function. Of course, it's also possible that the JS engine recognizes this behavioral symmetry and just handles the inlining of `Number(..)`'s behavior (aka `+x`) for you! + +But remember, obsessing about `+x` versus `x | 0` is in most cases likely a waste of effort. This is a microperformance issue, and one that you shouldn't let dictate/degrade the readability of your program. + +While performance is very important in critical paths of your program, it's not the only factor. Among several options that are roughly similar in performance, readability should be another important concern. + +## Tail Call Optimization (TCO) + +As we briefly mentioned earlier, ES6 includes a specific requirement that ventures into the world of performance. It's related to a specific form of optimization that can occur with function calls: *tail call optimization*. + +Briefly, a "tail call" is a function call that appears at the "tail" of another function, such that after the call finishes, there's nothing left to do (except perhaps return its result value). + +For example, here's a non-recursive setup with tail calls: + +```js +function foo(x) { + return x; +} + +function bar(y) { + return foo( y + 1 ); // tail call +} + +function baz() { + return 1 + bar( 40 ); // not tail call +} + +baz(); // 42 +``` + +`foo(y+1)` is a tail call in `bar(..)` because after `foo(..)` finishes, `bar(..)` is also finished except in this case returning the result of the `foo(..)` call. However, `bar(40)` is *not* a tail call because after it completes, its result value must be added to `1` before `baz()` can return it. + +Without getting into too much nitty-gritty detail, calling a new function requires an extra amount of reserved memory to manage the call stack, called a "stack frame." So the preceding snippet would generally require a stack frame for each of `baz()`, `bar(..)`, and `foo(..)` all at the same time. + +However, if a TCO-capable engine can realize that the `foo(y+1)` call is in *tail position* meaning `bar(..)` is basically complete, then when calling `foo(..)`, it doesn't need to create a new stack frame, but can instead reuse the existing stack frame from `bar(..)`. That's not only faster, but it also uses less memory. + +That sort of optimization isn't a big deal in a simple snippet, but it becomes a *much bigger deal* when dealing with recursion, especially if the recursion could have resulted in hundreds or thousands of stack frames. With TCO the engine can perform all those calls with a single stack frame! + +Recursion is a hairy topic in JS because without TCO, engines have had to implement arbitrary (and different!) limits to how deep they will let the recursion stack get before they stop it, to prevent running out of memory. With TCO, recursive functions with *tail position* calls can essentially run unbounded, because there's never any extra usage of memory! + +Consider that recursive `factorial(..)` from before, but rewritten to make it TCO friendly: + +```js +function factorial(n) { + function fact(n,res) { + if (n < 2) return res; + + return fact( n - 1, n * res ); + } + + return fact( n, 1 ); +} + +factorial( 5 ); // 120 +``` + +This version of `factorial(..)` is still recursive, but it's also optimizable with TCO, because both inner `fact(..)` calls are in *tail position*. + +**Note:** It's important to note that TCO only applies if there's actually a tail call. If you write recursive functions without tail calls, the performance will still fall back to normal stack frame allocation, and the engines' limits on such recursive call stacks will still apply. Many recursive functions can be rewritten as we just showed with `factorial(..)`, but it takes careful attention to detail. + +One reason that ES6 requires engines to implement TCO rather than leaving it up to their discretion is because the *lack of TCO* actually tends to reduce the chances that certain algorithms will be implemented in JS using recursion, for fear of the call stack limits. + +If the lack of TCO in the engine would just gracefully degrade to slower performance in all cases, it wouldn't probably have been something that ES6 needed to *require*. But because the lack of TCO can actually make certain programs impractical, it's more an important feature of the language than just a hidden implementation detail. + +ES6 guarantees that from now on, JS developers will be able to rely on this optimization across all ES6+ compliant browsers. That's a win for JS performance! + +## Review + +Effectively benchmarking performance of a piece of code, especially to compare it to another option for that same code to see which approach is faster, requires careful attention to detail. + +Rather than rolling your own statistically valid benchmarking logic, just use the Benchmark.js library, which does that for you. But be careful about how you author tests, because it's far too easy to construct a test that seems valid but that's actually flawed -- even tiny differences can skew the results to be completely unreliable. + +It's important to get as many test results from as many different environments as possible to eliminate hardware/device bias. jsPerf.com is a fantastic website for crowdsourcing performance benchmark test runs. + +Many common performance tests unfortunately obsess about irrelevant microperformance details like `x++` versus `++x`. Writing good tests means understanding how to focus on big picture concerns, like optimizing on the critical path, and avoiding falling into traps like different JS engines' implementation details. + +Tail call optimization (TCO) is a required optimization as of ES6 that will make some recursive patterns practical in JS where they would have been impossible otherwise. TCO allows a function call in the *tail position* of another function to execute without needing any extra resources, which means the engine no longer needs to place arbitrary restrictions on call stack depth for recursive algorithms. diff --git a/async & performance/cover.jpg b/async & performance/cover.jpg new file mode 100644 index 0000000..1509ef9 Binary files /dev/null and b/async & performance/cover.jpg differ diff --git a/async & performance/foreword.md b/async & performance/foreword.md new file mode 100644 index 0000000..97f7478 --- /dev/null +++ b/async & performance/foreword.md @@ -0,0 +1,22 @@ +# You Don't Know JS: Async & Performance +# Foreword + +Over the years, my employer has trusted me enough to conduct interviews. If we're looking for someone with skills in JavaScript, my first line of questioning… actually that's not true, I first check if the candidate needs the bathroom and/or a drink, because comfort is important, but once I'm past the bit about the candidate's fluid in/out-take, I set about determining if the candidate knows JavaScript, or just jQuery. + +Not that there's anything wrong with jQuery. It lets you do a lot without really knowing JavaScript, and that's a feature not a bug. But if the job calls for advanced skills in JavaScript performance and maintainability, you need someone who knows how libraries such as jQuery are put together. You need to be able to harness the core of JavaScript the same way they do. + +If I want to get a picture of someone's core JavaScript skill, I'm most interested in what they make of closures (you've read that book of this series already, right?) and how to get the most out of asynchronicity, which brings us to this book. + +For starters, you'll be taken through callbacks, the bread and butter of asynchronous programming. Of course, bread and butter does not make for a particularly satisfying meal, but the next course is full of tasty tasty promises! + +If you don't know promises, now is the time to learn. Promises are now the official way to provide async return values in both JavaScript and the DOM. All future async DOM APIs will use them, many already do, so be prepared! At the time of writing, Promises have shipped in most major browsers, with IE shipping soon. Once you've finished that, I hope you left room for the next course, Generators. + +Generators snuck their way into stable versions of Chrome and Firefox without too much pomp and ceremony, because, frankly, they're more complicated than they are interesting. Or, that's what I thought until I saw them combined with promises. There, they become an important tool in readability and maintenance. + +For dessert, well, I won't spoil the surprise, but prepare to gaze into the future of JavaScript! Features that give you more and more control over concurrency and asynchronicity. + +Well, I won't block your enjoyment of the book any longer, on with the show! If you've already read part of the book before reading this Foreword, give yourself 10 asynchronous points! You deserve them! + +Jake Archibald
+[jakearchibald.com](http://jakearchibald.com), [@jaffathecake](http://twitter.com/jaffathecake)
+Developer Advocate at Google Chrome diff --git a/async & performance/toc.md b/async & performance/toc.md new file mode 100644 index 0000000..1122607 --- /dev/null +++ b/async & performance/toc.md @@ -0,0 +1,51 @@ +# You Don't Know JS: Async & Performance + +## Table of Contents + +* Foreword +* Preface +* Chapter 1: Asynchrony: Now & Later + * A Program In Chunks + * Event Loop + * Parallel Threading + * Concurrency + * Jobs + * Statement Ordering +* Chapter 2: Callbacks + * Continuations + * Sequential Brain + * Trust Issues + * Trying To Save Callbacks +* Chapter 3: Promises + * What is a Promise? + * Thenable Duck-Typing + * Promise Trust + * Chain Flow + * Error Handling + * Promise Patterns + * Promise API Recap + * Promise Limitations +* Chapter 4: Generators + * Breaking Run-to-completion + * Generator'ing Values + * Iterating Generators Asynchronously + * Generators + Promises + * Generator Delegation + * Generator Concurrency + * Thunks + * Pre-ES6 Generators +* Chapter 5: Program Performance + * Web Workers + * SIMD + * asm.js +* Chapter 6: Benchmarking & Tuning + * Benchmarking + * Context Is King + * jsPerf.com + * Writing Good Tests + * Microperformance + * Tail Call Optimization (TCO) +* Appendix A: *asynquence* Library +* Appendix B: Advanced Async Patterns +* Appendix C: Acknowledgments + diff --git a/es6 & beyond/README.md b/es6 & beyond/README.md new file mode 100644 index 0000000..164357a --- /dev/null +++ b/es6 & beyond/README.md @@ -0,0 +1,23 @@ +# You Don't Know JS: ES6 & Beyond + + + +----- + +**[Purchase digital/print copy from O'Reilly](http://shop.oreilly.com/product/0636920033769.do)** + +----- + +[Table of Contents](toc.md) + +* [Foreword](foreword.md) (by [Rick Waldron](http://bocoup.com/weblog/author/rick-waldron/)) +* [Preface](../preface.md) +* [Chapter 1: ES? Now & Future](ch1.md) +* [Chapter 2: Syntax](ch2.md) +* [Chapter 3: Organization](ch3.md) +* [Chapter 4: Async Flow Control](ch4.md) +* [Chapter 5: Collections](ch5.md) +* [Chapter 6: API Additions](ch6.md) +* [Chapter 7: Meta Programming](ch7.md) +* [Chapter 8: Beyond ES6](ch8.md) +* [Appendix A: Thank You's!](apA.md) diff --git a/es6 & beyond/apA.md b/es6 & beyond/apA.md new file mode 100644 index 0000000..054353a --- /dev/null +++ b/es6 & beyond/apA.md @@ -0,0 +1,20 @@ +# You Don't Know JS: ES6 & Beyond +# Appendix A: Acknowledgments + +I have many people to thank for making this book title and the overall series happen. + +First, I must thank my wife Christen Simpson, and my two kids Ethan and Emily, for putting up with Dad always pecking away at the computer. Even when not writing books, my obsession with JavaScript glues my eyes to the screen far more than it should. That time I borrow from my family is the reason these books can so deeply and completely explain JavaScript to you, the reader. I owe my family everything. + +I'd like to thank my editors at O'Reilly, namely Simon St.Laurent and Brian MacDonald, as well as the rest of the editorial and marketing staff. They are fantastic to work with, and have been especially accommodating during this experiment into "open source" book writing, editing, and production. + +Thank you to the many folks who have participated in making this book series better by providing editorial suggestions and corrections, including Shelley Powers, Tim Ferro, Evan Borden, Forrest L. Norvell, Jennifer Davis, Jesse Harlin, and many others. A big thank you to Rick Waldron for writing the Foreword for this title. + +Thank you to the countless folks in the community, including members of the TC39 committee, who have shared so much knowledge with the rest of us, and especially tolerated my incessant questions and explorations with patience and detail. John-David Dalton, Juriy "kangax" Zaytsev, Mathias Bynens, Axel Rauschmayer, Nicholas Zakas, Angus Croll, Reginald Braithwaite, Dave Herman, Brendan Eich, Allen Wirfs-Brock, Bradley Meck, Domenic Denicola, David Walsh, Tim Disney, Peter van der Zee, Andrea Giammarchi, Kit Cambridge, Eric Elliott, André Bargull, Caitlin Potter, Brian Terlson, Ingvar Stepanyan, Chris Dickinson, Luke Hoban, and so many others, I can't even scratch the surface. + +The *You Don't Know JS* book series was born on Kickstarter, so I also wish to thank all my (nearly) 500 generous backers, without whom this book series could not have happened: + +> Jan Szpila, nokiko, Murali Krishnamoorthy, Ryan Joy, Craig Patchett, pdqtrader, Dale Fukami, ray hatfield, R0drigo Perez [Mx], Dan Petitt, Jack Franklin, Andrew Berry, Brian Grinstead, Rob Sutherland, Sergi Meseguer, Phillip Gourley, Mark Watson, Jeff Carouth, Alfredo Sumaran, Martin Sachse, Marcio Barrios, Dan, AimelyneM, Matt Sullivan, Delnatte Pierre-Antoine, Jake Smith, Eugen Tudorancea, Iris, David Trinh, simonstl, Ray Daly, Uros Gruber, Justin Myers, Shai Zonis, Mom & Dad, Devin Clark, Dennis Palmer, Brian Panahi Johnson, Josh Marshall, Marshall, Dennis Kerr, Matt Steele, Erik Slagter, Sacah, Justin Rainbow, Christian Nilsson, Delapouite, D.Pereira, Nicolas Hoizey, George V. Reilly, Dan Reeves, Bruno Laturner, Chad Jennings, Shane King, Jeremiah Lee Cohick, od3n, Stan Yamane, Marko Vucinic, Jim B, Stephen Collins, Ægir Þorsteinsson, Eric Pederson, Owain, Nathan Smith, Jeanetteurphy, Alexandre ELISÉ, Chris Peterson, Rik Watson, Luke Matthews, Justin Lowery, Morten Nielsen, Vernon Kesner, Chetan Shenoy, Paul Tregoing, Marc Grabanski, Dion Almaer, Andrew Sullivan, Keith Elsass, Tom Burke, Brian Ashenfelter, David Stuart, Karl Swedberg, Graeme, Brandon Hays, John Christopher, Gior, manoj reddy, Chad Smith, Jared Harbour, Minoru TODA, Chris Wigley, Daniel Mee, Mike, Handyface, Alex Jahraus, Carl Furrow, Rob Foulkrod, Max Shishkin, Leigh Penny Jr., Robert Ferguson, Mike van Hoenselaar, Hasse Schougaard, rajan venkataguru, Jeff Adams, Trae Robbins, Rolf Langenhuijzen, Jorge Antunes, Alex Koloskov, Hugh Greenish, Tim Jones, Jose Ochoa, Michael Brennan-White, Naga Harish Muvva, Barkóczi Dávid, Kitt Hodsden, Paul McGraw, Sascha Goldhofer, Andrew Metcalf, Markus Krogh, Michael Mathews, Matt Jared, Juanfran, Georgie Kirschner, Kenny Lee, Ted Zhang, Amit Pahwa, Inbal Sinai, Dan Raine, Schabse Laks, Michael Tervoort, Alexandre Abreu, Alan Joseph Williams, NicolasD, Cindy Wong, Reg Braithwaite, LocalPCGuy, Jon Friskics, Chris Merriman, John Pena, Jacob Katz, Sue Lockwood, Magnus Johansson, Jeremy Crapsey, Grzegorz Pawłowski, nico nuzzaci, Christine Wilks, Hans Bergren, charles montgomery, Ariel בר-לבב Fogel, Ivan Kolev, Daniel Campos, Hugh Wood, Christian Bradford, Frédéric Harper, Ionuţ Dan Popa, Jeff Trimble, Rupert Wood, Trey Carrico, Pancho Lopez, Joël kuijten, Tom A Marra, Jeff Jewiss, Jacob Rios, Paolo Di Stefano, Soledad Penades, Chris Gerber, Andrey Dolganov, Wil Moore III, Thomas Martineau, Kareem, Ben Thouret, Udi Nir, Morgan Laupies, jory carson-burson, Nathan L Smith, Eric Damon Walters, Derry Lozano-Hoyland, Geoffrey Wiseman, mkeehner, KatieK, Scott MacFarlane, Brian LaShomb, Adrien Mas, christopher ross, Ian Littman, Dan Atkinson, Elliot Jobe, Nick Dozier, Peter Wooley, John Hoover, dan, Martin A. Jackson, Héctor Fernando Hurtado, andy ennamorato, Paul Seltmann, Melissa Gore, Dave Pollard, Jack Smith, Philip Da Silva, Guy Israeli, @megalithic, Damian Crawford, Felix Gliesche, April Carter Grant, Heidi, jim tierney, Andrea Giammarchi, Nico Vignola, Don Jones, Chris Hartjes, Alex Howes, john gibbon, David J. Groom, BBox, Yu 'Dilys' Sun, Nate Steiner, Brandon Satrom, Brian Wyant, Wesley Hales, Ian Pouncey, Timothy Kevin Oxley, George Terezakis, sanjay raj, Jordan Harband, Marko McLion, Wolfgang Kaufmann, Pascal Peuckert, Dave Nugent, Markus Liebelt, Welling Guzman, Nick Cooley, Daniel Mesquita, Robert Syvarth, Chris Coyier, Rémy Bach, Adam Dougal, Alistair Duggin, David Loidolt, Ed Richer, Brian Chenault, GoldFire Studios, Carles Andrés, Carlos Cabo, Yuya Saito, roberto ricardo, Barnett Klane, Mike Moore, Kevin Marx, Justin Love, Joe Taylor, Paul Dijou, Michael Kohler, Rob Cassie, Mike Tierney, Cody Leroy Lindley, tofuji, Shimon Schwartz, Raymond, Luc De Brouwer, David Hayes, Rhys Brett-Bowen, Dmitry, Aziz Khoury, Dean, Scott Tolinski - Level Up, Clement Boirie, Djordje Lukic, Anton Kotenko, Rafael Corral, Philip Hurwitz, Jonathan Pidgeon, Jason Campbell, Joseph C., SwiftOne, Jan Hohner, Derick Bailey, getify, Daniel Cousineau, Chris Charlton, Eric Turner, David Turner, Joël Galeran, Dharma Vagabond, adam, Dirk van Bergen, dave ♥♫★ furf, Vedran Zakanj, Ryan McAllen, Natalie Patrice Tucker, Eric J. Bivona, Adam Spooner, Aaron Cavano, Kelly Packer, Eric J, Martin Drenovac, Emilis, Michael Pelikan, Scott F. Walter, Josh Freeman, Brandon Hudgeons, vijay chennupati, Bill Glennon, Robin R., Troy Forster, otaku_coder, Brad, Scott, Frederick Ostrander, Adam Brill, Seb Flippence, Michael Anderson, Jacob, Adam Randlett, Standard, Joshua Clanton, Sebastian Kouba, Chris Deck, SwordFire, Hannes Papenberg, Richard Woeber, hnzz, Rob Crowther, Jedidiah Broadbent, Sergey Chernyshev, Jay-Ar Jamon, Ben Combee, luciano bonachela, Mark Tomlinson, Kit Cambridge, Michael Melgares, Jacob Adams, Adrian Bruinhout, Bev Wieber, Scott Puleo, Thomas Herzog, April Leone, Daniel Mizieliński, Kees van Ginkel, Jon Abrams, Erwin Heiser, Avi Laviad, David newell, Jean-Francois Turcot, Niko Roberts, Erik Dana, Charles Neill, Aaron Holmes, Grzegorz Ziółkowski, Nathan Youngman, Timothy, Jacob Mather, Michael Allan, Mohit Seth, Ryan Ewing, Benjamin Van Treese, Marcelo Santos, Denis Wolf, Phil Keys, Chris Yung, Timo Tijhof, Martin Lekvall, Agendine, Greg Whitworth, Helen Humphrey, Dougal Campbell, Johannes Harth, Bruno Girin, Brian Hough, Darren Newton, Craig McPheat, Olivier Tille, Dennis Roethig, Mathias Bynens, Brendan Stromberger, sundeep, John Meyer, Ron Male, John F Croston III, gigante, Carl Bergenhem, B.J. May, Rebekah Tyler, Ted Foxberry, Jordan Reese, Terry Suitor, afeliz, Tom Kiefer, Darragh Duffy, Kevin Vanderbeken, Andy Pearson, Simon Mac Donald, Abid Din, Chris Joel, Tomas Theunissen, David Dick, Paul Grock, Brandon Wood, John Weis, dgrebb, Nick Jenkins, Chuck Lane, Johnny Megahan, marzsman, Tatu Tamminen, Geoffrey Knauth, Alexander Tarmolov, Jeremy Tymes, Chad Auld, Sean Parmelee, Rob Staenke, Dan Bender, Yannick derwa, Joshua Jones, Geert Plaisier, Tom LeZotte, Christen Simpson, Stefan Bruvik, Justin Falcone, Carlos Santana, Michael Weiss, Pablo Villoslada, Peter deHaan, Dimitris Iliopoulos, seyDoggy, Adam Jordens, Noah Kantrowitz, Amol M, Matthew Winnard, Dirk Ginader, Phinam Bui, David Rapson, Andrew Baxter, Florian Bougel, Michael George, Alban Escalier, Daniel Sellers, Sasha Rudan, John Green, Robert Kowalski, David I. Teixeira (@ditma, Charles Carpenter, Justin Yost, Sam S, Denis Ciccale, Kevin Sheurs, Yannick Croissant, Pau Fracés, Stephen McGowan, Shawn Searcy, Chris Ruppel, Kevin Lamping, Jessica Campbell, Christopher Schmitt, Sablons, Jonathan Reisdorf, Bunni Gek, Teddy Huff, Michael Mullany, Michael Fürstenberg, Carl Henderson, Rick Yoesting, Scott Nichols, Hernán Ciudad, Andrew Maier, Mike Stapp, Jesse Shawl, Sérgio Lopes, jsulak, Shawn Price, Joel Clermont, Chris Ridmann, Sean Timm, Jason Finch, Aiden Montgomery, Elijah Manor, Derek Gathright, Jesse Harlin, Dillon Curry, Courtney Myers, Diego Cadenas, Arne de Bree, João Paulo Dubas, James Taylor, Philipp Kraeutli, Mihai Păun, Sam Gharegozlou, joshjs, Matt Murchison, Eric Windham, Timo Behrmann, Andrew Hall, joshua price, Théophile Villard + +This book series is being produced in an open source fashion, including editing and production. We owe GitHub a debt of gratitude for making that sort of thing possible for the community! + +Thank you again to all the countless folks I didn't name but who I nonetheless owe thanks. May this book series be "owned" by all of us and serve to contribute to increasing awareness and understanding of the JavaScript language, to the benefit of all current and future community contributors. diff --git a/es6 & beyond/ch1.md b/es6 & beyond/ch1.md new file mode 100644 index 0000000..ee03de8 --- /dev/null +++ b/es6 & beyond/ch1.md @@ -0,0 +1,119 @@ +# You Don't Know JS: ES6 & Beyond +# Chapter 1: ES? Now & Future + +Before you dive into this book, you should have a solid working proficiency over JavaScript up to the most recent standard (at the time of this writing), which is commonly called *ES5* (technically ES 5.1). Here, we plan to talk squarely about the upcoming *ES6*, as well as cast our vision beyond to understand how JS will evolve moving forward. + +If you are still looking for confidence with JavaScript, I highly recommend you read the other titles in this series first: + +* *Up & Going*: Are you new to programming and JS? This is the roadmap you need to consult as you start your learning journey. +* *Scope & Closures*: Did you know that JS lexical scope is based on compiler (not interpreter!) semantics? Can you explain how closures are a direct result of lexical scope and functions as values? +* *this & Object Prototypes*: Can you recite the four simple rules for how `this` is bound? Have you been muddling through fake "classes" in JS instead of adopting the simpler "behavior delegation" design pattern? Ever heard of *objects linked to other objects* (OLOO)? +* *Types & Grammar*: Do you know the built-in types in JS, and more importantly, do you know how to properly and safely use coercion between types? How comfortable are you with the nuances of JS grammar/syntax? +* *Async & Performance*: Are you still using callbacks to manage your asynchrony? Can you explain what a promise is and why/how it solves "callback hell"? Do you know how to use generators to improve the legibility of async code? What exactly constitutes mature optimization of JS programs and individual operations? + +If you've already read all those titles and you feel pretty comfortable with the topics they cover, it's time we dive into the evolution of JS to explore all the changes coming not only soon but farther over the horizon. + +Unlike ES5, ES6 is not just a modest set of new APIs added to the language. It incorporates a whole slew of new syntactic forms, some of which may take quite a bit of getting used to. There's also a variety of new organization forms and new API helpers for various data types. + +ES6 is a radical jump forward for the language. Even if you think you know JS in ES5, ES6 is full of new stuff you *don't know yet*, so get ready! This book explores all the major themes of ES6 that you need to get up to speed on, and even gives you a glimpse of future features coming down the track that you should be aware of. + +**Warning:** All code in this book assumes an ES6+ environment. At the time of this writing, ES6 support varies quite a bit in browsers and JS environments (like Node.js), so your mileage may vary. + +## Versioning + +The JavaScript standard is referred to officially as "ECMAScript" (abbreviated "ES"), and up until just recently has been versioned entirely by ordinal number (i.e., "5" for "5th edition"). + +The earliest versions, ES1 and ES2, were not widely known or implemented. ES3 was the first widespread baseline for JavaScript, and constitutes the JavaScript standard for browsers like IE6-8 and older Android 2.x mobile browsers. For political reasons beyond what we'll cover here, the ill-fated ES4 never came about. + +In 2009, ES5 was officially finalized (later ES5.1 in 2011), and settled as the widespread standard for JS for the modern revolution and explosion of browsers, such as Firefox, Chrome, Opera, Safari, and many others. + +Leading up to the expected *next* version of JS (slipped from 2013 to 2014 and then 2015), the obvious and common label in discourse has been ES6. + +However, late into the ES6 specification timeline, suggestions have surfaced that versioning may in the future switch to a year-based schema, such as ES2016 (aka ES7) to refer to whatever version of the specification is finalized before the end of 2016. Some disagree, but ES6 will likely maintain its dominant mindshare over the late-change substitute ES2015. However, ES2016 may in fact signal the new year-based schema. + +It has also been observed that the pace of JS evolution is much faster even than single-year versioning. As soon as an idea begins to progress through standards discussions, browsers start prototyping the feature, and early adopters start experimenting with the code. + +Usually well before there's an official stamp of approval, a feature is de facto standardized by virtue of this early engine/tooling prototyping. So it's also valid to consider the future of JS versioning to be per-feature rather than per-arbitrary-collection-of-major-features (as it is now) or even per-year (as it may become). + +The takeaway is that the version labels stop being as important, and JavaScript starts to be seen more as an evergreen, living standard. The best way to cope with this is to stop thinking about your code base as being "ES6-based," for instance, and instead consider it feature by feature for support. + +## Transpiling + +Made even worse by the rapid evolution of features, a problem arises for JS developers who at once may both strongly desire to use new features while at the same time being slapped with the reality that their sites/apps may need to support older browsers without such support. + +The way ES5 appears to have played out in the broader industry, the typical mindset was that code bases waited to adopt ES5 until most if not all pre-ES5 environments had fallen out of their support spectrum. As a result, many are just recently (at the time of this writing) starting to adopt things like `strict` mode, which landed in ES5 over five years ago. + +It's widely considered to be a harmful approach for the future of the JS ecosystem to wait around and trail the specification by so many years. All those responsible for evolving the language desire for developers to begin basing their code on the new features and patterns as soon as they stabilize in specification form and browsers have a chance to implement them. + +So how do we resolve this seeming contradiction? The answer is tooling, specifically a technique called *transpiling* (transformation + compiling). Roughly, the idea is to use a special tool to transform your ES6 code into equivalent (or close!) matches that work in ES5 environments. + +For example, consider shorthand property definitions (see "Object Literal Extensions" in Chapter 2). Here's the ES6 form: + +```js +var foo = [1,2,3]; + +var obj = { + foo // means `foo: foo` +}; + +obj.foo; // [1,2,3] +``` + +But (roughly) here's how that transpiles: + +```js +var foo = [1,2,3]; + +var obj = { + foo: foo +}; + +obj.foo; // [1,2,3] +``` + +This is a minor but pleasant transformation that lets us shorten the `foo: foo` in an object literal declaration to just `foo`, if the names are the same. + +Transpilers perform these transformations for you, usually in a build workflow step similar to how you perform linting, minification, and other similar operations. + +### Shims/Polyfills + +Not all new ES6 features need a transpiler. Polyfills (aka shims) are a pattern for defining equivalent behavior from a newer environment into an older environment, when possible. Syntax cannot be polyfilled, but APIs often can be. + +For example, `Object.is(..)` is a new utility for checking strict equality of two values but without the nuanced exceptions that `===` has for `NaN` and `-0` values. The polyfill for `Object.is(..)` is pretty easy: + +```js +if (!Object.is) { + Object.is = function(v1, v2) { + // test for `-0` + if (v1 === 0 && v2 === 0) { + return 1 / v1 === 1 / v2; + } + // test for `NaN` + if (v1 !== v1) { + return v2 !== v2; + } + // everything else + return v1 === v2; + }; +} +``` + +**Tip:** Pay attention to the outer `if` statement guard wrapped around the polyfill. This is an important detail, which means the snippet only defines its fallback behavior for older environments where the API in question isn't already defined; it would be very rare that you'd want to overwrite an existing API. + +There's a great collection of ES6 shims called "ES6 Shim" (https://github.com/paulmillr/es6-shim/) that you should definitely adopt as a standard part of any new JS project! + +It is assumed that JS will continue to evolve constantly, with browsers rolling out support for features continually rather than in large chunks. So the best strategy for keeping updated as it evolves is to just introduce polyfill shims into your code base, and a transpiler step into your build workflow, right now and get used to that new reality. + +If you decide to keep the status quo and just wait around for all browsers without a feature supported to go away before you start using the feature, you're always going to be way behind. You'll sadly be missing out on all the innovations designed to make writing JavaScript more effective, efficient, and robust. + +## Review + +ES6 (some may try to call it ES2015) is just landing as of the time of this writing, and it has lots of new stuff you need to learn! + +But it's even more important to shift your mindset to align with the new way that JavaScript is going to evolve. It's not just waiting around for years for some official document to get a vote of approval, as many have done in the past. + +Now, JavaScript features land in browsers as they become ready, and it's up to you whether you'll get on the train early or whether you'll be playing costly catch-up games years from now. + +Whatever labels that future JavaScript adopts, it's going to move a lot quicker than it ever has before. Transpilers and shims/polyfills are important tools to keep you on the forefront of where the language is headed. + +If there's any narrative important to understand about the new reality for JavaScript, it's that all JS developers are strongly implored to move from the trailing edge of the curve to the leading edge. And learning ES6 is where that all starts! diff --git a/es6 & beyond/ch2.md b/es6 & beyond/ch2.md new file mode 100644 index 0000000..9f07cb2 --- /dev/null +++ b/es6 & beyond/ch2.md @@ -0,0 +1,2822 @@ +# You Don't Know JS: ES6 & Beyond +# Chapter 2: Syntax + +If you've been writing JS for any length of time, odds are the syntax is pretty familiar to you. There are certainly many quirks, but overall it's a fairly reasonable and straightforward syntax that draws many similarities from other languages. + +However, ES6 adds quite a few new syntactic forms that take some getting used to. In this chapter, we'll tour through them to find out what's in store. + +**Tip:** At the time of this writing, some of the features discussed in this book have been implemented in various browsers (Firefox, Chrome, etc.), but some have only been partially implemented and many others have not been implemented at all. Your experience may be mixed trying these examples directly. If so, try them out with transpilers, as most of these features are covered by those tools. ES6Fiddle (http://www.es6fiddle.net/) is a great, easy-to-use playground for trying out ES6, as is the online REPL for the Babel transpiler (http://babeljs.io/repl/). + +## Block-Scoped Declarations + +You're probably aware that the fundamental unit of variable scoping in JavaScript has always been the `function`. If you needed to create a block of scope, the most prevalent way to do so other than a regular function declaration was the immediately invoked function expression (IIFE). For example: + +```js +var a = 2; + +(function IIFE(){ + var a = 3; + console.log( a ); // 3 +})(); + +console.log( a ); // 2 +``` + +### `let` Declarations + +However, we can now create declarations that are bound to any block, called (unsurprisingly) *block scoping*. This means all we need is a pair of `{ .. }` to create a scope. Instead of using `var`, which always declares variables attached to the enclosing function (or global, if top level) scope, use `let`: + +```js +var a = 2; + +{ + let a = 3; + console.log( a ); // 3 +} + +console.log( a ); // 2 +``` + +It's not very common or idiomatic thus far in JS to use a standalone `{ .. }` block, but it's always been valid. And developers from other languages that have *block scoping* will readily recognize that pattern. + +I believe this is the best way to create block-scoped variables, with a dedicated `{ .. }` block. Moreover, you should always put the `let` declaration(s) at the very top of that block. If you have more than one to declare, I'd recommend using just one `let`. + +Stylistically, I even prefer to put the `let` on the same line as the opening `{`, to make it clearer that this block is only for the purpose of declaring the scope for those variables. + +```js +{ let a = 2, b, c; + // .. +} +``` + +Now, that's going to look strange and it's not likely going to match the recommendations given in most other ES6 literature. But I have reasons for my madness. + +There's another experimental (not standardized) form of the `let` declaration called the `let`-block, which looks like: + +```js +let (a = 2, b, c) { + // .. +} +``` + +That form is what I call *explicit* block scoping, whereas the `let ..` declaration form that mirrors `var` is more *implicit*, as it kind of hijacks whatever `{ .. }` pair it's found in. Generally developers find *explicit* mechanisms a bit more preferable than *implicit* mechanisms, and I claim this is one of those cases. + +If you compare the previous two snippet forms, they're very similar, and in my opinion both qualify stylistically as *explicit* block scoping. Unfortunately, the `let (..) { .. }` form, the most *explicit* of the options, was not adopted in ES6. That may be revisited post-ES6, but for now the former option is our best bet, I think. + +To reinforce the *implicit* nature of `let ..` declarations, consider these usages: + +```js +let a = 2; + +if (a > 1) { + let b = a * 3; + console.log( b ); // 6 + + for (let i = a; i <= b; i++) { + let j = i + 10; + console.log( j ); + } + // 12 13 14 15 16 + + let c = a + b; + console.log( c ); // 8 +} +``` + +Quick quiz without looking back at that snippet: which variable(s) exist only inside the `if` statement, and which variable(s) exist only inside the `for` loop? + +The answers: the `if` statement contains `b` and `c` block-scoped variables, and the `for` loop contains `i` and `j` block-scoped variables. + +Did you have to think about it for a moment? Does it surprise you that `i` isn't added to the enclosing `if` statement scope? That mental pause and questioning -- I call it a "mental tax" -- comes from the fact that this `let` mechanism is not only new to us, but it's also *implicit*. + +There's also hazard in the `let c = ..` declaration appearing so far down in the scope. Unlike traditional `var`-declared variables, which are attached to the entire enclosing function scope regardless of where they appear, `let` declarations attach to the block scope but are not initialized until they appear in the block. + +Accessing a `let`-declared variable earlier than its `let ..` declaration/initialization causes an error, whereas with `var` declarations the ordering doesn't matter (except stylistically). + +Consider: + +```js +{ + console.log( a ); // undefined + console.log( b ); // ReferenceError! + + var a; + let b; +} +``` + +**Warning:** This `ReferenceError` from accessing too-early `let`-declared references is technically called a *Temporal Dead Zone (TDZ)* error -- you're accessing a variable that's been declared but not yet initialized. This will not be the only time we see TDZ errors -- they crop up in several places in ES6. Also, note that "initialized" doesn't require explicitly assigning a value in your code, as `let b;` is totally valid. A variable that's not given an assignment at declaration time is assumed to have been assigned the `undefined` value, so `let b;` is the same as `let b = undefined;`. Explicit assignment or not, you cannot access `b` until the `let b` statement is run. + +One last gotcha: `typeof` behaves differently with TDZ variables than it does with undeclared (or declared!) variables. For example: + +```js +{ + // `a` is not declared + if (typeof a === "undefined") { + console.log( "cool" ); + } + + // `b` is declared, but in its TDZ + if (typeof b === "undefined") { // ReferenceError! + // .. + } + + // .. + + let b; +} +``` + +The `a` is not declared, so `typeof` is the only safe way to check for its existence or not. But `typeof b` throws the TDZ error because farther down in the code there happens to be a `let b` declaration. Oops. + +Now it should be clearer why I insist that `let` declarations should all be at the top of their scope. That totally avoids the accidental errors of accessing too early. It also makes it more *explicit* when you look at the start of a block, any block, what variables it contains. + +Your blocks (`if` statements, `while` loops, etc.) don't have to share their original behavior with scoping behavior. + +This explicitness on your part, which is up to you to maintain with discipline, will save you lots of refactor headaches and footguns down the line. + +**Note:** For more information on `let` and block scoping, see Chapter 3 of the *Scope & Closures* title of this series. + +#### `let` + `for` + +The only exception I'd make to the preference for the *explicit* form of `let` declaration blocking is a `let` that appears in the header of a `for` loop. The reason may seem nuanced, but I believe it to be one of the more important ES6 features. + +Consider: + +```js +var funcs = []; + +for (let i = 0; i < 5; i++) { + funcs.push( function(){ + console.log( i ); + } ); +} + +funcs[3](); // 3 +``` + +The `let i` in the `for` header declares an `i` not just for the `for` loop itself, but it redeclares a new `i` for each iteration of the loop. That means that closures created inside the loop iteration close over those per-iteration variables the way you'd expect. + +If you tried that same snippet but with `var i` in the `for` loop header, you'd get `5` instead of `3`, because there'd only be one `i` in the outer scope that was closed over, instead of a new `i` for each iteration's function to close over. + +You could also have accomplished the same thing slightly more verbosely: + +```js +var funcs = []; + +for (var i = 0; i < 5; i++) { + let j = i; + funcs.push( function(){ + console.log( j ); + } ); +} + +funcs[3](); // 3 +``` + +Here, we forcibly create a new `j` for each iteration, and then the closure works the same way. I prefer the former approach; that extra special capability is why I endorse the `for (let .. ) ..` form. It could be argued it's somewhat more *implicit*, but it's *explicit* enough, and useful enough, for my tastes. + +`let` also works the same way with `for..in` and `for..of` loops (see "`for..of` Loops"). + +### `const` Declarations + +There's one other form of block-scoped declaration to consider: the `const`, which creates *constants*. + +What exactly is a constant? It's a variable that's read-only after its initial value is set. Consider: + +```js +{ + const a = 2; + console.log( a ); // 2 + + a = 3; // TypeError! +} +``` + +You are not allowed to change the value the variable holds once it's been set, at declaration time. A `const` declaration must have an explicit initialization. If you wanted a *constant* with the `undefined` value, you'd have to declare `const a = undefined` to get it. + +Constants are not a restriction on the value itself, but on the variable's assignment of that value. In other words, the value is not frozen or immutable because of `const`, just the assignment of it. If the value is complex, such as an object or array, the contents of the value can still be modified: + +```js +{ + const a = [1,2,3]; + a.push( 4 ); + console.log( a ); // [1,2,3,4] + + a = 42; // TypeError! +} +``` + +The `a` variable doesn't actually hold a constant array; rather, it holds a constant reference to the array. The array itself is freely mutable. + +**Warning:** Assigning an object or array as a constant means that value will not be able to be garbage collected until that constant's lexical scope goes away, as the reference to the value can never be unset. That may be desirable, but be careful if it's not your intent! + +Essentially, `const` declarations enforce what we've stylistically signaled with our code for years, where we declared a variable name of all uppercase letters and assigned it some literal value that we took care never to change. There's no enforcement on a `var` assignment, but there is now with a `const` assignment, which can help you catch unintended changes. + +`const` *can* be used with variable declarations of `for`, `for..in`, and `for..of` loops (see "`for..of` Loops"). However, an error will be thrown if there's any attempt to reassign, such as the typical `i++` clause of a `for` loop. + +#### `const` Or Not + +There's some rumored assumptions that a `const` could be more optimizable by the JS engine in certain scenarios than a `let` or `var` would be. Theoretically, the engine more easily knows the variable's value/type will never change, so it can eliminate some possible tracking. + +Whether `const` really helps here or this is just our own fantasies and intuitions, the much more important decision to make is if you intend constant behavior or not. Remember: one of the most important roles for source code is to communicate clearly, not only to you, but your future self and other code collaborators, what your intent is. + +Some developers prefer to start out every variable declaration as a `const` and then relax a declaration back to a `let` if it becomes necessary for its value to change in the code. This is an interesting perspective, but it's not clear that it genuinely improves the readability or reason-ability of code. + +It's not really a *protection*, as many believe, because any later developer who wants to change a value of a `const` can just blindly change `const` to `let` on the declaration. At best, it protects accidental change. But again, other than our intuitions and sensibilities, there doesn't appear to be objective and clear measure of what constitutes "accidents" or prevention thereof. Similar mindsets exist around type enforcement. + +My advice: to avoid potentially confusing code, only use `const` for variables that you're intentionally and obviously signaling will not change. In other words, don't *rely on* `const` for code behavior, but instead use it as a tool for signaling intent, when intent can be signaled clearly. + +### Block-scoped Functions + +Starting with ES6, function declarations that occur inside of blocks are now specified to be scoped to that block. Prior to ES6, the specification did not call for this, but many implementations did it anyway. So now the specification meets reality. + +Consider: + +```js +{ + foo(); // works! + + function foo() { + // .. + } +} + +foo(); // ReferenceError +``` + +The `foo()` function is declared inside the `{ .. }` block, and as of ES6 is block-scoped there. So it's not available outside that block. But also note that it is "hoisted" within the block, as opposed to `let` declarations, which suffer the TDZ error trap mentioned earlier. + +Block-scoping of function declarations could be a problem if you've ever written code like this before, and relied on the old legacy non-block-scoped behavior: + +```js +if (something) { + function foo() { + console.log( "1" ); + } +} +else { + function foo() { + console.log( "2" ); + } +} + +foo(); // ?? +``` + +In pre-ES6 environments, `foo()` would print `"2"` regardless of the value of `something`, because both function declarations were hoisted out of the blocks, and the second one always wins. + +In ES6, that last line throws a `ReferenceError`. + +## Spread/Rest + +ES6 introduces a new `...` operator that's typically referred to as the *spread* or *rest* operator, depending on where/how it's used. Let's take a look: + +```js +function foo(x,y,z) { + console.log( x, y, z ); +} + +foo( ...[1,2,3] ); // 1 2 3 +``` + +When `...` is used in front of an array (actually, any *iterable*, which we cover in Chapter 3), it acts to "spread" it out into its individual values. + +You'll typically see that usage as is shown in that previous snippet, when spreading out an array as a set of arguments to a function call. In this usage, `...` acts to give us a simpler syntactic replacement for the `apply(..)` method, which we would typically have used pre-ES6 as: + +```js +foo.apply( null, [1,2,3] ); // 1 2 3 +``` + +But `...` can be used to spread out/expand a value in other contexts as well, such as inside another array declaration: + +```js +var a = [2,3,4]; +var b = [ 1, ...a, 5 ]; + +console.log( b ); // [1,2,3,4,5] +``` + +In this usage, `...` is basically replacing `concat(..)`, as it behaves like `[1].concat( a, [5] )` here. + +The other common usage of `...` can be seen as essentially the opposite; instead of spreading a value out, the `...` *gathers* a set of values together into an array. Consider: + +```js +function foo(x, y, ...z) { + console.log( x, y, z ); +} + +foo( 1, 2, 3, 4, 5 ); // 1 2 [3,4,5] +``` + +The `...z` in this snippet is essentially saying: "gather the *rest* of the arguments (if any) into an array called `z`." Because `x` was assigned `1`, and `y` was assigned `2`, the rest of the arguments `3`, `4`, and `5` were gathered into `z`. + +Of course, if you don't have any named parameters, the `...` gathers all arguments: + +```js +function foo(...args) { + console.log( args ); +} + +foo( 1, 2, 3, 4, 5); // [1,2,3,4,5] +``` + +**Note:** The `...args` in the `foo(..)` function declaration is usually called "rest parameters," because you're collecting the rest of the parameters. I prefer "gather," because it's more descriptive of what it does rather than what it contains. + +The best part about this usage is that it provides a very solid alternative to using the long-since-deprecated `arguments` array -- actually, it's not really an array, but an array-like object. Because `args` (or whatever you call it -- a lot of people prefer `r` or `rest`) is a real array, we can get rid of lots of silly pre-ES6 tricks we jumped through to make `arguments` into something we can treat as an array. + +Consider: + +```js +// doing things the new ES6 way +function foo(...args) { + // `args` is already a real array + + // discard first element in `args` + args.shift(); + + // pass along all of `args` as arguments + // to `console.log(..)` + console.log( ...args ); +} + +// doing things the old-school pre-ES6 way +function bar() { + // turn `arguments` into a real array + var args = Array.prototype.slice.call( arguments ); + + // add some elements on the end + args.push( 4, 5 ); + + // filter out odd numbers + args = args.filter( function(v){ + return v % 2 == 0; + } ); + + // pass along all of `args` as arguments + // to `foo(..)` + foo.apply( null, args ); +} + +bar( 0, 1, 2, 3 ); // 2 4 +``` + +The `...args` in the `foo(..)` function declaration gathers arguments, and the `...args` in the `console.log(..)` call spreads them out. That's a good illustration of the symmetric but opposite uses of the `...` operator. + +Besides the `...` usage in a function declaration, there's another case where `...` is used for gathering values, and we'll look at it in the "Too Many, Too Few, Just Enough" section later in this chapter. + +## Default Parameter Values + +Perhaps one of the most common idioms in JavaScript relates to setting a default value for a function parameter. The way we've done this for years should look quite familiar: + +```js +function foo(x,y) { + x = x || 11; + y = y || 31; + + console.log( x + y ); +} + +foo(); // 42 +foo( 5, 6 ); // 11 +foo( 5 ); // 36 +foo( null, 6 ); // 17 +``` + +Of course, if you've used this pattern before, you know that it's both helpful and a little bit dangerous, if for example you need to be able to pass in what would otherwise be considered a falsy value for one of the parameters. Consider: + +```js +foo( 0, 42 ); // 53 <-- Oops, not 42 +``` + +Why? Because the `0` is falsy, and so the `x || 11` results in `11`, not the directly passed in `0`. + +To fix this gotcha, some people will instead write the check more verbosely like this: + +```js +function foo(x,y) { + x = (x !== undefined) ? x : 11; + y = (y !== undefined) ? y : 31; + + console.log( x + y ); +} + +foo( 0, 42 ); // 42 +foo( undefined, 6 ); // 17 +``` + +Of course, that means that any value except `undefined` can be directly passed in. However, `undefined` will be assumed to signal, "I didn't pass this in." That works great unless you actually need to be able to pass `undefined` in. + +In that case, you could test to see if the argument is actually omitted, by it actually not being present in the `arguments` array, perhaps like this: + +```js +function foo(x,y) { + x = (0 in arguments) ? x : 11; + y = (1 in arguments) ? y : 31; + + console.log( x + y ); +} + +foo( 5 ); // 36 +foo( 5, undefined ); // NaN +``` + +But how would you omit the first `x` argument without the ability to pass in any kind of value (not even `undefined`) that signals "I'm omitting this argument"? + +`foo(,5)` is tempting, but it's invalid syntax. `foo.apply(null,[,5])` seems like it should do the trick, but `apply(..)`'s quirks here mean that the arguments are treated as `[undefined,5]`, which of course doesn't omit. + +If you investigate further, you'll find you can only omit arguments on the end (i.e., righthand side) by simply passing fewer arguments than "expected," but you cannot omit arguments in the middle or at the beginning of the arguments list. It's just not possible. + +There's a principle applied to JavaScript's design here that is important to remember: `undefined` means *missing*. That is, there's no difference between `undefined` and *missing*, at least as far as function arguments go. + +**Note:** There are, confusingly, other places in JS where this particular design principle doesn't apply, such as for arrays with empty slots. See the *Types & Grammar* title of this series for more information. + +With all this in mind, we can now examine a nice helpful syntax added as of ES6 to streamline the assignment of default values to missing arguments: + +```js +function foo(x = 11, y = 31) { + console.log( x + y ); +} + +foo(); // 42 +foo( 5, 6 ); // 11 +foo( 0, 42 ); // 42 + +foo( 5 ); // 36 +foo( 5, undefined ); // 36 <-- `undefined` is missing +foo( 5, null ); // 5 <-- null coerces to `0` + +foo( undefined, 6 ); // 17 <-- `undefined` is missing +foo( null, 6 ); // 6 <-- null coerces to `0` +``` + +Notice the results and how they imply both subtle differences and similarities to the earlier approaches. + +`x = 11` in a function declaration is more like `x !== undefined ? x : 11` than the much more common idiom `x || 11`, so you'll need to be careful in converting your pre-ES6 code to this ES6 default parameter value syntax. + +**Note:** A rest/gather parameter (see "Spread/Rest") cannot have a default value. So, while `function foo(...vals=[1,2,3]) {` might seem an intriguing capability, it's not valid syntax. You'll need to continue to apply that sort of logic manually if necessary. + +### Default Value Expressions + +Function default values can be more than just simple values like `31`; they can be any valid expression, even a function call: + +```js +function bar(val) { + console.log( "bar called!" ); + return y + val; +} + +function foo(x = y + 3, z = bar( x )) { + console.log( x, z ); +} + +var y = 5; +foo(); // "bar called" + // 8 13 +foo( 10 ); // "bar called" + // 10 15 +y = 6; +foo( undefined, 10 ); // 9 10 +``` + +As you can see, the default value expressions are lazily evaluated, meaning they're only run if and when they're needed -- that is, when a parameter's argument is omitted or is `undefined`. + +It's a subtle detail, but the formal parameters in a function declaration are in their own scope (think of it as a scope bubble wrapped around just the `( .. )` of the function declaration), not in the function body's scope. That means a reference to an identifier in a default value expression first matches the formal parameters' scope before looking to an outer scope. See the *Scope & Closures* title of this series for more information. + +Consider: + +```js +var w = 1, z = 2; + +function foo( x = w + 1, y = x + 1, z = z + 1 ) { + console.log( x, y, z ); +} + +foo(); // ReferenceError +``` + +The `w` in the `w + 1` default value expression looks for `w` in the formal parameters' scope, but does not find it, so the outer scope's `w` is used. Next, The `x` in the `x + 1` default value expression finds `x` in the formal parameters' scope, and luckily `x` has already been initialized, so the assignment to `y` works fine. + +However, the `z` in `z + 1` finds `z` as a not-yet-initialized-at-that-moment parameter variable, so it never tries to find the `z` from the outer scope. + +As we mentioned in the "`let` Declarations" section earlier in this chapter, ES6 has a TDZ, which prevents a variable from being accessed in its uninitialized state. As such, the `z + 1` default value expression throws a TDZ `ReferenceError` error. + +Though it's not necessarily a good idea for code clarity, a default value expression can even be an inline function expression call -- commonly referred to as an immediately invoked function expression (IIFE): + +```js +function foo( x = + (function(v){ return v + 11; })( 31 ) +) { + console.log( x ); +} + +foo(); // 42 +``` + +There will very rarely be any cases where an IIFE (or any other executed inline function expression) will be appropriate for default value expressions. If you find yourself tempted to do this, take a step back and reevaluate! + +**Warning:** If the IIFE had tried to access the `x` identifier and had not declared its own `x`, this would also have been a TDZ error, just as discussed before. + +The default value expression in the previous snippet is an IIFE in that in the sense that it's a function that's executed right inline, via `(31)`. If we had left that part off, the default value assigned to `x` would have just been a function reference itself, perhaps like a default callback. There will probably be cases where that pattern will be quite useful, such as: + +```js +function ajax(url, cb = function(){}) { + // .. +} + +ajax( "http://some.url.1" ); +``` + +In this case, we essentially want to default `cb` to be a no-op empty function call if not otherwise specified. The function expression is just a function reference, not a function call itself (no invoking `()` on the end of it), which accomplishes that goal. + +Since the early days of JS, there's been a little-known but useful quirk available to us: `Function.prototype` is itself an empty no-op function. So, the declaration could have been `cb = Function.prototype` and saved the inline function expression creation. + +## Destructuring + +ES6 introduces a new syntactic feature called *destructuring*, which may be a little less confusing if you instead think of it as *structured assignment*. To understand this meaning, consider: + +```js +function foo() { + return [1,2,3]; +} + +var tmp = foo(), + a = tmp[0], b = tmp[1], c = tmp[2]; + +console.log( a, b, c ); // 1 2 3 +``` + +As you can see, we created a manual assignment of the values in the array that `foo()` returns to individual variables `a`, `b`, and `c`, and to do so we (unfortunately) needed the `tmp` variable. + +Similarly, we can do the following with objects: + +```js +function bar() { + return { + x: 4, + y: 5, + z: 6 + }; +} + +var tmp = bar(), + x = tmp.x, y = tmp.y, z = tmp.z; + +console.log( x, y, z ); // 4 5 6 +``` + +The `tmp.x` property value is assigned to the `x` variable, and likewise for `tmp.y` to `y` and `tmp.z` to `z`. + +Manually assigning indexed values from an array or properties from an object can be thought of as *structured assignment*. ES6 adds a dedicated syntax for *destructuring*, specifically *array destructuring* and *object destructuring*. This syntax eliminates the need for the `tmp` variable in the previous snippets, making them much cleaner. Consider: + +```js +var [ a, b, c ] = foo(); +var { x: x, y: y, z: z } = bar(); + +console.log( a, b, c ); // 1 2 3 +console.log( x, y, z ); // 4 5 6 +``` + +You're likely more accustomed to seeing syntax like `[a,b,c]` on the righthand side of an `=` assignment, as the value being assigned. + +Destructuring symmetrically flips that pattern, so that `[a,b,c]` on the lefthand side of the `=` assignment is treated as a kind of "pattern" for decomposing the righthand side array value into separate variable assignments. + +Similarly, `{ x: x, y: y, z: z }` specifies a "pattern" to decompose the object value from `bar()` into separate variable assignments. + +### Object Property Assignment Pattern + +Let's dig into that `{ x: x, .. }` syntax from the previous snippet. If the property name being matched is the same as the variable you want to declare, you can actually shorten the syntax: + +```js +var { x, y, z } = bar(); + +console.log( x, y, z ); // 4 5 6 +``` + +Pretty cool, right? + +But is `{ x, .. }` leaving off the `x: ` part or leaving off the `: x` part? We're actually leaving off the `x: ` part when we use the shorter syntax. That may not seem like an important detail, but you'll understand its importance in just a moment. + +If you can write the shorter form, why would you ever write out the longer form? Because that longer form actually allows you to assign a property to a different variable name, which can sometimes be quite useful: + +```js +var { x: bam, y: baz, z: bap } = bar(); + +console.log( bam, baz, bap ); // 4 5 6 +console.log( x, y, z ); // ReferenceError +``` + +There's a subtle but super-important quirk to understand about this variation of the object destructuring form. To illustrate why it can be a gotcha you need to be careful of, let's consider the "pattern" of how normal object literals are specified: + +```js +var X = 10, Y = 20; + +var o = { a: X, b: Y }; + +console.log( o.a, o.b ); // 10 20 +``` + +In `{ a: X, b: Y }`, we know that `a` is the object property, and `X` is the source value that gets assigned to it. In other words, the syntactic pattern is `target: source`, or more obviously, `property-alias: value`. We intuitively understand this because it's the same as `=` assignment, where the pattern is `target = source`. + +However, when you use object destructuring assignment -- that is, putting the `{ .. }` object literal-looking syntax on the lefthand side of the `=` operator -- you invert that `target: source` pattern. + +Recall: + +```js +var { x: bam, y: baz, z: bap } = bar(); +``` + +The syntactic pattern here is `source: target` (or `value: variable-alias`). `x: bam` means the `x` property is the source value and `bam` is the target variable to assign to. In other words, object literals are `target <-- source`, and object destructuring assignments are `source --> target`. See how that's flipped? + +There's another way to think about this syntax though, which may help ease the confusion. Consider: + +```js +var aa = 10, bb = 20; + +var o = { x: aa, y: bb }; +var { x: AA, y: BB } = o; + +console.log( AA, BB ); // 10 20 +``` + +In the `{ x: aa, y: bb }` line, the `x` and `y` represent the object properties. In the `{ x: AA, y: BB }` line, the `x` and the `y` *also* represent the object properties. + +Recall how earlier I asserted that `{ x, .. }` was leaving off the `x: ` part? In those two lines, if you erase the `x: ` and `y: ` parts in that snippet, you're left only with `aa, bb` and `AA, BB`, which in effect -- only conceptually, not actually -- are assignments from `aa` to `AA` and from `bb` to `BB`. + +So, that symmetry may help to explain why the syntactic pattern was intentionally flipped for this ES6 feature. + +**Note:** I would have preferred the syntax to be `{ AA: x , BB: y }` for the destructuring assignment, as that would have preserved consistency of the more familiar `target: source` pattern for both usages. Alas, I'm having to train my brain for the inversion, as some readers may also have to do. + +### Not Just Declarations + +So far, we've used destructuring assignment with `var` declarations (of course, they could also use `let` and `const`), but destructuring is a general assignment operation, not just a declaration. + +Consider: + +```js +var a, b, c, x, y, z; + +[a,b,c] = foo(); +( { x, y, z } = bar() ); + +console.log( a, b, c ); // 1 2 3 +console.log( x, y, z ); // 4 5 6 +``` + +The variables can already be declared, and then the destructuring only does assignments, exactly as we've already seen. + +**Note:** For the object destructuring form specifically, when leaving off a `var`/`let`/`const` declarator, we had to surround the whole assignment expression in `( )`, because otherwise the `{ .. }` on the lefthand side as the first element in the statement is taken to be a block statement instead of an object. + +In fact, the assignment expressions (`a`, `y`, etc.) don't actually need to be just variable identifiers. Anything that's a valid assignment expression is allowed. For example: + +```js +var o = {}; + +[o.a, o.b, o.c] = foo(); +( { x: o.x, y: o.y, z: o.z } = bar() ); + +console.log( o.a, o.b, o.c ); // 1 2 3 +console.log( o.x, o.y, o.z ); // 4 5 6 +``` + +You can even use computed property expressions in the destructuring. Consider: + +```js +var which = "x", + o = {}; + +( { [which]: o[which] } = bar() ); + +console.log( o.x ); // 4 +``` + +The `[which]:` part is the computed property, which results in `x` -- the property to destructure from the object in question as the source of the assignment. The `o[which]` part is just a normal object key reference, which equates to `o.x` as the target of the assignment. + +You can use the general assignments to create object mappings/transformations, such as: + +```js +var o1 = { a: 1, b: 2, c: 3 }, + o2 = {}; + +( { a: o2.x, b: o2.y, c: o2.z } = o1 ); + +console.log( o2.x, o2.y, o2.z ); // 1 2 3 +``` + +Or you can map an object to an array, such as: + +```js +var o1 = { a: 1, b: 2, c: 3 }, + a2 = []; + +( { a: a2[0], b: a2[1], c: a2[2] } = o1 ); + +console.log( a2 ); // [1,2,3] +``` + +Or the other way around: + +```js +var a1 = [ 1, 2, 3 ], + o2 = {}; + +[ o2.a, o2.b, o2.c ] = a1; + +console.log( o2.a, o2.b, o2.c ); // 1 2 3 +``` + +Or you could reorder one array to another: + +```js +var a1 = [ 1, 2, 3 ], + a2 = []; + +[ a2[2], a2[0], a2[1] ] = a1; + +console.log( a2 ); // [2,3,1] +``` + +You can even solve the traditional "swap two variables" task without a temporary variable: + +```js +var x = 10, y = 20; + +[ y, x ] = [ x, y ]; + +console.log( x, y ); // 20 10 +``` + +**Warning:** Be careful: you shouldn't mix in declaration with assignment unless you want all of the assignment expressions *also* to be treated as declarations. Otherwise, you'll get syntax errors. That's why in the earlier example I had to do `var a2 = []` separately from the `[ a2[0], .. ] = ..` destructuring assignment. It wouldn't make any sense to try `var [ a2[0], .. ] = ..`, because `a2[0]` isn't a valid declaration identifier; it also obviously couldn't implicitly create a `var a2 = []` declaration to use. + +### Repeated Assignments + +The object destructuring form allows a source property (holding any value type) to be listed multiple times. For example: + +```js +var { a: X, a: Y } = { a: 1 }; + +X; // 1 +Y; // 1 +``` + +That also means you can both destructure a sub-object/array property and also capture the sub-object/array's value itself. Consider: + +```js +var { a: { x: X, x: Y }, a } = { a: { x: 1 } }; + +X; // 1 +Y; // 1 +a; // { x: 1 } + +( { a: X, a: Y, a: [ Z ] } = { a: [ 1 ] } ); + +X.push( 2 ); +Y[0] = 10; + +X; // [10,2] +Y; // [10,2] +Z; // 1 +``` + +A word of caution about destructuring: it may be tempting to list destructuring assignments all on a single line as has been done thus far in our discussion. However, it's a much better idea to spread destructuring assignment patterns over multiple lines, using proper indentation -- much like you would in JSON or with an object literal value -- for readability sake. + +```js +// harder to read: +var { a: { b: [ c, d ], e: { f } }, g } = obj; + +// better: +var { + a: { + b: [ c, d ], + e: { f } + }, + g +} = obj; +``` + +Remember: **the purpose of destructuring is not just less typing, but more declarative readability.** + +#### Destructuring Assignment Expressions + +The assignment expression with object or array destructuring has as its completion value the full righthand object/array value. Consider: + +```js +var o = { a:1, b:2, c:3 }, + a, b, c, p; + +p = { a, b, c } = o; + +console.log( a, b, c ); // 1 2 3 +p === o; // true +``` + +In the previous snippet, `p` was assigned the `o` object reference, not one of the `a`, `b`, or `c` values. The same is true of array destructuring: + +```js +var o = [1,2,3], + a, b, c, p; + +p = [ a, b, c ] = o; + +console.log( a, b, c ); // 1 2 3 +p === o; // true +``` + +By carrying the object/array value through as the completion, you can chain destructuring assignment expressions together: + +```js +var o = { a:1, b:2, c:3 }, + p = [4,5,6], + a, b, c, x, y, z; + +( {a} = {b,c} = o ); +[x,y] = [z] = p; + +console.log( a, b, c ); // 1 2 3 +console.log( x, y, z ); // 4 5 4 +``` + +### Too Many, Too Few, Just Enough + +With both array destructuring assignment and object destructuring assignment, you do not have to assign all the values that are present. For example: + +```js +var [,b] = foo(); +var { x, z } = bar(); + +console.log( b, x, z ); // 2 4 6 +``` + +The `1` and `3` values that came back from `foo()` are discarded, as is the `5` value from `bar()`. + +Similarly, if you try to assign more values than are present in the value you're destructuring/decomposing, you get graceful fallback to `undefined`, as you'd expect: + +```js +var [,,c,d] = foo(); +var { w, z } = bar(); + +console.log( c, z ); // 3 6 +console.log( d, w ); // undefined undefined +``` + +This behavior follows symmetrically from the earlier stated "`undefined` is missing" principle. + +We examined the `...` operator earlier in this chapter, and saw that it can sometimes be used to spread an array value out into its separate values, and sometimes it can be used to do the opposite: to gather a set of values together into an array. + +In addition to the gather/rest usage in function declarations, `...` can perform the same behavior in destructuring assignments. To illustrate, let's recall a snippet from earlier in this chapter: + +```js +var a = [2,3,4]; +var b = [ 1, ...a, 5 ]; + +console.log( b ); // [1,2,3,4,5] +``` + +Here we see that `...a` is spreading `a` out, because it appears in the array `[ .. ]` value position. If `...a` appears in an array destructuring position, it performs the gather behavior: + +```js +var a = [2,3,4]; +var [ b, ...c ] = a; + +console.log( b, c ); // 2 [3,4] +``` + +The `var [ .. ] = a` destructuring assignment spreads `a` out to be assigned to the pattern described inside the `[ .. ]`. The first part names `b` for the first value in `a` (`2`). But then `...c` gathers the rest of the values (`3` and `4`) into an array and calls it `c`. + +**Note:** We've seen how `...` works with arrays, but what about with objects? It's not an ES6 feature, but see Chapter 8 for discussion of a possible "beyond ES6" feature where `...` works with spreading or gathering objects. + +### Default Value Assignment + +Both forms of destructuring can offer a default value option for an assignment, using the `=` syntax similar to the default function argument values discussed earlier. + +Consider: + +```js +var [ a = 3, b = 6, c = 9, d = 12 ] = foo(); +var { x = 5, y = 10, z = 15, w = 20 } = bar(); + +console.log( a, b, c, d ); // 1 2 3 12 +console.log( x, y, z, w ); // 4 5 6 20 +``` + +You can combine the default value assignment with the alternative assignment expression syntax covered earlier. For example: + +```js +var { x, y, z, w: WW = 20 } = bar(); + +console.log( x, y, z, WW ); // 4 5 6 20 +``` + +Be careful about confusing yourself (or other developers who read your code) if you use an object or array as the default value in a destructuring. You can create some really hard to understand code: + +```js +var x = 200, y = 300, z = 100; +var o1 = { x: { y: 42 }, z: { y: z } }; + +( { y: x = { y: y } } = o1 ); +( { z: y = { y: z } } = o1 ); +( { x: z = { y: x } } = o1 ); +``` + +Can you tell from that snippet what values `x`, `y`, and `z` have at the end? Takes a moment of pondering, I would imagine. I'll end the suspense: + +```js +console.log( x.y, y.y, z.y ); // 300 100 42 +``` + +The takeaway here: destructuring is great and can be very useful, but it's also a sharp sword that can cause injury (to someone's brain) if used unwisely. + +### Nested Destructuring + +If the values you're destructuring have nested objects or arrays, you can destructure those nested values as well: + +```js +var a1 = [ 1, [2, 3, 4], 5 ]; +var o1 = { x: { y: { z: 6 } } }; + +var [ a, [ b, c, d ], e ] = a1; +var { x: { y: { z: w } } } = o1; + +console.log( a, b, c, d, e ); // 1 2 3 4 5 +console.log( w ); // 6 +``` + +Nested destructuring can be a simple way to flatten out object namespaces. For example: + +```js +var App = { + model: { + User: function(){ .. } + } +}; + +// instead of: +// var User = App.model.User; + +var { model: { User } } = App; +``` + +### Destructuring Parameters + +In the following snippet, can you spot the assignment? + +```js +function foo(x) { + console.log( x ); +} + +foo( 42 ); +``` + +The assignment is kinda hidden: `42` (the argument) is assigned to `x` (the parameter) when `foo(42)` is executed. If parameter/argument pairing is an assignment, then it stands to reason that it's an assignment that could be destructured, right? Of course! + +Consider array destructuring for parameters: + +```js +function foo( [ x, y ] ) { + console.log( x, y ); +} + +foo( [ 1, 2 ] ); // 1 2 +foo( [ 1 ] ); // 1 undefined +foo( [] ); // undefined undefined +``` + +Object destructuring for parameters works, too: + +```js +function foo( { x, y } ) { + console.log( x, y ); +} + +foo( { y: 1, x: 2 } ); // 2 1 +foo( { y: 42 } ); // undefined 42 +foo( {} ); // undefined undefined +``` + +This technique is an approximation of named arguments (a long requested feature for JS!), in that the properties on the object map to the destructured parameters of the same names. That also means that we get optional parameters (in any position) for free, as you can see leaving off the `x` "parameter" worked as we'd expect. + +Of course, all the previously discussed variations of destructuring are available to us with parameter destructuring, including nested destructuring, default values, and more. Destructuring also mixes fine with other ES6 function parameter capabilities, like default parameter values and rest/gather parameters. + +Consider these quick illustrations (certainly not exhaustive of the possible variations): + +```js +function f1([ x=2, y=3, z ]) { .. } +function f2([ x, y, ...z], w) { .. } +function f3([ x, y, ...z], ...w) { .. } + +function f4({ x: X, y }) { .. } +function f5({ x: X = 10, y = 20 }) { .. } +function f6({ x = 10 } = {}, { y } = { y: 10 }) { .. } +``` + +Let's take one example from this snippet and examine it, for illustration purposes: + +```js +function f3([ x, y, ...z], ...w) { + console.log( x, y, z, w ); +} + +f3( [] ); // undefined undefined [] [] +f3( [1,2,3,4], 5, 6 ); // 1 2 [3,4] [5,6] +``` + +There are two `...` operators in use here, and they're both gathering values in arrays (`z` and `w`), though `...z` gathers from the rest of the values left over in the first array argument, while `...w` gathers from the rest of the main arguments left over after the first. + +#### Destructuring Defaults + Parameter Defaults + +There's one subtle point you should be particularly careful to notice -- the difference in behavior between a destructuring default value and a function parameter default value. For example: + +```js +function f6({ x = 10 } = {}, { y } = { y: 10 }) { + console.log( x, y ); +} + +f6(); // 10 10 +``` + +At first, it would seem that we've declared a default value of `10` for both the `x` and `y` parameters, but in two different ways. However, these two different approaches will behave differently in certain cases, and the difference is awfully subtle. + +Consider: + +```js +f6( {}, {} ); // 10 undefined +``` + +Wait, why did that happen? It's pretty clear that named parameter `x` is defaulting to `10` if not passed as a property of that same name in the first argument's object. + +But what about `y` being `undefined`? The `{ y: 10 }` value is an object as a function parameter default value, not a destructuring default value. As such, it only applies if the second argument is not passed at all, or is passed as `undefined`. + +In the previous snippet, we *are* passing a second argument (`{}`), so the default `{ y: 10 }` value is not used, and the `{ y }` destructuring occurs against the passed in `{}` empty object value. + +Now, compare `{ y } = { y: 10 }` to `{ x = 10 } = {}`. + +For the `x`'s form usage, if the first function argument is omitted or `undefined`, the `{}` empty object default applies. Then, whatever value is in the first argument position -- either the default `{}` or whatever you passed in -- is destructured with the `{ x = 10 }`, which checks to see if an `x` property is found, and if not found (or `undefined`), the `10` default value is applied to the `x` named parameter. + +Deep breath. Read back over those last few paragraphs a couple of times. Let's review via code: + +```js +function f6({ x = 10 } = {}, { y } = { y: 10 }) { + console.log( x, y ); +} + +f6(); // 10 10 +f6( undefined, undefined ); // 10 10 +f6( {}, undefined ); // 10 10 + +f6( {}, {} ); // 10 undefined +f6( undefined, {} ); // 10 undefined + +f6( { x: 2 }, { y: 3 } ); // 2 3 +``` + +It would generally seem that the defaulting behavior of the `x` parameter is probably the more desirable and sensible case compared to that of `y`. As such, it's important to understand why and how `{ x = 10 } = {}` form is different from `{ y } = { y: 10 }` form. + +If that's still a bit fuzzy, go back and read it again, and play with this yourself. Your future self will thank you for taking the time to get this very subtle gotcha nuance detail straight. + +#### Nested Defaults: Destructured and Restructured + +Although it may at first be difficult to grasp, an interesting idiom emerges for setting defaults for a nested object's properties: using object destructuring along with what I'd call *restructuring*. + +Consider a set of defaults in a nested object structure, like the following: + +```js +// taken from: http://es-discourse.com/t/partial-default-arguments/120/7 + +var defaults = { + options: { + remove: true, + enable: false, + instance: {} + }, + log: { + warn: true, + error: true + } +}; +``` + +Now, let's say that you have an object called `config`, which has some of these applied, but perhaps not all, and you'd like to set all the defaults into this object in the missing spots, but not override specific settings already present: + +```js +var config = { + options: { + remove: false, + instance: null + } +}; +``` + +You can of course do so manually, as you might have done in the past: + +```js +config.options = config.options || {}; +config.options.remove = (config.options.remove !== undefined) ? + config.options.remove : defaults.options.remove; +config.options.enable = (config.options.enable !== undefined) ? + config.options.enable : defaults.options.enable; +... +``` + +Yuck. + +Others may prefer the assign-overwrite approach to this task. You might be tempted by the ES6 `Object.assign(..)` utility (see Chapter 6) to clone the properties first from `defaults` and then overwritten with the cloned properties from `config`, as so: + +```js +config = Object.assign( {}, defaults, config ); +``` + +That looks way nicer, huh? But there's a major problem! `Object.assign(..)` is shallow, which means when it copies `defaults.options`, it just copies that object reference, not deep cloning that object's properties to a `config.options` object. `Object.assign(..)` would need to be applied (sort of "recursively") at all levels of your object's tree to get the deep cloning you're expecting. + +**Note:** Many JS utility libraries/frameworks provide their own option for deep cloning of an object, but those approaches and their gotchas are beyond our scope to discuss here. + +So let's examine if ES6 object destructuring with defaults can help at all: + +```js +config.options = config.options || {}; +config.log = config.log || {}; +({ + options: { + remove: config.options.remove = defaults.options.remove, + enable: config.options.enable = defaults.options.enable, + instance: config.options.instance = defaults.options.instance + } = {}, + log: { + warn: config.log.warn = defaults.log.warn, + error: config.log.error = defaults.log.error + } = {} +} = config); +``` + +Not as nice as the false promise of `Object.assign(..)` (being that it's shallow only), but it's better than the manual approach by a fair bit, I think. It is still unfortunately verbose and repetitive, though. + +The previous snippet's approach works because I'm hacking the destructuring and defaults mechanism to do the property `=== undefined` checks and assignment decisions for me. It's a trick in that I'm destructuring `config` (see the `= config` at the end of the snippet), but I'm reassigning all the destructured values right back into `config`, with the `config.options.enable` assignment references. + +Still too much, though. Let's see if we can make anything better. + +The following trick works best if you know that all the various properties you're destructuring are uniquely named. You can still do it even if that's not the case, but it's not as nice -- you'll have to do the destructuring in stages, or create unique local variables as temporary aliases. + +If we fully destructure all the properties into top-level variables, we can then immediately restructure to reconstitute the original nested object structure. + +But all those temporary variables hanging around would pollute scope. So, let's use block scoping (see "Block-Scoped Declarations" earlier in this chapter) with a general `{ }` enclosing block: + +```js +// merge `defaults` into `config` +{ + // destructure (with default value assignments) + let { + options: { + remove = defaults.options.remove, + enable = defaults.options.enable, + instance = defaults.options.instance + } = {}, + log: { + warn = defaults.log.warn, + error = defaults.log.error + } = {} + } = config; + + // restructure + config = { + options: { remove, enable, instance }, + log: { warn, error } + }; +} +``` + +That seems a fair bit nicer, huh? + +**Note:** You could also accomplish the scope enclosure with an arrow IIFE instead of the general `{ }` block and `let` declarations. Your destructuring assignments/defaults would be in the parameter list and your restructuring would be the `return` statement in the function body. + +The `{ warn, error }` syntax in the restructuring part may look new to you; that's called "concise properties" and we cover it in the next section! + +## Object Literal Extensions + +ES6 adds a number of important convenience extensions to the humble `{ .. }` object literal. + +### Concise Properties + +You're certainly familiar with declaring object literals in this form: + +```js +var x = 2, y = 3, + o = { + x: x, + y: y + }; +``` + +If it's always felt redundant to say `x: x` all over, there's good news. If you need to define a property that is the same name as a lexical identifier, you can shorten it from `x: x` to `x`. Consider: + +```js +var x = 2, y = 3, + o = { + x, + y + }; +``` + +### Concise Methods + +In a similar spirit to concise properties we just examined, functions attached to properties in object literals also have a concise form, for convenience. + +The old way: + +```js +var o = { + x: function(){ + // .. + }, + y: function(){ + // .. + } +} +``` + +And as of ES6: + +```js +var o = { + x() { + // .. + }, + y() { + // .. + } +} +``` + +**Warning:** While `x() { .. }` seems to just be shorthand for `x: function(){ .. }`, concise methods have special behaviors that their older counterparts don't; specifically, the allowance for `super` (see "Object `super`" later in this chapter). + +Generators (see Chapter 4) also have a concise method form: + +```js +var o = { + *foo() { .. } +}; +``` + +#### Concisely Unnamed + +While that convenience shorthand is quite attractive, there's a subtle gotcha to be aware of. To illustrate, let's examine pre-ES6 code like the following, which you might try to refactor to use concise methods: + +```js +function runSomething(o) { + var x = Math.random(), + y = Math.random(); + + return o.something( x, y ); +} + +runSomething( { + something: function something(x,y) { + if (x > y) { + // recursively call with `x` + // and `y` swapped + return something( y, x ); + } + + return y - x; + } +} ); +``` + +This obviously silly code just generates two random numbers and subtracts the smaller from the bigger. But what's important here isn't what it does, but rather how it's defined. Let's focus on the object literal and function definition, as we see here: + +```js +runSomething( { + something: function something(x,y) { + // .. + } +} ); +``` + +Why do we say both `something:` and `function something`? Isn't that redundant? Actually, no, both are needed for different purposes. The property `something` is how we can call `o.something(..)`, sort of like its public name. But the second `something` is a lexical name to refer to the function from inside itself, for recursion purposes. + +Can you see why the line `return something(y,x)` needs the name `something` to refer to the function? There's no lexical name for the object, such that it could have said `return o.something(y,x)` or something of that sort. + +That's actually a pretty common practice when the object literal does have an identifying name, such as: + +```js +var controller = { + makeRequest: function(..){ + // .. + controller.makeRequest(..); + } +}; +``` + +Is this a good idea? Perhaps, perhaps not. You're assuming that the name `controller` will always point to the object in question. But it very well may not -- the `makeRequest(..)` function doesn't control the outer code and so can't force that to be the case. This could come back to bite you. + +Others prefer to use `this` to define such things: + +```js +var controller = { + makeRequest: function(..){ + // .. + this.makeRequest(..); + } +}; +``` + +That looks fine, and should work if you always invoke the method as `controller.makeRequest(..)`. But you now have a `this` binding gotcha if you do something like: + +```js +btn.addEventListener( "click", controller.makeRequest, false ); +``` + +Of course, you can solve that by passing `controller.makeRequest.bind(controller)` as the handler reference to bind the event to. But yuck -- it isn't very appealing. + +Or what if your inner `this.makeRequest(..)` call needs to be made from a nested function? You'll have another `this` binding hazard, which people will often solve with the hacky `var self = this`, such as: + +```js +var controller = { + makeRequest: function(..){ + var self = this; + + btn.addEventListener( "click", function(){ + // .. + self.makeRequest(..); + }, false ); + } +}; +``` + +More yuck. + +**Note:** For more information on `this` binding rules and gotchas, see Chapters 1-2 of the *this & Object Prototypes* title of this series. + +OK, what does all this have to do with concise methods? Recall our `something(..)` method definition: + +```js +runSomething( { + something: function something(x,y) { + // .. + } +} ); +``` + +The second `something` here provides a super convenient lexical identifier that will always point to the function itself, giving us the perfect reference for recursion, event binding/unbinding, and so on -- no messing around with `this` or trying to use an untrustable object reference. + +Great! + +So, now we try to refactor that function reference to this ES6 concise method form: + +```js +runSomething( { + something(x,y) { + if (x > y) { + return something( y, x ); + } + + return y - x; + } +} ); +``` + +Seems fine at first glance, except this code will break. The `return something(..)` call will not find a `something` identifier, so you'll get a `ReferenceError`. Oops. But why? + +The above ES6 snippet is interpreted as meaning: + +```js +runSomething( { + something: function(x,y){ + if (x > y) { + return something( y, x ); + } + + return y - x; + } +} ); +``` + +Look closely. Do you see the problem? The concise method definition implies `something: function(x,y)`. See how the second `something` we were relying on has been omitted? In other words, concise methods imply anonymous function expressions. + +Yeah, yuck. + +**Note:** You may be tempted to think that `=>` arrow functions are a good solution here, but they're equally insufficient, as they're also anonymous function expressions. We'll cover them in "Arrow Functions" later in this chapter. + +The partially redeeming news is that our `something(x,y)` concise method won't be totally anonymous. See "Function Names" in Chapter 7 for information about ES6 function name inference rules. That won't help us for our recursion, but it helps with debugging at least. + +So what are we left to conclude about concise methods? They're short and sweet, and a nice convenience. But you should only use them if you're never going to need them to do recursion or event binding/unbinding. Otherwise, stick to your old-school `something: function something(..)` method definitions. + +A lot of your methods are probably going to benefit from concise method definitions, so that's great news! Just be careful of the few where there's an un-naming hazard. + +#### ES5 Getter/Setter + +Technically, ES5 defined getter/setter literals forms, but they didn't seem to get used much, mostly due to the lack of transpilers to handle that new syntax (the only major new syntax added in ES5, really). So while it's not a new ES6 feature, we'll briefly refresh on that form, as it's probably going to be much more useful with ES6 going forward. + +Consider: + +```js +var o = { + __id: 10, + get id() { return this.__id++; }, + set id(v) { this.__id = v; } +} + +o.id; // 10 +o.id; // 11 +o.id = 20; +o.id; // 20 + +// and: +o.__id; // 21 +o.__id; // 21 -- still! +``` + +These getter and setter literal forms are also present in classes; see Chapter 3. + +**Warning:** It may not be obvious, but the setter literal must have exactly one declared parameter; omitting it or listing others is illegal syntax. The single required parameter *can* use destructuring and defaults (e.g., `set id({ id: v = 0 }) { .. }`), but the gather/rest `...` is not allowed (`set id(...v) { .. }`). + +### Computed Property Names + +You've probably been in a situation like the following snippet, where you have one or more property names that come from some sort of expression and thus can't be put into the object literal: + +```js +var prefix = "user_"; + +var o = { + baz: function(..){ .. } +}; + +o[ prefix + "foo" ] = function(..){ .. }; +o[ prefix + "bar" ] = function(..){ .. }; +.. +``` + +ES6 adds a syntax to the object literal definition which allows you to specify an expression that should be computed, whose result is the property name assigned. Consider: + +```js +var prefix = "user_"; + +var o = { + baz: function(..){ .. }, + [ prefix + "foo" ]: function(..){ .. }, + [ prefix + "bar" ]: function(..){ .. } + .. +}; +``` + +Any valid expression can appear inside the `[ .. ]` that sits in the property name position of the object literal definition. + +Probably the most common use of computed property names will be with `Symbol`s (which we cover in "Symbols" later in this chapter), such as: + +```js +var o = { + [Symbol.toStringTag]: "really cool thing", + .. +}; +``` + +`Symbol.toStringTag` is a special built-in value, which we evaluate with the `[ .. ]` syntax, so we can assign the `"really cool thing"` value to the special property name. + +Computed property names can also appear as the name of a concise method or a concise generator: + +```js +var o = { + ["f" + "oo"]() { .. } // computed concise method + *["b" + "ar"]() { .. } // computed concise generator +}; +``` + +### Setting `[[Prototype]]` + +We won't cover prototypes in detail here, so for more information, see the *this & Object Prototypes* title of this series. + +Sometimes it will be helpful to assign the `[[Prototype]]` of an object at the same time you're declaring its object literal. The following has been a nonstandard extension in many JS engines for a while, but is standardized as of ES6: + +```js +var o1 = { + // .. +}; + +var o2 = { + __proto__: o1, + // .. +}; +``` + +`o2` is declared with a normal object literal, but it's also `[[Prototype]]`-linked to `o1`. The `__proto__` property name here can also be a string `"__proto__"`, but note that it *cannot* be the result of a computed property name (see the previous section). + +`__proto__` is controversial, to say the least. It's a decades-old proprietary extension to JS that is finally standardized, somewhat begrudgingly it seems, in ES6. Many developers feel it shouldn't ever be used. In fact, it's in "Annex B" of ES6, which is the section that lists things JS feels it has to standardize for compatibility reasons only. + +**Warning:** Though I'm narrowly endorsing `__proto__` as a key in an object literal definition, I definitely do not endorse using it in its object property form, like `o.__proto__`. That form is both a getter and setter (again for compatibility reasons), but there are definitely better options. See the *this & Object Prototypes* title of this series for more information. + +For setting the `[[Prototype]]` of an existing object, you can use the ES6 utility `Object.setPrototypeOf(..)`. Consider: + +```js +var o1 = { + // .. +}; + +var o2 = { + // .. +}; + +Object.setPrototypeOf( o2, o1 ); +``` + +**Note:** We'll discuss `Object` again in Chapter 6. "`Object.setPrototypeOf(..)` Static Function" provides additional details on `Object.setPrototypeOf(..)`. Also see "`Object.assign(..)` Static Function" for another form that relates `o2` prototypically to `o1`. + +### Object `super` + +`super` is typically thought of as being only related to classes. However, due to JS's classless-objects-with-prototypes nature, `super` is equally effective, and nearly the same in behavior, with plain objects' concise methods. + +Consider: + +```js +var o1 = { + foo() { + console.log( "o1:foo" ); + } +}; + +var o2 = { + foo() { + super.foo(); + console.log( "o2:foo" ); + } +}; + +Object.setPrototypeOf( o2, o1 ); + +o2.foo(); // o1:foo + // o2:foo +``` + +**Warning:** `super` is only allowed in concise methods, not regular function expression properties. It also is only allowed in `super.XXX` form (for property/method access), not in `super()` form. + +The `super` reference in the `o2.foo()` method is locked statically to `o2`, and specifically to the `[[Prototype]]` of `o2`. `super` here would basically be `Object.getPrototypeOf(o2)` -- resolves to `o1` of course -- which is how it finds and calls `o1.foo()`. + +For complete details on `super`, see "Classes" in Chapter 3. + +## Template Literals + +At the very outset of this section, I'm going to have to call out the name of this ES6 feature as being awfully... misleading, depending on your experiences with what the word *template* means. + +Many developers think of templates as being reusable renderable pieces of text, such as the capability provided by most template engines (Mustache, Handlebars, etc.). ES6's use of the word *template* would imply something similar, like a way to declare inline template literals that can be re-rendered. However, that's not at all the right way to think about this feature. + +So, before we go on, I'm renaming to what it should have been called: *interpolated string literals* (or *interpoliterals* for short). + +You're already well aware of declaring string literals with `"` or `'` delimiters, and you also know that these are not *smart strings* (as some languages have), where the contents would be parsed for interpolation expressions. + +However, ES6 introduces a new type of string literal, using the `` ` `` backtick as the delimiter. These string literals allow basic string interpolation expressions to be embedded, which are then automatically parsed and evaluated. + +Here's the old pre-ES6 way: + +```js +var name = "Kyle"; + +var greeting = "Hello " + name + "!"; + +console.log( greeting ); // "Hello Kyle!" +console.log( typeof greeting ); // "string" +``` + +Now, consider the new ES6 way: + +```js +var name = "Kyle"; + +var greeting = `Hello ${name}!`; + +console.log( greeting ); // "Hello Kyle!" +console.log( typeof greeting ); // "string" +``` + +As you can see, we used the `` `..` `` around a series of characters, which are interpreted as a string literal, but any expressions of the form `${..}` are parsed and evaluated inline immediately. The fancy term for such parsing and evaluating is *interpolation* (much more accurate than templating). + +The result of the interpolated string literal expression is just a plain old normal string, assigned to the `greeting` variable. + +**Warning:** `typeof greeting == "string"` illustrates why it's important not to think of these entities as special template values, as you cannot assign the unevaluated form of the literal to something and reuse it. The `` `..` `` string literal is more like an IIFE in the sense that it's automatically evaluated inline. The result of a `` `..` `` string literal is, simply, just a string. + +One really nice benefit of interpolated string literals is they are allowed to split across multiple lines: + +```js +var text = +`Now is the time for all good men +to come to the aid of their +country!`; + +console.log( text ); +// Now is the time for all good men +// to come to the aid of their +// country! +``` + +The line breaks (newlines) in the interpolated string literal were preserved in the string value. + +Unless appearing as explicit escape sequences in the literal value, the value of the `\r` carriage return character (code point `U+000D`) or the value of the `\r\n` carriage return + line feed sequence (code points `U+000D` and `U+000A`) are both normalized to a `\n` line feed character (code point `U+000A`). Don't worry though; this normalization is rare and would likely only happen if copy-pasting text into your JS file. + +### Interpolated Expressions + +Any valid expression is allowed to appear inside `${..}` in an interpolated string literal, including function calls, inline function expression calls, and even other interpolated string literals! + +Consider: + +```js +function upper(s) { + return s.toUpperCase(); +} + +var who = "reader"; + +var text = +`A very ${upper( "warm" )} welcome +to all of you ${upper( `${who}s` )}!`; + +console.log( text ); +// A very WARM welcome +// to all of you READERS! +``` + +Here, the inner `` `${who}s` `` interpolated string literal was a little bit nicer convenience for us when combining the `who` variable with the `"s"` string, as opposed to `who + "s"`. There will be cases that nesting interpolated string literals is helpful, but be wary if you find yourself doing that kind of thing often, or if you find yourself nesting several levels deep. + +If that's the case, the odds are good that your string value production could benefit from some abstractions. + +**Warning:** As a word of caution, be very careful about the readability of your code with such new found power. Just like with default value expressions and destructuring assignment expressions, just because you *can* do something doesn't mean you *should* do it. Never go so overboard with new ES6 tricks that your code becomes more clever than you or your other team members. + +#### Expression Scope + +One quick note about the scope that is used to resolve variables in expressions. I mentioned earlier that an interpolated string literal is kind of like an IIFE, and it turns out thinking about it like that explains the scoping behavior as well. + +Consider: + +```js +function foo(str) { + var name = "foo"; + console.log( str ); +} + +function bar() { + var name = "bar"; + foo( `Hello from ${name}!` ); +} + +var name = "global"; + +bar(); // "Hello from bar!" +``` + +At the moment the `` `..` `` string literal is expressed, inside the `bar()` function, the scope available to it finds `bar()`'s `name` variable with value `"bar"`. Neither the global `name` nor `foo(..)`'s `name` matter. In other words, an interpolated string literal is just lexically scoped where it appears, not dynamically scoped in any way. + +### Tagged Template Literals + +Again, renaming the feature for sanity sake: *tagged string literals*. + +To be honest, this is one of the cooler tricks that ES6 offers. It may seem a little strange, and perhaps not all that generally practical at first. But once you've spent some time with it, tagged string literals may just surprise you in their usefulness. + +For example: + +```js +function foo(strings, ...values) { + console.log( strings ); + console.log( values ); +} + +var desc = "awesome"; + +foo`Everything is ${desc}!`; +// [ "Everything is ", "!"] +// [ "awesome" ] +``` + +Let's take a moment to consider what's happening in the previous snippet. First, the most jarring thing that jumps out is ``foo`Everything...`;``. That doesn't look like anything we've seen before. What is it? + +It's essentially a special kind of function call that doesn't need the `( .. )`. The *tag* -- the `foo` part before the `` `..` `` string literal -- is a function value that should be called. Actually, it can be any expression that results in a function, even a function call that returns another function, like: + +```js +function bar() { + return function foo(strings, ...values) { + console.log( strings ); + console.log( values ); + } +} + +var desc = "awesome"; + +bar()`Everything is ${desc}!`; +// [ "Everything is ", "!"] +// [ "awesome" ] +``` + +But what gets passed to the `foo(..)` function when invoked as a tag for a string literal? + +The first argument -- we called it `strings` -- is an array of all the plain strings (the stuff between any interpolated expressions). We get two values in the `strings` array: `"Everything is "` and `"!"`. + +For convenience sake in our example, we then gather up all subsequent arguments into an array called `values` using the `...` gather/rest operator (see the "Spread/Rest" section earlier in this chapter), though you could of course have left them as individual named parameters following the `strings` parameter. + +The argument(s) gathered into our `values` array are the results of the already-evaluated interpolation expressions found in the string literal. So obviously the only element in `values` in our example is `"awesome"`. + +You can think of these two arrays as: the values in `values` are the separators if you were to splice them in between the values in `strings`, and then if you joined everything together, you'd get the complete interpolated string value. + +A tagged string literal is like a processing step after the interpolation expressions are evaluated but before the final string value is compiled, allowing you more control over generating the string from the literal. + +Typically, the string literal tag function (`foo(..)` in the previous snippets) should compute an appropriate string value and return it, so that you can use the tagged string literal as a value just like untagged string literals: + +```js +function tag(strings, ...values) { + return strings.reduce( function(s,v,idx){ + return s + (idx > 0 ? values[idx-1] : "") + v; + }, "" ); +} + +var desc = "awesome"; + +var text = tag`Everything is ${desc}!`; + +console.log( text ); // Everything is awesome! +``` + +In this snippet, `tag(..)` is a pass-through operation, in that it doesn't perform any special modifications, but just uses `reduce(..)` to loop over and splice/interleave `strings` and `values` together the same way an untagged string literal would have done. + +So what are some practical uses? There are many advanced ones that are beyond our scope to discuss here. But here's a simple idea that formats numbers as U.S. dollars (sort of like basic localization): + +```js +function dollabillsyall(strings, ...values) { + return strings.reduce( function(s,v,idx){ + if (idx > 0) { + if (typeof values[idx-1] == "number") { + // look, also using interpolated + // string literals! + s += `$${values[idx-1].toFixed( 2 )}`; + } + else { + s += values[idx-1]; + } + } + + return s + v; + }, "" ); +} + +var amt1 = 11.99, + amt2 = amt1 * 1.08, + name = "Kyle"; + +var text = dollabillsyall +`Thanks for your purchase, ${name}! Your +product cost was ${amt1}, which with tax +comes out to ${amt2}.` + +console.log( text ); +// Thanks for your purchase, Kyle! Your +// product cost was $11.99, which with tax +// comes out to $12.95. +``` + +If a `number` value is encountered in the `values` array, we put `"$"` in front of it and format it to two decimal places with `toFixed(2)`. Otherwise, we let the value pass-through untouched. + +#### Raw Strings + +In the previous snippets, our tag functions receive the first argument we called `strings`, which is an array. But there's an additional bit of data included: the raw unprocessed versions of all the strings. You can access those raw string values using the `.raw` property, like this: + +```js +function showraw(strings, ...values) { + console.log( strings ); + console.log( strings.raw ); +} + +showraw`Hello\nWorld`; +// [ "Hello +// World" ] +// [ "Hello\nWorld" ] +``` + +The raw version of the value preserves the raw escaped `\n` sequence (the `\` and the `n` are separate characters), while the processed version considers it a single newline character. However, the earlier mentioned line-ending normalization is applied to both values. + +ES6 comes with a built-in function that can be used as a string literal tag: `String.raw(..)`. It simply passes through the raw versions of the `strings` values: + +```js +console.log( `Hello\nWorld` ); +// Hello +// World + +console.log( String.raw`Hello\nWorld` ); +// Hello\nWorld + +String.raw`Hello\nWorld`.length; +// 12 +``` + +Other uses for string literal tags included special processing for internationalization, localization, and more! + +## Arrow Functions + +We've touched on `this` binding complications with functions earlier in this chapter, and they're covered at length in the *this & Object Prototypes* title of this series. It's important to understand the frustrations that `this`-based programming with normal functions brings, because that is the primary motivation for the new ES6 `=>` arrow function feature. + +Let's first illustrate what an arrow function looks like, as compared to normal functions: + +```js +function foo(x,y) { + return x + y; +} + +// versus + +var foo = (x,y) => x + y; +``` + +The arrow function definition consists of a parameter list (of zero or more parameters, and surrounding `( .. )` if there's not exactly one parameter), followed by the `=>` marker, followed by a function body. + +So, in the previous snippet, the arrow function is just the `(x,y) => x + y` part, and that function reference happens to be assigned to the variable `foo`. + +The body only needs to be enclosed by `{ .. }` if there's more than one expression, or if the body consists of a non-expression statement. If there's only one expression, and you omit the surrounding `{ .. }`, there's an implied `return` in front of the expression, as illustrated in the previous snippet. + +Here's some other arrow function variations to consider: + +```js +var f1 = () => 12; +var f2 = x => x * 2; +var f3 = (x,y) => { + var z = x * 2 + y; + y++; + x *= 3; + return (x + y + z) / 2; +}; +``` + +Arrow functions are *always* function expressions; there is no arrow function declaration. It also should be clear that they are anonymous function expressions -- they have no named reference for the purposes of recursion or event binding/unbinding -- though "Function Names" in Chapter 7 will describe ES6's function name inference rules for debugging purposes. + +**Note:** All the capabilities of normal function parameters are available to arrow functions, including default values, destructuring, rest parameters, and so on. + +Arrow functions have a nice, shorter syntax, which makes them on the surface very attractive for writing terser code. Indeed, nearly all literature on ES6 (other than the titles in this series) seems to immediately and exclusively adopt the arrow function as "the new function." + +It is telling that nearly all examples in discussion of arrow functions are short single statement utilities, such as those passed as callbacks to various utilities. For example: + +```js +var a = [1,2,3,4,5]; + +a = a.map( v => v * 2 ); + +console.log( a ); // [2,4,6,8,10] +``` + +In those cases, where you have such inline function expressions, and they fit the pattern of computing a quick calculation in a single statement and returning that result, arrow functions indeed look to be an attractive and lightweight alternative to the more verbose `function` keyword and syntax. + +Most people tend to *ooh and aah* at nice terse examples like that, as I imagine you just did! + +However, I would caution you that it would seem to me somewhat a misapplication of this feature to use arrow function syntax with otherwise normal, multistatement functions, especially those that would otherwise be naturally expressed as function declarations. + +Recall the `dollabillsyall(..)` string literal tag function from earlier in this chapter -- let's change it to use `=>` syntax: + +```js +var dollabillsyall = (strings, ...values) => + strings.reduce( (s,v,idx) => { + if (idx > 0) { + if (typeof values[idx-1] == "number") { + // look, also using interpolated + // string literals! + s += `$${values[idx-1].toFixed( 2 )}`; + } + else { + s += values[idx-1]; + } + } + + return s + v; + }, "" ); +``` + +In this example, the only modifications I made were the removal of `function`, `return`, and some `{ .. }`, and then the insertion of `=>` and a `var`. Is this a significant improvement in the readability of the code? Meh. + +I'd actually argue that the lack of `return` and outer `{ .. }` partially obscures the fact that the `reduce(..)` call is the only statement in the `dollabillsyall(..)` function and that its result is the intended result of the call. Also, the trained eye that is so used to hunting for the word `function` in code to find scope boundaries now needs to look for the `=>` marker, which can definitely be harder to find in the thick of the code. + +While not a hard-and-fast rule, I'd say that the readability gains from `=>` arrow function conversion are inversely proportional to the length of the function being converted. The longer the function, the less `=>` helps; the shorter the function, the more `=>` can shine. + +I think it's probably more sensible and reasonable to adopt `=>` for the places in code where you do need short inline function expressions, but leave your normal-length main functions as is. + +### Not Just Shorter Syntax, But `this` + +Most of the popular attention toward `=>` has been on saving those precious keystrokes by dropping `function`, `return`, and `{ .. }` from your code. + +But there's a big detail we've skipped over so far. I said at the beginning of the section that `=>` functions are closely related to `this` binding behavior. In fact, `=>` arrow functions are *primarily designed* to alter `this` behavior in a specific way, solving a particular and common pain point with `this`-aware coding. + +The saving of keystrokes is a red herring, a misleading sideshow at best. + +Let's revisit another example from earlier in this chapter: + +```js +var controller = { + makeRequest: function(..){ + var self = this; + + btn.addEventListener( "click", function(){ + // .. + self.makeRequest(..); + }, false ); + } +}; +``` + +We used the `var self = this` hack, and then referenced `self.makeRequest(..)`, because inside the callback function we're passing to `addEventListener(..)`, the `this` binding will not be the same as it is in `makeRequest(..)` itself. In other words, because `this` bindings are dynamic, we fall back to the predictability of lexical scope via the `self` variable. + +Herein we finally can see the primary design characteristic of `=>` arrow functions. Inside arrow functions, the `this` binding is not dynamic, but is instead lexical. In the previous snippet, if we used an arrow function for the callback, `this` will be predictably what we wanted it to be. + +Consider: + +```js +var controller = { + makeRequest: function(..){ + btn.addEventListener( "click", () => { + // .. + this.makeRequest(..); + }, false ); + } +}; +``` + +Lexical `this` in the arrow function callback in the previous snippet now points to the same value as in the enclosing `makeRequest(..)` function. In other words, `=>` is a syntactic stand-in for `var self = this`. + +In cases where `var self = this` (or, alternatively, a function `.bind(this)` call) would normally be helpful, `=>` arrow functions are a nicer alternative operating on the same prinicple. Sounds great, right? + +Not quite so simple. + +If `=>` replaces `var self = this` or `.bind(this)` and it helps, guess what happens if you use `=>` with a `this`-aware function that *doesn't* need `var self = this` to work? You might be able to guess that it's going to mess things up. Yeah. + +Consider: + +```js +var controller = { + makeRequest: (..) => { + // .. + this.helper(..); + }, + helper: (..) => { + // .. + } +}; + +controller.makeRequest(..); +``` + +Although we invoke as `controller.makeRequest(..)`, the `this.helper` reference fails, because `this` here doesn't point to `controller` as it normally would. Where does it point? It lexically inherits `this` from the surrounding scope. In this previous snippet, that's the global scope, where `this` points to the global object. Ugh. + +In addition to lexical `this`, arrow functions also have lexical `arguments` -- they don't have their own `arguments` array but instead inherit from their parent -- as well as lexical `super` and `new.target` (see "Classes" in Chapter 3). + +So now we can conclude a more nuanced set of rules for when `=>` is appropriate and not: + +* If you have a short, single-statement inline function expression, where the only statement is a `return` of some computed value, *and* that function doesn't already make a `this` reference inside it, *and* there's no self-reference (recursion, event binding/unbinding), *and* you don't reasonably expect the function to ever be that way, you can probably safely refactor it to be an `=>` arrow function. +* If you have an inner function expression that's relying on a `var self = this` hack or a `.bind(this)` call on it in the enclosing function to ensure proper `this` binding, that inner function expression can probably safely become an `=>` arrow function. +* If you have an inner function expression that's relying on something like `var args = Array.prototype.slice.call(arguments)` in the enclosing function to make a lexical copy of `arguments`, that inner function expression can probably safely become an `=>` arrow function. +* For everything else -- normal function declarations, longer multistatement function expressions, functions that need a lexical name identifier self-reference (recursion, etc.), and any other function that doesn't fit the previous characteristics -- you should probably avoid `=>` function syntax. + +Bottom line: `=>` is about lexical binding of `this`, `arguments`, and `super`. These are intentional features designed to fix some common problems, not bugs, quirks, or mistakes in ES6. + +Don't believe any hype that `=>` is primarily, or even mostly, about fewer keystrokes. Whether you save keystrokes or waste them, you should know exactly what you are intentionally doing with every character typed. + +**Tip:** If you have a function that for any of these articulated reasons is not a good match for an `=>` arrow function, but it's being declared as part of an object literal, recall from "Concise Methods" earlier in this chapter that there's another option for shorter function syntax. + +If you prefer a visual decision chart for how/why to pick an arrow function: + + + +## `for..of` Loops + +Joining the `for` and `for..in` loops from the JavaScript we're all familiar with, ES6 adds a `for..of` loop, which loops over the set of values produced by an *iterator*. + +The value you loop over with `for..of` must be an *iterable*, or it must be a value which can be coerced/boxed to an object (see the *Types & Grammar* title of this series) that is an iterable. An iterable is simply an object that is able to produce an iterator, which the loop then uses. + +Let's compare `for..of` to `for..in` to illustrate the difference: + +```js +var a = ["a","b","c","d","e"]; + +for (var idx in a) { + console.log( idx ); +} +// 0 1 2 3 4 + +for (var val of a) { + console.log( val ); +} +// "a" "b" "c" "d" "e" +``` + +As you can see, `for..in` loops over the keys/indexes in the `a` array, while `for..of` loops over the values in `a`. + +Here's the pre-ES6 version of the `for..of` from that previous snippet: + +```js +var a = ["a","b","c","d","e"], + k = Object.keys( a ); + +for (var val, i = 0; i < k.length; i++) { + val = a[ k[i] ]; + console.log( val ); +} +// "a" "b" "c" "d" "e" +``` + +And here's the ES6 but non-`for..of` equivalent, which also gives a glimpse at manually iterating an iterator (see "Iterators" in Chapter 3): + +```js +var a = ["a","b","c","d","e"]; + +for (var val, ret, it = a[Symbol.iterator](); + (ret = it.next()) && !ret.done; +) { + val = ret.value; + console.log( val ); +} +// "a" "b" "c" "d" "e" +``` + +Under the covers, the `for..of` loop asks the iterable for an iterator (using the built-in `Symbol.iterator`; see "Well-Known Symbols" in Chapter 7), then it repeatedly calls the iterator and assigns its produced value to the loop iteration variable. + +Standard built-in values in JavaScript that are by default iterables (or provide them) include: + +* Arrays +* Strings +* Generators (see Chapter 3) +* Collections / TypedArrays (see Chapter 5) + +**Warning:** Plain objects are not by default suitable for `for..of` looping. That's because they don't have a default iterator, which is intentional, not a mistake. However, we won't go any further into those nuanced reasonings here. In "Iterators" in Chapter 3, we'll see how to define iterators for our own objects, which lets `for..of` loop over any object to get a set of values we define. + +Here's how to loop over the characters in a primitive string: + +```js +for (var c of "hello") { + console.log( c ); +} +// "h" "e" "l" "l" "o" +``` + +The `"hello"` primitive string value is coerced/boxed to the `String` object wrapper equivalent, which is an iterable by default. + +In `for (XYZ of ABC)..`, the `XYZ` clause can either be an assignment expression or a declaration, identical to that same clause in `for` and `for..in` loops. So you can do stuff like this: + +```js +var o = {}; + +for (o.a of [1,2,3]) { + console.log( o.a ); +} +// 1 2 3 + +for ({x: o.a} of [ {x: 1}, {x: 2}, {x: 3} ]) { + console.log( o.a ); +} +// 1 2 3 +``` + +`for..of` loops can be prematurely stopped, just like other loops, with `break`, `continue`, `return` (if in a function), and thrown exceptions. In any of these cases, the iterator's `return(..)` function is automatically called (if one exists) to let the iterator perform cleanup tasks, if necessary. + +**Note:** See "Iterators" in Chapter 3 for more complete coverage on iterables and iterators. + +## Regular Expressions + +Let's face it: regular expressions haven't changed much in JS in a long time. So it's a great thing that they've finally learned a couple of new tricks in ES6. We'll briefly cover the additions here, but the overall topic of regular expressions is so dense that you'll need to turn to chapters/books dedicated to it (of which there are many!) if you need a refresher. + +### Unicode Flag + +We'll cover the topic of Unicode in more detail in "Unicode" later in this chapter. Here, we'll just look briefly at the new `u` flag for ES6+ regular expressions, which turns on Unicode matching for that expression. + +JavaScript strings are typically interpreted as sequences of 16-bit characters, which correspond to the characters in the *Basic Multilingual Plane (BMP)* (http://en.wikipedia.org/wiki/Plane_%28Unicode%29). But there are many UTF-16 characters that fall outside this range, and so strings may have these multibyte characters in them. + +Prior to ES6, regular expressions could only match based on BMP characters, which means that those extended characters were treated as two separate characters for matching purposes. This is often not ideal. + +So, as of ES6, the `u` flag tells a regular expression to process a string with the interpretation of Unicode (UTF-16) characters, such that such an extended character will be matched as a single entity. + +**Warning:** Despite the name implication, "UTF-16" doesn't strictly mean 16 bits. Modern Unicode uses 21 bits, and standards like UTF-8 and UTF-16 refer roughly to how many bits are used in the representation of a character. + +An example (straight from the ES6 specification): 𝄞 (the musical symbol G-clef) is Unicode point U+1D11E (0x1D11E). + +If this character appears in a regular expression pattern (like `/𝄞/`), the standard BMP interpretation would be that it's two separate characters (0xD834 and 0xDD1E) to match with. But the new ES6 Unicode-aware mode means that `/𝄞/u` (or the escaped Unicode form `/\u{1D11E}/u`) will match `"𝄞"` in a string as a single matched character. + +You might be wondering why this matters? In non-Unicode BMP mode, the pattern is treated as two separate characters, but would still find the match in a string with the `"𝄞"` character in it, as you can see if you try: + +```js +/𝄞/.test( "𝄞-clef" ); // true +``` + +The length of the match is what matters. For example: + +```js +/^.-clef/ .test( "𝄞-clef" ); // false +/^.-clef/u.test( "𝄞-clef" ); // true +``` + +The `^.-clef` in the pattern says to match only a single character at the beginning before the normal `"-clef"` text. In standard BMP mode, the match fails (two characters), but with `u` Unicode mode flagged on, the match succeeds (one character). + +It's also important to note that `u` makes quantifiers like `+` and `*` apply to the entire Unicode code point as a single character, not just the *lower surrogate* (aka rightmost half of the symbol) of the character. The same goes for Unicode characters appearing in character classes, like `/[💩-💫]/u`. + +**Note:** There's plenty more nitty-gritty details about `u` behavior in regular expressions, which Mathias Bynens (https://twitter.com/mathias) has written extensively about (https://mathiasbynens.be/notes/es6-unicode-regex). + +### Sticky Flag + +Another flag mode added to ES6 regular expressions is `y`, which is often called "sticky mode." *Sticky* essentially means the regular expression has a virtual anchor at its beginning that keeps it rooted to matching at only the position indicated by the regular expression's `lastIndex` property. + +To illustrate, let's consider two regular expressions, the first without sticky mode and the second with: + +```js +var re1 = /foo/, + str = "++foo++"; + +re1.lastIndex; // 0 +re1.test( str ); // true +re1.lastIndex; // 0 -- not updated + +re1.lastIndex = 4; +re1.test( str ); // true -- ignored `lastIndex` +re1.lastIndex; // 4 -- not updated +``` + +Three things to observe about this snippet: + +* `test(..)` doesn't pay any attention to `lastIndex`'s value, and always just performs its match from the beginning of the input string. +* Because our pattern does not have a `^` start-of-input anchor, the search for `"foo"` is free to move ahead through the whole string looking for a match. +* `lastIndex` is not updated by `test(..)`. + +Now, let's try a sticky mode regular expression: + +```js +var re2 = /foo/y, // <-- notice the `y` sticky flag + str = "++foo++"; + +re2.lastIndex; // 0 +re2.test( str ); // false -- "foo" not found at `0` +re2.lastIndex; // 0 + +re2.lastIndex = 2; +re2.test( str ); // true +re2.lastIndex; // 5 -- updated to after previous match + +re2.test( str ); // false +re2.lastIndex; // 0 -- reset after previous match failure +``` + +And so our new observations about sticky mode: + +* `test(..)` uses `lastIndex` as the exact and only position in `str` to look to make a match. There is no moving ahead to look for the match -- it's either there at the `lastIndex` position or not. +* If a match is made, `test(..)` updates `lastIndex` to point to the character immediately following the match. If a match fails, `test(..)` resets `lastIndex` back to `0`. + +Normal non-sticky patterns that aren't otherwise `^`-rooted to the start-of-input are free to move ahead in the input string looking for a match. But sticky mode restricts the pattern to matching just at the position of `lastIndex`. + +As I suggested at the beginning of this section, another way of looking at this is that `y` implies a virtual anchor at the beginning of the pattern that is relative (aka constrains the start of the match) to exactly the `lastIndex` position. + +**Warning:** In previous literature on the topic, it has alternatively been asserted that this behavior is like `y` implying a `^` (start-of-input) anchor in the pattern. This is inaccurate. We'll explain in further detail in "Anchored Sticky" later. + +#### Sticky Positioning + +It may seem strangely limiting that to use `y` for repeated matches, you have to manually ensure `lastIndex` is in the exact right position, as it has no move-ahead capability for matching. + +Here's one possible scenario: if you know that the match you care about is always going to be at a position that's a multiple of a number (e.g., `0`, `10`, `20`, etc.), you can just construct a limited pattern matching what you care about, but then manually set `lastIndex` each time before match to those fixed positions. + +Consider: + +```js +var re = /f../y, + str = "foo far fad"; + +str.match( re ); // ["foo"] + +re.lastIndex = 10; +str.match( re ); // ["far"] + +re.lastIndex = 20; +str.match( re ); // ["fad"] +``` + +However, if you're parsing a string that isn't formatted in fixed positions like that, figuring out what to set `lastIndex` to before each match is likely going to be untenable. + +There's a saving nuance to consider here. `y` requires that `lastIndex` be in the exact position for a match to occur. But it doesn't strictly require that *you* manually set `lastIndex`. + +Instead, you can construct your expressions in such a way that they capture in each main match everything before and after the thing you care about, up to right before the next thing you'll care to match. + +Because `lastIndex` will set to the next character beyond the end of a match, if you've matched everything up to that point, `lastIndex` will always be in the correct position for the `y` pattern to start from the next time. + +**Warning:** If you can't predict the structure of the input string in a sufficiently patterned way like that, this technique may not be suitable and you may not be able to use `y`. + +Having structured string input is likely the most practical scenario where `y` will be capable of performing repeated matching throughout a string. Consider: + +```js +var re = /\d+\.\s(.*?)(?:\s|$)/y + str = "1. foo 2. bar 3. baz"; + +str.match( re ); // [ "1. foo ", "foo" ] + +re.lastIndex; // 7 -- correct position! +str.match( re ); // [ "2. bar ", "bar" ] + +re.lastIndex; // 14 -- correct position! +str.match( re ); // ["3. baz", "baz"] +``` + +This works because I knew something ahead of time about the structure of the input string: there is always a numeral prefix like `"1. "` before the desired match (`"foo"`, etc.), and either a space after it, or the end of the string (`$` anchor). So the regular expression I constructed captures all of that in each main match, and then I use a matching group `( )` so that the stuff I really care about is separated out for convenience. + +After the first match (`"1. foo "`), the `lastIndex` is `7`, which is already the position needed to start the next match, for `"2. bar "`, and so on. + +If you're going to use `y` sticky mode for repeated matches, you'll probably want to look for opportunities to have `lastIndex` automatically positioned as we've just demonstrated. + +#### Sticky Versus Global + +Some readers may be aware that you can emulate something like this `lastIndex`-relative matching with the `g` global match flag and the `exec(..)` method, as so: + +```js +var re = /o+./g, // <-- look, `g`! + str = "foot book more"; + +re.exec( str ); // ["oot"] +re.lastIndex; // 4 + +re.exec( str ); // ["ook"] +re.lastIndex; // 9 + +re.exec( str ); // ["or"] +re.lastIndex; // 13 + +re.exec( str ); // null -- no more matches! +re.lastIndex; // 0 -- starts over now! +``` + +While it's true that `g` pattern matches with `exec(..)` start their matching from `lastIndex`'s current value, and also update `lastIndex` after each match (or failure), this is not the same thing as `y`'s behavior. + +Notice in the previous snippet that `"ook"`, located at position `6`, was matched and found by the second `exec(..)` call, even though at the time, `lastIndex` was `4` (from the end of the previous match). Why? Because as we said earlier, non-sticky matches are free to move ahead in their matching. A sticky mode expression would have failed here, because it would not be allowed to move ahead. + +In addition to perhaps undesired move-ahead matching behavior, another downside to just using `g` instead of `y` is that `g` changes the behavior of some matching methods, like `str.match(re)`. + +Consider: + +```js +var re = /o+./g, // <-- look, `g`! + str = "foot book more"; + +str.match( re ); // ["oot","ook","or"] +``` + +See how all the matches were returned at once? Sometimes that's OK, but sometimes that's not what you want. + +The `y` sticky flag will give you one-at-a-time progressive matching with utilities like `test(..)` and `match(..)`. Just make sure the `lastIndex` is always in the right position for each match! + +#### Anchored Sticky + +As we warned earlier, it's inaccurate to think of sticky mode as implying a pattern starts with `^`. The `^` anchor has a distinct meaning in regular expressions, which is *not altered* by sticky mode. `^` is an anchor that *always* refers to the beginning of the input, and *is not* in any way relative to `lastIndex`. + +Besides poor/inaccurate documentation on this topic, the confusion is unfortunately strengthened further because an older pre-ES6 experiment with sticky mode in Firefox *did* make `^` relative to `lastIndex`, so that behavior has been around for years. + +ES6 elected not to do it that way. `^` in a pattern means start-of-input absolutely and only. + +As a consequence, a pattern like `/^foo/y` will always and only find a `"foo"` match at the beginning of a string, *if it's allowed to match there*. If `lastIndex` is not `0`, the match will fail. Consider: + +```js +var re = /^foo/y, + str = "foo"; + +re.test( str ); // true +re.test( str ); // false +re.lastIndex; // 0 -- reset after failure + +re.lastIndex = 1; +re.test( str ); // false -- failed for positioning +re.lastIndex; // 0 -- reset after failure +``` + +Bottom line: `y` plus `^` plus `lastIndex > 0` is an incompatible combination that will always cause a failed match. + +**Note:** While `y` does not alter the meaning of `^` in any way, the `m` multiline mode *does*, such that `^` means start-of-input *or* start of text after a newline. So, if you combine `y` and `m` flags together for a pattern, you can find multiple `^`-rooted matches in a string. But remember: because it's `y` sticky, you'll have to make sure `lastIndex` is pointing at the correct new line position (likely by matching to the end of the line) each subsequent time, or no subsequent matches will be made. + +### Regular Expression `flags` + +Prior to ES6, if you wanted to examine a regular expression object to see what flags it had applied, you needed to parse them out -- ironically, probably with another regular expression -- from the content of the `source` property, such as: + +```js +var re = /foo/ig; + +re.toString(); // "/foo/ig" + +var flags = re.toString().match( /\/([gim]*)$/ )[1]; + +flags; // "ig" +``` + +As of ES6, you can now get these values directly, with the new `flags` property: + +```js +var re = /foo/ig; + +re.flags; // "gi" +``` + +It's a small nuance, but the ES6 specification calls for the expression's flags to be listed in this order: `"gimuy"`, regardless of what order the original pattern was specified with. That's the reason for the difference between `/ig` and `"gi"`. + +No, the order of flags specified or listed doesn't matter. + +Another tweak from ES6 is that the `RegExp(..)` constructor is now `flags`-aware if you pass it an existing regular expression: + +```js +var re1 = /foo*/y; +re1.source; // "foo*" +re1.flags; // "y" + +var re2 = new RegExp( re1 ); +re2.source; // "foo*" +re2.flags; // "y" + +var re3 = new RegExp( re1, "ig" ); +re3.source; // "foo*" +re3.flags; // "gi" +``` + +Prior to ES6, the `re3` construction would throw an error, but as of ES6 you can override the flags when duplicating. + +## Number Literal Extensions + +Prior to ES5, number literals looked like the following -- the octal form was not officially specified, only allowed as an extension that browsers had come to de facto agreement on: + +```js +var dec = 42, + oct = 052, + hex = 0x2a; +``` + +**Note:** Though you are specifying a number in different bases, the number's mathematic value is what is stored, and the default output interpretation is always base-10. The three variables in the previous snippet all have the `42` value stored in them. + +To further illustrate that `052` was a nonstandard form extension, consider: + +```js +Number( "42" ); // 42 +Number( "052" ); // 52 +Number( "0x2a" ); // 42 +``` + +ES5 continued to permit the browser-extended octal form (including such inconsistencies), except that in strict mode, the octal literal (`052`) form is disallowed. This restriction was done mainly because many developers had the habit (from other languages) of seemingly innocuously prefixing otherwise base-10 numbers with `0`'s for code alignment purposes, and then running into the accidental fact that they'd changed the number value entirely! + +ES6 continues the legacy of changes/variations to how number literals outside base-10 numbers can be represented. There's now an official octal form, an amended hexadecimal form, and a brand-new binary form. For web compatibility reasons, the old octal `052` form will continue to be legal (though unspecified) in non-strict mode, but should really never be used anymore. + +Here are the new ES6 number literal forms: + +```js +var dec = 42, + oct = 0o52, // or `0O52` :( + hex = 0x2a, // or `0X2a` :/ + bin = 0b101010; // or `0B101010` :/ +``` + +The only decimal form allowed is base-10. Octal, hexadecimal, and binary are all integer forms. + +And the string representations of these forms are all able to be coerced/converted to their number equivalent: + +```js +Number( "42" ); // 42 +Number( "0o52" ); // 42 +Number( "0x2a" ); // 42 +Number( "0b101010" ); // 42 +``` + +Though not strictly new to ES6, it's a little-known fact that you can actually go the opposite direction of conversion (well, sort of): + +```js +var a = 42; + +a.toString(); // "42" -- also `a.toString( 10 )` +a.toString( 8 ); // "52" +a.toString( 16 ); // "2a" +a.toString( 2 ); // "101010" +``` + +In fact, you can represent a number this way in any base from `2` to `36`, though it'd be rare that you'd go outside the standard bases: 2, 8, 10, and 16. + +## Unicode + +Let me just say that this section is not an exhaustive everything-you-ever-wanted-to-know-about-Unicode resource. I want to cover what you need to know that's *changing* for Unicode in ES6, but we won't go much deeper than that. Mathias Bynens (http://twitter.com/mathias) has written/spoken extensively and brilliantly about JS and Unicode (see https://mathiasbynens.be/notes/javascript-unicode and http://fluentconf.com/javascript-html-2015/public/content/2015/02/18-javascript-loves-unicode). + +The Unicode characters that range from `0x0000` to `0xFFFF` contain all the standard printed characters (in various languages) that you're likely to have seen or interacted with. This group of characters is called the *Basic Multilingual Plane (BMP)*. The BMP even contains fun symbols like this cool snowman: ☃ (U+2603). + +There are lots of other extended Unicode characters beyond this BMP set, which range up to `0x10FFFF`. These symbols are often referred to as *astral* symbols, as that's the name given to the set of 16 *planes* (e.g., layers/groupings) of characters beyond the BMP. Examples of astral symbols include 𝄞 (U+1D11E) and 💩 (U+1F4A9). + +Prior to ES6, JavaScript strings could specify Unicode characters using Unicode escaping, such as: + +```js +var snowman = "\u2603"; +console.log( snowman ); // "☃" +``` + +However, the `\uXXXX` Unicode escaping only supports four hexadecimal characters, so you can only represent the BMP set of characters in this way. To represent an astral character using Unicode escaping prior to ES6, you need to use a *surrogate pair* -- basically two specially calculated Unicode-escaped characters side by side, which JS interprets together as a single astral character: + +```js +var gclef = "\uD834\uDD1E"; +console.log( gclef ); // "𝄞" +``` + +As of ES6, we now have a new form for Unicode escaping (in strings and regular expressions), called Unicode *code point escaping*: + +```js +var gclef = "\u{1D11E}"; +console.log( gclef ); // "𝄞" +``` + +As you can see, the difference is the presence of the `{ }` in the escape sequence, which allows it to contain any number of hexadecimal characters. Because you only need six to represent the highest possible code point value in Unicode (i.e., 0x10FFFF), this is sufficient. + +### Unicode-Aware String Operations + +By default, JavaScript string operations and methods are not sensitive to astral symbols in string values. So, they treat each BMP character individually, even the two surrogate halves that make up an otherwise single astral character. Consider: + +```js +var snowman = "☃"; +snowman.length; // 1 + +var gclef = "𝄞"; +gclef.length; // 2 +``` + +So, how do we accurately calculate the length of such a string? In this scenario, the following trick will work: + +```js +var gclef = "𝄞"; + +[...gclef].length; // 1 +Array.from( gclef ).length; // 1 +``` + +Recall from the "`for..of` Loops" section earlier in this chapter that ES6 strings have built-in iterators. This iterator happens to be Unicode-aware, meaning it will automatically output an astral symbol as a single value. We take advantage of that using the `...` spread operator in an array literal, which creates an array of the string's symbols. Then we just inspect the length of that resultant array. ES6's `Array.from(..)` does basically the same thing as `[...XYZ]`, but we'll cover that utility in detail in Chapter 6. + +**Warning:** It should be noted that constructing and exhausting an iterator just to get the length of a string is quite expensive on performance, relatively speaking, compared to what a theoretically optimized native utility/property would do. + +Unfortunately, the full answer is not as simple or straightforward. In addition to the surrogate pairs (which the string iterator takes care of), there are special Unicode code points that behave in other special ways, which is much harder to account for. For example, there's a set of code points that modify the previous adjacent character, known as *Combining Diacritical Marks*. + +Consider these two string outputs: + +```js +console.log( s1 ); // "é" +console.log( s2 ); // "é" +``` + +They look the same, but they're not! Here's how we created `s1` and `s2`: + +```js +var s1 = "\xE9", + s2 = "e\u0301"; +``` + +As you can probably guess, our previous `length` trick doesn't work with `s2`: + +```js +[...s1].length; // 1 +[...s2].length; // 2 +``` + +So what can we do? In this case, we can perform a *Unicode normalization* on the value before inquiring about its length, using the ES6 `String#normalize(..)` utility (which we'll cover more in Chapter 6): + +```js +var s1 = "\xE9", + s2 = "e\u0301"; + +s1.normalize().length; // 1 +s2.normalize().length; // 1 + +s1 === s2; // false +s1 === s2.normalize(); // true +``` + +Essentially, `normalize(..)` takes a sequence like `"e\u0301"` and normalizes it to `"\xE9"`. Normalization can even combine multiple adjacent combining marks if there's a suitable Unicode character they combine to: + +```js +var s1 = "o\u0302\u0300", + s2 = s1.normalize(), + s3 = "ồ"; + +s1.length; // 3 +s2.length; // 1 +s3.length; // 1 + +s2 === s3; // true +``` + +Unfortunately, normalization isn't fully perfect here, either. If you have multiple combining marks modifying a single character, you may not get the length count you'd expect, because there may not be a single defined normalized character that represents the combination of all the marks. For example: + +```js +var s1 = "e\u0301\u0330"; + +console.log( s1 ); // "ḛ́" + +s1.normalize().length; // 2 +``` + +The further you go down this rabbit hole, the more you realize that it's difficult to get one precise definition for "length." What we see visually rendered as a single character -- more precisely called a *grapheme* -- doesn't always strictly relate to a single "character" in the program processing sense. + +**Tip:** If you want to see just how deep this rabbit hole goes, check out the "Grapheme Cluster Boundaries" algorithm (http://www.Unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries). + +### Character Positioning + +Similar to length complications, what does it actually mean to ask, "what is the character at position 2?" The naive pre-ES6 answer comes from `charAt(..)`, which will not respect the atomicity of an astral character, nor will it take into account combining marks. + +Consider: + +```js +var s1 = "abc\u0301d", + s2 = "ab\u0107d", + s3 = "ab\u{1d49e}d"; + +console.log( s1 ); // "abćd" +console.log( s2 ); // "abćd" +console.log( s3 ); // "ab𝒞d" + +s1.charAt( 2 ); // "c" +s2.charAt( 2 ); // "ć" +s3.charAt( 2 ); // "" <-- unprintable surrogate +s3.charAt( 3 ); // "" <-- unprintable surrogate +``` + +So, is ES6 giving us a Unicode-aware version of `charAt(..)`? Unfortunately, no. At the time of this writing, there's a proposal for such a utility that's under consideration for post-ES6. + +But with what we explored in the previous section (and of course with the limitations noted thereof!), we can hack an ES6 answer: + +```js +var s1 = "abc\u0301d", + s2 = "ab\u0107d", + s3 = "ab\u{1d49e}d"; + +[...s1.normalize()][2]; // "ć" +[...s2.normalize()][2]; // "ć" +[...s3.normalize()][2]; // "𝒞" +``` + +**Warning:** Reminder of an earlier warning: constructing and exhausting an iterator each time you want to get at a single character is... very not ideal, performance wise. Let's hope we get a built-in and optimized utility for this soon, post-ES6. + +What about a Unicode-aware version of the `charCodeAt(..)` utility? ES6 gives us `codePointAt(..)`: + +```js +var s1 = "abc\u0301d", + s2 = "ab\u0107d", + s3 = "ab\u{1d49e}d"; + +s1.normalize().codePointAt( 2 ).toString( 16 ); +// "107" + +s2.normalize().codePointAt( 2 ).toString( 16 ); +// "107" + +s3.normalize().codePointAt( 2 ).toString( 16 ); +// "1d49e" +``` + +What about the other direction? A Unicode-aware version of `String.fromCharCode(..)` is ES6's `String.fromCodePoint(..)`: + +```js +String.fromCodePoint( 0x107 ); // "ć" + +String.fromCodePoint( 0x1d49e ); // "𝒞" +``` + +So wait, can we just combine `String.fromCodePoint(..)` and `codePointAt(..)` to get a better version of a Unicode-aware `charAt(..)` from earlier? Yep! + +```js +var s1 = "abc\u0301d", + s2 = "ab\u0107d", + s3 = "ab\u{1d49e}d"; + +String.fromCodePoint( s1.normalize().codePointAt( 2 ) ); +// "ć" + +String.fromCodePoint( s2.normalize().codePointAt( 2 ) ); +// "ć" + +String.fromCodePoint( s3.normalize().codePointAt( 2 ) ); +// "𝒞" +``` + +There's quite a few other string methods we haven't addressed here, including `toUpperCase()`, `toLowerCase()`, `substring(..)`, `indexOf(..)`, `slice(..)`, and a dozen others. None of these have been changed or augmented for full Unicode awareness, so you should be very careful -- probably just avoid them! -- when working with strings containing astral symbols. + +There are also several string methods that use regular expressions for their behavior, like `replace(..)` and `match(..)`. Thankfully, ES6 brings Unicode awareness to regular expressions, as we covered in "Unicode Flag" earlier in this chapter. + +OK, there we have it! JavaScript's Unicode string support is significantly better over pre-ES6 (though still not perfect) with the various additions we've just covered. + +### Unicode Identifier Names + +Unicode can also be used in identifier names (variables, properties, etc.). Prior to ES6, you could do this with Unicode-escapes, like: + +```js +var \u03A9 = 42; + +// same as: var Ω = 42; +``` + +As of ES6, you can also use the earlier explained code point escape syntax: + +```js +var \u{2B400} = 42; + +// same as: var 𫐀 = 42; +``` + +There's a complex set of rules around exactly which Unicode characters are allowed. Furthermore, some are allowed only if they're not the first character of the identifier name. + +**Note:** Mathias Bynens has a great post (https://mathiasbynens.be/notes/javascript-identifiers-es6) on all the nitty-gritty details. + +The reasons for using such unusual characters in identifier names are rather rare and academic. You typically won't be best served by writing code that relies on these esoteric capabilities. + +## Symbols + +With ES6, for the first time in quite a while, a new primitive type has been added to JavaScript: the `symbol`. Unlike the other primitive types, however, symbols don't have a literal form. + +Here's how you create a symbol: + +```js +var sym = Symbol( "some optional description" ); + +typeof sym; // "symbol" +``` + +Some things to note: + +* You cannot and should not use `new` with `Symbol(..)`. It's not a constructor, nor are you producing an object. +* The parameter passed to `Symbol(..)` is optional. If passed, it should be a string that gives a friendly description for the symbol's purpose. +* The `typeof` output is a new value (`"symbol"`) that is the primary way to identify a symbol. + +The description, if provided, is solely used for the stringification representation of the symbol: + +```js +sym.toString(); // "Symbol(some optional description)" +``` + +Similar to how primitive string values are not instances of `String`, symbols are also not instances of `Symbol`. If, for some reason, you want to construct a boxed wrapper object form of a symbol value, you can do the following: + +```js +sym instanceof Symbol; // false + +var symObj = Object( sym ); +symObj instanceof Symbol; // true + +symObj.valueOf() === sym; // true +``` + +**Note:** `symObj` in this snippet is interchangeable with `sym`; either form can be used in all places symbols are utilized. There's not much reason to use the boxed wrapper object form (`symObj`) instead of the primitive form (`sym`). Keeping with similar advice for other primitives, it's probably best to prefer `sym` over `symObj`. + +The internal value of a symbol itself -- referred to as its `name` -- is hidden from the code and cannot be obtained. You can think of this symbol value as an automatically generated, unique (within your application) string value. + +But if the value is hidden and unobtainable, what's the point of having a symbol at all? + +The main point of a symbol is to create a string-like value that can't collide with any other value. So, for example, consider using a symbol as a constant representing an event name: + +```js +const EVT_LOGIN = Symbol( "event.login" ); +``` + +You'd then use `EVT_LOGIN` in place of a generic string literal like `"event.login"`: + +```js +evthub.listen( EVT_LOGIN, function(data){ + // .. +} ); +``` + +The benefit here is that `EVT_LOGIN` holds a value that cannot be duplicated (accidentally or otherwise) by any other value, so it is impossible for there to be any confusion of which event is being dispatched or handled. + +**Note:** Under the covers, the `evthub` utility assumed in the previous snippet would almost certainly be using the symbol value from the `EVT_LOGIN` argument directly as the property/key in some internal object (hash) that tracks event handlers. If `evthub` instead needed to use the symbol value as a real string, it would need to explicitly coerce with `String(..)` or `toString()`, as implicit string coercion of symbols is not allowed. + +You may use a symbol directly as a property name/key in an object, such as a special property that you want to treat as hidden or meta in usage. It's important to know that although you intend to treat it as such, it is not *actually* a hidden or untouchable property. + +Consider this module that implements the *singleton* pattern behavior -- that is, it only allows itself to be created once: + +```js +const INSTANCE = Symbol( "instance" ); + +function HappyFace() { + if (HappyFace[INSTANCE]) return HappyFace[INSTANCE]; + + function smile() { .. } + + return HappyFace[INSTANCE] = { + smile: smile + }; +} + +var me = HappyFace(), + you = HappyFace(); + +me === you; // true +``` + +The `INSTANCE` symbol value here is a special, almost hidden, meta-like property stored statically on the `HappyFace()` function object. + +It could alternatively have been a plain old property like `__instance`, and the behavior would have been identical. The usage of a symbol simply improves the metaprogramming style, keeping this `INSTANCE` property set apart from any other normal properties. + +### Symbol Registry + +One mild downside to using symbols as in the last few examples is that the `EVT_LOGIN` and `INSTANCE` variables had to be stored in an outer scope (perhaps even the global scope), or otherwise somehow stored in a publicly available location, so that all parts of the code that need to use the symbols can access them. + +To aid in organizing code with access to these symbols, you can create symbol values with the *global symbol registry*. For example: + +```js +const EVT_LOGIN = Symbol.for( "event.login" ); + +console.log( EVT_LOGIN ); // Symbol(event.login) +``` + +And: + +```js +function HappyFace() { + const INSTANCE = Symbol.for( "instance" ); + + if (HappyFace[INSTANCE]) return HappyFace[INSTANCE]; + + // .. + + return HappyFace[INSTANCE] = { .. }; +} +``` + +`Symbol.for(..)` looks in the global symbol registry to see if a symbol is already stored with the provided description text, and returns it if so. If not, it creates one to return. In other words, the global symbol registry treats symbol values, by description text, as singletons themselves. + +But that also means that any part of your application can retrieve the symbol from the registry using `Symbol.for(..)`, as long as the matching description name is used. + +Ironically, symbols are basically intended to replace the use of *magic strings* (arbitrary string values given special meaning) in your application. But you precisely use *magic* description string values to uniquely identify/locate them in the global symbol registry! + +To avoid accidental collisions, you'll probably want to make your symbol descriptions quite unique. One easy way of doing that is to include prefix/context/namespacing information in them. + +For example, consider a utility such as the following: + +```js +function extractValues(str) { + var key = Symbol.for( "extractValues.parse" ), + re = extractValues[key] || + /[^=&]+?=([^&]+?)(?=&|$)/g, + values = [], match; + + while (match = re.exec( str )) { + values.push( match[1] ); + } + + return values; +} +``` + +We use the magic string value `"extractValues.parse"` because it's quite unlikely that any other symbol in the registry would ever collide with that description. + +If a user of this utility wants to override the parsing regular expression, they can also use the symbol registry: + +```js +extractValues[Symbol.for( "extractValues.parse" )] = + /..some pattern../g; + +extractValues( "..some string.." ); +``` + +Aside from the assistance the symbol registry provides in globally storing these values, everything we're seeing here could have been done by just actually using the magic string `"extractValues.parse"` as the key, rather than the symbol. The improvements exist at the metaprogramming level more than the functional level. + +You may have occasion to use a symbol value that has been stored in the registry to look up what description text (key) it's stored under. For example, you may need to signal to another part of your application how to locate a symbol in the registry because you cannot pass the symbol value itself. + +You can retrieve a registered symbol's description text (key) using `Symbol.keyFor(..)`: + +```js +var s = Symbol.for( "something cool" ); + +var desc = Symbol.keyFor( s ); +console.log( desc ); // "something cool" + +// get the symbol from the registry again +var s2 = Symbol.for( desc ); + +s2 === s; // true +``` + +### Symbols as Object Properties + +If a symbol is used as a property/key of an object, it's stored in a special way so that the property will not show up in a normal enumeration of the object's properties: + +```js +var o = { + foo: 42, + [ Symbol( "bar" ) ]: "hello world", + baz: true +}; + +Object.getOwnPropertyNames( o ); // [ "foo","baz" ] +``` + +To retrieve an object's symbol properties: + +```js +Object.getOwnPropertySymbols( o ); // [ Symbol(bar) ] +``` + +This makes it clear that a property symbol is not actually hidden or inaccessible, as you can always see it in the `Object.getOwnPropertySymbols(..)` list. + +#### Built-In Symbols + +ES6 comes with a number of predefined built-in symbols that expose various meta behaviors on JavaScript object values. However, these symbols are *not* registered in the global symbol registry, as one might expect. + +Instead, they're stored as properties on the `Symbol` function object. For example, in the "`for..of`" section earlier in this chapter, we introduced the `Symbol.iterator` value: + +```js +var a = [1,2,3]; + +a[Symbol.iterator]; // native function +``` + +The specification uses the `@@` prefix notation to refer to the built-in symbols, the most common ones being: `@@iterator`, `@@toStringTag`, `@@toPrimitive`. Several others are defined as well, though they probably won't be used as often. + +**Note:** See "Well Known Symbols" in Chapter 7 for detailed information about how these built-in symbols are used for meta programming purposes. + +## Review + +ES6 adds a heap of new syntax forms to JavaScript, so there's plenty to learn! + +Most of these are designed to ease the pain points of common programming idioms, such as setting default values to function parameters and gathering the "rest" of the parameters into an array. Destructuring is a powerful tool for more concisely expressing assignments of values from arrays and nested objects. + +While features like `=>` arrow functions appear to also be all about shorter and nicer-looking syntax, they actually have very specific behaviors that you should intentionally use only in appropriate situations. + +Expanded Unicode support, new tricks for regular expressions, and even a new primitive `symbol` type round out the syntactic evolution of ES6. diff --git a/es6 & beyond/ch3.md b/es6 & beyond/ch3.md new file mode 100644 index 0000000..db34e0e --- /dev/null +++ b/es6 & beyond/ch3.md @@ -0,0 +1,2048 @@ +# You Don't Know JS: ES6 & Beyond +# Chapter 3: Organization + +It's one thing to write JS code, but it's another to properly organize it. Utilizing common patterns for organization and reuse goes a long way to improving the readability and understandability of your code. Remember: code is at least as much about communicating to other developers as it is about feeding the computer instructions. + +ES6 has several important features that help significantly improve these patterns, including: iterators, generators, modules, and classes. + +## Iterators + +An *iterator* is a structured pattern for pulling information from a source in one-at-a-time fashion. This pattern has been around programming for a long time. And to be sure, JS developers have been ad hoc designing and implementing iterators in JS programs since before anyone can remember, so it's not at all a new topic. + +What ES6 has done is introduce an implicit standardized interface for iterators. Many of the built-in data structures in JavaScript will now expose an iterator implementing this standard. And you can also construct your own iterators adhering to the same standard, for maximal interoperability. + +Iterators are a way of organizing ordered, sequential, pull-based consumption of data. + +For example, you may implement a utility that produces a new unique identifier each time it's requested. Or you may produce an infinite series of values that rotate through a fixed list, in round-robin fashion. Or you could attach an iterator to a database query result to pull out new rows one at a time. + +Although they have not commonly been used in JS in such a manner, iterators can also be thought of as controlling behavior one step at a time. This can be illustrated quite clearly when considering generators (see "Generators" later in this chapter), though you can certainly do the same without generators. + +### Interfaces + +At the time of this writing, ES6 section 25.1.1.2 (https://people.mozilla.org/~jorendorff/es6-draft.html#sec-iterator-interface) details the `Iterator` interface as having the following requirement: + +``` +Iterator [required] + next() {method}: retrieves next IteratorResult +``` + +There are two optional members that some iterators are extended with: + +``` +Iterator [optional] + return() {method}: stops iterator and returns IteratorResult + throw() {method}: signals error and returns IteratorResult +``` + +The `IteratorResult` interface is specified as: + +``` +IteratorResult + value {property}: current iteration value or final return value + (optional if `undefined`) + done {property}: boolean, indicates completion status +``` + +**Note:** I call these interfaces implicit not because they're not explicitly called out in the specification -- they are! -- but because they're not exposed as direct objects accessible to code. JavaScript does not, in ES6, support any notion of "interfaces," so adherence for your own code is purely conventional. However, wherever JS expects an iterator -- a `for..of` loop, for instance -- what you provide must adhere to these interfaces or the code will fail. + +There's also an `Iterable` interface, which describes objects that must be able to produce iterators: + +``` +Iterable + @@iterator() {method}: produces an Iterator +``` + +If you recall from "Built-In Symbols" in Chapter 2, `@@iterator` is the special built-in symbol representing the method that can produce iterator(s) for the object. + +#### IteratorResult + +The `IteratorResult` interface specifies that the return value from any iterator operation will be an object of the form: + +```js +{ value: .. , done: true / false } +``` + +Built-in iterators will always return values of this form, but more properties are, of course, allowed to be present on the return value, as necessary. + +For example, a custom iterator may add additional metadata to the result object (e.g., where the data came from, how long it took to retrieve, cache expiration length, frequency for the appropriate next request, etc.). + +**Note:** Technically, `value` is optional if it would otherwise be considered absent or unset, such as in the case of the value `undefined`. Because accessing `res.value` will produce `undefined` whether it's present with that value or absent entirely, the presence/absence of the property is more an implementation detail or an optimization (or both), rather than a functional issue. + +### `next()` Iteration + +Let's look at an array, which is an iterable, and the iterator it can produce to consume its values: + +```js +var arr = [1,2,3]; + +var it = arr[Symbol.iterator](); + +it.next(); // { value: 1, done: false } +it.next(); // { value: 2, done: false } +it.next(); // { value: 3, done: false } + +it.next(); // { value: undefined, done: true } +``` + +Each time the method located at `Symbol.iterator` (see Chapters 2 and 7) is invoked on this `arr` value, it will produce a new fresh iterator. Most structures will do the same, including all the built-in data structures in JS. + +However, a structure like an event queue consumer might only ever produce a single iterator (singleton pattern). Or a structure might only allow one unique iterator at a time, requiring the current one to be completed before a new one can be created. + +The `it` iterator in the previous snippet doesn't report `done: true` when you receive the `3` value. You have to call `next()` again, in essence going beyond the end of the array's values, to get the complete signal `done: true`. It may not be clear why until later in this section, but that design decision will typically be considered a best practice. + +Primitive string values are also iterables by default: + +```js +var greeting = "hello world"; + +var it = greeting[Symbol.iterator](); + +it.next(); // { value: "h", done: false } +it.next(); // { value: "e", done: false } +.. +``` + +**Note:** Technically, the primitive value itself isn't iterable, but thanks to "boxing", `"hello world"` is coerced/converted to its `String` object wrapper form, which *is* an iterable. See the *Types & Grammar* title of this series for more information. + +ES6 also includes several new data structures, called collections (see Chapter 5). These collections are not only iterables themselves, but they also provide API method(s) to generate an iterator, such as: + +```js +var m = new Map(); +m.set( "foo", 42 ); +m.set( { cool: true }, "hello world" ); + +var it1 = m[Symbol.iterator](); +var it2 = m.entries(); + +it1.next(); // { value: [ "foo", 42 ], done: false } +it2.next(); // { value: [ "foo", 42 ], done: false } +.. +``` + +The `next(..)` method of an iterator can optionally take one or more arguments. The built-in iterators mostly do not exercise this capability, though a generator's iterator definitely does (see "Generators" later in this chapter). + +By general convention, including all the built-in iterators, calling `next(..)` on an iterator that's already been exhausted is not an error, but will simply continue to return the result `{ value: undefined, done: true }`. + +### Optional: `return(..)` and `throw(..)` + +The optional methods on the iterator interface -- `return(..)` and `throw(..)` -- are not implemented on most of the built-in iterators. However, they definitely do mean something in the context of generators, so see "Generators" for more specific information. + +`return(..)` is defined as sending a signal to an iterator that the consuming code is complete and will not be pulling any more values from it. This signal can be used to notify the producer (the iterator responding to `next(..)` calls) to perform any cleanup it may need to do, such as releasing/closing network, database, or file handle resources. + +If an iterator has a `return(..)` present and any condition occurs that can automatically be interpreted as abnormal or early termination of consuming the iterator, `return(..)` will automatically be called. You can call `return(..)` manually as well. + +`return(..)` will return an `IteratorResult` object just like `next(..)` does. In general, the optional value you send to `return(..)` would be sent back as `value` in this `IteratorResult`, though there are nuanced cases where that might not be true. + +`throw(..)` is used to signal an exception/error to an iterator, which possibly may be used differently by the iterator than the completion signal implied by `return(..)`. It does not necessarily imply a complete stop of the iterator as `return(..)` generally does. + +For example, with generator iterators, `throw(..)` actually injects a thrown exception into the generator's paused execution context, which can be caught with a `try..catch`. An uncaught `throw(..)` exception would end up abnormally aborting the generator's iterator. + +**Note:** By general convention, an iterator should not produce any more results after having called `return(..)` or `throw(..)`. + +### Iterator Loop + +As we covered in the "`for..of`" section in Chapter 2, the ES6 `for..of` loop directly consumes a conforming iterable. + +If an iterator is also an iterable, it can be used directly with the `for..of` loop. You make an iterator an iterable by giving it a `Symbol.iterator` method that simply returns the iterator itself: + +```js +var it = { + // make the `it` iterator an iterable + [Symbol.iterator]() { return this; }, + + next() { .. }, + .. +}; + +it[Symbol.iterator]() === it; // true +``` + +Now we can consume the `it` iterator with a `for..of` loop: + +```js +for (var v of it) { + console.log( v ); +} +``` + +To fully understand how such a loop works, recall the `for` equivalent of a `for..of` loop from Chapter 2: + +```js +for (var v, res; (res = it.next()) && !res.done; ) { + v = res.value; + console.log( v ); +} +``` + +If you look closely, you'll see that `it.next()` is called before each iteration, and then `res.done` is consulted. If `res.done` is `true`, the expression evaluates to `false` and the iteration doesn't occur. + +Recall earlier that we suggested iterators should in general not return `done: true` along with the final intended value from the iterator. Now you can see why. + +If an iterator returned `{ done: true, value: 42 }`, the `for..of` loop would completely discard the `42` value and it'd be lost. For this reason, assuming that your iterator may be consumed by patterns like the `for..of` loop or its manual `for` equivalent, you should probably wait to return `done: true` for signaling completion until after you've already returned all relevant iteration values. + +**Warning:** You can, of course, intentionally design your iterator to return some relevant `value` at the same time as returning `done: true`. But don't do this unless you've documented that as the case, and thus implicitly forced consumers of your iterator to use a different pattern for iteration than is implied by `for..of` or its manual equivalent we depicted. + +### Custom Iterators + +In addition to the standard built-in iterators, you can make your own! All it takes to make them interoperate with ES6's consumption facilities (e.g., the `for..of` loop and the `...` operator) is to adhere to the proper interface(s). + +Let's try constructing an iterator that produces the infinite series of numbers in the Fibonacci sequence: + +```js +var Fib = { + [Symbol.iterator]() { + var n1 = 1, n2 = 1; + + return { + // make the iterator an iterable + [Symbol.iterator]() { return this; }, + + next() { + var current = n2; + n2 = n1; + n1 = n1 + current; + return { value: current, done: false }; + }, + + return(v) { + console.log( + "Fibonacci sequence abandoned." + ); + return { value: v, done: true }; + } + }; + } +}; + +for (var v of Fib) { + console.log( v ); + + if (v > 50) break; +} +// 1 1 2 3 5 8 13 21 34 55 +// Fibonacci sequence abandoned. +``` + +**Warning:** If we hadn't inserted the `break` condition, this `for..of` loop would have run forever, which is probably not the desired result in terms of breaking your program! + +The `Fib[Symbol.iterator]()` method when called returns the iterator object with `next()` and `return(..)` methods on it. State is maintained via `n1` and `n2` variables, which are kept by the closure. + +Let's *next* consider an iterator that is designed to run through a series (aka a queue) of actions, one item at a time: + +```js +var tasks = { + [Symbol.iterator]() { + var steps = this.actions.slice(); + + return { + // make the iterator an iterable + [Symbol.iterator]() { return this; }, + + next(...args) { + if (steps.length > 0) { + let res = steps.shift()( ...args ); + return { value: res, done: false }; + } + else { + return { done: true } + } + }, + + return(v) { + steps.length = 0; + return { value: v, done: true }; + } + }; + }, + actions: [] +}; +``` + +The iterator on `tasks` steps through functions found in the `actions` array property, if any, and executes them one at a time, passing in whatever arguments you pass to `next(..)`, and returning any return value to you in the standard `IteratorResult` object. + +Here's how we could use this `tasks` queue: + +```js +tasks.actions.push( + function step1(x){ + console.log( "step 1:", x ); + return x * 2; + }, + function step2(x,y){ + console.log( "step 2:", x, y ); + return x + (y * 2); + }, + function step3(x,y,z){ + console.log( "step 3:", x, y, z ); + return (x * y) + z; + } +); + +var it = tasks[Symbol.iterator](); + +it.next( 10 ); // step 1: 10 + // { value: 20, done: false } + +it.next( 20, 50 ); // step 2: 20 50 + // { value: 120, done: false } + +it.next( 20, 50, 120 ); // step 3: 20 50 120 + // { value: 1120, done: false } + +it.next(); // { done: true } +``` + +This particular usage reinforces that iterators can be a pattern for organizing functionality, not just data. It's also reminiscent of what we'll see with generators in the next section. + +You could even get creative and define an iterator that represents meta operations on a single piece of data. For example, we could define an iterator for numbers that by default ranges from `0` up to (or down to, for negative numbers) the number in question. + +Consider: + +```js +if (!Number.prototype[Symbol.iterator]) { + Object.defineProperty( + Number.prototype, + Symbol.iterator, + { + writable: true, + configurable: true, + enumerable: false, + value: function iterator(){ + var i, inc, done = false, top = +this; + + // iterate positively or negatively? + inc = 1 * (top < 0 ? -1 : 1); + + return { + // make the iterator itself an iterable! + [Symbol.iterator](){ return this; }, + + next() { + if (!done) { + // initial iteration always 0 + if (i == null) { + i = 0; + } + // iterating positively + else if (top >= 0) { + i = Math.min(top,i + inc); + } + // iterating negatively + else { + i = Math.max(top,i + inc); + } + + // done after this iteration? + if (i == top) done = true; + + return { value: i, done: false }; + } + else { + return { done: true }; + } + } + }; + } + } + ); +} +``` + +Now, what tricks does this creativity afford us? + +```js +for (var i of 3) { + console.log( i ); +} +// 0 1 2 3 + +[...-3]; // [0,-1,-2,-3] +``` + +Those are some fun tricks, though the practical utility is somewhat debatable. But then again, one might wonder why ES6 didn't just ship with such a minor but delightful feature easter egg!? + +I'd be remiss if I didn't at least remind you that extending native prototypes as I'm doing in the previous snippet is something you should only do with caution and awareness of potential hazards. + +In this case, the chances that you'll have a collision with other code or even a future JS feature is probably exceedingly low. But just beware of the slight possibility. And document what you're doing verbosely for posterity's sake. + +**Note:** I've expounded on this particular technique in this blog post (http://blog.getify.com/iterating-es6-numbers/) if you want more details. And this comment (http://blog.getify.com/iterating-es6-numbers/comment-page-1/#comment-535294) even suggests a similar trick but for making string character ranges. + +### Iterator Consumption + +We've already shown consuming an iterator item by item with the `for..of` loop. But there are other ES6 structures that can consume iterators. + +Let's consider the iterator attached to this array (though any iterator we choose would have the following behaviors): + +```js +var a = [1,2,3,4,5]; +``` + +The `...` spread operator fully exhausts an iterator. Consider: + +```js +function foo(x,y,z,w,p) { + console.log( x + y + z + w + p ); +} + +foo( ...a ); // 15 +``` + +`...` can also spread an iterator inside an array: + +```js +var b = [ 0, ...a, 6 ]; +b; // [0,1,2,3,4,5,6] +``` + +Array destructuring (see "Destructuring" in Chapter 2) can partially or completely (if paired with a `...` rest/gather operator) consume an iterator: + +```js +var it = a[Symbol.iterator](); + +var [x,y] = it; // take just the first two elements from `it` +var [z, ...w] = it; // take the third, then the rest all at once + +// is `it` fully exhausted? Yep. +it.next(); // { value: undefined, done: true } + +x; // 1 +y; // 2 +z; // 3 +w; // [4,5] +``` + +## Generators + +All functions run to completion, right? In other words, once a function starts running, it finishes before anything else can interrupt. + +At least that's how it's been for the whole history of JavaScript up to this point. As of ES6, a new somewhat exotic form of function is being introduced, called a generator. A generator can pause itself in mid-execution, and can be resumed either right away or at a later time. So it clearly does not hold the run-to-completion guarantee that normal functions do. + +Moreover, each pause/resume cycle in mid-execution is an opportunity for two-way message passing, where the generator can return a value, and the controlling code that resumes it can send a value back in. + +As with iterators in the previous section, there are multiple ways to think about what a generator is, or rather what it's most useful for. There's no one right answer, but we'll try to consider several angles. + +**Note:** See the *Async & Performance* title of this series for more information about generators, and also see Chapter 4 of this current title. + +### Syntax + +The generator function is declared with this new syntax: + +```js +function *foo() { + // .. +} +``` + +The position of the `*` is not functionally relevant. The same declaration could be written as any of the following: + +```js +function *foo() { .. } +function* foo() { .. } +function * foo() { .. } +function*foo() { .. } +.. +``` + +The *only* difference here is stylistic preference. Most other literature seems to prefer `function* foo(..) { .. }`. I prefer `function *foo(..) { .. }`, so that's how I'll present them for the rest of this title. + +My reason is purely didactic in nature. In this text, when referring to a generator function, I will use `*foo(..)`, as opposed to `foo(..)` for a normal function. I observe that `*foo(..)` more closely matches the `*` positioning of `function *foo(..) { .. }`. + +Moreover, as we saw in Chapter 2 with concise methods, there's a concise generator form in object literals: + +```js +var a = { + *foo() { .. } +}; +``` + +I would say that with concise generators, `*foo() { .. }` is rather more natural than `* foo() { .. }`. So that further argues for matching the consistency with `*foo()`. + +Consistency eases understanding and learning. + +#### Executing a Generator + +Though a generator is declared with `*`, you still execute it like a normal function: + +```js +foo(); +``` + +You can still pass it arguments, as in: + +```js +function *foo(x,y) { + // .. +} + +foo( 5, 10 ); +``` + +The major difference is that executing a generator, like `foo(5,10)` doesn't actually run the code in the generator. Instead, it produces an iterator that will control the generator to execute its code. + +We'll come back to this later in "Iterator Control," but briefly: + +```js +function *foo() { + // .. +} + +var it = foo(); + +// to start/advanced `*foo()`, call +// `it.next(..)` +``` + +#### `yield` + +Generators also have a new keyword you can use inside them, to signal the pause point: `yield`. Consider: + +```js +function *foo() { + var x = 10; + var y = 20; + + yield; + + var z = x + y; +} +``` + +In this `*foo()` generator, the operations on the first two lines would run at the beginning, then `yield` would pause the generator. If and when resumed, the last line of `*foo()` would run. `yield` can appear any number of times (or not at all, technically!) in a generator. + +You can even put `yield` inside a loop, and it can represent a repeated pause point. In fact, a loop that never completes just means a generator that never completes, which is completely valid, and sometimes entirely what you need. + +`yield` is not just a pause point. It's an expression that sends out a value when pausing the generator. Here's a `while..true` loop in a generator that for each iteration `yield`s a new random number: + +```js +function *foo() { + while (true) { + yield Math.random(); + } +} +``` + +The `yield ..` expression not only sends a value -- `yield` without a value is the same as `yield undefined` -- but also receives (e.g., is replaced by) the eventual resumption value. Consider: + +```js +function *foo() { + var x = yield 10; + console.log( x ); +} +``` + +This generator will first `yield` out the value `10` when pausing itself. When you resume the generator -- using the `it.next(..)` we referred to earlier -- whatever value (if any) you resume with will replace/complete the whole `yield 10` expression, meaning that value will be assigned to the `x` variable. + +A `yield ..` expression can appear anywhere a normal expression can. For example: + +```js +function *foo() { + var arr = [ yield 1, yield 2, yield 3 ]; + console.log( arr, yield 4 ); +} +``` + +`*foo()` here has four `yield ..` expressions. Each `yield` results in the generator pausing to wait for a resumption value that's then used in the various expression contexts. + +`yield` is not technically an operator, though when used like `yield 1` it sure looks like it. Because `yield` can be used all by itself as in `var x = yield;`, thinking of it as an operator can sometimes be confusing. + +Technically, `yield ..` is of the same "expression precedence" -- similar conceptually to operator precedence -- as an assignment expression like `a = 3`. That means `yield ..` can basically appear anywhere `a = 3` can validly appear. + +Let's illustrate the symmetry: + +```js +var a, b; + +a = 3; // valid +b = 2 + a = 3; // invalid +b = 2 + (a = 3); // valid + +yield 3; // valid +a = 2 + yield 3; // invalid +a = 2 + (yield 3); // valid +``` + +**Note:** If you think about it, it makes a sort of conceptual sense that a `yield ..` expression would behave similar to an assignment expression. When a paused `yield` expression is resumed, it's completed/replaced by the resumption value in a way that's not terribly dissimilar from being "assigned" that value. + +The takeaway: if you need `yield ..` to appear in a position where an assignment like `a = 3` would not itself be allowed, it needs to be wrapped in a `( )`. + +Because of the low precedence of the `yield` keyword, almost any expression after a `yield ..` will be computed first before being sent with `yield`. Only the `...` spread operator and the `,` comma operator have lower precedence, meaning they'd bind after the `yield` has been evaluated. + +So just like with multiple operators in normal statements, another case where `( )` might be needed is to override (elevate) the low precedence of `yield`, such as the difference between these expressions: + +```js +yield 2 + 3; // same as `yield (2 + 3)` + +(yield 2) + 3; // `yield 2` first, then `+ 3` +``` + +Just like `=` assignment, `yield` is also "right-associative," which means that multiple `yield` expressions in succession are treated as having been `( .. )` grouped from right to left. So, `yield yield yield 3` is treated as `yield (yield (yield 3))`. A "left-associative" interpretation like `((yield) yield) yield 3` would make no sense. + +Just like with operators, it's a good idea to use `( .. )` grouping, even if not strictly required, to disambiguate your intent if `yield` is combined with other operators or `yield`s. + +**Note:** See the *Types & Grammar* title of this series for more information about operator precedence and associativity. + +#### `yield *` + +In the same way that the `*` makes a `function` declaration into `function *` generator declaration, a `*` makes `yield` into `yield *`, which is a very different mechanism, called *yield delegation*. Grammatically, `yield *..` will behave the same as a `yield ..`, as discussed in the previous section. + +`yield * ..` requires an iterable; it then invokes that iterable's iterator, and delegates its own host generator's control to that iterator until it's exhausted. Consider: + +```js +function *foo() { + yield *[1,2,3]; +} +``` + +**Note:** As with the `*` position in a generator's declaration (discussed earlier), the `*` positioning in `yield *` expressions is stylistically up to you. Most other literature prefers `yield* ..`, but I prefer `yield *..`, for very symmetrical reasons as already discussed. + +The `[1,2,3]` value produces an iterator that will step through its values, so the `*foo()` generator will yield those values out as it's consumed. Another way to illustrate the behavior is in yield delegating to another generator: + +```js +function *foo() { + yield 1; + yield 2; + yield 3; +} + +function *bar() { + yield *foo(); +} +``` + +The iterator produced when `*bar()` calls `*foo()` is delegated to via `yield *`, meaning whatever value(s) `*foo()` produces will be produced by `*bar()`. + +Whereas with `yield ..` the completion value of the expression comes from resuming the generator with `it.next(..)`, the completion value of the `yield *..` expression comes from the return value (if any) from the delegated-to iterator. + +Built-in iterators generally don't have return values, as we covered at the end of the "Iterator Loop" section earlier in this chapter. But if you define your own custom iterator (or generator), you can design it to `return` a value, which `yield *..` would capture: + +```js +function *foo() { + yield 1; + yield 2; + yield 3; + return 4; +} + +function *bar() { + var x = yield *foo(); + console.log( "x:", x ); +} + +for (var v of bar()) { + console.log( v ); +} +// 1 2 3 +// x: { value: 4, done: true } +``` + +While the `1`, `2`, and `3` values are `yield`ed out of `*foo()` and then out of `*bar()`, the `4` value returned from `*foo()` is the completion value of the `yield *foo()` expression, which then gets assigned to `x`. + +Because `yield *` can call another generator (by way of delegating to its iterator), it can also perform a sort of generator recursion by calling itself: + +```js +function *foo(x) { + if (x < 3) { + x = yield *foo( x + 1 ); + } + return x * 2; +} + +foo( 1 ); +``` + +The result from `foo(1)` and then calling the iterator's `next()` to run it through its recursive steps will be `24`. The first `*foo(..)` run has `x` at value `1`, which is `x < 3`. `x + 1` is passed recursively to `*foo(..)`, so `x` is then `2`. One more recursive call results in `x` of `3`. + +Now, because `x < 3` fails, the recursion stops, and `return 3 * 2` gives `6` back to the previous call's `yield *..` expression, which is then assigned to `x`. Another `return 6 * 2` returns `12` back to the previous call's `x`. Finally `12 * 2`, or `24`, is returned from the completed run of the `*foo(..)` generator. + +### Iterator Control + +Earlier, we briefly introduced the concept that generators are controlled by iterators. Let's fully dig into that now. + +Recall the recursive `*foo(..)` from the previous section. Here's how we'd run it: + +```js +function *foo(x) { + if (x < 3) { + x = yield *foo( x + 1 ); + } + return x * 2; +} + +var it = foo( 1 ); +it.next(); // { value: 24, done: true } +``` + +In this case, the generator doesn't really ever pause, as there's no `yield ..` expression. Instead, `yield *` just keeps the current iteration step going via the recursive call. So, just one call to the iterator's `next()` function fully runs the generator. + +Now let's consider a generator that will have multiple steps and thus multiple produced values: + +```js +function *foo() { + yield 1; + yield 2; + yield 3; +} +``` + +We already know we can consume an iterator, even one attached to a generator like `*foo()`, with a `for..of` loop: + +```js +for (var v of foo()) { + console.log( v ); +} +// 1 2 3 +``` + +**Note:** The `for..of` loop requires an iterable. A generator function reference (like `foo`) by itself is not an iterable; you must execute it with `foo()` to get the iterator (which is also an iterable, as we explained earlier in this chapter). You could theoretically extend the `GeneratorPrototype` (the prototype of all generator functions) with a `Symbol.iterator` function that essentially just does `return this()`. That would make the `foo` reference itself an iterable, which means `for (var v of foo) { .. }` (notice no `()` on `foo`) will work. + +Let's instead iterate the generator manually: + +```js +function *foo() { + yield 1; + yield 2; + yield 3; +} + +var it = foo(); + +it.next(); // { value: 1, done: false } +it.next(); // { value: 2, done: false } +it.next(); // { value: 3, done: false } + +it.next(); // { value: undefined, done: true } +``` + +If you look closely, there are three `yield` statements and four `next()` calls. That may seem like a strange mismatch. In fact, there will always be one more `next()` call than `yield` expression, assuming all are evaluated and the generator is fully run to completion. + +But if you look at it from the opposite perspective (inside-out instead of outside-in), the matching between `yield` and `next()` makes more sense. + +Recall that the `yield ..` expression will be completed by the value you resume the generator with. That means the argument you pass to `next(..)` completes whatever `yield ..` expression is currently paused waiting for a completion. + +Let's illustrate this perspective this way: + +```js +function *foo() { + var x = yield 1; + var y = yield 2; + var z = yield 3; + console.log( x, y, z ); +} +``` + +In this snippet, each `yield ..` is sending a value out (`1`, `2`, `3`), but more directly, it's pausing the generator to wait for a value. In other words, it's almost like asking the question, "What value should I use here? I'll wait to hear back." + +Now, here's how we control `*foo()` to start it up: + +```js +var it = foo(); + +it.next(); // { value: 1, done: false } +``` + +That first `next()` call is starting up the generator from its initial paused state, and running it to the first `yield`. At the moment you call that first `next()`, there's no `yield ..` expression waiting for a completion. If you passed a value to that first `next()` call, it would currently just be thrown away, because no `yield` is waiting to receive such a value. + +**Note:** An early proposal for the "beyond ES6" timeframe *would* let you access a value passed to an initial `next(..)` call via a separate meta property (see Chapter 7) inside the generator. + +Now, let's answer the currently pending question, "What value should I assign to `x`?" We'll answer it by sending a value to the *next* `next(..)` call: + +```js +it.next( "foo" ); // { value: 2, done: false } +``` + +Now, the `x` will have the value `"foo"`, but we've also asked a new question, "What value should I assign to `y`?" And we answer: + +```js +it.next( "bar" ); // { value: 3, done: false } +``` + +Answer given, another question asked. Final answer: + +```js +it.next( "baz" ); // "foo" "bar" "baz" + // { value: undefined, done: true } +``` + +Now it should be clearer how each `yield ..` "question" is answered by the *next* `next(..)` call, and so the "extra" `next()` call we observed is always just the initial one that starts everything going. + +Let's put all those steps together: + +```js +var it = foo(); + +// start up the generator +it.next(); // { value: 1, done: false } + +// answer first question +it.next( "foo" ); // { value: 2, done: false } + +// answer second question +it.next( "bar" ); // { value: 3, done: false } + +// answer third question +it.next( "baz" ); // "foo" "bar" "baz" + // { value: undefined, done: true } +``` + +You can think of a generator as a producer of values, in which case each iteration is simply producing a value to be consumed. + +But in a more general sense, perhaps it's appropriate to think of generators as controlled, progressive code execution, much like the `tasks` queue example from the earlier "Custom Iterators" section. + +**Note:** That perspective is exactly the motivation for how we'll revisit generators in Chapter 4. Specifically, there's no reason that `next(..)` has to be called right away after the previous `next(..)` finishes. While the generator's inner execution context is paused, the rest of the program continues unblocked, including the ability for asynchronous actions to control when the generator is resumed. + +### Early Completion + +As we covered earlier in this chapter, the iterator attached to a generator supports the optional `return(..)` and `throw(..)` methods. Both of them have the effect of aborting a paused generator immediately. + +Consider: + +```js +function *foo() { + yield 1; + yield 2; + yield 3; +} + +var it = foo(); + +it.next(); // { value: 1, done: false } + +it.return( 42 ); // { value: 42, done: true } + +it.next(); // { value: undefined, done: true } +``` + +`return(x)` is kind of like forcing a `return x` to be processed at exactly that moment, such that you get the specified value right back. Once a generator is completed, either normally or early as shown, it no longer processes any code or returns any values. + +In addition to `return(..)` being callable manually, it's also called automatically at the end of iteration by any of the ES6 constructs that consume iterators, such as the `for..of` loop and the `...` spread operator. + +The purpose for this capability is so the generator can be notified if the controlling code is no longer going to iterate over it anymore, so that it can perhaps do any cleanup tasks (freeing up resources, resetting status, etc.). Identical to a normal function cleanup pattern, the main way to accomplish this is to use a `finally` clause: + +```js +function *foo() { + try { + yield 1; + yield 2; + yield 3; + } + finally { + console.log( "cleanup!" ); + } +} + +for (var v of foo()) { + console.log( v ); +} +// 1 2 3 +// cleanup! + +var it = foo(); + +it.next(); // { value: 1, done: false } +it.return( 42 ); // cleanup! + // { value: 42, done: true } +``` + +**Warning:** Do not put a `yield` statement inside the `finally` clause! It's valid and legal, but it's a really terrible idea. It acts in a sense as deferring the completion of the `return(..)` call you made, as any `yield ..` expressions in the `finally` clause are respected to pause and send messages; you don't immediately get a completed generator as expected. There's basically no good reason to opt in to that crazy *bad part*, so avoid doing so! + +In addition to the previous snippet showing how `return(..)` aborts the generator while still triggering the `finally` clause, it also demonstrates that a generator produces a whole new iterator each time it's called. In fact, you can use multiple iterators attached to the same generator concurrently: + +```js +function *foo() { + yield 1; + yield 2; + yield 3; +} + +var it1 = foo(); +it1.next(); // { value: 1, done: false } +it1.next(); // { value: 2, done: false } + +var it2 = foo(); +it2.next(); // { value: 1, done: false } + +it1.next(); // { value: 3, done: false } + +it2.next(); // { value: 2, done: false } +it2.next(); // { value: 3, done: false } + +it2.next(); // { value: undefined, done: true } +it1.next(); // { value: undefined, done: true } +``` + +#### Early Abort + +Instead of calling `return(..)`, you can call `throw(..)`. Just like `return(x)` is essentially injecting a `return x` into the generator at its current pause point, calling `throw(x)` is essentially like injecting a `throw x` at the pause point. + +Other than the exception behavior (we cover what that means to `try` clauses in the next section), `throw(..)` produces the same sort of early completion that aborts the generator's run at its current pause point. For example: + +```js +function *foo() { + yield 1; + yield 2; + yield 3; +} + +var it = foo(); + +it.next(); // { value: 1, done: false } + +try { + it.throw( "Oops!" ); +} +catch (err) { + console.log( err ); // Exception: Oops! +} + +it.next(); // { value: undefined, done: true } +``` + +Because `throw(..)` basically injects a `throw ..` in replacement of the `yield 1` line of the generator, and nothing handles this exception, it immediately propagates back out to the calling code, which handles it with a `try..catch`. + +Unlike `return(..)`, the iterator's `throw(..)` method is never called automatically. + +Of course, though not shown in the previous snippet, if a `try..finally` clause was waiting inside the generator when you call `throw(..)`, the `finally` clause would be given a chance to complete before the exception is propagated back to the calling code. + +### Error Handling + +As we've already hinted, error handling with generators can be expressed with `try..catch`, which works in both inbound and outbound directions: + +```js +function *foo() { + try { + yield 1; + } + catch (err) { + console.log( err ); + } + + yield 2; + + throw "Hello!"; +} + +var it = foo(); + +it.next(); // { value: 1, done: false } + +try { + it.throw( "Hi!" ); // Hi! + // { value: 2, done: false } + it.next(); + + console.log( "never gets here" ); +} +catch (err) { + console.log( err ); // Hello! +} +``` + +Errors can also propagate in both directions through `yield *` delegation: + +```js +function *foo() { + try { + yield 1; + } + catch (err) { + console.log( err ); + } + + yield 2; + + throw "foo: e2"; +} + +function *bar() { + try { + yield *foo(); + + console.log( "never gets here" ); + } + catch (err) { + console.log( err ); + } +} + +var it = bar(); + +try { + it.next(); // { value: 1, done: false } + + it.throw( "e1" ); // e1 + // { value: 2, done: false } + + it.next(); // foo: e2 + // { value: undefined, done: true } +} +catch (err) { + console.log( "never gets here" ); +} + +it.next(); // { value: undefined, done: true } +``` + +When `*foo()` calls `yield 1`, the `1` value passes through `*bar()` untouched, as we've already seen. + +But what's most interesting about this snippet is that when `*foo()` calls `throw "foo: e2"`, this error propagates to `*bar()` and is immediately caught by `*bar()`'s `try..catch` block. The error doesn't pass through `*bar()` like the `1` value did. + +`*bar()`'s `catch` then does a normal output of `err` (`"foo: e2"`) and then `*bar()` finishes normally, which is why the `{ value: undefined, done: true }` iterator result comes back from `it.next()`. + +If `*bar()` didn't have a `try..catch` around the `yield *..` expression, the error would of course propagate all the way out, and on the way through it still would complete (abort) `*bar()`. + +### Transpiling a Generator + +Is it possible to represent a generator's capabilities prior to ES6? It turns out it is, and there are several great tools that do so, including most notably Facebook's Regenerator tool (https://facebook.github.io/regenerator/). + +But just to better understand generators, let's try our hand at manually converting. Basically, we're going to create a simple closure-based state machine. + +We'll keep our source generator really simple: + +```js +function *foo() { + var x = yield 42; + console.log( x ); +} +``` + +To start, we'll need a function called `foo()` that we can execute, which needs to return an iterator: + +```js +function foo() { + // .. + + return { + next: function(v) { + // .. + } + + // we'll skip `return(..)` and `throw(..)` + }; +} +``` + +Now, we need some inner variable to keep track of where we are in the steps of our "generator"'s logic. We'll call it `state`. There will be three states: `0` initially, `1` while waiting to fulfill the `yield` expression, and `2` once the generator is complete. + +Each time `next(..)` is called, we need to process the next step, and then increment `state`. For convenience, we'll put each step into a `case` clause of a `switch` statement, and we'll hold that in an inner function called `nextState(..)` that `next(..)` can call. Also, because `x` is a variable across the overall scope of the "generator," it needs to live outside the `nextState(..)` function. + +Here it is all together (obviously somewhat simplified, to keep the conceptual illustration clearer): + +```js +function foo() { + function nextState(v) { + switch (state) { + case 0: + state++; + + // the `yield` expression + return 42; + case 1: + state++; + + // `yield` expression fulfilled + x = v; + console.log( x ); + + // the implicit `return` + return undefined; + + // no need to handle state `2` + } + } + + var state = 0, x; + + return { + next: function(v) { + var ret = nextState( v ); + + return { value: ret, done: (state == 2) }; + } + + // we'll skip `return(..)` and `throw(..)` + }; +} +``` + +And finally, let's test our pre-ES6 "generator": + +```js +var it = foo(); + +it.next(); // { value: 42, done: false } + +it.next( 10 ); // 10 + // { value: undefined, done: true } +``` + +Not bad, huh? Hopefully this exercise solidifies in your mind that generators are actually just simple syntax for state machine logic. That makes them widely applicable. + +### Generator Uses + +So, now that we much more deeply understand how generators work, what are they useful for? + +We've seen two major patterns: + +* *Producing a series of values:* This usage can be simple (e.g., random strings or incremented numbers), or it can represent more structured data access (e.g., iterating over rows returned from a database query). + + Either way, we use the iterator to control a generator so that some logic can be invoked for each call to `next(..)`. Normal iterators on data structures merely pull values without any controlling logic. +* *Queue of tasks to perform serially:* This usage often represents flow control for the steps in an algorithm, where each step requires retrieval of data from some external source. The fulfillment of each piece of data may be immediate, or may be asynchronously delayed. + + From the perspective of the code inside the generator, the details of sync or async at a `yield` point are entirely opaque. Moreover, these details are intentionally abstracted away, such as not to obscure the natural sequential expression of steps with such implementation complications. Abstraction also means the implementations can be swapped/refactored often without touching the code in the generator at all. + +When generators are viewed in light of these uses, they become a lot more than just a different or nicer syntax for a manual state machine. They are a powerful abstraction tool for organizing and controlling orderly production and consumption of data. + +## Modules + +I don't think it's an exaggeration to suggest that the single most important code organization pattern in all of JavaScript is, and always has been, the module. For myself, and I think for a large cross-section of the community, the module pattern drives the vast majority of code. + +### The Old Way + +The traditional module pattern is based on an outer function with inner variables and functions, and a returned "public API" with methods that have closure over the inner data and capabilities. It's often expressed like this: + +```js +function Hello(name) { + function greeting() { + console.log( "Hello " + name + "!" ); + } + + // public API + return { + greeting: greeting + }; +} + +var me = Hello( "Kyle" ); +me.greeting(); // Hello Kyle! +``` + +This `Hello(..)` module can produce multiple instances by being called subsequent times. Sometimes, a module is only called for as a singleton (i.e., it just needs one instance), in which case a slight variation on the previous snippet, using an IIFE, is common: + +```js +var me = (function Hello(name){ + function greeting() { + console.log( "Hello " + name + "!" ); + } + + // public API + return { + greeting: greeting + }; +})( "Kyle" ); + +me.greeting(); // Hello Kyle! +``` + +This pattern is tried and tested. It's also flexible enough to have a wide assortment of variations for a number of different scenarios. + +One of the most common is the Asynchronous Module Definition (AMD), and another is the Universal Module Definition (UMD). We won't cover the particulars of these patterns and techniques here, but they're explained extensively in many places online. + +### Moving Forward + +As of ES6, we no longer need to rely on the enclosing function and closure to provide us with module support. ES6 modules have first class syntactic and functional support. + +Before we get into the specific syntax, it's important to understand some fairly significant conceptual differences with ES6 modules compared to how you may have dealt with modules in the past: + +* ES6 uses file-based modules, meaning one module per file. At this time, there is no standardized way of combining multiple modules into a single file. + + That means that if you are going to load ES6 modules directly into a browser web application, you will be loading them individually, not as a large bundle in a single file as has been common in performance optimization efforts. + + It's expected that the contemporaneous advent of HTTP/2 will significantly mitigate any such performance concerns, as it operates on a persistent socket connection and thus can very efficiently load many smaller files in parallel and interleaved with one another. +* The API of an ES6 module is static. That is, you define statically what all the top-level exports are on your module's public API, and those cannot be amended later. + + Some uses are accustomed to being able to provide dynamic API definitions, where methods can be added/removed/replaced in response to runtime conditions. Either these uses will have to change to fit with ES6 static APIs, or they will have to restrain the dynamic changes to properties/methods of a second-level object. +* ES6 modules are singletons. That is, there's only one instance of the module, which maintains its state. Every time you import that module into another module, you get a reference to the one centralized instance. If you want to be able to produce multiple module instances, your module will need to provide some sort of factory to do it. +* The properties and methods you expose on a module's public API are not just normal assignments of values or references. They are actual bindings (almost like pointers) to the identifiers in your inner module definition. + + In pre-ES6 modules, if you put a property on your public API that holds a primitive value like a number or string, that property assignment was by value-copy, and any internal update of a corresponding variable would be separate and not affect the public copy on the API object. + + With ES6, exporting a local private variable, even if it currently holds a primitive string/number/etc, exports a binding to the variable. If the module changes the variable's value, the external import binding now resolves to that new value. +* Importing a module is the same thing as statically requesting it to load (if it hasn't already). If you're in a browser, that implies a blocking load over the network. If you're on a server (i.e., Node.js), it's a blocking load from the filesystem. + + However, don't panic about the performance implications. Because ES6 modules have static definitions, the import requirements can be statically scanned, and loads will happen preemptively, even before you've used the module. + + ES6 doesn't actually specify or handle the mechanics of how these load requests work. There's a separate notion of a Module Loader, where each hosting environment (browser, Node.js, etc.) provides a default Loader appropriate to the environment. The importing of a module uses a string value to represent where to get the module (URL, file path, etc.), but this value is opaque in your program and only meaningful to the Loader itself. + + You can define your own custom Loader if you want more fine-grained control than the default Loader affords -- which is basically none, as it's totally hidden from your program's code. + +As you can see, ES6 modules will serve the overall use case of organizing code with encapsulation, controlling public APIs, and referencing dependency imports. But they have a very particular way of doing so, and that may or may not fit very closely with how you've already been doing modules for years. + +#### CommonJS + +There's a similar, but not fully compatible, module syntax called CommonJS, which is familiar to those in the Node.js ecosystem. + +For lack of a more tactful way to say this, in the long run, ES6 modules essentially are bound to supersede all previous formats and standards for modules, even CommonJS, as they are built on syntactic support in the language. This will, in time, inevitably win out as the superior approach, if for no other reason than ubiquity. + +We face a fairly long road to get to that point, though. There are literally hundreds of thousands of CommonJS style modules in the server-side JavaScript world, and 10 times that many modules of varying format standards (UMD, AMD, ad hoc) in the browser world. It will take many years for the transitions to make any significant progress. + +In the interim, module transpilers/converters will be an absolute necessity. You might as well just get used to that new reality. Whether you author in regular modules, AMD, UMD, CommonJS, or ES6, these tools will have to parse and convert to a format that is suitable for whatever environment your code will run in. + +For Node.js, that probably means (for now) that the target is CommonJS. For the browser, it's probably UMD or AMD. Expect lots of flux on this over the next few years as these tools mature and best practices emerge. + +From here on out, my best advice on modules is this: whatever format you've been religiously attached to with strong affinity, also develop an appreciation for and understanding of ES6 modules, such as they are, and let your other module tendencies fade. They *are* the future of modules in JS, even if that reality is a bit of a ways off. + +### The New Way + +The two main new keywords that enable ES6 modules are `import` and `export`. There's lots of nuance to the syntax, so let's take a deeper look. + +**Warning:** An important detail that's easy to overlook: both `import` and `export` must always appear in the top-level scope of their respective usage. For example, you cannot put either an `import` or `export` inside an `if` conditional; they must appear outside of all blocks and functions. + +#### `export`ing API Members + +The `export` keyword is either put in front of a declaration, or used as an operator (of sorts) with a special list of bindings to export. Consider: + +```js +export function foo() { + // .. +} + +export var awesome = 42; + +var bar = [1,2,3]; +export { bar }; +``` + +Another way of expressing the same exports: + +```js +function foo() { + // .. +} + +var awesome = 42; +var bar = [1,2,3]; + +export { foo, awesome, bar }; +``` + +These are all called *named exports*, as you are in effect exporting the name bindings of the variables/functions/etc. + +Anything you don't *label* with `export` stays private inside the scope of the module. That is, although something like `var bar = ..` looks like it's declaring at the top-level global scope, the top-level scope is actually the module itself; there is no global scope in modules. + +**Note:** Modules *do* still have access to `window` and all the "globals" that hang off it, just not as lexical top-level scope. However, you really should stay away from the globals in your modules if at all possible. + +You can also "rename" (aka alias) a module member during named export: + +```js +function foo() { .. } + +export { foo as bar }; +``` + +When this module is imported, only the `bar` member name is available to import; `foo` stays hidden inside the module. + +Module exports are not just normal assignments of values or references, as you're accustomed to with the `=` assignment operator. Actually, when you export something, you're exporting a binding (kinda like a pointer) to that thing (variable, etc.). + +Within your module, if you change the value of a variable you already exported a binding to, even if it's already been imported (see the next section), the imported binding will resolve to the current (updated) value. + +Consider: + +```js +var awesome = 42; +export { awesome }; + +// later +awesome = 100; +``` + +When this module is imported, regardless of whether that's before or after the `awesome = 100` setting, once that assignment has happened, the imported binding resolves to the `100` value, not `42`. + +That's because the binding is, in essence, a reference to, or a pointer to, the `awesome` variable itself, rather than a copy of its value. This is a mostly unprecedented concept for JS introduced with ES6 module bindings. + +Though you can clearly use `export` multiple times inside a module's definition, ES6 definitely prefers the approach that a module has a single export, which is known as a *default export*. In the words of some members of the TC39 committee, you're "rewarded with simpler `import` syntax" if you follow that pattern, and conversely "penalized" with more verbose syntax if you don't. + +A default export sets a particular exported binding to be the default when importing the module. The name of the binding is literally `default`. As you'll see later, when importing module bindings you can also rename them, as you commonly will with a default export. + +There can only be one `default` per module definition. We'll cover `import` in the next section, and you'll see how the `import` syntax is more concise if the module has a default export. + +There's a subtle nuance to default export syntax that you should pay close attention to. Compare these two snippets: + +```js +function foo(..) { + // .. +} + +export default foo; +``` + +And this one: + +```js +function foo(..) { + // .. +} + +export { foo as default }; +``` + +In the first snippet, you are exporting a binding to the function expression value at that moment, *not* to the identifier `foo`. In other words, `export default ..` takes an expression. If you later assign `foo` to a different value inside your module, the module import still reveals the function originally exported, not the new value. + +By the way, the first snippet could also have been written as: + +```js +export default function foo(..) { + // .. +} +``` + +**Warning:** Although the `function foo..` part here is technically a function expression, for the purposes of the internal scope of the module, it's treated like a function declaration, in that the `foo` name is bound in the module's top-level scope (often called "hoisting"). The same is true for `export default class Foo..`. However, while you *can* do `export var foo = ..`, you currently cannot do `export default var foo = ..` (or `let` or `const`), in a frustrating case of inconsistency. At the time of this writing, there's already discussion of adding that capability in soon, post-ES6, for consistency sake. + +Recall the second snippet again: + +```js +function foo(..) { + // .. +} + +export { foo as default }; +``` + +In this version of the module export, the default export binding is actually to the `foo` identifier rather than its value, so you get the previously described binding behavior (i.e., if you later change `foo`'s value, the value seen on the import side will also be updated). + +Be very careful of this subtle gotcha in default export syntax, especially if your logic calls for export values to be updated. If you never plan to update a default export's value, `export default ..` is fine. If you do plan to update the value, you must use `export { .. as default }`. Either way, make sure to comment your code to explain your intent! + +Because there can only be one `default` per module, you may be tempted to design your module with one default export of an object with all your API methods on it, such as: + +```js +export default { + foo() { .. }, + bar() { .. }, + .. +}; +``` + +That pattern seems to map closely to how a lot of developers have already structured their pre-ES6 modules, so it seems like a natural approach. Unfortunately, it has some downsides and is officially discouraged. + +In particular, the JS engine cannot statically analyze the contents of a plain object, which means it cannot do some optimizations for static `import` performance. The advantage of having each member individually and explicitly exported is that the engine *can* do the static analysis and optimization. + +If your API has more than one member already, it seems like these principles -- one default export per module, and all API members as named exports -- are in conflict, doesn't it? But you *can* have a single default export as well as other named exports; they are not mutually exclusive. + +So, instead of this (discouraged) pattern: + +```js +export default function foo() { .. } + +foo.bar = function() { .. }; +foo.baz = function() { .. }; +``` + +You can do: + +```js +export default function foo() { .. } + +export function bar() { .. } +export function baz() { .. } +``` + +**Note:** In this previous snippet, I used the name `foo` for the function that `default` labels. That `foo` name, however, is ignored for the purposes of export -- `default` is actually the exported name. When you import this default binding, you can give it whatever name you want, as you'll see in the next section. + +Alternatively, some will prefer: + +```js +function foo() { .. } +function bar() { .. } +function baz() { .. } + +export { foo as default, bar, baz, .. }; +``` + +The effects of mixing default and named exports will be more clear when we cover `import` shortly. But essentially it means that the most concise default import form would only retrieve the `foo()` function. The user could additionally manually list `bar` and `baz` as named imports, if they want them. + +You can probably imagine how tedious that's going to be for consumers of your module if you have lots of named export bindings. There is a wildcard import form where you import all of a module's exports within a single namespace object, but there's no way to wildcard import to top-level bindings. + +Again, the ES6 module mechanism is intentionally designed to discourage modules with lots of exports; relatively speaking, it's desired that such approaches be a little more difficult, as a sort of social engineering to encourage simple module design in favor of large/complex module design. + +I would probably recommend you not mix default export with named exports, especially if you have a large API and refactoring to separate modules isn't practical or desired. In that case, just use all named exports, and document that consumers of your module should probably use the `import * as ..` (namespace import, discussed in the next section) approach to bring the whole API in at once on a single namespace. + +We mentioned this earlier, but let's come back to it in more detail. Other than the `export default ...` form that exports an expression value binding, all other export forms are exporting bindings to local identifiers. For those bindings, if you change the value of a variable inside a module after exporting, the external imported binding will access the updated value: + +```js +var foo = 42; +export { foo as default }; + +export var bar = "hello world"; + +foo = 10; +bar = "cool"; +``` + +When you import this module, the `default` and `bar` exports will be bound to the local variables `foo` and `bar`, meaning they will reveal the updated `10` and `"cool"` values. The values at time of export are irrelevant. The values at time of import are irrelevant. The bindings are live links, so all that matters is what the current value is when you access the binding. + +**Warning:** Two-way bindings are not allowed. If you import a `foo` from a module, and try to change the value of your imported `foo` variable, an error will be thrown! We'll revisit that in the next section. + +You can also re-export another module's exports, such as: + +```js +export { foo, bar } from "baz"; +export { foo as FOO, bar as BAR } from "baz"; +export * from "baz"; +``` + +Those forms are similar to just first importing from the `"baz"` module then listing its members explicitly for export from your module. However, in these forms, the members of the `"baz"` module are never imported to your module's local scope; they sort of pass through untouched. + +#### `import`ing API Members + +To import a module, unsurprisingly you use the `import` statement. Just as `export` has several nuanced variations, so does `import`, so spend plenty of time considering the following issues and experimenting with your options. + +If you want to import certain specific named members of a module's API into your top-level scope, you use this syntax: + +```js +import { foo, bar, baz } from "foo"; +``` + +**Warning:** The `{ .. }` syntax here may look like an object literal, or even an object destructuring syntax. However, its form is special just for modules, so be careful not to confuse it with other `{ .. }` patterns elsewhere. + +The `"foo"` string is called a *module specifier*. Because the whole goal is statically analyzable syntax, the module specifier must be a string literal; it cannot be a variable holding the string value. + +From the perspective of your ES6 code and the JS engine itself, the contents of this string literal are completely opaque and meaningless. The module loader will interpret this string as an instruction of where to find the desired module, either as a URL path or a local filesystem path. + +The `foo`, `bar`, and `baz` identifiers listed must match named exports on the module's API (static analysis and error assertion apply). They are bound as top-level identifiers in your current scope: + +```js +import { foo } from "foo"; + +foo(); +``` + +You can rename the bound identifiers imported, as: + +```js +import { foo as theFooFunc } from "foo"; + +theFooFunc(); +``` + +If the module has just a default export that you want to import and bind to an identifier, you can opt to skip the `{ .. }` surrounding syntax for that binding. The `import` in this preferred case gets the nicest and most concise of the `import` syntax forms: + +```js +import foo from "foo"; + +// or: +import { default as foo } from "foo"; +``` + +**Note:** As explained in the previous section, the `default` keyword in a module's `export` specifies a named export where the name is actually `default`, as is illustrated by the second more verbose syntax option. The renaming from `default` to, in this case, `foo`, is explicit in the latter syntax and is identical yet implicit in the former syntax. + +You can also import a default export along with other named exports, if the module has such a definition. Recall this module definition from earlier: + +```js +export default function foo() { .. } + +export function bar() { .. } +export function baz() { .. } +``` + +To import that module's default export and its two named exports: + +```js +import FOOFN, { bar, baz as BAZ } from "foo"; + +FOOFN(); +bar(); +BAZ(); +``` + +The strongly suggested approach from ES6's module philosophy is that you only import the specific bindings from a module that you need. If a module provides 10 API methods, but you only need two of them, some believe it wasteful to bring in the entire set of API bindings. + +One benefit, besides code being more explicit, is that narrow imports make static analysis and error detection (accidentally using the wrong binding name, for instance) more robust. + +Of course, that's just the standard position influenced by ES6 design philosophy; there's nothing that requires adherence to that approach. + +Many developers would be quick to point out that such approaches can be more tedious, requiring you to regularly revisit and update your `import` statement(s) each time you realize you need something else from a module. The trade-off is in exchange for convenience. + +In that light, the preference might be to import everything from the module into a single namespace, rather than importing individual members, each directly into the scope. Fortunately, the `import` statement has a syntax variation that can support this style of module consumption, called *namespace import*. + +Consider a `"foo"` module exported as: + +```js +export function bar() { .. } +export var x = 42; +export function baz() { .. } +``` + +You can import that entire API to a single module namespace binding: + +```js +import * as foo from "foo"; + +foo.bar(); +foo.x; // 42 +foo.baz(); +``` + +**Note:** The `* as ..` clause requires the `*` wildcard. In other words, you cannot do something like `import { bar, x } as foo from "foo"` to bring in only part of the API but still bind to the `foo` namespace. I would have liked something like that, but for ES6 it's all or nothing with the namespace import. + +If the module you're importing with `* as ..` has a default export, it is named `default` in the namespace specified. You can additionally name the default import outside of the namespace binding, as a top-level identifier. Consider a `"world"` module exported as: + +```js +export default function foo() { .. } +export function bar() { .. } +export function baz() { .. } +``` + +And this `import`: + +```js +import foofn, * as hello from "world"; + +foofn(); +hello.default(); +hello.bar(); +hello.baz(); +``` + +While this syntax is valid, it can be rather confusing that one method of the module (the default export) is bound at the top-level of your scope, whereas the rest of the named exports (and one called `default`) are bound as properties on a differently named (`hello`) identifier namespace. + +As I mentioned earlier, my suggestion would be to avoid designing your module exports in this way, to reduce the chances that your module's users will suffer these strange quirks. + +All imported bindings are immutable and/or read-only. Consider the previous import; all of these subsequent assignment attempts will throw `TypeError`s: + +```js +import foofn, * as hello from "world"; + +foofn = 42; // (runtime) TypeError! +hello.default = 42; // (runtime) TypeError! +hello.bar = 42; // (runtime) TypeError! +hello.baz = 42; // (runtime) TypeError! +``` + +Recall earlier in the "`export`ing API Members" section that we talked about how the `bar` and `baz` bindings are bound to the actual identifiers inside the `"world"` module. That means if the module changes those values, `hello.bar` and `hello.baz` now reference the updated values. + +But the immutable/read-only nature of your local imported bindings enforces that you cannot change them from the imported bindings, hence the `TypeError`s. That's pretty important, because without those protections, your changes would end up affecting all other consumers of the module (remember: singleton), which could create some very surprising side effects! + +Moreover, though a module *can* change its API members from the inside, you should be very cautious of intentionally designing your modules in that fashion. ES6 modules are *intended* to be static, so deviations from that principle should be rare and should be carefully and verbosely documented. + +**Warning:** There are module design philosophies where you actually intend to let a consumer change the value of a property on your API, or module APIs are designed to be "extended" by having other "plug-ins" add to the API namespace. As we just asserted, ES6 module APIs should be thought of and designed as static and unchangeable, which strongly restricts and discourages these alternative module design patterns. You can get around these limitations by exporting a plain object, which of course can then be changed at will. But be careful and think twice before going down that road. + +Declarations that occur as a result of an `import` are "hoisted" (see the *Scope & Closures* title of this series). Consider: + +```js +foo(); + +import { foo } from "foo"; +``` + +`foo()` can run because not only did the static resolution of the `import ..` statement figure out what `foo` is during compilation, but it also "hoisted" the declaration to the top of the module's scope, thus making it available throughout the module. + +Finally, the most basic form of the `import` looks like this: + +```js +import "foo"; +``` + +This form does not actually import any of the module's bindings into your scope. It loads (if not already loaded), compiles (if not already compiled), and evaluates (if not already run) the `"foo"` module. + +In general, that sort of import is probably not going to be terribly useful. There may be niche cases where a module's definition has side effects (such as assigning things to the `window`/global object). You could also envision using `import "foo"` as a sort of preload for a module that may be needed later. + +### Circular Module Dependency + +A imports B. B imports A. How does this actually work? + +I'll state off the bat that designing systems with intentional circular dependency is generally something I try to avoid. That having been said, I recognize there are reasons people do this and it can solve some sticky design situations. + +Let's consider how ES6 handles this. First, module `"A"`: + +```js +import bar from "B"; + +export default function foo(x) { + if (x > 10) return bar( x - 1 ); + return x * 2; +} +``` + +Now, module `"B"`: + +```js +import foo from "A"; + +export default function bar(y) { + if (y > 5) return foo( y / 2 ); + return y * 3; +} +``` + +These two functions, `foo(..)` and `bar(..)`, would work as standard function declarations if they were in the same scope, because the declarations are "hoisted" to the whole scope and thus available to each other regardless of authoring order. + +With modules, you have declarations in entirely different scopes, so ES6 has to do extra work to help make these circular references work. + +In a rough conceptual sense, this is how circular `import` dependencies are validated and resolved: + +* If the `"A"` module is loaded first, the first step is to scan the file and analyze all the exports, so it can register all those bindings available for import. Then it processes the `import .. from "B"`, which signals that it needs to go fetch `"B"`. +* Once the engine loads `"B"`, it does the same analysis of its export bindings. When it sees the `import .. from "A"`, it knows the API of `"A"` already, so it can verify the `import` is valid. Now that it knows the `"B"` API, it can also validate the `import .. from "B"` in the waiting `"A"` module. + +In essence, the mutual imports, along with the static verification that's done to validate both `import` statements, virtually composes the two separate module scopes (via the bindings), such that `foo(..)` can call `bar(..)` and vice versa. This is symmetric to if they had originally been declared in the same scope. + +Now let's try using the two modules together. First, we'll try `foo(..)`: + +```js +import foo from "foo"; +foo( 25 ); // 11 +``` + +Or we can try `bar(..)`: + +```js +import bar from "bar"; +bar( 25 ); // 11.5 +``` + +By the time either the `foo(25)` or `bar(25)` calls are executed, all the analysis/compilation of all modules has completed. That means `foo(..)` internally knows directly about `bar(..)` and `bar(..)` internally knows directly about `foo(..)`. + +If all we need is to interact with `foo(..)`, then we only need to import the `"foo"` module. Likewise with `bar(..)` and the `"bar"` module. + +Of course, we *can* import and use both of them if we want to: + +```js +import foo from "foo"; +import bar from "bar"; + +foo( 25 ); // 11 +bar( 25 ); // 11.5 +``` + +The static loading semantics of the `import` statement mean that a `"foo"` and `"bar"` that mutually depend on each other via `import` will ensure that both are loaded, parsed, and compiled before either of them runs. So their circular dependency is statically resolved and this works as you'd expect. + +### Module Loading + +We asserted at the beginning of this "Modules" section that the `import` statement uses a separate mechanism, provided by the hosting environment (browser, Node.js, etc.), to actually resolve the module specifier string into some useful instruction for finding and loading the desired module. That mechanism is the system *Module Loader*. + +The default module loader provided by the environment will interpret a module specifier as a URL if in the browser, and (generally) as a local filesystem path if on a server such as Node.js. The default behavior is to assume the loaded file is authored in the ES6 standard module format. + +Moreover, you will be able to load a module into the browser via an HTML tag, similar to how current script programs are loaded. At the time of this writing, it's not fully clear if this tag will be `` elements in the page that load these files separately, and even a few inline-code `` elements as well. + +But do these separate files/code snippets constitute separate programs or are they collectively one JS program? + +The (perhaps surprising) reality is they act more like independent JS programs in most, but not all, respects. + +The one thing they *share* is the single `global` object (`window` in the browser), which means multiple files can append their code to that shared namespace and they can all interact. + +So, if one `script` element defines a global function `foo()`, when a second `script` later runs, it can access and call `foo()` just as if it had defined the function itself. + +But global variable scope *hoisting* (see the *Scope & Closures* title of this series) does not occur across these boundaries, so the following code would not work (because `foo()`'s declaration isn't yet declared), regardless of if they are (as shown) inline `` elements or externally loaded `` files: + +```html + + + +``` + +But either of these *would* work instead: + +```html + +``` + +Or: + +```html + + + +``` + +Also, if an error occurs in a `script` element (inline or external), as a separate standalone JS program it will fail and stop, but any subsequent `script`s will run (still with the shared `global`) unimpeded. + +You can create `script` elements dynamically from your code, and inject them into the DOM of the page, and the code in them will behave basically as if loaded normally in a separate file: + +```js +var greeting = "Hello World"; + +var el = document.createElement( "script" ); + +el.text = "function foo(){ alert( greeting );\ + } setTimeout( foo, 1000 );"; + +document.body.appendChild( el ); +``` + +**Note:** Of course, if you tried the above snippet but set `el.src` to some file URL instead of setting `el.text` to the code contents, you'd be dynamically creating an externally loaded `` element. + +One difference between code in an inline code block and that same code in an external file is that in the inline code block, the sequence of characters `` cannot appear together, as (regardless of where it appears) it would be interpreted as the end of the code block. So, beware of code like: + +```html +"; + +``` + +It looks harmless, but the `` appearing inside the `string` literal will terminate the script block abnormally, causing an error. The most common workaround is: + +```js +""; +``` + +Also, beware that code inside an external file will be interpreted in the character set (UTF-8, ISO-8859-8, etc.) the file is served with (or the default), but that same code in an inline `script` element in your HTML page will be interpreted by the character set of the page (or its default). + +**Warning:** The `charset` attribute will not work on inline script elements. + +Another deprecated practice with inline `script` elements is including HTML-style or X(HT)ML-style comments around inline code, like: + +```html + + + +``` + +Both of these are totally unnecessary now, so if you're still doing that, stop it! + +**Note:** Both `` (HTML-style comments) are actually specified as valid single-line comment delimiters (`var x = 2; another valid line comment`) in JavaScript (see the "Web ECMAScript" section earlier), purely because of this old technique. But never use them. + +## Reserved Words + +The ES5 spec defines a set of "reserved words" in Section 7.6.1 that cannot be used as standalone variable names. Technically, there are four categories: "keywords", "future reserved words", the `null` literal, and the `true` / `false` boolean literals. + +Keywords are the obvious ones like `function` and `switch`. Future reserved words include things like `enum`, though many of the rest of them (`class`, `extends`, etc.) are all now actually used by ES6; there are other strict-mode only reserved words like `interface`. + +StackOverflow user "art4theSould" creatively worked all these reserved words into a fun little poem (http://stackoverflow.com/questions/26255/reserved-keywords-in-javascript/12114140#12114140): + +> Let this long package float, +> Goto private class if short. +> While protected with debugger case, +> Continue volatile interface. +> Instanceof super synchronized throw, +> Extends final export throws. +> +> Try import double enum? +> - False, boolean, abstract function, +> Implements typeof transient break! +> Void static, default do, +> Switch int native new. +> Else, delete null public var +> In return for const, true, char +> …Finally catch byte. + +**Note:** This poem includes words that were reserved in ES3 (`byte`, `long`, etc.) that are no longer reserved as of ES5. + +Prior to ES5, the reserved words also could not be property names or keys in object literals, but that restriction no longer exists. + +So, this is not allowed: + +```js +var import = "42"; +``` + +But this is allowed: + +```js +var obj = { import: "42" }; +console.log( obj.import ); +``` + +You should be aware though that some older browser versions (mainly older IE) weren't completely consistent on applying these rules, so there are places where using reserved words in object property name locations can still cause issues. Carefully test all supported browser environments. + +## Implementation Limits + +The JavaScript spec does not place arbitrary limits on things such as the number of arguments to a function or the length of a string literal, but these limits exist nonetheless, because of implementation details in different engines. + +For example: + +```js +function addAll() { + var sum = 0; + for (var i=0; i < arguments.length; i++) { + sum += arguments[i]; + } + return sum; +} + +var nums = []; +for (var i=1; i < 100000; i++) { + nums.push(i); +} + +addAll( 2, 4, 6 ); // 12 +addAll.apply( null, nums ); // should be: 499950000 +``` + +In some JS engines, you'll get the correct `499950000` answer, but in others (like Safari 6.x), you'll get the error: "RangeError: Maximum call stack size exceeded." + +Examples of other limits known to exist: + +* maximum number of characters allowed in a string literal (not just a string value) +* size (bytes) of data that can be sent in arguments to a function call (aka stack size) +* number of parameters in a function declaration +* maximum depth of non-optimized call stack (i.e., with recursion): how long a chain of function calls from one to the other can be +* number of seconds a JS program can run continuously blocking the browser +* maximum length allowed for a variable name +* ... + +It's not very common at all to run into these limits, but you should be aware that limits can and do exist, and importantly that they vary between engines. + +## Review + +We know and can rely upon the fact that the JS language itself has one standard and is predictably implemented by all the modern browsers/engines. This is a very good thing! + +But JavaScript rarely runs in isolation. It runs in an environment mixed in with code from third-party libraries, and sometimes it even runs in engines/environments that differ from those found in browsers. + +Paying close attention to these issues improves the reliability and robustness of your code. diff --git a/types & grammar/apB.md b/types & grammar/apB.md new file mode 100644 index 0000000..94a0f5e --- /dev/null +++ b/types & grammar/apB.md @@ -0,0 +1,20 @@ +# You Don't Know JS: Types & Grammar +# Appendix B: Acknowledgments + +I have many people to thank for making this book title and the overall series happen. + +First, I must thank my wife Christen Simpson, and my two kids Ethan and Emily, for putting up with Dad always pecking away at the computer. Even when not writing books, my obsession with JavaScript glues my eyes to the screen far more than it should. That time I borrow from my family is the reason these books can so deeply and completely explain JavaScript to you, the reader. I owe my family everything. + +I'd like to thank my editors at O'Reilly, namely Simon St.Laurent and Brian MacDonald, as well as the rest of the editorial and marketing staff. They are fantastic to work with, and have been especially accommodating during this experiment into "open source" book writing, editing, and production. + +Thank you to the many folks who have participated in making this book series better by providing editorial suggestions and corrections, including Shelley Powers, Tim Ferro, Evan Borden, Forrest L. Norvell, Jennifer Davis, Jesse Harlin, and many others. A big thank you to David Walsh for writing the Foreword for this title. + +Thank you to the countless folks in the community, including members of the TC39 committee, who have shared so much knowledge with the rest of us, and especially tolerated my incessant questions and explorations with patience and detail. John-David Dalton, Juriy "kangax" Zaytsev, Mathias Bynens, Axel Rauschmayer, Nicholas Zakas, Angus Croll, Reginald Braithwaite, Dave Herman, Brendan Eich, Allen Wirfs-Brock, Bradley Meck, Domenic Denicola, David Walsh, Tim Disney, Peter van der Zee, Andrea Giammarchi, Kit Cambridge, Eric Elliott, and so many others, I can't even scratch the surface. + +The *You Don't Know JS* book series was born on Kickstarter, so I also wish to thank all my (nearly) 500 generous backers, without whom this book series could not have happened: + +> Jan Szpila, nokiko, Murali Krishnamoorthy, Ryan Joy, Craig Patchett, pdqtrader, Dale Fukami, ray hatfield, R0drigo Perez [Mx], Dan Petitt, Jack Franklin, Andrew Berry, Brian Grinstead, Rob Sutherland, Sergi Meseguer, Phillip Gourley, Mark Watson, Jeff Carouth, Alfredo Sumaran, Martin Sachse, Marcio Barrios, Dan, AimelyneM, Matt Sullivan, Delnatte Pierre-Antoine, Jake Smith, Eugen Tudorancea, Iris, David Trinh, simonstl, Ray Daly, Uros Gruber, Justin Myers, Shai Zonis, Mom & Dad, Devin Clark, Dennis Palmer, Brian Panahi Johnson, Josh Marshall, Marshall, Dennis Kerr, Matt Steele, Erik Slagter, Sacah, Justin Rainbow, Christian Nilsson, Delapouite, D.Pereira, Nicolas Hoizey, George V. Reilly, Dan Reeves, Bruno Laturner, Chad Jennings, Shane King, Jeremiah Lee Cohick, od3n, Stan Yamane, Marko Vucinic, Jim B, Stephen Collins, Ægir Þorsteinsson, Eric Pederson, Owain, Nathan Smith, Jeanetteurphy, Alexandre ELISÉ, Chris Peterson, Rik Watson, Luke Matthews, Justin Lowery, Morten Nielsen, Vernon Kesner, Chetan Shenoy, Paul Tregoing, Marc Grabanski, Dion Almaer, Andrew Sullivan, Keith Elsass, Tom Burke, Brian Ashenfelter, David Stuart, Karl Swedberg, Graeme, Brandon Hays, John Christopher, Gior, manoj reddy, Chad Smith, Jared Harbour, Minoru TODA, Chris Wigley, Daniel Mee, Mike, Handyface, Alex Jahraus, Carl Furrow, Rob Foulkrod, Max Shishkin, Leigh Penny Jr., Robert Ferguson, Mike van Hoenselaar, Hasse Schougaard, rajan venkataguru, Jeff Adams, Trae Robbins, Rolf Langenhuijzen, Jorge Antunes, Alex Koloskov, Hugh Greenish, Tim Jones, Jose Ochoa, Michael Brennan-White, Naga Harish Muvva, Barkóczi Dávid, Kitt Hodsden, Paul McGraw, Sascha Goldhofer, Andrew Metcalf, Markus Krogh, Michael Mathews, Matt Jared, Juanfran, Georgie Kirschner, Kenny Lee, Ted Zhang, Amit Pahwa, Inbal Sinai, Dan Raine, Schabse Laks, Michael Tervoort, Alexandre Abreu, Alan Joseph Williams, NicolasD, Cindy Wong, Reg Braithwaite, LocalPCGuy, Jon Friskics, Chris Merriman, John Pena, Jacob Katz, Sue Lockwood, Magnus Johansson, Jeremy Crapsey, Grzegorz Pawłowski, nico nuzzaci, Christine Wilks, Hans Bergren, charles montgomery, Ariel בר-לבב Fogel, Ivan Kolev, Daniel Campos, Hugh Wood, Christian Bradford, Frédéric Harper, Ionuţ Dan Popa, Jeff Trimble, Rupert Wood, Trey Carrico, Pancho Lopez, Joël kuijten, Tom A Marra, Jeff Jewiss, Jacob Rios, Paolo Di Stefano, Soledad Penades, Chris Gerber, Andrey Dolganov, Wil Moore III, Thomas Martineau, Kareem, Ben Thouret, Udi Nir, Morgan Laupies, jory carson-burson, Nathan L Smith, Eric Damon Walters, Derry Lozano-Hoyland, Geoffrey Wiseman, mkeehner, KatieK, Scott MacFarlane, Brian LaShomb, Adrien Mas, christopher ross, Ian Littman, Dan Atkinson, Elliot Jobe, Nick Dozier, Peter Wooley, John Hoover, dan, Martin A. Jackson, Héctor Fernando Hurtado, andy ennamorato, Paul Seltmann, Melissa Gore, Dave Pollard, Jack Smith, Philip Da Silva, Guy Israeli, @megalithic, Damian Crawford, Felix Gliesche, April Carter Grant, Heidi, jim tierney, Andrea Giammarchi, Nico Vignola, Don Jones, Chris Hartjes, Alex Howes, john gibbon, David J. Groom, BBox, Yu 'Dilys' Sun, Nate Steiner, Brandon Satrom, Brian Wyant, Wesley Hales, Ian Pouncey, Timothy Kevin Oxley, George Terezakis, sanjay raj, Jordan Harband, Marko McLion, Wolfgang Kaufmann, Pascal Peuckert, Dave Nugent, Markus Liebelt, Welling Guzman, Nick Cooley, Daniel Mesquita, Robert Syvarth, Chris Coyier, Rémy Bach, Adam Dougal, Alistair Duggin, David Loidolt, Ed Richer, Brian Chenault, GoldFire Studios, Carles Andrés, Carlos Cabo, Yuya Saito, roberto ricardo, Barnett Klane, Mike Moore, Kevin Marx, Justin Love, Joe Taylor, Paul Dijou, Michael Kohler, Rob Cassie, Mike Tierney, Cody Leroy Lindley, tofuji, Shimon Schwartz, Raymond, Luc De Brouwer, David Hayes, Rhys Brett-Bowen, Dmitry, Aziz Khoury, Dean, Scott Tolinski - Level Up, Clement Boirie, Djordje Lukic, Anton Kotenko, Rafael Corral, Philip Hurwitz, Jonathan Pidgeon, Jason Campbell, Joseph C., SwiftOne, Jan Hohner, Derick Bailey, getify, Daniel Cousineau, Chris Charlton, Eric Turner, David Turner, Joël Galeran, Dharma Vagabond, adam, Dirk van Bergen, dave ♥♫★ furf, Vedran Zakanj, Ryan McAllen, Natalie Patrice Tucker, Eric J. Bivona, Adam Spooner, Aaron Cavano, Kelly Packer, Eric J, Martin Drenovac, Emilis, Michael Pelikan, Scott F. Walter, Josh Freeman, Brandon Hudgeons, vijay chennupati, Bill Glennon, Robin R., Troy Forster, otaku_coder, Brad, Scott, Frederick Ostrander, Adam Brill, Seb Flippence, Michael Anderson, Jacob, Adam Randlett, Standard, Joshua Clanton, Sebastian Kouba, Chris Deck, SwordFire, Hannes Papenberg, Richard Woeber, hnzz, Rob Crowther, Jedidiah Broadbent, Sergey Chernyshev, Jay-Ar Jamon, Ben Combee, luciano bonachela, Mark Tomlinson, Kit Cambridge, Michael Melgares, Jacob Adams, Adrian Bruinhout, Bev Wieber, Scott Puleo, Thomas Herzog, April Leone, Daniel Mizieliński, Kees van Ginkel, Jon Abrams, Erwin Heiser, Avi Laviad, David newell, Jean-Francois Turcot, Niko Roberts, Erik Dana, Charles Neill, Aaron Holmes, Grzegorz Ziółkowski, Nathan Youngman, Timothy, Jacob Mather, Michael Allan, Mohit Seth, Ryan Ewing, Benjamin Van Treese, Marcelo Santos, Denis Wolf, Phil Keys, Chris Yung, Timo Tijhof, Martin Lekvall, Agendine, Greg Whitworth, Helen Humphrey, Dougal Campbell, Johannes Harth, Bruno Girin, Brian Hough, Darren Newton, Craig McPheat, Olivier Tille, Dennis Roethig, Mathias Bynens, Brendan Stromberger, sundeep, John Meyer, Ron Male, John F Croston III, gigante, Carl Bergenhem, B.J. May, Rebekah Tyler, Ted Foxberry, Jordan Reese, Terry Suitor, afeliz, Tom Kiefer, Darragh Duffy, Kevin Vanderbeken, Andy Pearson, Simon Mac Donald, Abid Din, Chris Joel, Tomas Theunissen, David Dick, Paul Grock, Brandon Wood, John Weis, dgrebb, Nick Jenkins, Chuck Lane, Johnny Megahan, marzsman, Tatu Tamminen, Geoffrey Knauth, Alexander Tarmolov, Jeremy Tymes, Chad Auld, Sean Parmelee, Rob Staenke, Dan Bender, Yannick derwa, Joshua Jones, Geert Plaisier, Tom LeZotte, Christen Simpson, Stefan Bruvik, Justin Falcone, Carlos Santana, Michael Weiss, Pablo Villoslada, Peter deHaan, Dimitris Iliopoulos, seyDoggy, Adam Jordens, Noah Kantrowitz, Amol M, Matthew Winnard, Dirk Ginader, Phinam Bui, David Rapson, Andrew Baxter, Florian Bougel, Michael George, Alban Escalier, Daniel Sellers, Sasha Rudan, John Green, Robert Kowalski, David I. Teixeira (@ditma, Charles Carpenter, Justin Yost, Sam S, Denis Ciccale, Kevin Sheurs, Yannick Croissant, Pau Fracés, Stephen McGowan, Shawn Searcy, Chris Ruppel, Kevin Lamping, Jessica Campbell, Christopher Schmitt, Sablons, Jonathan Reisdorf, Bunni Gek, Teddy Huff, Michael Mullany, Michael Fürstenberg, Carl Henderson, Rick Yoesting, Scott Nichols, Hernán Ciudad, Andrew Maier, Mike Stapp, Jesse Shawl, Sérgio Lopes, jsulak, Shawn Price, Joel Clermont, Chris Ridmann, Sean Timm, Jason Finch, Aiden Montgomery, Elijah Manor, Derek Gathright, Jesse Harlin, Dillon Curry, Courtney Myers, Diego Cadenas, Arne de Bree, João Paulo Dubas, James Taylor, Philipp Kraeutli, Mihai Păun, Sam Gharegozlou, joshjs, Matt Murchison, Eric Windham, Timo Behrmann, Andrew Hall, joshua price, Théophile Villard + +This book series is being produced in an open source fashion, including editing and production. We owe GitHub a debt of gratitude for making that sort of thing possible for the community! + +Thank you again to all the countless folks I didn't name but who I nonetheless owe thanks. May this book series be "owned" by all of us and serve to contribute to increasing awareness and understanding of the JavaScript language, to the benefit of all current and future community contributors. diff --git a/types & grammar/ch1.md b/types & grammar/ch1.md new file mode 100644 index 0000000..1e0c0c6 --- /dev/null +++ b/types & grammar/ch1.md @@ -0,0 +1,300 @@ +# You Don't Know JS: Types & Grammar +# Chapter 1: Types + +Most developers would say that a dynamic language (like JS) does not have *types*. Let's see what the ES5.1 specification (http://www.ecma-international.org/ecma-262/5.1/) has to say on the topic: + +> Algorithms within this specification manipulate values each of which has an associated type. The possible value types are exactly those defined in this clause. Types are further sub classified into ECMAScript language types and specification types. +> +> An ECMAScript language type corresponds to values that are directly manipulated by an ECMAScript programmer using the ECMAScript language. The ECMAScript language types are Undefined, Null, Boolean, String, Number, and Object. + +Now, if you're a fan of strongly typed (statically typed) languages, you may object to this usage of the word "type." In those languages, "type" means a whole lot *more* than it does here in JS. + +Some people say JS shouldn't claim to have "types," and they should instead be called "tags" or perhaps "subtypes". + +Bah! We're going to use this rough definition (the same one that seems to drive the wording of the spec): a *type* is an intrinsic, built-in set of characteristics that uniquely identifies the behavior of a particular value and distinguishes it from other values, both to the engine **and to the developer**. + +In other words, if both the engine and the developer treat value `42` (the number) differently than they treat value `"42"` (the string), then those two values have different *types* -- `number` and `string`, respectively. When you use `42`, you are *intending* to do something numeric, like math. But when you use `"42"`, you are *intending* to do something string'ish, like outputting to the page, etc. **These two values have different types.** + +That's by no means a perfect definition. But it's good enough for this discussion. And it's consistent with how JS describes itself. + +# A Type By Any Other Name... + +Beyond academic definition disagreements, why does it matter if JavaScript has *types* or not? + +Having a proper understanding of each *type* and its intrinsic behavior is absolutely essential to understanding how to properly and accurately convert values to different types (see Coercion, Chapter 4). Nearly every JS program ever written will need to handle value coercion in some shape or form, so it's important you do so responsibly and with confidence. + +If you have the `number` value `42`, but you want to treat it like a `string`, such as pulling out the `"2"` as a character in position `1`, you obviously must first convert (coerce) the value from `number` to `string`. + +That seems simple enough. + +But there are many different ways that such coercion can happen. Some of these ways are explicit, easy to reason about, and reliable. But if you're not careful, coercion can happen in very strange and surprising ways. + +Coercion confusion is perhaps one of the most profound frustrations for JavaScript developers. It has often been criticized as being so *dangerous* as to be considered a flaw in the design of the language, to be shunned and avoided. + +Armed with a full understanding of JavaScript types, we're aiming to illustrate why coercion's *bad reputation* is largely overhyped and somewhat undeserved -- to flip your perspective, to seeing coercion's power and usefulness. But first, we have to get a much better grip on values and types. + +## Built-in Types + +JavaScript defines seven built-in types: + +* `null` +* `undefined` +* `boolean` +* `number` +* `string` +* `object` +* `symbol` -- added in ES6! + +**Note:** All of these types except `object` are called "primitives". + +The `typeof` operator inspects the type of the given value, and always returns one of seven string values -- surprisingly, there's not an exact 1-to-1 match with the seven built-in types we just listed. + +```js +typeof undefined === "undefined"; // true +typeof true === "boolean"; // true +typeof 42 === "number"; // true +typeof "42" === "string"; // true +typeof { life: 42 } === "object"; // true + +// added in ES6! +typeof Symbol() === "symbol"; // true +``` + +These six listed types have values of the corresponding type and return a string value of the same name, as shown. `Symbol` is a new data type as of ES6, and will be covered in Chapter 3. + +As you may have noticed, I excluded `null` from the above listing. It's *special* -- special in the sense that it's buggy when combined with the `typeof` operator: + +```js +typeof null === "object"; // true +``` + +It would have been nice (and correct!) if it returned `"null"`, but this original bug in JS has persisted for nearly two decades, and will likely never be fixed because there's too much existing web content that relies on its buggy behavior that "fixing" the bug would *create* more "bugs" and break a lot of web software. + +If you want to test for a `null` value using its type, you need a compound condition: + +```js +var a = null; + +(!a && typeof a === "object"); // true +``` + +`null` is the only primitive value that is "falsy" (aka false-like; see Chapter 4) but that also returns `"object"` from the `typeof` check. + +So what's the seventh string value that `typeof` can return? + +```js +typeof function a(){ /* .. */ } === "function"; // true +``` + +It's easy to think that `function` would be a top-level built-in type in JS, especially given this behavior of the `typeof` operator. However, if you read the spec, you'll see it's actually a "subtype" of object. Specifically, a function is referred to as a "callable object" -- an object that has an internal `[[Call]]` property that allows it to be invoked. + +The fact that functions are actually objects is quite useful. Most importantly, they can have properties. For example: + +```js +function a(b,c) { + /* .. */ +} +``` + +The function object has a `length` property set to the number of formal parameters it is declared with. + +```js +a.length; // 2 +``` + +Since you declared the function with two formal named parameters (`b` and `c`), the "length of the function" is `2`. + +What about arrays? They're native to JS, so are they a special type? + +```js +typeof [1,2,3] === "object"; // true +``` + +Nope, just objects. It's most appropriate to think of them also as a "subtype" of object (see Chapter 3), in this case with the additional characteristics of being numerically indexed (as opposed to just being string-keyed like plain objects) and maintaining an automatically updated `.length` property. + +## Values as Types + +In JavaScript, variables don't have types -- **values have types**. Variables can hold any value, at any time. + +Another way to think about JS types is that JS doesn't have "type enforcement," in that the engine doesn't insist that a *variable* always holds values of the *same initial type* that it starts out with. A variable can, in one assignment statement, hold a `string`, and in the next hold a `number`, and so on. + +The *value* `42` has an intrinsic type of `number`, and its *type* cannot be changed. Another value, like `"42"` with the `string` type, can be created *from* the `number` value `42` through a process called **coercion** (see Chapter 4). + +If you use `typeof` against a variable, it's not asking "what's the type of the variable?" as it may seem, since JS variables have no types. Instead, it's asking "what's the type of the value *in* the variable?" + +```js +var a = 42; +typeof a; // "number" + +a = true; +typeof a; // "boolean" +``` + +The `typeof` operator always returns a string. So: + +```js +typeof typeof 42; // "string" +``` + +The first `typeof 42` returns `"number"`, and `typeof "number"` is `"string"`. + +### `undefined` vs "undeclared" + +Variables that have no value *currently*, actually have the `undefined` value. Calling `typeof` against such variables will return `"undefined"`: + +```js +var a; + +typeof a; // "undefined" + +var b = 42; +var c; + +// later +b = c; + +typeof b; // "undefined" +typeof c; // "undefined" +``` + +It's tempting for most developers to think of the word "undefined" and think of it as a synonym for "undeclared." However, in JS, these two concepts are quite different. + +An "undefined" variable is one that has been declared in the accessible scope, but *at the moment* has no other value in it. By contrast, an "undeclared" variable is one that has not been formally declared in the accessible scope. + +Consider: + +```js +var a; + +a; // undefined +b; // ReferenceError: b is not defined +``` + +An annoying confusion is the error message that browsers assign to this condition. As you can see, the message is "b is not defined," which is of course very easy and reasonable to confuse with "b is undefined." Yet again, "undefined" and "is not defined" are very different things. It'd be nice if the browsers said something like "b is not found" or "b is not declared," to reduce the confusion! + +There's also a special behavior associated with `typeof` as it relates to undeclared variables that even further reinforces the confusion. Consider: + +```js +var a; + +typeof a; // "undefined" + +typeof b; // "undefined" +``` + +The `typeof` operator returns `"undefined"` even for "undeclared" (or "not defined") variables. Notice that there was no error thrown when we executed `typeof b`, even though `b` is an undeclared variable. This is a special safety guard in the behavior of `typeof`. + +Similar to above, it would have been nice if `typeof` used with an undeclared variable returned "undeclared" instead of conflating the result value with the different "undefined" case. + +### `typeof` Undeclared + +Nevertheless, this safety guard is a useful feature when dealing with JavaScript in the browser, where multiple script files can load variables into the shared global namespace. + +**Note:** Many developers believe there should never be any variables in the global namespace, and that everything should be contained in modules and private/separate namespaces. This is great in theory but nearly impossible in practicality; still it's a good goal to strive toward! Fortunately, ES6 added first-class support for modules, which will eventually make that much more practical. + +As a simple example, imagine having a "debug mode" in your program that is controlled by a global variable (flag) called `DEBUG`. You'd want to check if that variable was declared before performing a debug task like logging a message to the console. A top-level global `var DEBUG = true` declaration would only be included in a "debug.js" file, which you only load into the browser when you're in development/testing, but not in production. + +However, you have to take care in how you check for the global `DEBUG` variable in the rest of your application code, so that you don't throw a `ReferenceError`. The safety guard on `typeof` is our friend in this case. + +```js +// oops, this would throw an error! +if (DEBUG) { + console.log( "Debugging is starting" ); +} + +// this is a safe existence check +if (typeof DEBUG !== "undefined") { + console.log( "Debugging is starting" ); +} +``` + +This sort of check is useful even if you're not dealing with user-defined variables (like `DEBUG`). If you are doing a feature check for a built-in API, you may also find it helpful to check without throwing an error: + +```js +if (typeof atob === "undefined") { + atob = function() { /*..*/ }; +} +``` + +**Note:** If you're defining a "polyfill" for a feature if it doesn't already exist, you probably want to avoid using `var` to make the `atob` declaration. If you declare `var atob` inside the `if` statement, this declaration is hoisted (see the *Scope & Closures* title of this series) to the top of the scope, even if the `if` condition doesn't pass (because the global `atob` already exists!). In some browsers and for some special types of global built-in variables (often called "host objects"), this duplicate declaration may throw an error. Omitting the `var` prevents this hoisted declaration. + +Another way of doing these checks against global variables but without the safety guard feature of `typeof` is to observe that all global variables are also properties of the global object, which in the browser is basically the `window` object. So, the above checks could have been done (quite safely) as: + +```js +if (window.DEBUG) { + // .. +} + +if (!window.atob) { + // .. +} +``` + +Unlike referencing undeclared variables, there is no `ReferenceError` thrown if you try to access an object property (even on the global `window` object) that doesn't exist. + +On the other hand, manually referencing the global variable with a `window` reference is something some developers prefer to avoid, especially if your code needs to run in multiple JS environments (not just browsers, but server-side node.js, for instance), where the global variable may not always be called `window`. + +Technically, this safety guard on `typeof` is useful even if you're not using global variables, though these circumstances are less common, and some developers may find this design approach less desirable. Imagine a utility function that you want others to copy-and-paste into their programs or modules, in which you want to check to see if the including program has defined a certain variable (so that you can use it) or not: + +```js +function doSomethingCool() { + var helper = + (typeof FeatureXYZ !== "undefined") ? + FeatureXYZ : + function() { /*.. default feature ..*/ }; + + var val = helper(); + // .. +} +``` + +`doSomethingCool()` tests for a variable called `FeatureXYZ`, and if found, uses it, but if not, uses its own. Now, if someone includes this utility in their module/program, it safely checks if they've defined `FeatureXYZ` or not: + +```js +// an IIFE (see "Immediately Invoked Function Expressions" +// discussion in the *Scope & Closures* title of this series) +(function(){ + function FeatureXYZ() { /*.. my XYZ feature ..*/ } + + // include `doSomethingCool(..)` + function doSomethingCool() { + var helper = + (typeof FeatureXYZ !== "undefined") ? + FeatureXYZ : + function() { /*.. default feature ..*/ }; + + var val = helper(); + // .. + } + + doSomethingCool(); +})(); +``` + +Here, `FeatureXYZ` is not at all a global variable, but we're still using the safety guard of `typeof` to make it safe to check for. And importantly, here there is *no* object we can use (like we did for global variables with `window.___`) to make the check, so `typeof` is quite helpful. + +Other developers would prefer a design pattern called "dependency injection," where instead of `doSomethingCool()` inspecting implicitly for `FeatureXYZ` to be defined outside/around it, it would need to have the dependency explicitly passed in, like: + +```js +function doSomethingCool(FeatureXYZ) { + var helper = FeatureXYZ || + function() { /*.. default feature ..*/ }; + + var val = helper(); + // .. +} +``` + +There are lots of options when designing such functionality. No one pattern here is "correct" or "wrong" -- there are various tradeoffs to each approach. But overall, it's nice that the `typeof` undeclared safety guard gives us more options. + +## Review + +JavaScript has seven built-in *types*: `null`, `undefined`, `boolean`, `number`, `string`, `object`, `symbol`. They can be identified by the `typeof` operator. + +Variables don't have types, but the values in them do. These types define intrinsic behavior of the values. + +Many developers will assume "undefined" and "undeclared" are roughly the same thing, but in JavaScript, they're quite different. `undefined` is a value that a declared variable can hold. "Undeclared" means a variable has never been declared. + +JavaScript unfortunately kind of conflates these two terms, not only in its error messages ("ReferenceError: a is not defined") but also in the return values of `typeof`, which is `"undefined"` for both cases. + +However, the safety guard (preventing an error) on `typeof` when used against an undeclared variable can be helpful in certain cases. diff --git a/types & grammar/ch2.md b/types & grammar/ch2.md new file mode 100644 index 0000000..a4803c5 --- /dev/null +++ b/types & grammar/ch2.md @@ -0,0 +1,985 @@ +# You Don't Know JS: Types & Grammar +# Chapter 2: Values + +`array`s, `string`s, and `number`s are the most basic building-blocks of any program, but JavaScript has some unique characteristics with these types that may either delight or confound you. + +Let's look at several of the built-in value types in JS, and explore how we can more fully understand and correctly leverage their behaviors. + +## Arrays + +As compared to other type-enforced languages, JavaScript `array`s are just containers for any type of value, from `string` to `number` to `object` to even another `array` (which is how you get multidimensional `array`s). + +```js +var a = [ 1, "2", [3] ]; + +a.length; // 3 +a[0] === 1; // true +a[2][0] === 3; // true +``` + +You don't need to presize your `array`s (see "Arrays" in Chapter 3), you can just declare them and add values as you see fit: + +```js +var a = [ ]; + +a.length; // 0 + +a[0] = 1; +a[1] = "2"; +a[2] = [ 3 ]; + +a.length; // 3 +``` + +**Warning:** Using `delete` on an `array` value will remove that slot from the `array`, but even if you remove the final element, it does **not** update the `length` property, so be careful! We'll cover the `delete` operator itself in more detail in Chapter 5. + +Be careful about creating "sparse" `array`s (leaving or creating empty/missing slots): + +```js +var a = [ ]; + +a[0] = 1; +// no `a[1]` slot set here +a[2] = [ 3 ]; + +a[1]; // undefined + +a.length; // 3 +``` + +While that works, it can lead to some confusing behavior with the "empty slots" you leave in between. While the slot appears to have the `undefined` value in it, it will not behave the same as if the slot is explicitly set (`a[1] = undefined`). See "Arrays" in Chapter 3 for more information. + +`array`s are numerically indexed (as you'd expect), but the tricky thing is that they also are objects that can have `string` keys/properties added to them (but which don't count toward the `length` of the `array`): + +```js +var a = [ ]; + +a[0] = 1; +a["foobar"] = 2; + +a.length; // 1 +a["foobar"]; // 2 +a.foobar; // 2 +``` + +However, a gotcha to be aware of is that if a `string` value intended as a key can be coerced to a standard base-10 `number`, then it is assumed that you wanted to use it as a `number` index rather than as a `string` key! + +```js +var a = [ ]; + +a["13"] = 42; + +a.length; // 14 +``` + +Generally, it's not a great idea to add `string` keys/properties to `array`s. Use `object`s for holding values in keys/properties, and save `array`s for strictly numerically indexed values. + +### Array-Likes + +There will be occasions where you need to convert an `array`-like value (a numerically indexed collection of values) into a true `array`, usually so you can call array utilities (like `indexOf(..)`, `concat(..)`, `forEach(..)`, etc.) against the collection of values. + +For example, various DOM query operations return lists of DOM elements that are not true `array`s but are `array`-like enough for our conversion purposes. Another common example is when functions expose the `arguments` (`array`-like) object (as of ES6, deprecated) to access the arguments as a list. + +One very common way to make such a conversion is to borrow the `slice(..)` utility against the value: + +```js +function foo() { + var arr = Array.prototype.slice.call( arguments ); + arr.push( "bam" ); + console.log( arr ); +} + +foo( "bar", "baz" ); // ["bar","baz","bam"] +``` + +If `slice()` is called without any other parameters, as it effectively is in the above snippet, the default values for its parameters have the effect of duplicating the `array` (or, in this case, `array`-like). + +As of ES6, there's also a built-in utility called `Array.from(..)` that can do the same task: + +```js +... +var arr = Array.from( arguments ); +... +``` + +**Note:** `Array.from(..)` has several powerful capabilities, and will be covered in detail in the *ES6 & Beyond* title of this series. + +## Strings + +It's a very common belief that `string`s are essentially just `array`s of characters. While the implementation under the covers may or may not use `array`s, it's important to realize that JavaScript `string`s are really not the same as `array`s of characters. The similarity is mostly just skin-deep. + +For example, let's consider these two values: + +```js +var a = "foo"; +var b = ["f","o","o"]; +``` + +Strings do have a shallow resemblance to `array`s -- `array`-likes, as above -- for instance, both of them having a `length` property, an `indexOf(..)` method (`array` version only as of ES5), and a `concat(..)` method: + +```js +a.length; // 3 +b.length; // 3 + +a.indexOf( "o" ); // 1 +b.indexOf( "o" ); // 1 + +var c = a.concat( "bar" ); // "foobar" +var d = b.concat( ["b","a","r"] ); // ["f","o","o","b","a","r"] + +a === c; // false +b === d; // false + +a; // "foo" +b; // ["f","o","o"] +``` + +So, they're both basically just "arrays of characters", right? **Not exactly**: + +```js +a[1] = "O"; +b[1] = "O"; + +a; // "foo" +b; // ["f","O","o"] +``` + +JavaScript `string`s are immutable, while `array`s are quite mutable. Moreover, the `a[1]` character position access form was not always widely valid JavaScript. Older versions of IE did not allow that syntax (but now they do). Instead, the *correct* approach has been `a.charAt(1)`. + +A further consequence of immutable `string`s is that none of the `string` methods that alter its contents can modify in-place, but rather must create and return new `string`s. By contrast, many of the methods that change `array` contents actually *do* modify in-place. + +```js +c = a.toUpperCase(); +a === c; // false +a; // "foo" +c; // "FOO" + +b.push( "!" ); +b; // ["f","O","o","!"] +``` + +Also, many of the `array` methods that could be helpful when dealing with `string`s are not actually available for them, but we can "borrow" non-mutation `array` methods against our `string`: + +```js +a.join; // undefined +a.map; // undefined + +var c = Array.prototype.join.call( a, "-" ); +var d = Array.prototype.map.call( a, function(v){ + return v.toUpperCase() + "."; +} ).join( "" ); + +c; // "f-o-o" +d; // "F.O.O." +``` + +Let's take another example: reversing a `string` (incidentally, a common JavaScript interview trivia question!). `array`s have a `reverse()` in-place mutator method, but `string`s do not: + +```js +a.reverse; // undefined + +b.reverse(); // ["!","o","O","f"] +b; // ["!","o","O","f"] +``` + +Unfortunately, this "borrowing" doesn't work with `array` mutators, because `string`s are immutable and thus can't be modified in place: + +```js +Array.prototype.reverse.call( a ); +// still returns a String object wrapper (see Chapter 3) +// for "foo" :( +``` + +Another workaround (aka hack) is to convert the `string` into an `array`, perform the desired operation, then convert it back to a `string`. + +```js +var c = a + // split `a` into an array of characters + .split( "" ) + // reverse the array of characters + .reverse() + // join the array of characters back to a string + .join( "" ); + +c; // "oof" +``` + +If that feels ugly, it is. Nevertheless, *it works* for simple `string`s, so if you need something quick-n-dirty, often such an approach gets the job done. + +**Warning:** Be careful! This approach **doesn't work** for `string`s with complex (unicode) characters in them (astral symbols, multibyte characters, etc.). You need more sophisticated library utilities that are unicode-aware for such operations to be handled accurately. Consult Mathias Bynens' work on the subject: *Esrever* (https://github.com/mathiasbynens/esrever). + +The other way to look at this is: if you are more commonly doing tasks on your "strings" that treat them as basically *arrays of characters*, perhaps it's better to just actually store them as `array`s rather than as `string`s. You'll probably save yourself a lot of hassle of converting from `string` to `array` each time. You can always call `join("")` on the `array` *of characters* whenever you actually need the `string` representation. + +## Numbers + +JavaScript has just one numeric type: `number`. This type includes both "integer" values and fractional decimal numbers. I say "integer" in quotes because it's long been a criticism of JS that there are not true integers, as there are in other languages. That may change at some point in the future, but for now, we just have `number`s for everything. + +So, in JS, an "integer" is just a value that has no fractional decimal value. That is, `42.0` is as much an "integer" as `42`. + +Like most modern languages, including practically all scripting languages, the implementation of JavaScript's `number`s is based on the "IEEE 754" standard, often called "floating-point." JavaScript specifically uses the "double precision" format (aka "64-bit binary") of the standard. + +There are many great write-ups on the Web about the nitty-gritty details of how binary floating-point numbers are stored in memory, and the implications of those choices. Because understanding bit patterns in memory is not strictly necessary to understand how to correctly use `number`s in JS, we'll leave it as an exercise for the interested reader if you'd like to dig further into IEEE 754 details. + +### Numeric Syntax + +Number literals are expressed in JavaScript generally as base-10 decimal literals. For example: + +```js +var a = 42; +var b = 42.3; +``` + +The leading portion of a decimal value, if `0`, is optional: + +```js +var a = 0.42; +var b = .42; +``` + +Similarly, the trailing portion (the fractional) of a decimal value after the `.`, if `0`, is optional: + +```js +var a = 42.0; +var b = 42.; +``` + +**Warning:** `42.` is pretty uncommon, and probably not a great idea if you're trying to avoid confusion when other people read your code. But it is, nevertheless, valid. + +By default, most `number`s will be outputted as base-10 decimals, with trailing fractional `0`s removed. So: + +```js +var a = 42.300; +var b = 42.0; + +a; // 42.3 +b; // 42 +``` + +Very large or very small `number`s will by default be outputted in exponent form, the same as the output of the `toExponential()` method, like: + +```js +var a = 5E10; +a; // 50000000000 +a.toExponential(); // "5e+10" + +var b = a * a; +b; // 2.5e+21 + +var c = 1 / a; +c; // 2e-11 +``` + +Because `number` values can be boxed with the `Number` object wrapper (see Chapter 3), `number` values can access methods that are built into the `Number.prototype` (see Chapter 3). For example, the `toFixed(..)` method allows you to specify how many fractional decimal places you'd like the value to be represented with: + +```js +var a = 42.59; + +a.toFixed( 0 ); // "43" +a.toFixed( 1 ); // "42.6" +a.toFixed( 2 ); // "42.59" +a.toFixed( 3 ); // "42.590" +a.toFixed( 4 ); // "42.5900" +``` + +Notice that the output is actually a `string` representation of the `number`, and that the value is `0`-padded on the right-hand side if you ask for more decimals than the value holds. + +`toPrecision(..)` is similar, but specifies how many *significant digits* should be used to represent the value: + +```js +var a = 42.59; + +a.toPrecision( 1 ); // "4e+1" +a.toPrecision( 2 ); // "43" +a.toPrecision( 3 ); // "42.6" +a.toPrecision( 4 ); // "42.59" +a.toPrecision( 5 ); // "42.590" +a.toPrecision( 6 ); // "42.5900" +``` + +You don't have to use a variable with the value in it to access these methods; you can access these methods directly on `number` literals. But you have to be careful with the `.` operator. Since `.` is a valid numeric character, it will first be interpreted as part of the `number` literal, if possible, instead of being interpreted as a property accessor. + +```js +// invalid syntax: +42.toFixed( 3 ); // SyntaxError + +// these are all valid: +(42).toFixed( 3 ); // "42.000" +0.42.toFixed( 3 ); // "0.420" +42..toFixed( 3 ); // "42.000" +``` + +`42.toFixed(3)` is invalid syntax, because the `.` is swallowed up as part of the `42.` literal (which is valid -- see above!), and so then there's no `.` property operator present to make the `.toFixed` access. + +`42..toFixed(3)` works because the first `.` is part of the `number` and the second `.` is the property operator. But it probably looks strange, and indeed it's very rare to see something like that in actual JavaScript code. In fact, it's pretty uncommon to access methods directly on any of the primitive values. Uncommon doesn't mean *bad* or *wrong*. + +**Note:** There are libraries that extend the built-in `Number.prototype` (see Chapter 3) to provide extra operations on/with `number`s, and so in those cases, it's perfectly valid to use something like `10..makeItRain()` to set off a 10-second money raining animation, or something else silly like that. + +This is also technically valid (notice the space): + +```js +42 .toFixed(3); // "42.000" +``` + +However, with the `number` literal specifically, **this is particularly confusing coding style** and will serve no other purpose but to confuse other developers (and your future self). Avoid it. + +`number`s can also be specified in exponent form, which is common when representing larger `number`s, such as: + +```js +var onethousand = 1E3; // means 1 * 10^3 +var onemilliononehundredthousand = 1.1E6; // means 1.1 * 10^6 +``` + +`number` literals can also be expressed in other bases, like binary, octal, and hexadecimal. + +These formats work in current versions of JavaScript: + +```js +0xf3; // hexadecimal for: 243 +0Xf3; // ditto + +0363; // octal for: 243 +``` + +**Note:** Starting with ES6 + `strict` mode, the `0363` form of octal literals is no longer allowed (see below for the new form). The `0363` form is still allowed in non-`strict` mode, but you should stop using it anyway, to be future-friendly (and because you should be using `strict` mode by now!). + +As of ES6, the following new forms are also valid: + +```js +0o363; // octal for: 243 +0O363; // ditto + +0b11110011; // binary for: 243 +0B11110011; // ditto +``` + +Please do your fellow developers a favor: never use the `0O363` form. `0` next to capital `O` is just asking for confusion. Always use the lowercase predicates `0x`, `0b`, and `0o`. + +### Small Decimal Values + +The most (in)famous side effect of using binary floating-point numbers (which, remember, is true of **all** languages that use IEEE 754 -- not *just* JavaScript as many assume/pretend) is: + +```js +0.1 + 0.2 === 0.3; // false +``` + +Mathematically, we know that statement should be `true`. Why is it `false`? + +Simply put, the representations for `0.1` and `0.2` in binary floating-point are not exact, so when they are added, the result is not exactly `0.3`. It's **really** close: `0.30000000000000004`, but if your comparison fails, "close" is irrelevant. + +**Note:** Should JavaScript switch to a different `number` implementation that has exact representations for all values? Some think so. There have been many alternatives presented over the years. None of them have been accepted yet, and perhaps never will. As easy as it may seem to just wave a hand and say, "fix that bug already!", it's not nearly that easy. If it were, it most definitely would have been changed a long time ago. + +Now, the question is, if some `number`s can't be *trusted* to be exact, does that mean we can't use `number`s at all? **Of course not.** + +There are some applications where you need to be more careful, especially when dealing with fractional decimal values. There are also plenty of (maybe most?) applications that only deal with whole numbers ("integers"), and moreover, only deal with numbers in the millions or trillions at maximum. These applications have been, and always will be, **perfectly safe** to use numeric operations in JS. + +What if we *did* need to compare two `number`s, like `0.1 + 0.2` to `0.3`, knowing that the simple equality test fails? + +The most commonly accepted practice is to use a tiny "rounding error" value as the *tolerance* for comparison. This tiny value is often called "machine epsilon," which is commonly `2^-52` (`2.220446049250313e-16`) for the kind of `number`s in JavaScript. + +As of ES6, `Number.EPSILON` is predefined with this tolerance value, so you'd want to use it, but you can safely polyfill the definition for pre-ES6: + +```js +if (!Number.EPSILON) { + Number.EPSILON = Math.pow(2,-52); +} +``` + +We can use this `Number.EPSILON` to compare two `number`s for "equality" (within the rounding error tolerance): + +```js +function numbersCloseEnoughToEqual(n1,n2) { + return Math.abs( n1 - n2 ) < Number.EPSILON; +} + +var a = 0.1 + 0.2; +var b = 0.3; + +numbersCloseEnoughToEqual( a, b ); // true +numbersCloseEnoughToEqual( 0.0000001, 0.0000002 ); // false +``` + +The maximum floating-point value that can be represented is roughly `1.798e+308` (which is really, really, really huge!), predefined for you as `Number.MAX_VALUE`. On the small end, `Number.MIN_VALUE` is roughly `5e-324`, which isn't negative but is really close to zero! + +### Safe Integer Ranges + +Because of how `number`s are represented, there is a range of "safe" values for the whole `number` "integers", and it's significantly less than `Number.MAX_VALUE`. + +The maximum integer that can "safely" be represented (that is, there's a guarantee that the requested value is actually representable unambiguously) is `2^53 - 1`, which is `9007199254740991`. If you insert your commas, you'll see that this is just over 9 quadrillion. So that's pretty darn big for `number`s to range up to. + +This value is actually automatically predefined in ES6, as `Number.MAX_SAFE_INTEGER`. Unsurprisingly, there's a minimum value, `-9007199254740991`, and it's defined in ES6 as `Number.MIN_SAFE_INTEGER`. + +The main way that JS programs are confronted with dealing with such large numbers is when dealing with 64-bit IDs from databases, etc. 64-bit numbers cannot be represented accurately with the `number` type, so must be stored in (and transmitted to/from) JavaScript using `string` representation. + +Numeric operations on such large ID `number` values (besides comparison, which will be fine with `string`s) aren't all that common, thankfully. But if you *do* need to perform math on these very large values, for now you'll need to use a *big number* utility. Big numbers may get official support in a future version of JavaScript. + +### Testing for Integers + +To test if a value is an integer, you can use the ES6-specified `Number.isInteger(..)`: + +```js +Number.isInteger( 42 ); // true +Number.isInteger( 42.000 ); // true +Number.isInteger( 42.3 ); // false +``` + +To polyfill `Number.isInteger(..)` for pre-ES6: + +```js +if (!Number.isInteger) { + Number.isInteger = function(num) { + return typeof num == "number" && num % 1 == 0; + }; +} +``` + +To test if a value is a *safe integer*, use the ES6-specified `Number.isSafeInteger(..)`: + +```js +Number.isSafeInteger( Number.MAX_SAFE_INTEGER ); // true +Number.isSafeInteger( Math.pow( 2, 53 ) ); // false +Number.isSafeInteger( Math.pow( 2, 53 ) - 1 ); // true +``` + +To polyfill `Number.isSafeInteger(..)` in pre-ES6 browsers: + +```js +if (!Number.isSafeInteger) { + Number.isSafeInteger = function(num) { + return Number.isInteger( num ) && + Math.abs( num ) <= Number.MAX_SAFE_INTEGER; + }; +} +``` + +### 32-bit (Signed) Integers + +While integers can range up to roughly 9 quadrillion safely (53 bits), there are some numeric operations (like the bitwise operators) that are only defined for 32-bit `number`s, so the "safe range" for `number`s used in that way must be much smaller. + +The range then is `Math.pow(-2,31)` (`-2147483648`, about -2.1 billion) up to `Math.pow(2,31)-1` (`2147483647`, about +2.1 billion). + +To force a `number` value in `a` to a 32-bit signed integer value, use `a | 0`. This works because the `|` bitwise operator only works for 32-bit integer values (meaning it can only pay attention to 32 bits and any other bits will be lost). Then, "or'ing" with zero is essentially a no-op bitwise speaking. + +**Note:** Certain special values (which we will cover in the next section) such as `NaN` and `Infinity` are not "32-bit safe," in that those values when passed to a bitwise operator will pass through the abstract operation `ToInt32` (see Chapter 4) and become simply the `+0` value for the purpose of that bitwise operation. + +## Special Values + +There are several special values spread across the various types that the *alert* JS developer needs to be aware of, and use properly. + +### The Non-value Values + +For the `undefined` type, there is one and only one value: `undefined`. For the `null` type, there is one and only one value: `null`. So for both of them, the label is both its type and its value. + +Both `undefined` and `null` are often taken to be interchangeable as either "empty" values or "non" values. Other developers prefer to distinguish between them with nuance. For example: + +* `null` is an empty value +* `undefined` is a missing value + +Or: + +* `undefined` hasn't had a value yet +* `null` had a value and doesn't anymore + +Regardless of how you choose to "define" and use these two values, `null` is a special keyword, not an identifier, and thus you cannot treat it as a variable to assign to (why would you!?). However, `undefined` *is* (unfortunately) an identifier. Uh oh. + +### Undefined + +In non-`strict` mode, it's actually possible (though incredibly ill-advised!) to assign a value to the globally provided `undefined` identifier: + +```js +function foo() { + undefined = 2; // really bad idea! +} + +foo(); +``` + +```js +function foo() { + "use strict"; + undefined = 2; // TypeError! +} + +foo(); +``` + +In both non-`strict` mode and `strict` mode, however, you can create a local variable of the name `undefined`. But again, this is a terrible idea! + +```js +function foo() { + "use strict"; + var undefined = 2; + console.log( undefined ); // 2 +} + +foo(); +``` + +**Friends don't let friends override `undefined`.** Ever. + +#### `void` Operator + +While `undefined` is a built-in identifier that holds (unless modified -- see above!) the built-in `undefined` value, another way to get this value is the `void` operator. + +The expression `void ___` "voids" out any value, so that the result of the expression is always the `undefined` value. It doesn't modify the existing value; it just ensures that no value comes back from the operator expression. + +```js +var a = 42; + +console.log( void a, a ); // undefined 42 +``` + +By convention (mostly from C-language programming), to represent the `undefined` value stand-alone by using `void`, you'd use `void 0` (though clearly even `void true` or any other `void` expression does the same thing). There's no practical difference between `void 0`, `void 1`, and `undefined`. + +But the `void` operator can be useful in a few other circumstances, if you need to ensure that an expression has no result value (even if it has side effects). + +For example: + +```js +function doSomething() { + // note: `APP.ready` is provided by our application + if (!APP.ready) { + // try again later + return void setTimeout( doSomething, 100 ); + } + + var result; + + // do some other stuff + return result; +} + +// were we able to do it right away? +if (doSomething()) { + // handle next tasks right away +} +``` + +Here, the `setTimeout(..)` function returns a numeric value (the unique identifier of the timer interval, if you wanted to cancel it), but we want to `void` that out so that the return value of our function doesn't give a false-positive with the `if` statement. + +Many devs prefer to just do these actions separately, which works the same but doesn't use the `void` operator: + +```js +if (!APP.ready) { + // try again later + setTimeout( doSomething, 100 ); + return; +} +``` + +In general, if there's ever a place where a value exists (from some expression) and you'd find it useful for the value to be `undefined` instead, use the `void` operator. That probably won't be terribly common in your programs, but in the rare cases you do need it, it can be quite helpful. + +### Special Numbers + +The `number` type includes several special values. We'll take a look at each in detail. + +#### The Not Number, Number + +Any mathematic operation you perform without both operands being `number`s (or values that can be interpreted as regular `number`s in base 10 or base 16) will result in the operation failing to produce a valid `number`, in which case you will get the `NaN` value. + +`NaN` literally stands for "not a `number`", though this label/description is very poor and misleading, as we'll see shortly. It would be much more accurate to think of `NaN` as being "invalid number," "failed number," or even "bad number," than to think of it as "not a number." + +For example: + +```js +var a = 2 / "foo"; // NaN + +typeof a === "number"; // true +``` + +In other words: "the type of not-a-number is 'number'!" Hooray for confusing names and semantics. + +`NaN` is a kind of "sentinel value" (an otherwise normal value that's assigned a special meaning) that represents a special kind of error condition within the `number` set. The error condition is, in essence: "I tried to perform a mathematic operation but failed, so here's the failed `number` result instead." + +So, if you have a value in some variable and want to test to see if it's this special failed-number `NaN`, you might think you could directly compare to `NaN` itself, as you can with any other value, like `null` or `undefined`. Nope. + +```js +var a = 2 / "foo"; + +a == NaN; // false +a === NaN; // false +``` + +`NaN` is a very special value in that it's never equal to another `NaN` value (i.e., it's never equal to itself). It's the only value, in fact, that is not reflexive (without the Identity characteristic `x === x`). So, `NaN !== NaN`. A bit strange, huh? + +So how *do* we test for it, if we can't compare to `NaN` (since that comparison would always fail)? + +```js +var a = 2 / "foo"; + +isNaN( a ); // true +``` + +Easy enough, right? We use the built-in global utility called `isNaN(..)` and it tells us if the value is `NaN` or not. Problem solved! + +Not so fast. + +The `isNaN(..)` utility has a fatal flaw. It appears it tried to take the meaning of `NaN` ("Not a Number") too literally -- that its job is basically: "test if the thing passed in is either not a `number` or is a `number`." But that's not quite accurate. + +```js +var a = 2 / "foo"; +var b = "foo"; + +a; // NaN +b; // "foo" + +window.isNaN( a ); // true +window.isNaN( b ); // true -- ouch! +``` + +Clearly, `"foo"` is literally *not a `number`*, but it's definitely not the `NaN` value either! This bug has been in JS since the very beginning (over 19 years of *ouch*). + +As of ES6, finally a replacement utility has been provided: `Number.isNaN(..)`. A simple polyfill for it so that you can safely check `NaN` values *now* even in pre-ES6 browsers is: + +```js +if (!Number.isNaN) { + Number.isNaN = function(n) { + return ( + typeof n === "number" && + window.isNaN( n ) + ); + }; +} + +var a = 2 / "foo"; +var b = "foo"; + +Number.isNaN( a ); // true +Number.isNaN( b ); // false -- phew! +``` + +Actually, we can implement a `Number.isNaN(..)` polyfill even easier, by taking advantage of that peculiar fact that `NaN` isn't equal to itself. `NaN` is the *only* value in the whole language where that's true; every other value is always **equal to itself**. + +So: + +```js +if (!Number.isNaN) { + Number.isNaN = function(n) { + return n !== n; + }; +} +``` + +Weird, huh? But it works! + +`NaN`s are probably a reality in a lot of real-world JS programs, either on purpose or by accident. It's a really good idea to use a reliable test, like `Number.isNaN(..)` as provided (or polyfilled), to recognize them properly. + +If you're currently using just `isNaN(..)` in a program, the sad reality is your program *has a bug*, even if you haven't been bitten by it yet! + +#### Infinities + +Developers from traditional compiled languages like C are probably used to seeing either a compiler error or runtime exception, like "Divide by zero," for an operation like: + +```js +var a = 1 / 0; +``` + +However, in JS, this operation is well-defined and results in the value `Infinity` (aka `Number.POSITIVE_INFINITY`). Unsurprisingly: + +```js +var a = 1 / 0; // Infinity +var b = -1 / 0; // -Infinity +``` + +As you can see, `-Infinity` (aka `Number.NEGATIVE_INFINITY`) results from a divide-by-zero where either (but not both!) of the divide operands is negative. + +JS uses finite numeric representations (IEEE 754 floating-point, which we covered earlier), so contrary to pure mathematics, it seems it *is* possible to overflow even with an operation like addition or subtraction, in which case you'd get `Infinity` or `-Infinity`. + +For example: + +```js +var a = Number.MAX_VALUE; // 1.7976931348623157e+308 +a + a; // Infinity +a + Math.pow( 2, 970 ); // Infinity +a + Math.pow( 2, 969 ); // 1.7976931348623157e+308 +``` + +According to the specification, if an operation like addition results in a value that's too big to represent, the IEEE 754 "round-to-nearest" mode specifies what the result should be. So, in a crude sense, `Number.MAX_VALUE + Math.pow( 2, 969 )` is closer to `Number.MAX_VALUE` than to `Infinity`, so it "rounds down," whereas `Number.MAX_VALUE + Math.pow( 2, 970 )` is closer to `Infinity` so it "rounds up". + +If you think too much about that, it's going to make your head hurt. So don't. Seriously, stop! + +Once you overflow to either one of the *infinities*, however, there's no going back. In other words, in an almost poetic sense, you can go from finite to infinite but not from infinite back to finite. + +It's almost philosophical to ask: "What is infinity divided by infinity". Our naive brains would likely say "1" or maybe "infinity." Turns out neither is true. Both mathematically and in JavaScript, `Infinity / Infinity` is not a defined operation. In JS, this results in `NaN`. + +But what about any positive finite `number` divided by `Infinity`? That's easy! `0`. And what about a negative finite `number` divided by `Infinity`? Keep reading! + +#### Zeros + +While it may confuse the mathematics-minded reader, JavaScript has both a normal zero `0` (otherwise known as a positive zero `+0`) *and* a negative zero `-0`. Before we explain why the `-0` exists, we should examine how JS handles it, because it can be quite confusing. + +Besides being specified literally as `-0`, negative zero also results from certain mathematic operations. For example: + +```js +var a = 0 / -3; // -0 +var b = 0 * -3; // -0 +``` + +Addition and subtraction cannot result in a negative zero. + +A negative zero when examined in the developer console will usually reveal `-0`, though that was not the common case until fairly recently, so some older browsers you encounter may still report it as `0`. + +However, if you try to stringify a negative zero value, it will always be reported as `"0"`, according to the spec. + +```js +var a = 0 / -3; + +// (some browser) consoles at least get it right +a; // -0 + +// but the spec insists on lying to you! +a.toString(); // "0" +a + ""; // "0" +String( a ); // "0" + +// strangely, even JSON gets in on the deception +JSON.stringify( a ); // "0" +``` + +Interestingly, the reverse operations (going from `string` to `number`) don't lie: + +```js ++"-0"; // -0 +Number( "-0" ); // -0 +JSON.parse( "-0" ); // -0 +``` + +**Warning:** The `JSON.stringify( -0 )` behavior of `"0"` is particularly strange when you observe that it's inconsistent with the reverse: `JSON.parse( "-0" )` reports `-0` as you'd correctly expect. + +In addition to stringification of negative zero being deceptive to hide its true value, the comparison operators are also (intentionally) configured to *lie*. + +```js +var a = 0; +var b = 0 / -3; + +a == b; // true +-0 == 0; // true + +a === b; // true +-0 === 0; // true + +0 > -0; // false +a > b; // false +``` + +Clearly, if you want to distinguish a `-0` from a `0` in your code, you can't just rely on what the developer console outputs, so you're going to have to be a bit more clever: + +```js +function isNegZero(n) { + n = Number( n ); + return (n === 0) && (1 / n === -Infinity); +} + +isNegZero( -0 ); // true +isNegZero( 0 / -3 ); // true +isNegZero( 0 ); // false +``` + +Now, why do we need a negative zero, besides academic trivia? + +There are certain applications where developers use the magnitude of a value to represent one piece of information (like speed of movement per animation frame) and the sign of that `number` to represent another piece of information (like the direction of that movement). + +In those applications, as one example, if a variable arrives at zero and it loses its sign, then you would lose the information of what direction it was moving in before it arrived at zero. Preserving the sign of the zero prevents potentially unwanted information loss. + +### Special Equality + +As we saw above, the `NaN` value and the `-0` value have special behavior when it comes to equality comparison. `NaN` is never equal to itself, so you have to use ES6's `Number.isNaN(..)` (or a polyfill). Simlarly, `-0` lies and pretends that it's equal (even `===` strict equal -- see Chapter 4) to regular positive `0`, so you have to use the somewhat hackish `isNegZero(..)` utility we suggested above. + +As of ES6, there's a new utility that can be used to test two values for absolute equality, without any of these exceptions. It's called `Object.is(..)`: + +```js +var a = 2 / "foo"; +var b = -3 * 0; + +Object.is( a, NaN ); // true +Object.is( b, -0 ); // true + +Object.is( b, 0 ); // false +``` + +There's a pretty simple polyfill for `Object.is(..)` for pre-ES6 environments: + +```js +if (!Object.is) { + Object.is = function(v1, v2) { + // test for `-0` + if (v1 === 0 && v2 === 0) { + return 1 / v1 === 1 / v2; + } + // test for `NaN` + if (v1 !== v1) { + return v2 !== v2; + } + // everything else + return v1 === v2; + }; +} +``` + +`Object.is(..)` probably shouldn't be used in cases where `==` or `===` are known to be *safe* (see Chapter 4 "Coercion"), as the operators are likely much more efficient and certainly are more idiomatic/common. `Object.is(..)` is mostly for these special cases of equality. + +## Value vs. Reference + +In many other languages, values can either be assigned/passed by value-copy or by reference-copy depending on the syntax you use. + +For example, in C++ if you want to pass a `number` variable into a function and have that variable's value updated, you can declare the function parameter like `int& myNum`, and when you pass in a variable like `x`, `myNum` will be a **reference to `x`**; references are like a special form of pointers, where you obtain a pointer to another variable (like an *alias*). If you don't declare a reference parameter, the value passed in will *always* be copied, even if it's a complex object. + +In JavaScript, there are no pointers, and references work a bit differently. You cannot have a reference from one JS variable to another variable. That's just not possible. + +A reference in JS points at a (shared) **value**, so if you have 10 different references, they are all always distinct references to a single shared value; **none of them are references/pointers to each other.** + +Moreover, in JavaScript, there are no syntactic hints that control value vs. reference assignment/passing. Instead, the *type* of the value *solely* controls whether that value will be assigned by value-copy or by reference-copy. + +Let's illustrate: + +```js +var a = 2; +var b = a; // `b` is always a copy of the value in `a` +b++; +a; // 2 +b; // 3 + +var c = [1,2,3]; +var d = c; // `d` is a reference to the shared `[1,2,3]` value +d.push( 4 ); +c; // [1,2,3,4] +d; // [1,2,3,4] +``` + +Simple values (aka scalar primitives) are *always* assigned/passed by value-copy: `null`, `undefined`, `string`, `number`, `boolean`, and ES6's `symbol`. + +Compound values -- `object`s (including `array`s, and all boxed object wrappers -- see Chapter 3) and `function`s -- *always* create a copy of the reference on assignment or passing. + +In the above snippet, because `2` is a scalar primitive, `a` holds one initial copy of that value, and `b` is assigned another *copy* of the value. When changing `b`, you are in no way changing the value in `a`. + +But **both `c` and `d`** are separate references to the same shared value `[1,2,3]`, which is a compound value. It's important to note that neither `c` nor `d` more "owns" the `[1,2,3]` value -- both are just equal peer references to the value. So, when using either reference to modify (`.push(4)`) the actual shared `array` value itself, it's affecting just the one shared value, and both references will reference the newly modified value `[1,2,3,4]`. + +Since references point to the values themselves and not to the variables, you cannot use one reference to change where another reference is pointed: + +```js +var a = [1,2,3]; +var b = a; +a; // [1,2,3] +b; // [1,2,3] + +// later +b = [4,5,6]; +a; // [1,2,3] +b; // [4,5,6] +``` + +When we make the assignment `b = [4,5,6]`, we are doing absolutely nothing to affect *where* `a` is still referencing (`[1,2,3]`). To do that, `b` would have to be a pointer to `a` rather than a reference to the `array` -- but no such capability exists in JS! + +The most common way such confusion happens is with function parameters: + +```js +function foo(x) { + x.push( 4 ); + x; // [1,2,3,4] + + // later + x = [4,5,6]; + x.push( 7 ); + x; // [4,5,6,7] +} + +var a = [1,2,3]; + +foo( a ); + +a; // [1,2,3,4] not [4,5,6,7] +``` + +When we pass in the argument `a`, it assigns a copy of the `a` reference to `x`. `x` and `a` are separate references pointing at the same `[1,2,3]` value. Now, inside the function, we can use that reference to mutate the value itself (`push(4)`). But when we make the assignment `x = [4,5,6]`, this is in no way affecting where the initial reference `a` is pointing -- still points at the (now modified) `[1,2,3,4]` value. + +There is no way to use the `x` reference to change where `a` is pointing. We could only modify the contents of the shared value that both `a` and `x` are pointing to. + +To accomplish changing `a` to have the `[4,5,6,7]` value contents, you can't create a new `array` and assign -- you must modify the existing `array` value: + +```js +function foo(x) { + x.push( 4 ); + x; // [1,2,3,4] + + // later + x.length = 0; // empty existing array in-place + x.push( 4, 5, 6, 7 ); + x; // [4,5,6,7] +} + +var a = [1,2,3]; + +foo( a ); + +a; // [4,5,6,7] not [1,2,3,4] +``` + +As you can see, `x.length = 0` and `x.push(4,5,6,7)` were not creating a new `array`, but modifying the existing shared `array`. So of course, `a` references the new `[4,5,6,7]` contents. + +Remember: you cannot directly control/override value-copy vs. reference -- those semantics are controlled entirely by the type of the underlying value. + +To effectively pass a compound value (like an `array`) by value-copy, you need to manually make a copy of it, so that the reference passed doesn't still point to the original. For example: + +```js +foo( a.slice() ); +``` + +`slice(..)` with no parameters by default makes an entirely new (shallow) copy of the `array`. So, we pass in a reference only to the copied `array`, and thus `foo(..)` cannot affect the contents of `a`. + +To do the reverse -- pass a scalar primitive value in a way where its value updates can be seen, kinda like a reference -- you have to wrap the value in another compound value (`object`, `array`, etc) that *can* be passed by reference-copy: + +```js +function foo(wrapper) { + wrapper.a = 42; +} + +var obj = { + a: 2 +}; + +foo( obj ); + +obj.a; // 42 +``` + +Here, `obj` acts as a wrapper for the scalar primitive property `a`. When passed to `foo(..)`, a copy of the `obj` reference is passed in and set to the `wrapper` parameter. We now can use the `wrapper` reference to access the shared object, and update its property. After the function finishes, `obj.a` will see the updated value `42`. + +It may occur to you that if you wanted to pass in a reference to a scalar primitive value like `2`, you could just box the value in its `Number` object wrapper (see Chapter 3). + +It *is* true a copy of the reference to this `Number` object *will* be passed to the function, but unfortunately, having a reference to the shared object is not going to give you the ability to modify the shared primitive value, like you may expect: + +```js +function foo(x) { + x = x + 1; + x; // 3 +} + +var a = 2; +var b = new Number( a ); // or equivalently `Object(a)` + +foo( b ); +console.log( b ); // 2, not 3 +``` + +The problem is that the underlying scalar primitive value is *not mutable* (same goes for `String` and `Boolean`). If a `Number` object holds the scalar primitive value `2`, that exact `Number` object can never be changed to hold another value; you can only create a whole new `Number` object with a different value. + +When `x` is used in the expression `x + 1`, the underlying scalar primitive value `2` is unboxed (extracted) from the `Number` object automatically, so the line `x = x + 1` very subtly changes `x` from being a shared reference to the `Number` object, to just holding the scalar primitive value `3` as a result of the addition operation `2 + 1`. Therefore, `b` on the outside still references the original unmodified/immutable `Number` object holding the value `2`. + +You *can* add properties on top of the `Number` object (just not change its inner primitive value), so you could exchange information indirectly via those additional properties. + +This is not all that common, however; it probably would not be considered a good practice by most developers. + +Instead of using the wrapper object `Number` in this way, it's probably much better to use the manual object wrapper (`obj`) approach in the earlier snippet. That's not to say that there's no clever uses for the boxed object wrappers like `Number` -- just that you should probably prefer the scalar primitive value form in most cases. + +References are quite powerful, but sometimes they get in your way, and sometimes you need them where they don't exist. The only control you have over reference vs. value-copy behavior is the type of the value itself, so you must indirectly influence the assignment/passing behavior by which value types you choose to use. + +## Review + +In JavaScript, `array`s are simply numerically indexed collections of any value-type. `string`s are somewhat "`array`-like", but they have distinct behaviors and care must be taken if you want to treat them as `array`s. Numbers in JavaScript include both "integers" and floating-point values. + +Several special values are defined within the primitive types. + +The `null` type has just one value: `null`, and likewise the `undefined` type has just the `undefined` value. `undefined` is basically the default value in any variable or property if no other value is present. The `void` operator lets you create the `undefined` value from any other value. + +`number`s include several special values, like `NaN` (supposedly "Not a Number", but really more appropriately "invalid number"); `+Infinity` and `-Infinity`; and `-0`. + +Simple scalar primitives (`string`s, `number`s, etc.) are assigned/passed by value-copy, but compound values (`object`s, etc.) are assigned/passed by reference-copy. References are not like references/pointers in other languages -- they're never pointed at other variables/references, only at the underlying values. diff --git a/types & grammar/ch3.md b/types & grammar/ch3.md new file mode 100644 index 0000000..15bff1c --- /dev/null +++ b/types & grammar/ch3.md @@ -0,0 +1,487 @@ +# You Don't Know JS: Types & Grammar +# Chapter 3: Natives + +Several times in Chapters 1 and 2, we alluded to various built-ins, usually called "natives," like `String` and `Number`. Let's examine those in detail now. + +Here's a list of the most commonly used natives: + +* `String()` +* `Number()` +* `Boolean()` +* `Array()` +* `Object()` +* `Function()` +* `RegExp()` +* `Date()` +* `Error()` +* `Symbol()` -- added in ES6! + +As you can see, these natives are actually built-in functions. + +If you're coming to JS from a language like Java, JavaScript's `String()` will look like the `String(..)` constructor you're used to for creating string values. So, you'll quickly observe that you can do things like: + +```js +var s = new String( "Hello World!" ); + +console.log( s.toString() ); // "Hello World!" +``` + +It *is* true that each of these natives can be used as a native constructor. But what's being constructed may be different than you think. + +```js +var a = new String( "abc" ); + +typeof a; // "object" ... not "String" + +a instanceof String; // true + +Object.prototype.toString.call( a ); // "[object String]" +``` + +The result of the constructor form of value creation (`new String("abc")`) is an object wrapper around the primitive (`"abc"`) value. + +Importantly, `typeof` shows that these objects are not their own special *types*, but more appropriately they are subtypes of the `object` type. + +This object wrapper can further be observed with: + +```js +console.log( a ); +``` + +The output of that statement varies depending on your browser, as developer consoles are free to choose however they feel it's appropriate to serialize the object for developer inspection. + +**Note:** At the time of writing, the latest Chrome prints something like this: `String {0: "a", 1: "b", 2: "c", length: 3, [[PrimitiveValue]]: "abc"}`. But older versions of Chrome used to just print this: `String {0: "a", 1: "b", 2: "c"}`. The latest Firefox currently prints `String ["a","b","c"]`, but used to print `"abc"` in italics, which was clickable to open the object inspector. Of course, these results are subject to rapid change and your experience may vary. + +The point is, `new String("abc")` creates a string wrapper object around `"abc"`, not just the primitive `"abc"` value itself. + +## Internal `[[Class]]` + +Values that are `typeof` `"object"` (such as an array) are additionally tagged with an internal `[[Class]]` property (think of this more as an internal *class*ification rather than related to classes from traditional class-oriented coding). This property cannot be accessed directly, but can generally be revealed indirectly by borrowing the default `Object.prototype.toString(..)` method called against the value. For example: + +```js +Object.prototype.toString.call( [1,2,3] ); // "[object Array]" + +Object.prototype.toString.call( /regex-literal/i ); // "[object RegExp]" +``` + +So, for the array in this example, the internal `[[Class]]` value is `"Array"`, and for the regular expression, it's `"RegExp"`. In most cases, this internal `[[Class]]` value corresponds to the built-in native constructor (see below) that's related to the value, but that's not always the case. + +What about primitive values? First, `null` and `undefined`: + +```js +Object.prototype.toString.call( null ); // "[object Null]" +Object.prototype.toString.call( undefined ); // "[object Undefined]" +``` + +You'll note that there are no `Null()` or `Undefined()` native constructors, but nevertheless the `"Null"` and `"Undefined"` are the internal `[[Class]]` values exposed. + +But for the other simple primitives like `string`, `number`, and `boolean`, another behavior actually kicks in, which is usually called "boxing" (see "Boxing Wrappers" section next): + +```js +Object.prototype.toString.call( "abc" ); // "[object String]" +Object.prototype.toString.call( 42 ); // "[object Number]" +Object.prototype.toString.call( true ); // "[object Boolean]" +``` + +In this snippet, each of the simple primitives are automatically boxed by their respective object wrappers, which is why `"String"`, `"Number"`, and `"Boolean"` are revealed as the respective internal `[[Class]]` values. + +**Note:** The behavior of `toString()` and `[[Class]]` as illustrated here has changed a bit from ES5 to ES6, but we cover those details in the *ES6 & Beyond* title of this series. + +## Boxing Wrappers + +These object wrappers serve a very important purpose. Primitive values don't have properties or methods, so to access `.length` or `.toString()` you need an object wrapper around the value. Thankfully, JS will automatically *box* (aka wrap) the primitive value to fulfill such accesses. + +```js +var a = "abc"; + +a.length; // 3 +a.toUpperCase(); // "ABC" +``` + +So, if you're going to be accessing these properties/methods on your string values regularly, like a `i < a.length` condition in a `for` loop for instance, it might seem to make sense to just have the object form of the value from the start, so the JS engine doesn't need to implicitly create it for you. + +But it turns out that's a bad idea. Browsers long ago performance-optimized the common cases like `.length`, which means your program will *actually go slower* if you try to "preoptimize" by directly using the object form (which isn't on the optimized path). + +In general, there's basically no reason to use the object form directly. It's better to just let the boxing happen implicitly where necessary. In other words, never do things like `new String("abc")`, `new Number(42)`, etc -- always prefer using the literal primitive values `"abc"` and `42`. + +### Object Wrapper Gotchas + +There are some gotchas with using the object wrappers directly that you should be aware of if you *do* choose to ever use them. + +For example, consider `Boolean` wrapped values: + +```js +var a = new Boolean( false ); + +if (!a) { + console.log( "Oops" ); // never runs +} +``` + +The problem is that you've created an object wrapper around the `false` value, but objects themselves are "truthy" (see Chapter 4), so using the object behaves oppositely to using the underlying `false` value itself, which is quite contrary to normal expectation. + +If you want to manually box a primitive value, you can use the `Object(..)` function (no `new` keyword): + +```js +var a = "abc"; +var b = new String( a ); +var c = Object( a ); + +typeof a; // "string" +typeof b; // "object" +typeof c; // "object" + +b instanceof String; // true +c instanceof String; // true + +Object.prototype.toString.call( b ); // "[object String]" +Object.prototype.toString.call( c ); // "[object String]" +``` + +Again, using the boxed object wrapper directly (like `b` and `c` above) is usually discouraged, but there may be some rare occasions you'll run into where they may be useful. + +## Unboxing + +If you have an object wrapper and you want to get the underlying primitive value out, you can use the `valueOf()` method: + +```js +var a = new String( "abc" ); +var b = new Number( 42 ); +var c = new Boolean( true ); + +a.valueOf(); // "abc" +b.valueOf(); // 42 +c.valueOf(); // true +``` + +Unboxing can also happen implicitly, when using an object wrapper value in a way that requires the primitive value. This process (coercion) will be covered in more detail in Chapter 4, but briefly: + +```js +var a = new String( "abc" ); +var b = a + ""; // `b` has the unboxed primitive value "abc" + +typeof a; // "object" +typeof b; // "string" +``` + +## Natives as Constructors + +For `array`, `object`, `function`, and regular-expression values, it's almost universally preferred that you use the literal form for creating the values, but the literal form creates the same sort of object as the constructor form does (that is, there is no nonwrapped value). + +Just as we've seen above with the other natives, these constructor forms should generally be avoided, unless you really know you need them, mostly because they introduce exceptions and gotchas that you probably don't really *want* to deal with. + +### `Array(..)` + +```js +var a = new Array( 1, 2, 3 ); +a; // [1, 2, 3] + +var b = [1, 2, 3]; +b; // [1, 2, 3] +``` + +**Note:** The `Array(..)` constructor does not require the `new` keyword in front of it. If you omit it, it will behave as if you have used it anyway. So `Array(1,2,3)` is the same outcome as `new Array(1,2,3)`. + +The `Array` constructor has a special form where if only one `number` argument is passed, instead of providing that value as *contents* of the array, it's taken as a length to "presize the array" (well, sorta). + +This is a terrible idea. Firstly, you can trip over that form accidentally, as it's easy to forget. + +But more importantly, there's no such thing as actually presizing the array. Instead, what you're creating is an otherwise empty array, but setting the `length` property of the array to the numeric value specified. + +An array that has no explicit values in its slots, but has a `length` property that *implies* the slots exist, is a weird exotic type of data structure in JS with some very strange and confusing behavior. The capability to create such a value comes purely from old, deprecated, historical functionalities ("array-like objects" like the `arguments` object). + +**Note:** An array with at least one "empty slot" in it is often called a "sparse array." + +It doesn't help matters that this is yet another example where browser developer consoles vary on how they represent such an object, which breeds more confusion. + +For example: + +```js +var a = new Array( 3 ); + +a.length; // 3 +a; +``` + +The serialization of `a` in Chrome is (at the time of writing): `[ undefined x 3 ]`. **This is really unfortunate.** It implies that there are three `undefined` values in the slots of this array, when in fact the slots do not exist (so-called "empty slots" -- also a bad name!). + +To visualize the difference, try this: + +```js +var a = new Array( 3 ); +var b = [ undefined, undefined, undefined ]; +var c = []; +c.length = 3; + +a; +b; +c; +``` + +**Note:** As you can see with `c` in this example, empty slots in an array can happen after creation of the array. Changing the `length` of an array to go beyond its number of actually-defined slot values, you implicitly introduce empty slots. In fact, you could even call `delete b[1]` in the above snippet, and it would introduce an empty slot into the middle of `b`. + +For `b` (in Chrome, currently), you'll find `[ undefined, undefined, undefined ]` as the serialization, as opposed to `[ undefined x 3 ]` for `a` and `c`. Confused? Yeah, so is everyone else. + +Worse than that, at the time of writing, Firefox reports `[ , , , ]` for `a` and `c`. Did you catch why that's so confusing? Look closely. Three commas implies four slots, not three slots like we'd expect. + +**What!?** Firefox puts an extra `,` on the end of their serialization here because as of ES5, trailing commas in lists (array values, property lists, etc.) are allowed (and thus dropped and ignored). So if you were to type in a `[ , , , ]` value into your program or the console, you'd actually get the underlying value that's like `[ , , ]` (that is, an array with three empty slots). This choice, while confusing if reading the developer console, is defended as instead making copy-n-paste behavior accurate. + +If you're shaking your head or rolling your eyes about now, you're not alone! Shrugs. + +Unfortunately, it gets worse. More than just confusing console output, `a` and `b` from the above code snippet actually behave the same in some cases **but differently in others**: + +```js +a.join( "-" ); // "--" +b.join( "-" ); // "--" + +a.map(function(v,i){ return i; }); // [ undefined x 3 ] +b.map(function(v,i){ return i; }); // [ 0, 1, 2 ] +``` + +**Ugh.** + +The `a.map(..)` call *fails* because the slots don't actually exist, so `map(..)` has nothing to iterate over. `join(..)` works differently. Basically, we can think of it implemented sort of like this: + +```js +function fakeJoin(arr,connector) { + var str = ""; + for (var i = 0; i < arr.length; i++) { + if (i > 0) { + str += connector; + } + if (arr[i] !== undefined) { + str += arr[i]; + } + } + return str; +} + +var a = new Array( 3 ); +fakeJoin( a, "-" ); // "--" +``` + +As you can see, `join(..)` works by just *assuming* the slots exist and looping up to the `length` value. Whatever `map(..)` does internally, it (apparently) doesn't make such an assumption, so the result from the strange "empty slots" array is unexpected and likely to cause failure. + +So, if you wanted to *actually* create an array of actual `undefined` values (not just "empty slots"), how could you do it (besides manually)? + +```js +var a = Array.apply( null, { length: 3 } ); +a; // [ undefined, undefined, undefined ] +``` + +Confused? Yeah. Here's roughly how it works. + +`apply(..)` is a utility available to all functions, which calls the function it's used with but in a special way. + +The first argument is a `this` object binding (covered in the *this & Object Prototypes* title of this series), which we don't care about here, so we set it to `null`. The second argument is supposed to be an array (or something *like* an array -- aka an "array-like object"). The contents of this "array" are "spread" out as arguments to the function in question. + +So, `Array.apply(..)` is calling the `Array(..)` function and spreading out the values (of the `{ length: 3 }` object value) as its arguments. + +Inside of `apply(..)`, we can envision there's another `for` loop (kinda like `join(..)` from above) that goes from `0` up to, but not including, `length` (`3` in our case). + +For each index, it retrieves that key from the object. So if the array-object parameter was named `arr` internally inside of the `apply(..)` function, the property access would effectively be `arr[0]`, `arr[1]`, and `arr[2]`. Of course, none of those properties exist on the `{ length: 3 }` object value, so all three of those property accesses would return the value `undefined`. + +In other words, it ends up calling `Array(..)` basically like this: `Array(undefined,undefined,undefined)`, which is how we end up with an array filled with `undefined` values, and not just those (crazy) empty slots. + +While `Array.apply( null, { length: 3 } )` is a strange and verbose way to create an array filled with `undefined` values, it's **vastly** better and more reliable than what you get with the footgun'ish `Array(3)` empty slots. + +Bottom line: **never ever, under any circumstances**, should you intentionally create and use these exotic empty-slot arrays. Just don't do it. They're nuts. + +### `Object(..)`, `Function(..)`, and `RegExp(..)` + +The `Object(..)`/`Function(..)`/`RegExp(..)` constructors are also generally optional (and thus should usually be avoided unless specifically called for): + +```js +var c = new Object(); +c.foo = "bar"; +c; // { foo: "bar" } + +var d = { foo: "bar" }; +d; // { foo: "bar" } + +var e = new Function( "a", "return a * 2;" ); +var f = function(a) { return a * 2; }; +function g(a) { return a * 2; } + +var h = new RegExp( "^a*b+", "g" ); +var i = /^a*b+/g; +``` + +There's practically no reason to ever use the `new Object()` constructor form, especially since it forces you to add properties one-by-one instead of many at once in the object literal form. + +The `Function` constructor is helpful only in the rarest of cases, where you need to dynamically define a function's parameters and/or its function body. **Do not just treat `Function(..)` as an alternate form of `eval(..)`.** You will almost never need to dynamically define a function in this way. + +Regular expressions defined in the literal form (`/^a*b+/g`) are strongly preferred, not just for ease of syntax but for performance reasons -- the JS engine precompiles and caches them before code execution. Unlike the other constructor forms we've seen so far, `RegExp(..)` has some reasonable utility: to dynamically define the pattern for a regular expression. + +```js +var name = "Kyle"; +var namePattern = new RegExp( "\\b(?:" + name + ")+\\b", "ig" ); + +var matches = someText.match( namePattern ); +``` + +This kind of scenario legitimately occurs in JS programs from time to time, so you'd need to use the `new RegExp("pattern","flags")` form. + +### `Date(..)` and `Error(..)` + +The `Date(..)` and `Error(..)` native constructors are much more useful than the other natives, because there is no literal form for either. + +To create a date object value, you must use `new Date()`. The `Date(..)` constructor accepts optional arguments to specify the date/time to use, but if omitted, the current date/time is assumed. + +By far the most common reason you construct a date object is to get the current timestamp value (a signed integer number of milliseconds since Jan 1, 1970). You can do this by calling `getTime()` on a date object instance. + +But an even easier way is to just call the static helper function defined as of ES5: `Date.now()`. And to polyfill that for pre-ES5 is pretty easy: + +```js +if (!Date.now) { + Date.now = function(){ + return (new Date()).getTime(); + }; +} +``` + +**Note:** If you call `Date()` without `new`, you'll get back a string representation of the date/time at that moment. The exact form of this representation is not specified in the language spec, though browsers tend to agree on something close to: `"Fri Jul 18 2014 00:31:02 GMT-0500 (CDT)"`. + +The `Error(..)` constructor (much like `Array()` above) behaves the same with the `new` keyword present or omitted. + +The main reason you'd want to create an error object is that it captures the current execution stack context into the object (in most JS engines, revealed as a read-only `.stack` property once constructed). This stack context includes the function call-stack and the line-number where the error object was created, which makes debugging that error much easier. + +You would typically use such an error object with the `throw` operator: + +```js +function foo(x) { + if (!x) { + throw new Error( "x wasn't provided" ); + } + // .. +} +``` + +Error object instances generally have at least a `message` property, and sometimes other properties (which you should treat as read-only), like `type`. However, other than inspecting the above-mentioned `stack` property, it's usually best to just call `toString()` on the error object (either explicitly, or implicitly through coercion -- see Chapter 4) to get a friendly-formatted error message. + +**Tip:** Technically, in addition to the general `Error(..)` native, there are several other specific-error-type natives: `EvalError(..)`, `RangeError(..)`, `ReferenceError(..)`, `SyntaxError(..)`, `TypeError(..)`, and `URIError(..)`. But it's very rare to manually use these specific error natives. They are automatically used if your program actually suffers from a real exception (such as referencing an undeclared variable and getting a `ReferenceError` error). + +### `Symbol(..)` + +New as of ES6, an additional primitive value type has been added, called "Symbol". Symbols are special "unique" (not strictly guaranteed!) values that can be used as properties on objects with little fear of any collision. They're primarily designed for special built-in behaviors of ES6 constructs, but you can also define your own symbols. + +Symbols can be used as property names, but you cannot see or access the actual value of a symbol from your program, nor from the developer console. If you evaluate a symbol in the developer console, what's shown looks like `Symbol(Symbol.create)`, for example. + +There are several predefined symbols in ES6, accessed as static properties of the `Symbol` function object, like `Symbol.create`, `Symbol.iterator`, etc. To use them, do something like: + +```js +obj[Symbol.iterator] = function(){ /*..*/ }; +``` + +To define your own custom symbols, use the `Symbol(..)` native. The `Symbol(..)` native "constructor" is unique in that you're not allowed to use `new` with it, as doing so will throw an error. + +```js +var mysym = Symbol( "my own symbol" ); +mysym; // Symbol(my own symbol) +mysym.toString(); // "Symbol(my own symbol)" +typeof mysym; // "symbol" + +var a = { }; +a[mysym] = "foobar"; + +Object.getOwnPropertySymbols( a ); +// [ Symbol(my own symbol) ] +``` + +While symbols are not actually private (`Object.getOwnPropertySymbols(..)` reflects on the object and reveals the symbols quite publicly), using them for private or special properties is likely their primary use-case. For most developers, they may take the place of property names with `_` underscore prefixes, which are almost always by convention signals to say, "hey, this is a private/special/internal property, so leave it alone!" + +**Note:** `Symbol`s are *not* `object`s, they are simple scalar primitives. + +### Native Prototypes + +Each of the built-in native constructors has its own `.prototype` object -- `Array.prototype`, `String.prototype`, etc. + +These objects contain behavior unique to their particular object subtype. + +For example, all string objects, and by extension (via boxing) `string` primitives, have access to default behavior as methods defined on the `String.prototype` object. + +**Note:** By documentation convention, `String.prototype.XYZ` is shortened to `String#XYZ`, and likewise for all the other `.prototype`s. + +* `String#indexOf(..)`: find the position in the string of another substring +* `String#charAt(..)`: access the character at a position in the string +* `String#substr(..)`, `String#substring(..)`, and `String#slice(..)`: extract a portion of the string as a new string +* `String#toUpperCase()` and `String#toLowerCase()`: create a new string that's converted to either uppercase or lowercase +* `String#trim()`: create a new string that's stripped of any trailing or leading whitespace + +None of the methods modify the string *in place*. Modifications (like case conversion or trimming) create a new value from the existing value. + +By virtue of prototype delegation (see the *this & Object Prototypes* title in this series), any string value can access these methods: + +```js +var a = " abc "; + +a.indexOf( "c" ); // 3 +a.toUpperCase(); // " ABC " +a.trim(); // "abc" +``` + +The other constructor prototypes contain behaviors appropriate to their types, such as `Number#toFixed(..)` (stringifying a number with a fixed number of decimal digits) and `Array#concat(..)` (merging arrays). All functions have access to `apply(..)`, `call(..)`, and `bind(..)` because `Function.prototype` defines them. + +But, some of the native prototypes aren't *just* plain objects: + +```js +typeof Function.prototype; // "function" +Function.prototype(); // it's an empty function! + +RegExp.prototype.toString(); // "/(?:)/" -- empty regex +"abc".match( RegExp.prototype ); // [""] +``` + +A particularly bad idea, you can even modify these native prototypes (not just adding properties as you're probably familiar with): + +```js +Array.isArray( Array.prototype ); // true +Array.prototype.push( 1, 2, 3 ); // 3 +Array.prototype; // [1,2,3] + +// don't leave it that way, though, or expect weirdness! +// reset the `Array.prototype` to empty +Array.prototype.length = 0; +``` + +As you can see, `Function.prototype` is a function, `RegExp.prototype` is a regular expression, and `Array.prototype` is an array. Interesting and cool, huh? + +#### Prototypes As Defaults + +`Function.prototype` being an empty function, `RegExp.prototype` being an "empty" (e.g., non-matching) regex, and `Array.prototype` being an empty array, make them all nice "default" values to assign to variables if those variables wouldn't already have had a value of the proper type. + +For example: + +```js +function isThisCool(vals,fn,rx) { + vals = vals || Array.prototype; + fn = fn || Function.prototype; + rx = rx || RegExp.prototype; + + return rx.test( + vals.map( fn ).join( "" ) + ); +} + +isThisCool(); // true + +isThisCool( + ["a","b","c"], + function(v){ return v.toUpperCase(); }, + /D/ +); // false +``` + +**Note:** As of ES6, we don't need to use the `vals = vals || ..` default value syntax trick (see Chapter 4) anymore, because default values can be set for parameters via native syntax in the function declaration (see Chapter 5). + +One minor side-benefit of this approach is that the `.prototype`s are already created and built-in, thus created *only once*. By contrast, using `[]`, `function(){}`, and `/(?:)/` values themselves for those defaults would (likely, depending on engine implementations) be recreating those values (and probably garbage-collecting them later) for *each call* of `isThisCool(..)`. That could be memory/CPU wasteful. + +Also, be very careful not to use `Array.prototype` as a default value **that will subsequently be modified**. In this example, `vals` is used read-only, but if you were to instead make in-place changes to `vals`, you would actually be modifying `Array.prototype` itself, which would lead to the gotchas mentioned earlier! + +**Note:** While we're pointing out these native prototypes and some usefulness, be cautious of relying on them and even more wary of modifying them in anyway. See Appendix A "Native Prototypes" for more discussion. + +## Review + +JavaScript provides object wrappers around primitive values, known as natives (`String`, `Number`, `Boolean`, etc). These object wrappers give the values access to behaviors appropriate for each object subtype (`String#trim()` and `Array#concat(..)`). + +If you have a simple scalar primitive value like `"abc"` and you access its `length` property or some `String.prototype` method, JS automatically "boxes" the value (wraps it in its respective object wrapper) so that the property/method accesses can be fulfilled. diff --git a/types & grammar/ch4.md b/types & grammar/ch4.md new file mode 100644 index 0000000..6e19e25 --- /dev/null +++ b/types & grammar/ch4.md @@ -0,0 +1,1919 @@ +# You Don't Know JS: Types & Grammar +# Chapter 4: Coercion + +Now that we much more fully understand JavaScript's types and values, we turn our attention to a very controversial topic: coercion. + +As we mentioned in Chapter 1, the debates over whether coercion is a useful feature or a flaw in the design of the language (or somewhere in between!) have raged since day one. If you've read other popular books on JS, you know that the overwhelmingly prevalent *message* out there is that coercion is magical, evil, confusing, and just downright a bad idea. + +In the same overall spirit of this book series, rather than running away from coercion because everyone else does, or because you get bitten by some quirk, I think you should run toward that which you don't understand and seek to *get it* more fully. + +Our goal is to fully explore the pros and cons (yes, there *are* pros!) of coercion, so that you can make an informed decision on its appropriateness in your program. + +## Converting Values + +Converting a value from one type to another is often called "type casting," when done explicitly, and "coercion" when done implicitly (forced by the rules of how a value is used). + +**Note:** It may not be obvious, but JavaScript coercions always result in one of the scalar primitive (see Chapter 2) values, like `string`, `number`, or `boolean`. There is no coercion that results in a complex value like `object` or `function`. Chapter 3 covers "boxing," which wraps scalar primitive values in their `object` counterparts, but this is not really coercion in an accurate sense. + +Another way these terms are often distinguished is as follows: "type casting" (or "type conversion") occur in statically typed languages at compile time, while "type coercion" is a runtime conversion for dynamically typed languages. + +However, in JavaScript, most people refer to all these types of conversions as *coercion*, so the way I prefer to distinguish is to say "implicit coercion" vs. "explicit coercion." + +The difference should be obvious: "explicit coercion" is when it is obvious from looking at the code that a type conversion is intentionally occurring, whereas "implicit coercion" is when the type conversion will occur as a less obvious side effect of some other intentional operation. + +For example, consider these two approaches to coercion: + +```js +var a = 42; + +var b = a + ""; // implicit coercion + +var c = String( a ); // explicit coercion +``` + +For `b`, the coercion that occurs happens implicitly, because the `+` operator combined with one of the operands being a `string` value (`""`) will insist on the operation being a `string` concatenation (adding two strings together), which *as a (hidden) side effect* will force the `42` value in `a` to be coerced to its `string` equivalent: `"42"`. + +By contrast, the `String(..)` function makes it pretty obvious that it's explicitly taking the value in `a` and coercing it to a `string` representation. + +Both approaches accomplish the same effect: `"42"` comes from `42`. But it's the *how* that is at the heart of the heated debates over JavaScript coercion. + +**Note:** Technically, there's some nuanced behavioral difference here beyond the stylistic difference. We cover that in more detail later in the chapter, in the "Implicitly: Strings <--> Numbers" section. + +The terms "explicit" and "implicit," or "obvious" and "hidden side effect," are *relative*. + +If you know exactly what `a + ""` is doing and you're intentionally doing that to coerce to a `string`, you might feel the operation is sufficiently "explicit." Conversely, if you've never seen the `String(..)` function used for `string` coercion, its behavior might seem hidden enough as to feel "implicit" to you. + +But we're having this discussion of "explicit" vs. "implicit" based on the likely opinions of an *average, reasonably informed, but not expert or JS specification devotee* developer. To whatever extent you do or do not find yourself fitting neatly in that bucket, you will need to adjust your perspective on our observations here accordingly. + +Just remember: it's often rare that we write our code and are the only ones who ever read it. Even if you're an expert on all the ins and outs of JS, consider how a less experienced teammate of yours will feel when they read your code. Will it be "explicit" or "implicit" to them in the same way it is for you? + +## Abstract Value Operations + +Before we can explore *explicit* vs *implicit* coercion, we need to learn the basic rules that govern how values *become* either a `string`, `number`, or `boolean`. The ES5 spec in section 9 defines several "abstract operations" (fancy spec-speak for "internal-only operation") with the rules of value conversion. We will specifically pay attention to: `ToString`, `ToNumber`, and `ToBoolean`, and to a lesser extent, `ToPrimitive`. + +### `ToString` + +When any non-`string` value is coerced to a `string` representation, the conversion is handled by the `ToString` abstract operation in section 9.8 of the specification. + +Built-in primitive values have natural stringification: `null` becomes `"null"`, `undefined` becomes `"undefined"` and `true` becomes `"true"`. `number`s are generally expressed in the natural way you'd expect, but as we discussed in Chapter 2, very small or very large `numbers` are represented in exponent form: + +```js +// multiplying `1.07` by `1000`, seven times over +var a = 1.07 * 1000 * 1000 * 1000 * 1000 * 1000 * 1000 * 1000; + +// seven times three digits => 21 digits +a.toString(); // "1.07e21" +``` + +For regular objects, unless you specify your own, the default `toString()` (located in `Object.prototype.toString()`) will return the *internal `[[Class]]`* (see Chapter 3), like for instance `"[object Object]"`. + +But as shown earlier, if an object has its own `toString()` method on it, and you use that object in a `string`-like way, its `toString()` will automatically be called, and the `string` result of that call will be used instead. + +**Note:** The way an object is coerced to a `string` technically goes through the `ToPrimitive` abstract operation (ES5 spec, section 9.1), but those nuanced details are covered in more detail in the `ToNumber` section later in this chapter, so we will skip over them here. + +Arrays have an overridden default `toString()` that stringifies as the (string) concatenation of all its values (each stringified themselves), with `","` in between each value: + +```js +var a = [1,2,3]; + +a.toString(); // "1,2,3" +``` + +Again, `toString()` can either be called explicitly, or it will automatically be called if a non-`string` is used in a `string` context. + +#### JSON Stringification + +Another task that seems awfully related to `ToString` is when you use the `JSON.stringify(..)` utility to serialize a value to a JSON-compatible `string` value. + +It's important to note that this stringification is not exactly the same thing as coercion. But since it's related to the `ToString` rules above, we'll take a slight diversion to cover JSON stringification behaviors here. + +For most simple values, JSON stringification behaves basically the same as `toString()` conversions, except that the serialization result is *always a `string`*: + +```js +JSON.stringify( 42 ); // "42" +JSON.stringify( "42" ); // ""42"" (a string with a quoted string value in it) +JSON.stringify( null ); // "null" +JSON.stringify( true ); // "true" +``` + +Any *JSON-safe* value can be stringified by `JSON.stringify(..)`. But what is *JSON-safe*? Any value that can be represented validly in a JSON representation. + +It may be easier to consider values that are **not** JSON-safe. Some examples: `undefined`s, `function`s, (ES6+) `symbol`s, and `object`s with circular references (where property references in an object structure create a never-ending cycle through each other). These are all illegal values for a standard JSON structure, mostly because they aren't portable to other languages that consume JSON values. + +The `JSON.stringify(..)` utility will automatically omit `undefined`, `function`, and `symbol` values when it comes across them. If such a value is found in an `array`, that value is replaced by `null` (so that the array position information isn't altered). If found as a property of an `object`, that property will simply be excluded. + +Consider: + +```js +JSON.stringify( undefined ); // undefined +JSON.stringify( function(){} ); // undefined + +JSON.stringify( [1,undefined,function(){},4] ); // "[1,null,null,4]" +JSON.stringify( { a:2, b:function(){} } ); // "{"a":2}" +``` + +But if you try to `JSON.stringify(..)` an `object` with circular reference(s) in it, an error will be thrown. + +JSON stringification has the special behavior that if an `object` value has a `toJSON()` method defined, this method will be called first to get a value to use for serialization. + +If you intend to JSON stringify an object that may contain illegal JSON value(s), or if you just have values in the `object` that aren't appropriate for the serialization, you should define a `toJSON()` method for it that returns a *JSON-safe* version of the `object`. + +For example: + +```js +var o = { }; + +var a = { + b: 42, + c: o, + d: function(){} +}; + +// create a circular reference inside `a` +o.e = a; + +// would throw an error on the circular reference +// JSON.stringify( a ); + +// define a custom JSON value serialization +a.toJSON = function() { + // only include the `b` property for serialization + return { b: this.b }; +}; + +JSON.stringify( a ); // "{"b":42}" +``` + +It's a very common misconception that `toJSON()` should return a JSON stringification representation. That's probably incorrect, unless you're wanting to actually stringify the `string` itself (usually not!). `toJSON()` should return the actual regular value (of whatever type) that's appropriate, and `JSON.stringify(..)` itself will handle the stringification. + +In other words, `toJSON()` should be interpreted as "to a JSON-safe value suitable for stringification," not "to a JSON string" as many developers mistakenly assume. + +Consider: + +```js +var a = { + val: [1,2,3], + + // probably correct! + toJSON: function(){ + return this.val.slice( 1 ); + } +}; + +var b = { + val: [1,2,3], + + // probably incorrect! + toJSON: function(){ + return "[" + + this.val.slice( 1 ).join() + + "]"; + } +}; + +JSON.stringify( a ); // "[2,3]" + +JSON.stringify( b ); // ""[2,3]"" +``` + +In the second call, we stringified the returned `string` rather than the `array` itself, which was probably not what we wanted to do. + +While we're talking about `JSON.stringify(..)`, let's discuss some lesser-known functionalities that can still be very useful. + +An optional second argument can be passed to `JSON.stringify(..)` that is called *replacer*. This argument can either be an `array` or a `function`. It's used to customize the recursive serialization of an `object` by providing a filtering mechanism for which properties should and should not be included, in a similar way to how `toJSON()` can prepare a value for serialization. + +If *replacer* is an `array`, it should be an `array` of `string`s, each of which will specify a property name that is allowed to be included in the serialization of the `object`. If a property exists that isn't in this list, it will be skipped. + +If *replacer* is a `function`, it will be called once for the `object` itself, and then once for each property in the `object`, and each time is passed two arguments, *key* and *value*. To skip a *key* in the serialization, return `undefined`. Otherwise, return the *value* provided. + +```js +var a = { + b: 42, + c: "42", + d: [1,2,3] +}; + +JSON.stringify( a, ["b","c"] ); // "{"b":42,"c":"42"}" + +JSON.stringify( a, function(k,v){ + if (k !== "c") return v; +} ); +// "{"b":42,"d":[1,2,3]}" +``` + +**Note:** In the `function` *replacer* case, the key argument `k` is `undefined` for the first call (where the `a` object itself is being passed in). The `if` statement **filters out** the property named `"c"`. Stringification is recursive, so the `[1,2,3]` array has each of its values (`1`, `2`, and `3`) passed as `v` to *replacer*, with indexes (`0`, `1`, and `2`) as `k`. + +A third optional argument can also be passed to `JSON.stringify(..)`, called *space*, which is used as indentation for prettier human-friendly output. *space* can be a positive integer to indicate how many space characters should be used at each indentation level. Or, *space* can be a `string`, in which case up to the first ten characters of its value will be used for each indentation level. + +```js +var a = { + b: 42, + c: "42", + d: [1,2,3] +}; + +JSON.stringify( a, null, 3 ); +// "{ +// "b": 42, +// "c": "42", +// "d": [ +// 1, +// 2, +// 3 +// ] +// }" + +JSON.stringify( a, null, "-----" ); +// "{ +// -----"b": 42, +// -----"c": "42", +// -----"d": [ +// ----------1, +// ----------2, +// ----------3 +// -----] +// }" +``` + +Remember, `JSON.stringify(..)` is not directly a form of coercion. We covered it here, however, for two reasons that relate its behavior to `ToString` coercion: + +1. `string`, `number`, `boolean`, and `null` values all stringify for JSON basically the same as how they coerce to `string` values via the rules of the `ToString` abstract operation. +2. If you pass an `object` value to `JSON.stringify(..)`, and that `object` has a `toJSON()` method on it, `toJSON()` is automatically called to (sort of) "coerce" the value to be *JSON-safe* before stringification. + +### `ToNumber` + +If any non-`number` value is used in a way that requires it to be a `number`, such as a mathematical operation, the ES5 spec defines the `ToNumber` abstract operation in section 9.3. + +For example, `true` becomes `1` and `false` becomes `0`. `undefined` becomes `NaN`, but (curiously) `null` becomes `0`. + +`ToNumber` for a `string` value essentially works for the most part like the rules/syntax for numeric literals (see Chapter 3). If it fails, the result is `NaN` (instead of a syntax error as with `number` literals). One example difference is that `0`-prefixed octal numbers are not handled as octals (just as normal base-10 decimals) in this operation, though such octals are valid as `number` literals (see Chapter 2). + +**Note:** The differences between `number` literal grammar and `ToNumber` on a `string` value are subtle and highly nuanced, and thus will not be covered further here. Consult section 9.3.1 of the ES5 spec for more information. + +Objects (and arrays) will first be converted to their primitive value equivalent, and the resulting value (if a primitive but not already a `number`) is coerced to a `number` according to the `ToNumber` rules just mentioned. + +To convert to this primitive value equivalent, the `ToPrimitive` abstract operation (ES5 spec, section 9.1) will consult the value (using the internal `DefaultValue` operation -- ES5 spec, section 8.12.8) in question to see if it has a `valueOf()` method. If `valueOf()` is available and it returns a primitive value, *that* value is used for the coercion. If not, but `toString()` is available, it will provide the value for the coercion. + +If neither operation can provide a primitive value, a `TypeError` is thrown. + +As of ES5, you can create such a noncoercible object -- one without `valueOf()` and `toString()` -- if it has a `null` value for its `[[Prototype]]`, typically created with `Object.create(null)`. See the *this & Object Prototypes* title of this series for more information on `[[Prototype]]`s. + +**Note:** We cover how to coerce to `number`s later in this chapter in detail, but for this next code snippet, just assume the `Number(..)` function does so. + +Consider: + +```js +var a = { + valueOf: function(){ + return "42"; + } +}; + +var b = { + toString: function(){ + return "42"; + } +}; + +var c = [4,2]; +c.toString = function(){ + return this.join( "" ); // "42" +}; + +Number( a ); // 42 +Number( b ); // 42 +Number( c ); // 42 +Number( "" ); // 0 +Number( [] ); // 0 +Number( [ "abc" ] ); // NaN +``` + +### `ToBoolean` + +Next, let's have a little chat about how `boolean`s behave in JS. There's **lots of confusion and misconception** floating out there around this topic, so pay close attention! + +First and foremost, JS has actual keywords `true` and `false`, and they behave exactly as you'd expect of `boolean` values. It's a common misconception that the values `1` and `0` are identical to `true`/`false`. While that may be true in other languages, in JS the `number`s are `number`s and the `boolean`s are `boolean`s. You can coerce `1` to `true` (and vice versa) or `0` to `false` (and vice versa). But they're not the same. + +#### Falsy Values + +But that's not the end of the story. We need to discuss how values other than the two `boolean`s behave whenever you coerce *to* their `boolean` equivalent. + +All of JavaScript's values can be divided into two categories: + +1. values that will become `false` if coerced to `boolean` +2. everything else (which will obviously become `true`) + +I'm not just being facetious. The JS spec defines a specific, narrow list of values that will coerce to `false` when coerced to a `boolean` value. + +How do we know what the list of values is? In the ES5 spec, section 9.2 defines a `ToBoolean` abstract operation, which says exactly what happens for all the possible values when you try to coerce them "to boolean." + +From that table, we get the following as the so-called "falsy" values list: + +* `undefined` +* `null` +* `false` +* `+0`, `-0`, and `NaN` +* `""` + +That's it. If a value is on that list, it's a "falsy" value, and it will coerce to `false` if you force a `boolean` coercion on it. + +By logical conclusion, if a value is *not* on that list, it must be on *another list*, which we call the "truthy" values list. But JS doesn't really define a "truthy" list per se. It gives some examples, such as saying explicitly that all objects are truthy, but mostly the spec just implies: **anything not explicitly on the falsy list is therefore truthy.** + +#### Falsy Objects + +Wait a minute, that section title even sounds contradictory. I literally *just said* the spec calls all objects truthy, right? There should be no such thing as a "falsy object." + +What could that possibly even mean? + +You might be tempted to think it means an object wrapper (see Chapter 3) around a falsy value (such as `""`, `0` or `false`). But don't fall into that *trap*. + +**Note:** That's a subtle specification joke some of you may get. + +Consider: + +```js +var a = new Boolean( false ); +var b = new Number( 0 ); +var c = new String( "" ); +``` + +We know all three values here are objects (see Chapter 3) wrapped around obviously falsy values. But do these objects behave as `true` or as `false`? That's easy to answer: + +```js +var d = Boolean( a && b && c ); + +d; // true +``` + +So, all three behave as `true`, as that's the only way `d` could end up as `true`. + +**Tip:** Notice the `Boolean( .. )` wrapped around the `a && b && c` expression -- you might wonder why that's there. We'll come back to that later in this chapter, so make a mental note of it. For a sneak-peek (trivia-wise), try for yourself what `d` will be if you just do `d = a && b && c` without the `Boolean( .. )` call! + +So, if "falsy objects" are **not just objects wrapped around falsy values**, what the heck are they? + +The tricky part is that they can show up in your JS program, but they're not actually part of JavaScript itself. + +**What!?** + +There are certain cases where browsers have created their own sort of *exotic* values behavior, namely this idea of "falsy objects," on top of regular JS semantics. + +A "falsy object" is a value that looks and acts like a normal object (properties, etc.), but when you coerce it to a `boolean`, it coerces to a `false` value. + +**Why!?** + +The most well-known case is `document.all`: an array-like (object) provided to your JS program *by the DOM* (not the JS engine itself), which exposes elements in your page to your JS program. It *used* to behave like a normal object--it would act truthy. But not anymore. + +`document.all` itself was never really "standard" and has long since been deprecated/abandoned. + +"Can't they just remove it, then?" Sorry, nice try. Wish they could. But there's far too many legacy JS code bases out there that rely on using it. + +So, why make it act falsy? Because coercions of `document.all` to `boolean` (like in `if` statements) were almost always used as a means of detecting old, nonstandard IE. + +IE has long since come up to standards compliance, and in many cases is pushing the web forward as much or more than any other browser. But all that old `if (document.all) { /* it's IE */ }` code is still out there, and much of it is probably never going away. All this legacy code is still assuming it's running in decade-old IE, which just leads to bad browsing experience for IE users. + +So, we can't remove `document.all` completely, but IE doesn't want `if (document.all) { .. }` code to work anymore, so that users in modern IE get new, standards-compliant code logic. + +"What should we do?" **"I've got it! Let's bastardize the JS type system and pretend that `document.all` is falsy!" + +Ugh. That sucks. It's a crazy gotcha that most JS developers don't understand. But the alternative (doing nothing about the above no-win problems) sucks *just a little bit more*. + +So... that's what we've got: crazy, nonstandard "falsy objects" added to JavaScript by the browsers. Yay! + +#### Truthy Values + +Back to the truthy list. What exactly are the truthy values? Remember: **a value is truthy if it's not on the falsy list.** + +Consider: + +```js +var a = "false"; +var b = "0"; +var c = "''"; + +var d = Boolean( a && b && c ); + +d; +``` + +What value do you expect `d` to have here? It's gotta be either `true` or `false`. + +It's `true`. Why? Because despite the contents of those `string` values looking like falsy values, the `string` values themselves are all truthy, because `""` is the only `string` value on the falsy list. + +What about these? + +```js +var a = []; // empty array -- truthy or falsy? +var b = {}; // empty object -- truthy or falsy? +var c = function(){}; // empty function -- truthy or falsy? + +var d = Boolean( a && b && c ); + +d; +``` + +Yep, you guessed it, `d` is still `true` here. Why? Same reason as before. Despite what it may seem like, `[]`, `{}`, and `function(){}` are *not* on the falsy list, and thus are truthy values. + +In other words, the truthy list is infinitely long. It's impossible to make such a list. You can only make a finite falsy list and consult *it*. + +Take five minutes, write the falsy list on a post-it note for your computer monitor, or memorize it if you prefer. Either way, you'll easily be able to construct a virtual truthy list whenever you need it by simply asking if it's on the falsy list or not. + +The importance of truthy and falsy is in understanding how a value will behave if you coerce it (either explicitly or implicitly) to a `boolean` value. Now that you have those two lists in mind, we can dive into coercion examples themselves. + +## Explicit Coercion + +*Explicit* coercion refers to type conversions that are obvious and explicit. There's a wide range of type conversion usage that clearly falls under the *explicit* coercion category for most developers. + +The goal here is to identify patterns in our code where we can make it clear and obvious that we're converting a value from one type to another, so as to not leave potholes for future developers to trip into. The more explicit we are, the more likely someone later will be able to read our code and understand without undue effort what our intent was. + +It would be hard to find any salient disagreements with *explicit* coercion, as it most closely aligns with how the commonly accepted practice of type conversion works in statically typed languages. As such, we'll take for granted (for now) that *explicit* coercion can be agreed upon to not be evil or controversial. We'll revisit this later, though. + +### Explicitly: Strings <--> Numbers + +We'll start with the simplest and perhaps most common coercion operation: coercing values between `string` and `number` representation. + +To coerce between `string`s and `number`s, we use the built-in `String(..)` and `Number(..)` functions (which we referred to as "native constructors" in Chapter 3), but **very importantly**, we do not use the `new` keyword in front of them. As such, we're not creating object wrappers. + +Instead, we're actually *explicitly coercing* between the two types: + +```js +var a = 42; +var b = String( a ); + +var c = "3.14"; +var d = Number( c ); + +b; // "42" +d; // 3.14 +``` + +`String(..)` coerces from any other value to a primitive `string` value, using the rules of the `ToString` operation discussed earlier. `Number(..)` coerces from any other value to a primitive `number` value, using the rules of the `ToNumber` operation discussed earlier. + +I call this *explicit* coercion because in general, it's pretty obvious to most developers that the end result of these operations is the applicable type conversion. + +In fact, this usage actually looks a lot like it does in some other statically typed languages. + +For example, in C/C++, you can say either `(int)x` or `int(x)`, and both will convert the value in `x` to an integer. Both forms are valid, but many prefer the latter, which kinda looks like a function call. In JavaScript, when you say `Number(x)`, it looks awfully similar. Does it matter that it's *actually* a function call in JS? Not really. + +Besides `String(..)` and `Number(..)`, there are other ways to "explicitly" convert these values between `string` and `number`: + +```js +var a = 42; +var b = a.toString(); + +var c = "3.14"; +var d = +c; + +b; // "42" +d; // 3.14 +``` + +Calling `a.toString()` is ostensibly explicit (pretty clear that "toString" means "to a string"), but there's some hidden implicitness here. `toString()` cannot be called on a *primitive* value like `42`. So JS automatically "boxes" (see Chapter 3) `42` in an object wrapper, so that `toString()` can be called against the object. In other words, you might call it "explicitly implicit." + +`+c` here is showing the *unary operator* form (operator with only one operand) of the `+` operator. Instead of performing mathematic addition (or string concatenation -- see below), the unary `+` explicitly coerces its operand (`c`) to a `number` value. + +Is `+c` *explicit* coercion? Depends on your experience and perspective. If you know (which you do, now!) that unary `+` is explicitly intended for `number` coercion, then it's pretty explicit and obvious. However, if you've never seen it before, it can seem awfully confusing, implicit, with hidden side effects, etc. + +**Note:** The generally accepted perspective in the open-source JS community is that unary `+` is an accepted form of *explicit* coercion. + +Even if you really like the `+c` form, there are definitely places where it can look awfully confusing. Consider: + +```js +var c = "3.14"; +var d = 5+ +c; + +d; // 8.14 +``` + +The unary `-` operator also coerces like `+` does, but it also flips the sign of the number. However, you cannot put two `--` next to each other to unflip the sign, as that's parsed as the decrement operator. Instead, you would need to do: `- -"3.14"` with a space in between, and that would result in coercion to `3.14`. + +You can probably dream up all sorts of hideous combinations of binary operators (like `+` for addition) next to the unary form of an operator. Here's another crazy example: + +```js +1 + - + + + - + 1; // 2 +``` + +You should strongly consider avoiding unary `+` (or `-`) coercion when it's immediately adjacent to other operators. While the above works, it would almost universally be considered a bad idea. Even `d = +c` (or `d =+ c` for that matter!) can far too easily be confused for `d += c`, which is entirely different! + +**Note:** Another extremely confusing place for unary `+` to be used adjacent to another operator would be the `++` increment operator and `--` decrement operator. For example: `a +++b`, `a + ++b`, and `a + + +b`. See "Expression Side-Effects" in Chapter 5 for more about `++`. + +Remember, we're trying to be explicit and **reduce** confusion, not make it much worse! + +#### `Date` To `number` + +Another common usage of the unary `+` operator is to coerce a `Date` object into a `number`, because the result is the unix timestamp (milliseconds elapsed since 1 January 1970 00:00:00 UTC) representation of the date/time value: + +```js +var d = new Date( "Mon, 18 Aug 2014 08:53:06 CDT" ); + ++d; // 1408369986000 +``` + +The most common usage of this idiom is to get the current *now* moment as a timestamp, such as: + +```js +var timestamp = +new Date(); +``` + +**Note:** Some developers are aware of a peculiar syntactic "trick" in JavaScript, which is that the `()` set on a constructor call (a function called with `new`) is *optional* if there are no arguments to pass. So you may run across the `var timestamp = +new Date;` form. However, not all developers agree that omitting the `()` improves readability, as it's an uncommon syntax exception that only applies to the `new fn()` call form and not the regular `fn()` call form. + +But coercion is not the only way to get the timestamp out of a `Date` object. A noncoercion approach is perhaps even preferable, as it's even more explicit: + +```js +var timestamp = new Date().getTime(); +// var timestamp = (new Date()).getTime(); +// var timestamp = (new Date).getTime(); +``` + +But an *even more* preferable noncoercion option is to use the ES5 added `Date.now()` static function: + +```js +var timestamp = Date.now(); +``` + +And if you want to polyfill `Date.now()` into older browsers, it's pretty simple: + +```js +if (!Date.now) { + Date.now = function() { + return +new Date(); + }; +} +``` + +I'd recommend skipping the coercion forms related to dates. Use `Date.now()` for current *now* timestamps, and `new Date( .. ).getTime()` for getting a timestamp of a specific *non-now* date/time that you need to specify. + +#### The Curious Case of the `~` + +One coercive JS operator that is often overlooked and usually very confused is the tilde `~` operator (aka "bitwise NOT"). Many of those who even understand what it does will often times still want to avoid it. But sticking to the spirit of our approach in this book and series, let's dig into it to find out if `~` has anything useful to give us. + +In the "32-bit (Signed) Integers" section of Chapter 2, we covered how bitwise operators in JS are defined only for 32-bit operations, which means they force their operands to conform to 32-bit value representations. The rules for how this happens are controlled by the `ToInt32` abstract operation (ES5 spec, section 9.5). + +`ToInt32` first does a `ToNumber` coercion, which means if the value is `"123"`, it's going to first become `123` before the `ToInt32` rules are applied. + +While not *technically* coercion itself (since the type doesn't change!), using bitwise operators (like `|` or `~`) with certain special `number` values produces a coercive effect that results in a different `number` value. + +For example, let's first consider the `|` "bitwise OR" operator used in the otherwise no-op idiom `0 | x`, which (as Chapter 2 showed) essentially only does the `ToInt32` conversion: + +```js +0 | -0; // 0 +0 | NaN; // 0 +0 | Infinity; // 0 +0 | -Infinity; // 0 +``` + +These special numbers aren't 32-bit representable (since they come from the 64-bit IEEE 754 standard -- see Chapter 2), so `ToInt32` just specifies `0` as the result from these values. + +It's debatable if `0 | __` is an *explicit* form of this coercive `ToInt32` operation or if it's more *implicit*. From the spec perspective, it's unquestionably *explicit*, but if you don't understand bitwise operations at this level, it can seem a bit more *implicitly* magical. Nevertheless, consistent with other assertions in this chapter, we will call it *explicit*. + +So, let's turn our attention back to `~`. The `~` operator first "coerces" to a 32-bit `number` value, and then performs a bitwise negation (flipping each bit's parity). + +**Note:** This is very similar to how `!` not only coerces its value to `boolean` but also flips its parity (see discussion of the "unary `!`" later). + +But... what!? Why do we care about bits being flipped? That's some pretty specialized, nuanced stuff. It's pretty rare for JS developers to need to reason about individual bits. + +Another way of thinking about the definition of `~` comes from old-school computer science/discrete Mathematics: `~` performs two's-complement. Great, thanks, that's totally clearer! + +Let's try again: `~x` is roughly the same as `-(x+1)`. That's weird, but slightly easier to reason about. So: + +```js +~42; // -(42+1) ==> -43 +``` + +You're probably still wondering what the heck all this `~` stuff is about, or why it really matters for a coercion discussion. Let's quickly get to the point. + +Consider `-(x+1)`. What's the only value that you can perform that operation on that will produce a `0` (or `-0` technically!) result? `-1`. In other words, `~` used with a range of `number` values will produce a falsy (easily coercible to `false`) `0` value for the `-1` input value, and any other truthy `number` otherwise. + +Why is that relevant? + +`-1` is commonly called a "sentinel value," which basically means a value that's given an arbitrary semantic meaning within the greater set of values of its same type (`number`s). The C-language uses `-1` sentinel values for many functions that return `>= 0` values for "success" and `-1` for "failure." + +JavaScript adopted this precedent when defining the `string` operation `indexOf(..)`, which searches for a substring and if found returns its zero-based index position, or `-1` if not found. + +It's pretty common to try to use `indexOf(..)` not just as an operation to get the position, but as a `boolean` check of presence/absence of a substring in another `string`. Here's how developers usually perform such checks: + +```js +var a = "Hello World"; + +if (a.indexOf( "lo" ) >= 0) { // true + // found it! +} +if (a.indexOf( "lo" ) != -1) { // true + // found it +} + +if (a.indexOf( "ol" ) < 0) { // true + // not found! +} +if (a.indexOf( "ol" ) == -1) { // true + // not found! +} +``` + +I find it kind of gross to look at `>= 0` or `== -1`. It's basically a "leaky abstraction," in that it's leaking underlying implementation behavior -- the usage of sentinel `-1` for "failure" -- into my code. I would prefer to hide such a detail. + +And now, finally, we see why `~` could help us! Using `~` with `indexOf()` "coerces" (actually just transforms) the value **to be appropriately `boolean`-coercible**: + +```js +var a = "Hello World"; + +~a.indexOf( "lo" ); // -4 <-- truthy! + +if (~a.indexOf( "lo" )) { // true + // found it! +} + +~a.indexOf( "ol" ); // 0 <-- falsy! +!~a.indexOf( "ol" ); // true + +if (!~a.indexOf( "ol" )) { // true + // not found! +} +``` + +`~` takes the return value of `indexOf(..)` and transforms it: for the "failure" `-1` we get the falsy `0`, and every other value is truthy. + +**Note:** The `-(x+1)` pseudo-algorithm for `~` would imply that `~-1` is `-0`, but actually it produces `0` because the underlying operation is actually bitwise, not mathematic. + +Technically, `if (~a.indexOf(..))` is still relying on *implicit* coercion of its resultant `0` to `false` or nonzero to `true`. But overall, `~` still feels to me more like an *explicit* coercion mechanism, as long as you know what it's intended to do in this idiom. + +I find this to be cleaner code than the previous `>= 0` / `== -1` clutter. + +##### Truncating Bits + +There's one more place `~` may show up in code you run across: some developers use the double tilde `~~` to truncate the decimal part of a `number` (i.e., "coerce" it to a whole number "integer"). It's commonly (though mistakingly) said this is the same result as calling `Math.floor(..)`. + +How `~~` works is that the first `~` applies the `ToInt32` "coercion" and does the bitwise flip, and then the second `~` does another bitwise flip, flipping all the bits back to the original state. The end result is just the `ToInt32` "coercion" (aka truncation). + +**Note:** The bitwise double-flip of `~~` is very similar to the parity double-negate `!!` behavior, explained in the "Explicitly: * --> Boolean" section later. + +However, `~~` needs some caution/clarification. First, it only works reliably on 32-bit values. But more importantly, it doesn't work the same on negative numbers as `Math.floor(..)` does! + +```js +Math.floor( -49.6 ); // -50 +~~-49.6; // -49 +``` + +Setting the `Math.floor(..)` difference aside, `~~x` can truncate to a (32-bit) integer. But so does `x | 0`, and seemingly with (slightly) *less effort*. + +So, why might you choose `~~x` over `x | 0`, then? Operator precedence (see Chapter 5): + +```js +~~1E20 / 10; // 166199296 + +1E20 | 0 / 10; // 1661992960 +(1E20 | 0) / 10; // 166199296 +``` + +Just as with all other advice here, use `~` and `~~` as explicit mechanisms for "coercion" and value transformation only if everyone who reads/writes such code is properly aware of how these operators work! + +### Explicitly: Parsing Numeric Strings + +A similar outcome to coercing a `string` to a `number` can be achieved by parsing a `number` out of a `string`'s character contents. There are, however, distinct differences between this parsing and the type conversion we examined above. + +Consider: + +```js +var a = "42"; +var b = "42px"; + +Number( a ); // 42 +parseInt( a ); // 42 + +Number( b ); // NaN +parseInt( b ); // 42 +``` + +Parsing a numeric value out of a string is *tolerant* of non-numeric characters -- it just stops parsing left-to-right when encountered -- whereas coercion is *not tolerant* and fails resulting in the `NaN` value. + +Parsing should not be seen as a substitute for coercion. These two tasks, while similar, have different purposes. Parse a `string` as a `number` when you don't know/care what other non-numeric characters there may be on the right-hand side. Coerce a `string` (to a `number`) when the only acceptable values are numeric and something like `"42px"` should be rejected as a `number`. + +**Tip:** `parseInt(..)` has a twin, `parseFloat(..)`, which (as it sounds) pulls out a floating-point number from a string. + +Don't forget that `parseInt(..)` operates on `string` values. It makes absolutely no sense to pass a `number` value to `parseInt(..)`. Nor would it make sense to pass any other type of value, like `true`, `function(){..}` or `[1,2,3]`. + +If you pass a non-`string`, the value you pass will automatically be coerced to a `string` first (see "`ToString`" earlier), which would clearly be a kind of hidden *implicit* coercion. It's a really bad idea to rely upon such a behavior in your program, so never use `parseInt(..)` with a non-`string` value. + +Prior to ES5, another gotcha existed with `parseInt(..)`, which was the source of many JS programs' bugs. If you didn't pass a second argument to indicate which numeric base (aka radix) to use for interpreting the numeric `string` contents, `parseInt(..)` would look at the beginning character(s) to make a guess. + +If the first two characters were `"0x"` or `"0X"`, the guess (by convention) was that you wanted to interpret the `string` as a hexadecimal (base-16) `number`. Otherwise, if the first character was `"0"`, the guess (again, by convention) was that you wanted to interpret the `string` as an octal (base-8) `number`. + +Hexadecimal `string`s (with the leading `0x` or `0X`) aren't terribly easy to get mixed up. But the octal number guessing proved devilishly common. For example: + +```js +var hour = parseInt( selectedHour.value ); +var minute = parseInt( selectedMinute.value ); + +console.log( "The time you selected was: " + hour + ":" + minute); +``` + +Seems harmless, right? Try selecting `08` for the hour and `09` for the minute. You'll get `0:0`. Why? because neither `8` nor `9` are valid characters in octal base-8. + +The pre-ES5 fix was simple, but so easy to forget: **always pass `10` as the second argument**. This was totally safe: + +```js +var hour = parseInt( selectedHour.value, 10 ); +var minute = parseInt( selectedMiniute.value, 10 ); +``` + +As of ES5, `parseInt(..)` no longer guesses octal. Unless you say otherwise, it assumes base-10 (or base-16 for `"0x"` prefixes). That's much nicer. Just be careful if your code has to run in pre-ES5 environments, in which case you still need to pass `10` for the radix. + +#### Parsing Non-Strings + +One somewhat infamous example of `parseInt(..)`'s behavior is highlighted in a sarcastic joke post a few years ago, poking fun at this JS behavior: + +```js +parseInt( 1/0, 19 ); // 18 +``` + +The assumptive (but totally invalid) assertion was, "If I pass in Infinity, and parse an integer out of that, I should get Infinity back, not 18." Surely, JS must be crazy for this outcome, right? + +Though this example is obviously contrived and unreal, let's indulge the madness for a moment and examine whether JS really is that crazy. + +First off, the most obvious sin committed here is to pass a non-`string` to `parseInt(..)`. That's a no-no. Do it and you're asking for trouble. But even if you do, JS politely coerces what you pass in into a `string` that it can try to parse. + +Some would argue that this is unreasonable behavior, and that `parseInt(..)` should refuse to operate on a non-`string` value. Should it perhaps throw an error? That would be very Java-like, frankly. I shudder at thinking JS should start throwing errors all over the place so that `try..catch` is needed around almost every line. + +Should it return `NaN`? Maybe. But... what about: + +```js +parseInt( new String( "42") ); +``` + +Should that fail, too? It's a non-`string` value. If you want that `String` object wrapper to be unboxed to `"42"`, then is it really so unusual for `42` to first become `"42"` so that `42` can be parsed back out? + +I would argue that this half-*explicit*, half-*implicit* coercion that can occur can often be a very helpful thing. For example: + +```js +var a = { + num: 21, + toString: function() { return String( this.num * 2 ); } +}; + +parseInt( a ); // 42 +``` + +The fact that `parseInt(..)` forcibly coerces its value to a `string` to perform the parse on is quite sensible. If you pass in garbage, and you get garbage back out, don't blame the trash can -- it just did its job faithfully. + +So, if you pass in a value like `Infinity` (the result of `1 / 0` obviously), what sort of `string` representation would make the most sense for its coercion? Only two reasonable choices come to mind: `"Infinity"` and `"∞"`. JS chose `"Infinity"`. I'm glad it did. + +I think it's a good thing that **all values** in JS have some sort of default `string` representation, so that they aren't mysterious black boxes that we can't debug and reason about. + +Now, what about base-19? Obviously, completely bogus and contrived. No real JS programs use base-19. It's absurd. But again, let's indulge the ridiculousness. In base-19, the valid numeric characters are `0` - `9` and `a` - `i` (case insensitive). + +So, back to our `parseInt( 1/0, 19 )` example. It's essentially `parseInt( "Infinity", 19 )`. How does it parse? The first character is `"I"`, which is value `18` in the silly base-19. The second character `"n"` is not in the valid set of numeric characters, and as such the parsing simply politely stops, just like when it ran across `"p"` in `"42px"`. + +The result? `18`. Exactly like it sensibly should be. The behaviors involved to get us there, and not to an error or to `Infinity` itself, are **very important** to JS, and should not be so easily discarded. + +Other examples of this behavior with `parseInt(..)` that may be surprising but are quite sensible include: + +```js +parseInt( 0.000008 ); // 0 ("0" from "0.000008") +parseInt( 0.0000008 ); // 8 ("8" from "8e-7") +parseInt( false, 16 ); // 250 ("fa" from "false") +parseInt( parseInt, 16 ); // 15 ("f" from "function..") + +parseInt( "0x10" ); // 16 +parseInt( "103", 2 ); // 2 +``` + +`parseInt(..)` is actually pretty predictable and consistent in its behavior. If you use it correctly, you'll get sensible results. If you use it incorrectly, the crazy results you get are not the fault of JavaScript. + +### Explicitly: * --> Boolean + +Now, let's examine coercing from any non-`boolean` value to a `boolean`. + +Just like with `String(..)` and `Number(..)` above, `Boolean(..)` (without the `new`, of course!) is an explicit way of forcing the `ToBoolean` coercion: + +```js +var a = "0"; +var b = []; +var c = {}; + +var d = ""; +var e = 0; +var f = null; +var g; + +Boolean( a ); // true +Boolean( b ); // true +Boolean( c ); // true + +Boolean( d ); // false +Boolean( e ); // false +Boolean( f ); // false +Boolean( g ); // false +``` + +While `Boolean(..)` is clearly explicit, it's not at all common or idiomatic. + +Just like the unary `+` operator coerces a value to a `number` (see above), the unary `!` negate operator explicitly coerces a value to a `boolean`. The *problem* is that it also flips the value from truthy to falsy or vice versa. So, the most common way JS developers explicitly coerce to `boolean` is to use the `!!` double-negate operator, because the second `!` will flip the parity back to the original: + +```js +var a = "0"; +var b = []; +var c = {}; + +var d = ""; +var e = 0; +var f = null; +var g; + +!!a; // true +!!b; // true +!!c; // true + +!!d; // false +!!e; // false +!!f; // false +!!g; // false +``` + +Any of these `ToBoolean` coercions would happen *implicitly* without the `Boolean(..)` or `!!`, if used in a `boolean` context such as an `if (..) ..` statement. But the goal here is to explicitly force the value to a `boolean` to make it clearer that the `ToBoolean` coercion is intended. + +Another example use-case for explicit `ToBoolean` coercion is if you want to force a `true`/`false` value coercion in the JSON serialization of a data structure: + +```js +var a = [ + 1, + function(){ /*..*/ }, + 2, + function(){ /*..*/ } +]; + +JSON.stringify( a ); // "[1,null,2,null]" + +JSON.stringify( a, function(key,val){ + if (typeof val == "function") { + // force `ToBoolean` coercion of the function + return !!val; + } + else { + return val; + } +} ); +// "[1,true,2,true]" +``` + +If you come to JavaScript from Java, you may recognize this idiom: + +```js +var a = 42; + +var b = a ? true : false; +``` + +The `? :` ternary operator will test `a` for truthiness, and based on that test will either assign `true` or `false` to `b`, accordingly. + +On its surface, this idiom looks like a form of *explicit* `ToBoolean`-type coercion, since it's obvious that only either `true` or `false` come out of the operation. + +However, there's a hidden *implicit* coercion, in that the `a` expression has to first be coerced to `boolean` to perform the truthiness test. I'd call this idiom "explicitly implicit." Furthermore, I'd suggest **you should avoid this idiom completely** in JavaScript. It offers no real benefit, and worse, masquerades as something it's not. + +`Boolean(a)` and `!!a` are far better as *explicit* coercion options. + +## Implicit Coercion + +*Implicit* coercion refers to type conversions that are hidden, with non-obvious side-effects that implicitly occur from other actions. In other words, *implicit coercions* are any type conversions that aren't obvious (to you). + +While it's clear what the goal of *explicit* coercion is (making code explicit and more understandable), it might be *too* obvious that *implicit* coercion has the opposite goal: making code harder to understand. + +Taken at face value, I believe that's where much of the ire towards coercion comes from. The majority of complaints about "JavaScript coercion" are actually aimed (whether they realize it or not) at *implicit* coercion. + +**Note:** Douglas Crockford, author of *"JavaScript: The Good Parts"*, has claimed in many conference talks and writings that JavaScript coercion should be avoided. But what he seems to mean is that *implicit* coercion is bad (in his opinion). However, if you read his own code, you'll find plenty of examples of coercion, both *implicit* and *explicit*! In truth, his angst seems to primarily be directed at the `==` operation, but as you'll see in this chapter, that's only part of the coercion mechanism. + +So, **is implicit coercion** evil? Is it dangerous? Is it a flaw in JavaScript's design? Should we avoid it at all costs? + +I bet most of you readers are inclined to enthusiastically cheer, "Yes!" + +**Not so fast.** Hear me out. + +Let's take a different perspective on what *implicit* coercion is, and can be, than just that it's "the opposite of the good explicit kind of coercion." That's far too narrow and misses an important nuance. + +Let's define the goal of *implicit* coercion as: to reduce verbosity, boilerplate, and/or unnecessary implementation detail that clutters up our code with noise that distracts from the more important intent. + +### Simplifying Implicitly + +Before we even get to JavaScript, let me suggest something pseudo-code'ish from some theoretical strongly typed language to illustrate: + +```js +SomeType x = SomeType( AnotherType( y ) ) +``` + +In this example, I have some arbitrary type of value in `y` that I want to convert to the `SomeType` type. The problem is, this language can't go directly from whatever `y` currently is to `SomeType`. It needs an intermediate step, where it first converts to `AnotherType`, and then from `AnotherType` to `SomeType`. + +Now, what if that language (or definition you could create yourself with the language) *did* just let you say: + +```js +SomeType x = SomeType( y ) +``` + +Wouldn't you generally agree that we simplified the type conversion here to reduce the unnecessary "noise" of the intermediate conversion step? I mean, is it *really* all that important, right here at this point in the code, to see and deal with the fact that `y` goes to `AnotherType` first before then going to `SomeType`? + +Some would argue, at least in some circumstances, yes. But I think an equal argument can be made of many other circumstances that here, the simplification **actually aids in the readability of the code** by abstracting or hiding away such details, either in the language itself or in our own abstractions. + +Undoubtedly, behind the scenes, somewhere, the intermediate conversion step is still happening. But if that detail is hidden from view here, we can just reason about getting `y` to type `SomeType` as an generic operation and hide the messy details. + +While not a perfect analogy, what I'm going to argue throughout the rest of this chapter is that JS *implicit* coercion can be thought of as providing a similar aid to your code. + +But, **and this is very important**, that is not an unbounded, absolute statement. There are definitely plenty of *evils* lurking around *implicit* coercion, that will harm your code much more than any potential readability improvements. Clearly, we have to learn how to avoid such constructs so we don't poison our code with all manner of bugs. + +Many developers believe that if a mechanism can do some useful thing **A** but can also be abused or misused to do some awful thing **Z**, then we should throw out that mechanism altogether, just to be safe. + +My encouragement to you is: don't settle for that. Don't "throw the baby out with the bathwater." Don't assume *implicit* coercion is all bad because all you think you've ever seen is its "bad parts." I think there are "good parts" here, and I want to help and inspire more of you to find and embrace them! + +### Implicitly: Strings <--> Numbers + +Earlier in this chapter, we explored *explicitly* coercing between `string` and `number` values. Now, let's explore the same task but with *implicit* coercion approaches. But before we do, we have to examine some nuances of operations that will *implicitly* force coercion. + +The `+` operator is overloaded to serve the purposes of both `number` addition and `string` concatenation. So how does JS know which type of operation you want to use? Consider: + +```js +var a = "42"; +var b = "0"; + +var c = 42; +var d = 0; + +a + b; // "420" +c + d; // 42 +``` + +What's different that causes `"420"` vs `42`? It's a common misconception that the difference is whether one or both of the operands is a `string`, as that means `+` will assume `string` concatenation. While that's partially true, it's more complicated than that. + +Consider: + +```js +var a = [1,2]; +var b = [3,4]; + +a + b; // "1,23,4" +``` + +Neither of these operands is a `string`, but clearly they were both coerced to `string`s and then the `string` concatenation kicked in. So what's really going on? + +(**Warning:** deeply nitty gritty spec-speak coming, so skip the next two paragraphs if that intimidates you!) + +----- + +According to ES5 spec section 11.6.1, the `+` algorithm (when an `object` value is an operand) will concatenate if either operand is either already a `string`, or if the following steps produce a `string` representation. So, when `+` receives an `object` (including `array`) for either operand, it first calls the `ToPrimitive` abstract operation (section 9.1) on the value, which then calls the `[[DefaultValue]]` algorithm (section 8.12.8) with a context hint of `number`. + +If you're paying close attention, you'll notice that this operation is now identical to how the `ToNumber` abstract operation handles `object`s (see the "`ToNumber`"" section earlier). The `valueOf()` operation on the `array` will fail to produce a simple primitive, so it then falls to a `toString()` representation. The two `array`s thus become `"1,2"` and `"3,4"`, respectively. Now, `+` concatenates the two `string`s as you'd normally expect: `"1,23,4"`. + +----- + +Let's set aside those messy details and go back to an earlier, simplified explanation: if either operand to `+` is a `string` (or becomes one with the above steps!), the operation will be `string` concatenation. Otherwise, it's always numeric addition. + +**Note:** A commonly cited coercion gotcha is `[] + {}` vs. `{} + []`, as those two expressions result, respectively, in `"[object Object]"` and `0`. There's more to it, though, and we cover those details in "Blocks" in Chapter 5. + +What's that mean for *implicit* coercion? + +You can coerce a `number` to a `string` simply by "adding" the `number` and the `""` empty `string`: + +```js +var a = 42; +var b = a + ""; + +b; // "42" +``` + +**Tip:** Numeric addition with the `+` operator is commutative, which means `2 + 3` is the same as `3 + 2`. String concatenation with `+` is obviously not generally commutative, **but** with the specific case of `""`, it's effectively commutative, as `a + ""` and `"" + a` will produce the same result. + +It's extremely common/idiomatic to (*implicitly*) coerce `number` to `string` with a `+ ""` operation. In fact, interestingly, even some of the most vocal critics of *implicit* coercion still use that approach in their own code, instead of one of its *explicit* alternatives. + +**I think this is a great example** of a useful form in *implicit* coercion, despite how frequently the mechanism gets criticized! + +Comparing this *implicit* coercion of `a + ""` to our earlier example of `String(a)` *explicit* coercion, there's one additional quirk to be aware of. Because of how the `ToPrimitive` abstract operation works, `a + ""` invokes `valueOf()` on the `a` value, whose return value is then finally converted to a `string` via the internal `ToString` abstract operation. But `String(a)` just invokes `toString()` directly. + +Both approaches ultimately result in a `string`, but if you're using an `object` instead of a regular primitive `number` value, you may not necessarily get the *same* `string` value! + +Consider: + +```js +var a = { + valueOf: function() { return 42; }, + toString: function() { return 4; } +}; + +a + ""; // "42" + +String( a ); // "4" +``` + +Generally, this sort of gotcha won't bite you unless you're really trying to create confusing data structures and operations, but you should be careful if you're defining both your own `valueOf()` and `toString()` methods for some `object`, as how you coerce the value could affect the outcome. + +What about the other direction? How can we *implicitly coerce* from `string` to `number`? + +``` +var a = "3.14"; +var b = a - 0; + +b; // 3.14 +``` + +The `-` operator is defined only for numeric subtraction, so `a - 0` forces `a`'s value to be coerced to a `number`. While far less common, `a * 1` or `a / 1` would accomplish the same result, as those operators are also only defined for numeric operations. + +What about `object` values with the `-` operator? Similar story as for `+` above: + +```js +var a = [3]; +var b = [1]; + +a - b; // 2 +``` + +Both `array` values have to become `number`s, but they end up first being coerced to `strings` (using the expected `toString()` serialization), and then are coerced to `number`s, for the `-` subtraction to perform on. + +So, is *implicit* coercion of `string` and `number` values the ugly evil you've always heard horror stories about? I don't personally think so. + +Compare `b = String(a)` (*explicit*) to `b = a + ""` (*implicit*). I think cases can be made for both approaches being useful in your code. Certainly `b = a + ""` is quite a bit more common in JS programs, proving its own utility regardless of *feelings* about the merits or hazards of *implicit* coercion in general. + +### Implicitly: Booleans --> Numbers + +I think a case where *implicit* coercion can really shine is in simplifying certain types of complicated `boolean` logic into simple numeric addition. Of course, this is not a general-purpose technique, but a specific solution for specific cases. + +Consider: + +```js +function onlyOne(a,b,c) { + return !!((a && !b && !c) || + (!a && b && !c) || (!a && !b && c)); +} + +var a = true; +var b = false; + +onlyOne( a, b, b ); // true +onlyOne( b, a, b ); // true + +onlyOne( a, b, a ); // false +``` + +This `onlyOne(..)` utility should only return `true` if exactly one of the arguments is `true` / truthy. It's using *implicit* coercion on the truthy checks and *explicit* coercion on the others, including the final return value. + +But what if we needed that utility to be able to handle four, five, or twenty flags in the same way? It's pretty difficult to imagine implementing code that would handle all those permutations of comparisons. + +But here's where coercing the `boolean` values to `number`s (`0` or `1`, obviously) can greatly help: + +```js +function onlyOne() { + var sum = 0; + for (var i=0; i < arguments.length; i++) { + // skip falsy values. same as treating + // them as 0's, but avoids NaN's. + if (arguments[i]) { + sum += arguments[i]; + } + } + return sum == 1; +} + +var a = true; +var b = false; + +onlyOne( b, a ); // true +onlyOne( b, a, b, b, b ); // true + +onlyOne( b, b ); // false +onlyOne( b, a, b, b, b, a ); // false +``` + +**Note:** Of course, instead of the `for` loop in `onlyOne(..)`, you could more tersely use the ES5 `reduce(..)` utility, but I didn't want to obscure the concepts. + +What we're doing here is relying on the `1` for `true`/truthy coercions, and numerically adding them all up. `sum += arguments[i]` uses *implicit* coercion to make that happen. If one and only one value in the `arguments` list is `true`, then the numeric sum will be `1`, otherwise the sum will not be `1` and thus the desired condition is not met. + +We could of course do this with *explicit* coercion instead: + +```js +function onlyOne() { + var sum = 0; + for (var i=0; i < arguments.length; i++) { + sum += Number( !!arguments[i] ); + } + return sum === 1; +} +``` + +We first use `!!arguments[i]` to force the coercion of the value to `true` or `false`. That's so you could pass non-`boolean` values in, like `onlyOne( "42", 0 )`, and it would still work as expected (otherwise you'd end up with `string` concatenation and the logic would be incorrect). + +Once we're sure it's a `boolean`, we do another *explicit* coercion with `Number(..)` to make sure the value is `0` or `1`. + +Is the *explicit* coercion form of this utility "better"? It does avoid the `NaN` trap as explained in the code comments. But, ultimately, it depends on your needs. I personally think the former version, relying on *implicit* coercion is more elegant (if you won't be passing `undefined` or `NaN`), and the *explicit* version is needlessly more verbose. + +But as with almost everything we're discussing here, it's a judgment call. + +**Note:** Regardless of *implicit* or *explicit* approaches, you could easily make `onlyTwo(..)` or `onlyFive(..)` variations by simply changing the final comparison from `1`, to `2` or `5`, respectively. That's drastically easier than adding a bunch of `&&` and `||` expressions. So, generally, coercion is very helpful in this case. + +### Implicitly: * --> Boolean + +Now, let's turn our attention to *implicit* coercion to `boolean` values, as it's by far the most common and also by far the most potentially troublesome. + +Remember, *implicit* coercion is what kicks in when you use a value in such a way that it forces the value to be converted. For numeric and `string` operations, it's fairly easy to see how the coercions can occur. + +But, what sort of expression operations require/force (*implicitly*) a `boolean` coercion? + +1. The test expression in an `if (..)` statement. +2. The test expression (second clause) in a `for ( .. ; .. ; .. )` header. +3. The test expression in `while (..)` and `do..while(..)` loops. +4. The test expression (first clause) in `? :` ternary expressions. +5. The left-hand operand (which serves as a test expression -- see below!) to the `||` ("logical or") and `&&` ("logical and") operators. + +Any value used in these contexts that is not already a `boolean` will be *implicitly* coerced to a `boolean` using the rules of the `ToBoolean` abstract operation covered earlier in this chapter. + +Let's look at some examples: + +```js +var a = 42; +var b = "abc"; +var c; +var d = null; + +if (a) { + console.log( "yep" ); // yep +} + +while (c) { + console.log( "nope, never runs" ); +} + +c = d ? a : b; +c; // "abc" + +if ((a && d) || c) { + console.log( "yep" ); // yep +} +``` + +In all these contexts, the non-`boolean` values are *implicitly coerced* to their `boolean` equivalents to make the test decisions. + +### Operators `||` and `&&` + +It's quite likely that you have seen the `||` ("logical or") and `&&` ("logical and") operators in most or all other languages you've used. So it'd be natural to assume that they work basically the same in JavaScript as in other similar languages. + +There's some very little known, but very important, nuance here. + +In fact, I would argue these operators shouldn't even be called "logical ___ operators", as that name is incomplete in describing what they do. If I were to give them a more accurate (if more clumsy) name, I'd call them "selector operators," or more completely, "operand selector operators." + +Why? Because they don't actually result in a *logic* value (aka `boolean`) in JavaScript, as they do in some other languages. + +So what *do* they result in? They result in the value of one (and only one) of their two operands. In other words, **they select one of the two operand's values**. + +Quoting the ES5 spec from section 11.11: + +> The value produced by a && or || operator is not necessarily of type Boolean. The value produced will always be the value of one of the two operand expressions. + +Let's illustrate: + +```js +var a = 42; +var b = "abc"; +var c = null; + +a || b; // 42 +a && b; // "abc" + +c || b; // "abc" +c && b; // null +``` + +**Wait, what!?** Think about that. In languages like C and PHP, those expressions result in `true` or `false`, but in JS (and Python and Ruby, for that matter!), the result comes from the values themselves. + +Both `||` and `&&` operators perform a `boolean` test on the **first operand** (`a` or `c`). If the operand is not already `boolean` (as it's not, here), a normal `ToBoolean` coercion occurs, so that the test can be performed. + +For the `||` operator, if the test is `true`, the `||` expression results in the value of the *first operand* (`a` or `c`). If the test is `false`, the `||` expression results in the value of the *second operand* (`b`). + +Inversely, for the `&&` operator, if the test is `true`, the `&&` expression results in the value of the *second operand* (`b`). If the test is `false`, the `&&` expression results in the value of the *first operand* (`a` or `c`). + +The result of a `||` or `&&` expression is always the underlying value of one of the operands, **not** the (possibly coerced) result of the test. In `c && b`, `c` is `null`, and thus falsy. But the `&&` expression itself results in `null` (the value in `c`), not in the coerced `false` used in the test. + +Do you see how these operators act as "operand selectors", now? + +Another way of thinking about these operators: + +```js +a || b; +// roughly equivalent to: +a ? a : b; + +a && b; +// roughly equivalent to: +a ? b : a; +``` + +**Note:** I call `a || b` "roughly equivalent" to `a ? a : b` because the outcome is identical, but there's a nuanced difference. In `a ? a : b`, if `a` was a more complex expression (like for instance one that might have side effects like calling a `function`, etc.), then the `a` expression would possibly be evaluated twice (if the first evaluation was truthy). By contrast, for `a || b`, the `a` expression is evaluated only once, and that value is used both for the coercive test as well as the result value (if appropriate). The same nuance applies to the `a && b` and `a ? b : a` expressions. + +An extremely common and helpful usage of this behavior, which there's a good chance you may have used before and not fully understood, is: + +```js +function foo(a,b) { + a = a || "hello"; + b = b || "world"; + + console.log( a + " " + b ); +} + +foo(); // "hello world" +foo( "yeah", "yeah!" ); // "yeah yeah!" +``` + +The `a = a || "hello"` idiom (sometimes said to be JavaScript's version of the C# "null coalescing operator") acts to test `a` and if it has no value (or only an undesired falsy value), provides a backup default value (`"hello"`). + +**Be careful**, though! + +```js +foo( "That's it!", "" ); // "That's it! world" <-- Oops! +``` + +See the problem? `""` as the second argument is a falsy value (see `ToBoolean` earlier in this chapter), so the `b = b || "world"` test fails, and the `"world"` default value is substituted, even though the intent probably was to have the explicitly passed `""` be the value assigned to `b`. + +This `||` idiom is extremely common, and quite helpful, but you have to use it only in cases where *all falsy values* should be skipped. Otherwise, you'll need to be more explicit in your test, and probably use a `? :` ternary instead. + +This *default value assignment* idiom is so common (and useful!) that even those who publicly and vehemently decry JavaScript coercion often use it in their own code! + +What about `&&`? + +There's another idiom that is quite a bit less commonly authored manually, but which is used by JS minifiers frequently. The `&&` operator "selects" the second operand if and only if the first operand tests as truthy, and this usage is sometimes called the "guard operator" (also see "Short Circuited" in Chapter 5) -- the first expression test "guards" the second expression: + +```js +function foo() { + console.log( a ); +} + +var a = 42; + +a && foo(); // 42 +``` + +`foo()` gets called only because `a` tests as truthy. If that test failed, this `a && foo()` expression statement would just silently stop -- this is known as "short circuiting" -- and never call `foo()`. + +Again, it's not nearly as common for people to author such things. Usually, they'd do `if (a) { foo(); }` instead. But JS minifiers choose `a && foo()` because it's much shorter. So, now, if you ever have to decipher such code, you'll know what it's doing and why. + +OK, so `||` and `&&` have some neat tricks up their sleeve, as long as you're willing to allow the *implicit* coercion into the mix. + +**Note:** Both the `a = b || "something"` and `a && b()` idioms rely on short circuiting behavior, which we cover in more detail in Chapter 5. + +The fact that these operators don't actually result in `true` and `false` is possibly messing with your head a little bit by now. You're probably wondering how all your `if` statements and `for` loops have been working, if they've included compound logical expressions like `a && (b || c)`. + +Don't worry! The sky is not falling. Your code is (probably) just fine. It's just that you probably never realized before that there was an *implicit* coercion to `boolean` going on **after** the compound expression was evaluated. + +Consider: + +```js +var a = 42; +var b = null; +var c = "foo"; + +if (a && (b || c)) { + console.log( "yep" ); +} +``` + +This code still works the way you always thought it did, except for one subtle extra detail. The `a && (b || c)` expression *actually* results in `"foo"`, not `true`. So, the `if` statement *then* forces the `"foo"` value to coerce to a `boolean`, which of course will be `true`. + +See? No reason to panic. Your code is probably still safe. But now you know more about how it does what it does. + +And now you also realize that such code is using *implicit* coercion. If you're in the "avoid (implicit) coercion camp" still, you're going to need to go back and make all of those tests *explicit*: + +```js +if (!!a && (!!b || !!c)) { + console.log( "yep" ); +} +``` + +Good luck with that! ... Sorry, just teasing. + +### Symbol Coercion + +Up to this point, there's been almost no observable outcome difference between *explicit* and *implicit* coercion -- only the readability of code has been at stake. + +But ES6 Symbols introduce a gotcha into the coercion system that we need to discuss briefly. For reasons that go well beyond the scope of what we'll discuss in this book, *explicit* coercion of a `symbol` to a `string` is allowed, but *implicit* coercion of the same is disallowed and throws an error. + +Consider: + +```js +var s1 = Symbol( "cool" ); +String( s1 ); // "Symbol(cool)" + +var s2 = Symbol( "not cool" ); +s2 + ""; // TypeError +``` + +`symbol` values cannot coerce to `number` at all (throws an error either way), but strangely they can both *explicitly* and *implicitly* coerce to `boolean` (always `true`). + +Consistency is always easier to learn, and exceptions are never fun to deal with, but we just need to be careful around the new ES6 `symbol` values and how we coerce them. + +The good news: it's probably going to be exceedingly rare for you to need to coerce a `symbol` value. The way they're typically used (see Chapter 3) will probably not call for coercion on a normal basis. + +## Loose Equals vs. Strict Equals + +Loose equals is the `==` operator, and strict equals is the `===` operator. Both operators are used for comparing two values for "equality," but the "loose" vs. "strict" indicates a **very important** difference in behavior between the two, specifically in how they decide "equality." + +A very common misconception about these two operators is: "`==` checks values for equality and `===` checks both values and types for equality." While that sounds nice and reasonable, it's inaccurate. Countless well-respected JavaScript books and blogs have said exactly that, but unfortunately they're all *wrong*. + +The correct description is: "`==` allows coercion in the equality comparison and `===` disallows coercion." + +### Equality Performance + +Stop and think about the difference between the first (inaccurate) explanation and this second (accurate) one. + +In the first explanation, it seems obvious that `===` is *doing more work* than `==`, because it has to *also* check the type. In the second explanation, `==` is the one *doing more work* because it has to follow through the steps of coercion if the types are different. + +Don't fall into the trap, as many have, of thinking this has anything to do with performance, though, as if `==` is going to be slower than `===` in any relevant way. While it's measurable that coercion does take *a little bit* of processing time, it's mere microseconds (yes, that's millionths of a second!). + +If you're comparing two values of the same types, `==` and `===` use the identical algorithm, and so other than minor differences in engine implementation, they should do the same work. + +If you're comparing two values of different types, the performance isn't the important factor. What you should be asking yourself is: when comparing these two values, do I want coercion or not? + +If you want coercion, use `==` loose equality, but if you don't want coercion, use `===` strict equality. + +**Note:** The implication here then is that both `==` and `===` check the types of their operands. The difference is in how they respond if the types don't match. + +### Abstract Equality + +The `==` operator's behavior is defined as "The Abstract Equality Comparison Algorithm" in section 11.9.3 of the ES5 spec. What's listed there is a comprehensive but simple algorithm that explicitly states every possible combination of types, and how the coercions (if necessary) should happen for each combination. + +**Warning:** When (*implicit*) coercion is maligned as being too complicated and too flawed to be a *useful good part*, it is these rules of "abstract equality" that are being condemned. Generally, they are said to be too complex and too unintuitive for developers to practically learn and use, and that they are prone more to causing bugs in JS programs than to enabling greater code readability. I believe this is a flawed premise -- that you readers are competent developers who write (and read and understand!) algorithms (aka code) all day long. So, what follows is a plain exposition of the "abstract equality" in simple terms. But I implore you to also read the ES5 spec section 11.9.3. I think you'll be surprised at just how reasonable it is. + +Basically, the first clause (11.9.3.1) says, if the two values being compared are of the same type, they are simply and naturally compared via Identity as you'd expect. For example, `42` is only equal to `42`, and `"abc"` is only equal to `"abc"`. + +Some minor exceptions to normal expectation to be aware of: + +* `NaN` is never equal to itself (see Chapter 2) +* `+0` and `-0` are equal to each other (see Chapter 2) + +The final provision in clause 11.9.3.1 is for `==` loose equality comparison with `object`s (including `function`s and `array`s). Two such values are only *equal* if they are both references to *the exact same value*. No coercion occurs here. + +**Note:** The `===` strict equality comparison is defined identically to 11.9.3.1, including the provision about two `object` values. It's a very little known fact that **`==` and `===` behave identically** in the case where two `object`s are being compared! + +The rest of the algorithm in 11.9.3 specifies that if you use `==` loose equality to compare two values of different types, one or both of the values will need to be *implicitly* coerced. This coercion happens so that both values eventually end up as the same type, which can then directly be compared for equality using simple value Identity. + +**Note:** The `!=` loose not-equality operation is defined exactly as you'd expect, in that it's literally the `==` operation comparison performed in its entirety, then the negation of the result. The same goes for the `!==` strict not-equality operation. + +#### Comparing: `string`s to `number`s + +To illustrate `==` coercion, let's first build off the `string` and `number` examples earlier in this chapter: + +```js +var a = 42; +var b = "42"; + +a === b; // false +a == b; // true +``` + +As we'd expect, `a === b` fails, because no coercion is allowed, and indeed the `42` and `"42"` values are different. + +However, the second comparison `a == b` uses loose equality, which means that if the types happen to be different, the comparison algorithm will perform *implicit* coercion on one or both values. + +But exactly what kind of coercion happens here? Does the `a` value of `42` become a `string`, or does the `b` value of `"42"` become a `number`? + +In the ES5 spec, clauses 11.9.3.4-5 say: + +> 4. If Type(x) is Number and Type(y) is String, +> return the result of the comparison x == ToNumber(y). +> 5. If Type(x) is String and Type(y) is Number, +> return the result of the comparison ToNumber(x) == y. + +**Warning:** The spec uses `Number` and `String` as the formal names for the types, while this book prefers `number` and `string` for the primitive types. Do not let the capitalization of `Number` in the spec confuse you for the `Number()` native function. For our purposes, the capitalization of the type name is irrelevant -- they have basically the same meaning. + +Clearly, the spec says the `"42"` value is coerced to a `number` for the comparison. The *how* of that coercion has already been covered earlier, specifically with the `ToNumber` abstract operation. In this case, it's quite obvious then that the resulting two `42` values are equal. + +#### Comparing: anything to `boolean` + +One of the biggest gotchas with the *implicit* coercion of `==` loose equality pops up when you try to compare a value directly to `true` or `false`. + +Consider: + +```js +var a = "42"; +var b = true; + +a == b; // false +``` + +Wait, what happened here!? We know that `"42"` is a truthy value (see earlier in this chapter). So, how come it's not `==` loose equal to `true`? + +The reason is both simple and deceptively tricky. It's so easy to misunderstand, many JS developers never pay close enough attention to fully grasp it. + +Let's again quote the spec, clauses 11.9.3.6-7: + +> 6. If Type(x) is Boolean, +> return the result of the comparison ToNumber(x) == y. +> 7. If Type(y) is Boolean, +> return the result of the comparison x == ToNumber(y). + +Let's break that down. First: + +```js +var x = true; +var y = "42"; + +x == y; // false +``` + +The `Type(x)` is indeed `Boolean`, so it performs `ToNumber(x)`, which coerces `true` to `1`. Now, `1 == "42"` is evaluated. The types are still different, so (essentially recursively) we reconsult the algorithm, which just as above will coerce `"42"` to `42`, and `1 == 42` is clearly `false`. + +Reverse it, and we still get the same outcome: + +```js +var x = "42"; +var y = false; + +x == y; // false +``` + +The `Type(y)` is `Boolean` this time, so `ToNumber(y)` yields `0`. `"42" == 0` recursively becomes `42 == 0`, which is of course `false`. + +In other words, **the value `"42"` is neither `== true` nor `== false`.** At first, that statement might seem crazy. How can a value be neither truthy nor falsy? + +But that's the problem! You're asking the wrong question, entirely. It's not your fault, really. Your brain is tricking you. + +`"42"` is indeed truthy, but `"42" == true` **is not performing a boolean test/coercion** at all, no matter what your brain says. `"42"` *is not* being coerced to a `boolean` (`true`), but instead `true` is being coerced to a `1`, and then `"42"` is being coerced to `42`. + +Whether we like it or not, `ToBoolean` is not even involved here, so the truthiness or falsiness of `"42"` is irrelevant to the `==` operation! + +What *is* relevant is to understand how the `==` comparison algorithm behaves with all the different type combinations. As it regards a `boolean` value on either side of the `==`, a `boolean` always coerces to a `number` *first*. + +If that seems strange to you, you're not alone. I personally would recommend to never, ever, under any circumstances, use `== true` or `== false`. Ever. + +But remember, I'm only talking about `==` here. `=== true` and `=== false` wouldn't allow the coercion, so they're safe from this hidden `ToNumber` coercion. + +Consider: + +```js +var a = "42"; + +// bad (will fail!): +if (a == true) { + // .. +} + +// also bad (will fail!): +if (a === true) { + // .. +} + +// good enough (works implicitly): +if (a) { + // .. +} + +// better (works explicitly): +if (!!a) { + // .. +} + +// also great (works explicitly): +if (Boolean( a )) { + // .. +} +``` + +If you avoid ever using `== true` or `== false` (aka loose equality with `boolean`s) in your code, you'll never have to worry about this truthiness/falsiness mental gotcha. + +#### Comparing: `null`s to `undefined`s + +Another example of *implicit* coercion can be seen with `==` loose equality between `null` and `undefined` values. Yet again quoting the ES5 spec, clauses 11.9.3.2-3: + +> 2. If x is null and y is undefined, return true. +> 3. If x is undefined and y is null, return true. + +`null` and `undefined`, when compared with `==` loose equality, equate to (aka coerce to) each other (as well as themselves, obviously), and no other values in the entire language. + +What this means is that `null` and `undefined` can be treated as indistinguishable for comparison purposes, if you use the `==` loose equality operator to allow their mutual *implicit* coercion. + +```js +var a = null; +var b; + +a == b; // true +a == null; // true +b == null; // true + +a == false; // false +b == false; // false +a == ""; // false +b == ""; // false +a == 0; // false +b == 0; // false +``` + +The coercion between `null` and `undefined` is safe and predictable, and no other values can give false positives in such a check. I recommend using this coercion to allow `null` and `undefined` to be indistinguishable and thus treated as the same value. + +For example: + +```js +var a = doSomething(); + +if (a == null) { + // .. +} +``` + +The `a == null` check will pass only if `doSomething()` returns either `null` or `undefined`, and will fail with any other value, even other falsy values like `0`, `false`, and `""`. + +The *explicit* form of the check, which disallows any such coercion, is (I think) unnecessarily much uglier (and perhaps a tiny bit less performant!): + +```js +var a = doSomething(); + +if (a === undefined || a === null) { + // .. +} +``` + +In my opinion, the form `a == null` is yet another example where *implicit* coercion improves code readability, but does so in a reliably safe way. + +#### Comparing: `object`s to non-`object`s + +If an `object`/`function`/`array` is compared to a simple scalar primitive (`string`, `number`, or `boolean`), the ES5 spec says in clauses 11.9.3.8-9: + +> 8. If Type(x) is either String or Number and Type(y) is Object, +> return the result of the comparison x == ToPrimitive(y). +> 9. If Type(x) is Object and Type(y) is either String or Number, +> return the result of the comparison ToPrimitive(x) == y. + +**Note:** You may notice that these clauses only mention `String` and `Number`, but not `Boolean`. That's because, as quoted earlier, clauses 11.9.3.6-7 take care of coercing any `Boolean` operand presented to a `Number` first. + +Consider: + +```js +var a = 42; +var b = [ 42 ]; + +a == b; // true +``` + +The `[ 42 ]` value has its `ToPrimitive` abstract operation called (see the "Abstract Value Operations" section earlier), which results in the `"42"` value. From there, it's just `42 == "42"`, which as we've already covered becomes `42 == 42`, so `a` and `b` are found to be coercively equal. + +**Tip:** All the quirks of the `ToPrimitive` abstract operation that we discussed earlier in this chapter (`toString()`, `valueOf()`) apply here as you'd expect. This can be quite useful if you have a complex data structure that you want to define a custom `valueOf()` method on, to provide a simple value for equality comparison purposes. + +In Chapter 3, we covered "unboxing," where an `object` wrapper around a primitive value (like from `new String("abc")`, for instance) is unwrapped, and the underlying primitive value (`"abc"`) is returned. This behavior is related to the `ToPrimitive` coercion in the `==` algorithm: + +```js +var a = "abc"; +var b = Object( a ); // same as `new String( a )` + +a === b; // false +a == b; // true +``` + +`a == b` is `true` because `b` is coerced (aka "unboxed," unwrapped) via `ToPrimitive` to its underlying `"abc"` simple scalar primitive value, which is the same as the value in `a`. + +There are some values where this is not the case, though, because of other overriding rules in the `==` algorithm. Consider: + +```js +var a = null; +var b = Object( a ); // same as `Object()` +a == b; // false + +var c = undefined; +var d = Object( c ); // same as `Object()` +c == d; // false + +var e = NaN; +var f = Object( e ); // same as `new Number( e )` +e == f; // false +``` + +The `null` and `undefined` values cannot be boxed -- they have no object wrapper equivalent -- so `Object(null)` is just like `Object()` in that both just produce a normal object. + +`NaN` can be boxed to its `Number` object wrapper equivalent, but when `==` causes an unboxing, the `NaN == NaN` comparison fails because `NaN` is never equal to itself (see Chapter 2). + +### Edge Cases + +Now that we've thoroughly examined how the *implicit* coercion of `==` loose equality works (in both sensible and surprising ways), let's try to call out the worst, craziest corner cases so we can see what we need to avoid to not get bitten with coercion bugs. + +First, let's examine how modifying the built-in native prototypes can produce crazy results: + +#### A Number By Any Other Value Would... + +```js +Number.prototype.valueOf = function() { + return 3; +}; + +new Number( 2 ) == 3; // true +``` + +**Warning:** `2 == 3` would not have fallen into this trap, because neither `2` nor `3` would have invoked the built-in `Number.prototype.valueOf()` method because both are already primitive `number` values and can be compared directly. However, `new Number(2)` must go through the `ToPrimitive` coercion, and thus invoke `valueOf()`. + +Evil, huh? Of course it is. No one should ever do such a thing. The fact that you *can* do this is sometimes used as a criticism of coercion and `==`. But that's misdirected frustration. JavaScript is not *bad* because you can do such things, a developer is *bad* **if they do such things**. Don't fall into the "my programming language should protect me from myself" fallacy. + +Next, let's consider another tricky example, which takes the evil from the previous example to another level: + +```js +if (a == 2 && a == 3) { + // .. +} +``` + +You might think this would be impossible, because `a` could never be equal to both `2` and `3` *at the same time*. But "at the same time" is inaccurate, since the first expression `a == 2` happens strictly *before* `a == 3`. + +So, what if we make `a.valueOf()` have side effects each time it's called, such that the first time it returns `2` and the second time it's called it returns `3`? Pretty easy: + +```js +var i = 2; + +Number.prototype.valueOf = function() { + return i++; +}; + +var a = new Number( 42 ); + +if (a == 2 && a == 3) { + console.log( "Yep, this happened." ); +} +``` + +Again, these are evil tricks. Don't do them. But also don't use them as complaints against coercion. Potential abuses of a mechanism are not sufficient evidence to condemn the mechanism. Just avoid these crazy tricks, and stick only with valid and proper usage of coercion. + +#### False-y Comparisons + +The most common complaint against *implicit* coercion in `==` comparisons comes from how falsy values behave surprisingly when compared to each other. + +To illustrate, let's look at a list of the corner-cases around falsy value comparisons, to see which ones are reasonable and which are troublesome: + +```js +"0" == null; // false +"0" == undefined; // false +"0" == false; // true -- UH OH! +"0" == NaN; // false +"0" == 0; // true +"0" == ""; // false + +false == null; // false +false == undefined; // false +false == NaN; // false +false == 0; // true -- UH OH! +false == ""; // true -- UH OH! +false == []; // true -- UH OH! +false == {}; // false + +"" == null; // false +"" == undefined; // false +"" == NaN; // false +"" == 0; // true -- UH OH! +"" == []; // true -- UH OH! +"" == {}; // false + +0 == null; // false +0 == undefined; // false +0 == NaN; // false +0 == []; // true -- UH OH! +0 == {}; // false +``` + +In this list of 24 comparisons, 17 of them are quite reasonable and predictable. For example, we know that `""` and `NaN` are not at all equatable values, and indeed they don't coerce to be loose equals, whereas `"0"` and `0` are reasonably equatable and *do* coerce as loose equals. + +However, seven of the comparisons are marked with "UH OH!" because as false positives, they are much more likely gotchas that could trip you up. `""` and `0` are definitely distinctly different values, and it's rare you'd want to treat them as equatable, so their mutual coercion is troublesome. Note that there aren't any false negatives here. + +#### The Crazy Ones + +We don't have to stop there, though. We can keep looking for even more troublesome coercions: + +```js +[] == ![]; // true +``` + +Oooo, that seems at a higher level of crazy, right!? Your brain may likely trick you that you're comparing a truthy to a falsy value, so the `true` result is surprising, as we *know* a value can never be truthy and falsy at the same time! + +But that's not what's actually happening. Let's break it down. What do we know about the `!` unary operator? It explicitly coerces to a `boolean` using the `ToBoolean` rules (and it also flips the parity). So before `[] == ![]` is even processed, it's actually already translated to `[] == false`. We already saw that form in our above list (`false == []`), so its surprise result is *not new* to us. + +How about other corner cases? + +```js +2 == [2]; // true +"" == [null]; // true +``` + +As we said earlier in our `ToNumber` discussion, the right-hand side `[2]` and `[null]` values will go through a `ToPrimitive` coercion so they can be more readily compared to the simple primitives (`2` and `""`, respectively) on the left-hand side. Since the `valueOf()` for `array` values just returns the `array` itself, coercion falls to stringifying the `array`. + +`[2]` will become `"2"`, which then is `ToNumber` coerced to `2` for the right-hand side value in the first comparison. `[null]` just straight becomes `""`. + +So, `2 == 2` and `"" == ""` are completely understandable. + +If your instinct is to still dislike these results, your frustration is not actually with coercion like you probably think it is. It's actually a complaint against the default `array` values' `ToPrimitive` behavior of coercing to a `string` value. More likely, you'd just wish that `[2].toString()` didn't return `"2"`, or that `[null].toString()` didn't return `""`. + +But what exactly *should* these `string` coercions result in? I can't really think of any other appropriate `string` coercion of `[2]` than `"2"`, except perhaps `"[2]"` -- but that could be very strange in other contexts! + +You could rightly make the case that since `String(null)` becomes `"null"`, then `String([null])` should also become `"null"`. That's a reasonable assertion. So, that's the real culprit. + +*Implicit* coercion itself isn't the evil here. Even an *explicit* coercion of `[null]` to a `string` results in `""`. What's at odds is whether it's sensible at all for `array` values to stringify to the equivalent of their contents, and exactly how that happens. So, direct your frustration at the rules for `String( [..] )`, because that's where the craziness stems from. Perhaps there should be no stringification coercion of `array`s at all? But that would have lots of other downsides in other parts of the language. + +Another famously cited gotcha: + +```js +0 == "\n"; // true +``` + +As we discussed earlier with empty `""`, `"\n"` (or `" "` or any other whitespace combination) is coerced via `ToNumber`, and the result is `0`. What other `number` value would you expect whitespace to coerce to? Does it bother you that *explicit* `Number(" ")` yields `0`? + +Really the only other reasonable `number` value that empty strings or whitespace strings could coerce to is the `NaN`. But would that *really* be better? The comparison `" " == NaN` would of course fail, but it's unclear that we'd have really *fixed* any of the underlying concerns. + +The chances that a real-world JS program fails because `0 == "\n"` are awfully rare, and such corner cases are easy to avoid. + +Type conversions **always** have corner cases, in any language -- nothing specific to coercion. The issues here are about second-guessing a certain set of corner cases (and perhaps rightly so!?), but that's not a salient argument against the overall coercion mechanism. + +Bottom line: almost any crazy coercion between *normal values* that you're likely to run into (aside from intentionally tricky `valueOf()` or `toString()` hacks as earlier) will boil down to the short seven-item list of gotcha coercions we've identified above. + +To contrast against these 24 likely suspects for coercion gotchas, consider another list like this: + +```js +42 == "43"; // false +"foo" == 42; // false +"true" == true; // false + +42 == "42"; // true +"foo" == [ "foo" ]; // true +``` + +In these nonfalsy, noncorner cases (and there are literally an infinite number of comparisons we could put on this list), the coercion results are totally safe, reasonable, and explainable. + +#### Sanity Check + +OK, we've definitely found some crazy stuff when we've looked deeply into *implicit* coercion. No wonder that most developers claim coercion is evil and should be avoided, right!? + +But let's take a step back and do a sanity check. + +By way of magnitude comparison, we have *a list* of seven troublesome gotcha coercions, but we have *another list* of (at least 17, but actually infinite) coercions that are totally sane and explainable. + +If you're looking for a textbook example of "throwing the baby out with the bathwater," this is it: discarding the entirety of coercion (the infinitely large list of safe and useful behaviors) because of a list of literally just seven gotchas. + +The more prudent reaction would be to ask, "how can I use the countless *good parts* of coercion, but avoid the few *bad parts*?" + +Let's look again at the *bad* list: + +```js +"0" == false; // true -- UH OH! +false == 0; // true -- UH OH! +false == ""; // true -- UH OH! +false == []; // true -- UH OH! +"" == 0; // true -- UH OH! +"" == []; // true -- UH OH! +0 == []; // true -- UH OH! +``` + +Four of the seven items on this list involve `== false` comparison, which we said earlier you should **always, always** avoid. That's a pretty easy rule to remember. + +Now the list is down to three. + +```js +"" == 0; // true -- UH OH! +"" == []; // true -- UH OH! +0 == []; // true -- UH OH! +``` + +Are these reasonable coercions you'd do in a normal JavaScript program? Under what conditions would they really happen? + +I don't think it's terribly likely that you'd literally use `== []` in a `boolean` test in your program, at least not if you know what you're doing. You'd probably instead be doing `== ""` or `== 0`, like: + +```js +function doSomething(a) { + if (a == "") { + // .. + } +} +``` + +You'd have an oops if you accidentally called `doSomething(0)` or `doSomething([])`. Another scenario: + +```js +function doSomething(a,b) { + if (a == b) { + // .. + } +} +``` + +Again, this could break if you did something like `doSomething("",0)` or `doSomething([],"")`. + +So, while the situations *can* exist where these coercions will bite you, and you'll want to be careful around them, they're probably not super common on the whole of your code base. + +#### Safely Using Implicit Coercion + +The most important advice I can give you: examine your program and reason about what values can show up on either side of an `==` comparison. To effectively avoid issues with such comparisons, here's some heuristic rules to follow: + +1. If either side of the comparison can have `true` or `false` values, don't ever, EVER use `==`. +2. If either side of the comparison can have `[]`, `""`, or `0` values, seriously consider not using `==`. + +In these scenarios, it's almost certainly better to use `===` instead of `==`, to avoid unwanted coercion. Follow those two simple rules and pretty much all the coercion gotchas that could reasonably hurt you will effectively be avoided. + +**Being more explicit/verbose in these cases will save you from a lot of headaches.** + +The question of `==` vs. `===` is really appropriately framed as: should you allow coercion for a comparison or not? + +There's lots of cases where such coercion can be helpful, allowing you to more tersely express some comparison logic (like with `null` and `undefined`, for example). + +In the overall scheme of things, there's relatively few cases where *implicit* coercion is truly dangerous. But in those places, for safety sake, definitely use `===`. + +**Tip:** Another place where coercion is guaranteed *not* to bite you is with the `typeof` operator. `typeof` is always going to return you one of seven strings (see Chapter 1), and none of them are the empty `""` string. As such, there's no case where checking the type of some value is going to run afoul of *implicit* coercion. `typeof x == "function"` is 100% as safe and reliable as `typeof x === "function"`. Literally, the spec says the algorithm will be identical in this situation. So, don't just blindly use `===` everywhere simply because that's what your code tools tell you to do, or (worst of all) because you've been told in some book to **not think about it**. You own the quality of your code. + +Is *implicit* coercion evil and dangerous? In a few cases, yes, but overwhelmingly, no. + +Be a responsible and mature developer. Learn how to use the power of coercion (both *explicit* and *implicit*) effectively and safely. And teach those around you to do the same. + +Here's a handy table made by Alex Dorey (@dorey on GitHub) to visualize a variety of comparisons: + + + +Source: https://github.com/dorey/JavaScript-Equality-Table + +## Abstract Relational Comparison + +While this part of *implicit* coercion often gets a lot less attention, it's important nonetheless to think about what happens with `a < b` comparisons (similar to how we just examined `a == b` in depth). + +The "Abstract Relational Comparison" algorithm in ES5 section 11.8.5 essentially divides itself into two parts: what to do if the comparison involves both `string` values (second half), or anything else (first half). + +**Note:** The algorithm is only defined for `a < b`. So, `a > b` is handled as `b < a`. + +The algorithm first calls `ToPrimitive` coercion on both values, and if the return result of either call is not a `string`, then both values are coerced to `number` values using the `ToNumber` operation rules, and compared numerically. + +For example: + +```js +var a = [ 42 ]; +var b = [ "43" ]; + +a < b; // true +b < a; // false +``` + +**Note:** Similar caveats for `-0` and `NaN` apply here as they did in the `==` algorithm discussed earlier. + +However, if both values are `string`s for the `<` comparison, simple lexicographic (natural alphabetic) comparison on the characters is performed: + +```js +var a = [ "42" ]; +var b = [ "043" ]; + +a < b; // false +``` + +`a` and `b` are *not* coerced to `number`s, because both of them end up as `string`s after the `ToPrimitive` coercion on the two `array`s. So, `"42"` is compared character by character to `"043"`, starting with the first characters `"4"` and `"0"`, respectively. Since `"0"` is lexicographically *less than* than `"4"`, the comparison returns `false`. + +The exact same behavior and reasoning goes for: + +```js +var a = [ 4, 2 ]; +var b = [ 0, 4, 3 ]; + +a < b; // false +``` + +Here, `a` becomes `"4,2"` and `b` becomes `"0,4,3"`, and those lexicographically compare identically to the previous snippet. + +What about: + +```js +var a = { b: 42 }; +var b = { b: 43 }; + +a < b; // ?? +``` + +`a < b` is also `false`, because `a` becomes `[object Object]` and `b` becomes `[object Object]`, and so clearly `a` is not lexicographically less than `b`. + +But strangely: + +```js +var a = { b: 42 }; +var b = { b: 43 }; + +a < b; // false +a == b; // false +a > b; // false + +a <= b; // true +a >= b; // true +``` + +Why is `a == b` not `true`? They're the same `string` value (`"[object Object]"`), so it seems they should be equal, right? Nope. Recall the previous discussion about how `==` works with `object` references. + +But then how are `a <= b` and `a >= b` resulting in `true`, if `a < b` **and** `a == b` **and** `a > b` are all `false`? + +Because the spec says for `a <= b`, it will actually evaluate `b < a` first, and then negate that result. Since `b < a` is *also* `false`, the result of `a <= b` is `true`. + +That's probably awfully contrary to how you might have explained what `<=` does up to now, which would likely have been the literal: "less than *or* equal to." JS more accurately considers `<=` as "not greater than" (`!(a > b)`, which JS treats as `!(b < a)`). Moreover, `a >= b` is explained by first considering it as `b <= a`, and then applying the same reasoning. + +Unfortunately, there is no "strict relational comparison" as there is for equality. In other words, there's no way to prevent *implicit* coercion from occurring with relational comparisons like `a < b`, other than to ensure that `a` and `b` are of the same type explicitly before making the comparison. + +Use the same reasoning from our earlier `==` vs. `===` sanity check discussion. If coercion is helpful and reasonably safe, like in a `42 < "43"` comparison, **use it**. On the other hand, if you need to be safe about a relational comparison, *explicitly coerce* the values first, before using `<` (or its counterparts). + +```js +var a = [ 42 ]; +var b = "043"; + +a < b; // false -- string comparison! +Number( a ) < Number( b ); // true -- number comparison! +``` + +## Review + +In this chapter, we turned our attention to how JavaScript type conversions happen, called **coercion**, which can be characterized as either *explicit* or *implicit*. + +Coercion gets a bad rap, but it's actually quite useful in many cases. An important task for the responsible JS developer is to take the time to learn all the ins and outs of coercion to decide which parts will help improve their code, and which parts they really should avoid. + +*Explicit* coercion is code which is obvious that the intent is to convert a value from one type to another. The benefit is improvement in readability and maintainability of code by reducing confusion. + +*Implicit* coercion is coercion that is "hidden" as a side-effect of some other operation, where it's not as obvious that the type conversion will occur. While it may seem that *implicit* coercion is the opposite of *explicit* and is thus bad (and indeed, many think so!), actually *implicit* coercion is also about improving the readability of code. + +Especially for *implicit*, coercion must be used responsibly and consciously. Know why you're writing the code you're writing, and how it works. Strive to write code that others will easily be able to learn from and understand as well. diff --git a/types & grammar/ch5.md b/types & grammar/ch5.md new file mode 100644 index 0000000..7aa277f --- /dev/null +++ b/types & grammar/ch5.md @@ -0,0 +1,1387 @@ +# You Don't Know JS: Types & Grammar +# Chapter 5: Grammar + +The last major topic we want to tackle is how JavaScript's language syntax works (aka its grammar). You may think you know how to write JS, but there's an awful lot of nuance to various parts of the language grammar that lead to confusion and misconception, so we want to dive into those parts and clear some things up. + +**Note:** The term "grammar" may be a little less familiar to readers than the term "syntax." In many ways, they are similar terms, describing the *rules* for how the language works. There are nuanced differences, but they mostly don't matter for our discussion here. The grammar for JavaScript is a structured way to describe how the syntax (operators, keywords, etc.) fits together into well-formed, valid programs. In other words, discussing syntax without grammar would leave out a lot of the important details. So our focus here in this chapter is most accurately described as *grammar*, even though the raw syntax of the language is what developers directly interact with. + +## Statements & Expressions + +It's fairly common for developers to assume that the term "statement" and "expression" are roughly equivalent. But here we need to distinguish between the two, because there are some very important differences in our JS programs. + +To draw the distinction, let's borrow from terminology you may be more familiar with: the English language. + +A "sentence" is one complete formation of words that expresses a thought. It's comprised of one or more "phrases," each of which can be connected with punctuation marks or conjunction words ("and," "or," etc). A phrase can itself be made up of smaller phrases. Some phrases are incomplete and don't accomplish much by themselves, while other phrases can stand on their own. These rules are collectively called the *grammar* of the English language. + +And so it goes with JavaScript grammar. Statements are sentences, expressions are phrases, and operators are conjunctions/punctuation. + +Every expression in JS can be evaluated down to a single, specific value result. For example: + +```js +var a = 3 * 6; +var b = a; +b; +``` + +In this snippet, `3 * 6` is an expression (evaluates to the value `18`). But `a` on the second line is also an expression, as is `b` on the third line. The `a` and `b` expressions both evaluate to the values stored in those variables at that moment, which also happens to be `18`. + +Moreover, each of the three lines is a statement containing expressions. `var a = 3 * 6` and `var b = a` are called "declaration statements" because they each declare a variable (and optionally assign a value to it). The `a = 3 * 6` and `b = a` assignments (minus the `var`s) are called assignment expressions. + +The third line contains just the expression `b`, but it's also a statement all by itself (though not a terribly interesting one!). This is generally referred to as an "expression statement." + +### Statement Completion Values + +It's a fairly little known fact that statements all have completion values (even if that value is just `undefined`). + +How would you even go about seeing the completion value of a statement? + +The most obvious answer is to type the statement into your browser's developer console, because when you execute it, the console by default reports the completion value of the most recent statement it executed. + +Let's consider `var b = a`. What's the completion value of that statement? + +The `b = a` assignment expression results in the value that was assigned (`18` above), but the `var` statement itself results in `undefined`. Why? Because `var` statements are defined that way in the spec. If you put `var a = 42;` into your console, you'll see `undefined` reported back instead of `42`. + +**Note:** Technically, it's a little more complex than that. In the ES5 spec, section 12.2 "Variable Statement," the `VariableDeclaration` algorithm actually *does* return a value (a `string` containing the name of the variable declared -- weird, huh!?), but that value is basically swallowed up (except for use by the `for..in` loop) by the `VariableStatement` algorithm, which forces an empty (aka `undefined`) completion value. + +In fact, if you've done much code experimenting in your console (or in a JavaScript environment REPL -- read/evaluate/print/loop tool), you've probably seen `undefined` reported after many different statements, and perhaps never realized why or what that was. Put simply, the console is just reporting the statement's completion value. + +But what the console prints out for the completion value isn't something we can use inside our program. So how can we capture the completion value? + +That's a much more complicated task. Before we explain *how*, let's explore *why* would you want to do that? + +We need to consider other types of statement completion values. For example, any regular `{ .. }` block has a completion value of the completion value of its last contained statement/expression. + +Consider: + +```js +var b; + +if (true) { + b = 4 + 38; +} +``` + +If you typed that into your console/REPL, you'd probably see `42` reported, since `42` is the completion value of the `if` block, which took on the completion value of its last assignment expression statement `b = 4 + 38`. + +In other words, the completion value of a block is like an *implicit return* of the last statement value in the block. + +**Note:** This is conceptually familiar in languages like CoffeeScript, which have implicit `return` values from `function`s that are the same as the last statement value in the function. + +But there's an obvious problem. This kind of code doesn't work: + +```js +var a, b; + +a = if (true) { + b = 4 + 38; +}; +``` + +We can't capture the completion value of a statement and assign it into another variable in any easy syntactic/grammatical way (at least not yet!). + +So, what can we do? + +**Warning**: For demo purposes only -- don't actually do the following in your real code! + +We could use the much maligned `eval(..)` (sometimes pronounced "evil") function to capture this completion value. + +```js +var a, b; + +a = eval( "if (true) { b = 4 + 38; }" ); + +a; // 42 +``` + +Yeeeaaahhhh. That's terribly ugly. But it works! And it illustrates the point that statement completion values are a real thing that can be captured not just in our console but in our programs. + +There's a proposal for ES7 called "do expression." Here's how it might work: + +```js +var a, b; + +a = do { + if (true) { + b = 4 + 38; + } +}; + +a; // 42 +``` + +The `do { .. }` expression executes a block (with one or many statements in it), and the final statement completion value inside the block becomes the completion value *of* the `do` expression, which can then be assigned to `a` as shown. + +The general idea is to be able to treat statements as expressions -- they can show up inside other statements -- without needing to wrap them in an inline function expression and perform an explicit `return ..`. + +For now, statement completion values are not much more than trivia. But they're probably going to take on more significance as JS evolves, and hopefully `do { .. }` expressions will reduce the temptation to use stuff like `eval(..)`. + +**Warning:** Repeating my earlier admonition: avoid `eval(..)`. Seriously. See the *Scope & Closures* title of this series for more explanation. + +### Expression Side Effects + +Most expressions don't have side effects. For example: + +```js +var a = 2; +var b = a + 3; +``` + +The expression `a + 3` did not *itself* have a side effect, like for instance changing `a`. It had a result, which is `5`, and that result was assigned to `b` in the statement `b = a + 3`. + +The most common example of an expression with (possible) side effects is a function call expression: + +```js +function foo() { + a = a + 1; +} + +var a = 1; +foo(); // result: `undefined`, side effect: changed `a` +``` + +There are other side-effecting expressions, though. For example: + +```js +var a = 42; +var b = a++; +``` + +The expression `a++` has two separate behaviors. *First*, it returns the current value of `a`, which is `42` (which then gets assigned to `b`). But *next*, it changes the value of `a` itself, incrementing it by one. + +```js +var a = 42; +var b = a++; + +a; // 43 +b; // 42 +``` + +Many developers would mistakenly believe that `b` has value `43` just like `a` does. But the confusion comes from not fully considering the *when* of the side effects of the `++` operator. + +The `++` increment operator and the `--` decrement operator are both unary operators (see Chapter 4), which can be used in either a postfix ("after") position or prefix ("before") position. + +```js +var a = 42; + +a++; // 42 +a; // 43 + +++a; // 44 +a; // 44 +``` + +When `++` is used in the prefix position as `++a`, its side effect (incrementing `a`) happens *before* the value is returned from the expression, rather than *after* as with `a++`. + +**Note:** Would you think `++a++` was legal syntax? If you try it, you'll get a `ReferenceError` error, but why? Because side-effecting operators **require a variable reference** to target their side effects to. For `++a++`, the `a++` part is evaluated first (because of operator precedence -- see below), which gives back the value of `a` _before_ the increment. But then it tries to evaluate `++42`, which (if you try it) gives the same `ReferenceError` error, since `++` can't have a side effect directly on a value like `42`. + +It is sometimes mistakenly thought that you can encapsulate the *after* side effect of `a++` by wrapping it in a `( )` pair, like: + +```js +var a = 42; +var b = (a++); + +a; // 43 +b; // 42 +``` + +Unfortunately, `( )` itself doesn't define a new wrapped expression that would be evaluated *after* the *after side effect* of the `a++` expression, as we might have hoped. In fact, even if it did, `a++` returns `42` first, and unless you have another expression that reevaluates `a` after the side effect of `++`, you're not going to get `43` from that expression, so `b` will not be assigned `43`. + +There's an option, though: the `,` statement-series comma operator. This operator allows you to string together multiple standalone expression statements into a single statement: + +```js +var a = 42, b; +b = ( a++, a ); + +a; // 43 +b; // 43 +``` + +**Note:** The `( .. )` around `a++, a` is required here. The reason is operator precedence, which we'll cover later in this chapter. + +The expression `a++, a` means that the second `a` statement expression gets evaluated *after* the *after side effects* of the first `a++` statement expression, which means it returns the `43` value for assignment to `b`. + +Another example of a side-effecting operator is `delete`. As we showed in Chapter 2, `delete` is used to remove a property from an `object` or a slot from an `array`. But it's usually just called as a standalone statement: + +```js +var obj = { + a: 42 +}; + +obj.a; // 42 +delete obj.a; // true +obj.a; // undefined +``` + +The result value of the `delete` operator is `true` if the requested operation is valid/allowable, or `false` otherwise. But the side effect of the operator is that it removes the property (or array slot). + +**Note:** What do we mean by valid/allowable? Nonexistent properties, or properties that exist and are configurable (see Chapter 3 of the *this & Object Prototypes* title of this series) will return `true` from the `delete` operator. Otherwise, the result will be `false` or an error. + +One last example of a side-effecting operator, which may at once be both obvious and nonobvious, is the `=` assignment operator. + +Consider: + +```js +var a; + +a = 42; // 42 +a; // 42 +``` + +It may not seem like `=` in `a = 42` is a side-effecting operator for the expression. But if we examine the result value of the `a = 42` statement, it's the value that was just assigned (`42`), so the assignment of that same value into `a` is essentially a side effect. + +**Tip:** The same reasoning about side effects goes for the compound-assignment operators like `+=`, `-=`, etc. For example, `a = b += 2` is processed first as `b += 2` (which is `b = b + 2`), and the result of *that* `=` assignment is then assigned to `a`. + +This behavior that an assignment expression (or statement) results in the assigned value is primarily useful for chained assignments, such as: + +```js +var a, b, c; + +a = b = c = 42; +``` + +Here, `c = 42` is evaluated to `42` (with the side effect of assigning `42` to `c`), then `b = 42` is evaluated to `42` (with the side effect of assigning `42` to `b`), and finally `a = 42` is evaluated (with the side effect of assigning `42` to `a`). + +**Warning:** A common mistake developers make with chained assignments is like `var a = b = 42`. While this looks like the same thing, it's not. If that statement were to happen without there also being a separate `var b` (somewhere in the scope) to formally declare `b`, then `var a = b = 42` would not declare `b` directly. Depending on `strict` mode, that would either throw an error or create an accidental global (see the *Scope & Closures* title of this series). + +Another scenario to consider: + +```js +function vowels(str) { + var matches; + + if (str) { + // pull out all the vowels + matches = str.match( /[aeiou]/g ); + + if (matches) { + return matches; + } + } +} + +vowels( "Hello World" ); // ["e","o","o"] +``` + +This works, and many developers prefer such. But using an idiom where we take advantage of the assignment side effect, we can simplify by combining the two `if` statements into one: + +```js +function vowels(str) { + var matches; + + // pull out all the vowels + if (str && (matches = str.match( /[aeiou]/g ))) { + return matches; + } +} + +vowels( "Hello World" ); // ["e","o","o"] +``` + +**Note:** The `( .. )` around `matches = str.match..` is required. The reason is operator precedence, which we'll cover in the "Operator Precedence" section later in this chapter. + +I prefer this shorter style, as I think it makes it clearer that the two conditionals are in fact related rather than separate. But as with most stylistic choices in JS, it's purely opinion which one is *better*. + +### Contextual Rules + +There are quite a few places in the JavaScript grammar rules where the same syntax means different things depending on where/how it's used. This kind of thing can, in isolation, cause quite a bit of confusion. + +We won't exhaustively list all such cases here, but just call out a few of the common ones. + +#### `{ .. }` Curly Braces + +There's two main places (and more coming as JS evolves!) that a pair of `{ .. }` curly braces will show up in your code. Let's take a look at each of them. + +##### Object Literals + +First, as an `object` literal: + +```js +// assume there's a `bar()` function defined + +var a = { + foo: bar() +}; +``` + +How do we know this is an `object` literal? Because the `{ .. }` pair is a value that's getting assigned to `a`. + +**Note:** The `a` reference is called an "l-value" (aka left-hand value) since it's the target of an assignment. The `{ .. }` pair is an "r-value" (aka right-hand value) since it's used *just* as a value (in this case as the source of an assignment). + +##### Labels + +What happens if we remove the `var a =` part of the above snippet? + +```js +// assume there's a `bar()` function defined + +{ + foo: bar() +} +``` + +A lot of developers assume that the `{ .. }` pair is just a standalone `object` literal that doesn't get assigned anywhere. But it's actually entirely different. + +Here, `{ .. }` is just a regular code block. It's not very idiomatic in JavaScript (much more so in other languages!) to have a standalone `{ .. }` block like that, but it's perfectly valid JS grammar. It can be especially helpful when combined with `let` block-scoping declarations (see the *Scope & Closures* title in this series). + +The `{ .. }` code block here is functionally pretty much identical to the code block being attached to some statement, like a `for`/`while` loop, `if` conditional, etc. + +But if it's a normal block of code, what's that bizarre looking `foo: bar()` syntax, and how is that legal? + +It's because of a little known (and, frankly, discouraged) feature in JavaScript called "labeled statements." `foo` is a label for the statement `bar()` (which has omitted its trailing `;` -- see "Automatic Semicolons" later in this chapter). But what's the point of a labeled statement? + +If JavaScript had a `goto` statement, you'd theoretically be able to say `goto foo` and have execution jump to that location in code. `goto`s are usually considered terrible coding idioms as they make code much harder to understand (aka "spaghetti code"), so it's a *very good thing* that JavaScript doesn't have a general `goto`. + +However, JS *does* support a limited, special form of `goto`: labeled jumps. Both the `continue` and `break` statements can optionally accept a specified label, in which case the program flow "jumps" kind of like a `goto`. Consider: + +```js +// `foo` labeled-loop +foo: for (var i=0; i<4; i++) { + for (var j=0; j<4; j++) { + // whenever the loops meet, continue outer loop + if (j == i) { + // jump to the next iteration of + // the `foo` labeled-loop + continue foo; + } + + // skip odd multiples + if ((j * i) % 2 == 1) { + // normal (non-labeled) `continue` of inner loop + continue; + } + + console.log( i, j ); + } +} +// 1 0 +// 2 0 +// 2 1 +// 3 0 +// 3 2 +``` + +**Note:** `continue foo` does not mean "go to the 'foo' labeled position to continue", but rather, "continue the loop that is labeled 'foo' with its next iteration." So, it's not *really* an arbitrary `goto`. + +As you can see, we skipped over the odd-multiple `3 1` iteration, but the labeled-loop jump also skipped iterations `1 1` and `2 2`. + +Perhaps a slightly more useful form of the labeled jump is with `break __` from inside an inner loop where you want to break out of the outer loop. Without a labeled `break`, this same logic could sometimes be rather awkward to write: + +```js +// `foo` labeled-loop +foo: for (var i=0; i<4; i++) { + for (var j=0; j<4; j++) { + if ((i * j) >= 3) { + console.log( "stopping!", i, j ); + // break out of the `foo` labeled loop + break foo; + } + + console.log( i, j ); + } +} +// 0 0 +// 0 1 +// 0 2 +// 0 3 +// 1 0 +// 1 1 +// 1 2 +// stopping! 1 3 +``` + +**Note:** `break foo` does not mean "go to the 'foo' labeled position to continue," but rather, "break out of the loop/block that is labeled 'foo' and continue *after* it." Not exactly a `goto` in the traditional sense, huh? + +The nonlabeled `break` alternative to the above would probably need to involve one or more functions, shared scope variable access, etc. It would quite likely be more confusing than labeled `break`, so here using a labeled `break` is perhaps the better option. + +A label can apply to a non-loop block, but only `break` can reference such a non-loop label. You can do a labeled `break ___` out of any labeled block, but you cannot `continue ___` a non-loop label, nor can you do a non-labeled `break` out of a block. + +```js +function foo() { + // `bar` labeled-block + bar: { + console.log( "Hello" ); + break bar; + console.log( "never runs" ); + } + console.log( "World" ); +} + +foo(); +// Hello +// World +``` + +Labeled loops/blocks are extremely uncommon, and often frowned upon. It's best to avoid them if possible; for example using function calls instead of the loop jumps. But there are perhaps some limited cases where they might be useful. If you're going to use a labeled jump, make sure to document what you're doing with plenty of comments! + +It's a very common belief that JSON is a proper subset of JS, so a string of JSON (like `{"a":42}` -- notice the quotes around the property name as JSON requires!) is thought to be a valid JavaScript program. **Not true!** Try putting `{"a":42}` into your JS console, and you'll get an error. + +That's because statement labels cannot have quotes around them, so `"a"` is not a valid label, and thus `:` can't come right after it. + +So, JSON is truly a subset of JS syntax, but JSON is not valid JS grammar by itself. + +One extremely common misconception along these lines is that if you were to load a JS file into a `