js-async-study

A study on refactoring away chained promises to async/await

Introduction

I wrote some code a while ago in hury, it works well and outputs the correct result, but I honestly hate the style.

See here:

const collectFiles = async (dir, collected = []) => {
    const files = await fs.readdir(dir)
        .then((list) => (
            list.map(file => resolve(dir, file))
        ))
        .then((list) => (
            Promise.all(
                list.map(async (file) => {
                    const stat = await fs.stat(file);
                    return {
                        file,
                        stat,
                    };
                })
            )
        ))
        .then((list) => (
            Promise.all(
                list.map(async (item) => {
                    if (item.stat.isDirectory()) {
                        list.push(
                            (await collectFiles(item.file, collected))
                        );
                    }
                    return item;
                })
            )
        ))
        .then((list) => (
            list
                .filter(({ stat }) => stat.isFile())
                .filter(({ file }) => file.endsWith(".md"))
                .map(({ file }) => file)
        ));

    collected.push(...files);
    return collected;
}

Problems

The code works, it is concise however it combines async/await with a promise chain.

Why does this irritate me?

The promise chain uses anonymous functions that have no context, this makes it hard to understand because at each phase you must read every line of code before it makes sense.

As the chain increases in size it becomes harder to understand.

Testing this code is painful but we could run the code against the file system and assert the desired result to save some time, not ideal.

Getting Started

If you need to dig into the example checkout the GitHub repo.

This will be a technical article so reading code from the repo is going to help you follow along, I recommend it.

First write a quick and dirty test function before we change anything:

const test = async () => {
    let files = [];
    const directory = "./example";
    const expected = [
        "example/subfolder/super-secret.md",
        "example/one.md",
        "example/two.md"
    ].map(file => resolvePath(file));

    log(`Collecting files from: ${directory}`);

    try {
        files = await collectFiles(directory);
        log(`Result:`, files);
        assert(utilEquals(expected, files));
    } catch (err) {
        log(err);
    }
};

test();

Running this code will produce the following output:

$ node ./scratch.js
Collecting files from: ./example
Result: [
  '/home/beast/Projects/collect-files/example/subfolder/super-secret.md',
  '/home/beast/Projects/collect-files/example/one.md',
  '/home/beast/Projects/collect-files/example/two.md'
]

Phase One

We can move each step wihthin the promise chain to their own named function.

const resolveFiles = (dir) => list => (
    list.map(filename => resolvePath(dir, filename))
);

const recurseSubDirectory = (collected) => (list) => (
    Promise.all(
        list.map(async (item) => {
            if (item.stat.isDirectory()) {
                list.push(
                    (await collectFiles(item.file, collected))
                );
            }
            return item;
        })
    )
);

const mapFileStats = (list) => (
    Promise.all(
        list.map(async (file) => {
            const stat = await fs.stat(file);
            return {
                file,
                stat,
            };
        })
    )
);

const onlyFiles = list => (
    list.filter(({ stat }) => stat.isFile())
);

const onlyFileType = list => (
    list.filter(({ file }) => file.endsWith(".md"))
);

const mapFile = list => list.map(({ file }) => file);


const collectFiles = async (dir, collected = []) => {
    const files = await fs.readdir(dir)
        .then(resolveFiles(dir))
        .then(mapFileStats)
        .then(recurseSubDirectory(collected))
        .then(onlyFiles)
        .then(onlyFileType)
        .then(mapFile);

    collected.push(...files);
    return collected;
}

Much better, the collectFiles function has clear context and each step can be tested individually.

But this code still bothers me, it combines async/await with the promise chain what else can we do?

Phase Two

Now remove the promise chain:

const collectFiles = async (dir, collected = []) => {
    let files = await fs.readdir(dir);
    files = resolveFiles(dir)(files);
    files = await mapFileStats(files);
    files = await recurseSubDirectory(collected)(files);
    files = onlyFiles(files);
    files = onlyFileType(files);
    files = mapFile(files);

    collected.push(...files);
    return collected;
}

Okay that is not too bad but looks like reduction.

Do we really have to reassign the files variable with each step?

Experience has taught me that often when you find yourself reassigning the same variable you can write a reducer to solve the same problem.

Phase Three

Adding the reducer:

const collectFiles = async (dir, collected = []) => {
    const inputDirectory = await fs.readdir(
        resolvePath(dir)
    );

    const actions = [
        fullFilePath(dir),
        mapFileStats,
        recurseSubDirectory(collected),
        onlyFiles,
        onlyFileType,
        mapFile,
    ];

    const files = await actions.reduce(
        async (list, action) => (
            action(await list)
        ),
        inputDirectory
    );

    collected.push(...files);
    return collected;
}

Now walk through these changes to understand.

We get a list of files from the inputDirectory using fs.readdir, it is async so we await the result.

Once we have the list of files, we need to take a series of actions to produce the desired output, we define an array of actions that need to be taken:

const actions = [
    fullFilePath(dir),
    mapFileStats,
    recurseSubDirectory(collected),
    onlyFiles,
    onlyFileType,
    mapFile,
];

As you can see these are just the same references to the functions in the previous phase.

Now we can run these actions through a reducer to obtain the result:

const files = await actions.reduce(
        async (list, action) => (
            action(await list)
        ),
    inputDirectory
);

The result is a Promise so we await the result from the beggining:

const files = await actions.reduce ...

Next we define the reducer function:

async (list, action) => (
    action(await list)
),

In some cases, the previous list parameter would have been a Promise so we flag the function as async and await the list value.

action(await list)

Take away

In JavaScript Promise is a lower-level abstraction than async/await, try to avoid promise chains.

If you cannot work your way out of promises, at least break out the callbacks into named functions to give your code context and meaning.

Bonus Round

Clean up some of the actions we can do better!

First things first, rename dir to directory abreviations are aweful and loose context for example is dir a directory or a direction?

I also hate the dirent abreviation but this is a built in Node.js convention so to make it easier to cross reference with their documentation I will leave that as is.

const filterFiles = (extension) => (list) => (
    list.filter(dirent =>
        (
            !extension &&
            dirent.isFile()
        ) ||
        (
            dirent.isFile() &&
            extension &&
            dirent.name.endsWith(extension)
        )
    )
);

const recurseSubDirectory = (directory, collected) => (list) => (
    Promise.all(
        list.map(async (dirent) => {
            if (dirent.isDirectory()) {
                list.push(
                    (await collectFiles(
                        resolvePath(directory, dirent.name),
                        collected
                    ))
                );
            }
            return dirent;
        })
    )
);

const mapFile = (directory) => list => (
    list.map(({ name }) => resolvePath(directory, name))
);

const collectFiles = async (directory, collected = []) => {
    const inputDirectory = await fs.readdir(
        resolvePath(directory),
        {
            withFileTypes: true
        }
    );

    const actions = [
        recurseSubDirectory(directory, collected),
        filterFiles(".md"),
        mapFile(directory),
    ];

    const files = await actions.reduce(
        async (list, action) => (
            action(await list)
        ),
        inputDirectory
    );
    collected.push(...files);
    return collected;
}

We can remove the need to call fs.stat by calling fs.readdir with the withFileTypes: true option.

This will return a dirent object instead of the file name string which already has the isDirectory() and isFile() functions.

const inputDirectory = await fs.readdir(
    resolvePath(args.directory),
    {
        withFileTypes: true
    }
);

It would also be better to recurse sub directories as early as possible so update the actions array.

const actions = [
    recurseSubDirectory(directory, collected), // first action called in the list
    filterFiles(".md"), // combined filter function to reduce looping over the array
    mapFile(directory),
];

Finally, we need to fix up some bugs from our changes so the recurseSubDirectory needed to resolve the input directory path.

(await collectFiles(
    resolvePath(directory, dirent.name),
    collected
))

Lastly the file filtering functions can be combined into one:

const filterFiles = (extension) => (list) => (
    list.filter(dirent =>
        (
            !extension &&
            dirent.isFile()
        ) ||
        (
            dirent.isFile() &&
            extension &&
            dirent.name.endsWith(extension)
        )
    )
);

Conclusion

If you have made it this far thank you for following along.

Refactoring code is difficult and takes many itterations, it is a good habbit to do it every day in small chunks so that you minimize risk and increase itteration.

It is only through reading and rewriting code do we learn and find a better way to achieve the same results.

Always write a test case before making any changes, this is the law of refactoring without risk.

Remember that when you change function arguments you introduce a breaking change, search your code base to asess the impact this might have.

Easter Egg

The final version can be found here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

js-async-study

Introduction

Problems

Getting Started

Phase One

Phase Two

Phase Three

Take away

Bonus Round

Conclusion

Easter Egg

Files

README.md

Latest commit

History

README.md

File metadata and controls

js-async-study

Introduction

Problems

Getting Started

Phase One

Phase Two

Phase Three

Take away

Bonus Round

Conclusion

Easter Egg