Automating Git with Node.js: Let’s Learn About Spawning Commands

In my previous article, I introduced commit-ai, a tool that automates the creation of commit messages. But how does it actually work? A key part of the “magic” is its ability to interact with your git repository to collect information about changed files. In this article, we’ll explore how you can do the same in your own Node.js projects.

Why Interact with Git in Node.js?

There are many reasons why you might want to interact with Git from within a Node.js application. You might be building a developer tool, a custom deployment script, or a CI/CD pipeline. In the case of commit-ai, we want to get the diff of the staged changes to generate a relevant commit message.

While there are some great libraries out there that provide a more abstract way of interacting with Git, sometimes the most straightforward approach is to simply run the git command-line tool and process its output. This is the approach I took in commit-ai.

Spawning Git Commands with `child_process`

Node.js provides a built-in child_process module that allows you to run external commands. The spawnSync function is a great choice for this, as it allows us to run a command and wait for it to complete before continuing.

Here’s a basic example of how you can run git diff --name-only to get a list of changed files:

import {spawnSync} from 'child_process';

const {stdout, stderr, status} = spawnSync('git', ['diff', '--name-only']);

if (status !== 0) {
    console.error(`git command failed with exit code ${status}: ${stderr.toString()}`);
    process.exit(1);
}

const changedFiles = stdout.toString().split('\n').filter(Boolean);

console.log(changedFiles);

In this example, we’re calling spawnSync with the git command and an array of arguments. We then check the exit code of the command to see if it was successful. If it was, we can process the output from stdout.

Async counterparts in “node:fs”

By the way: most of the functions in Node.js’s child_process module, including spawnSync, also have asynchronous counterparts (e.g., spawn). While spawnSync is great for simple scripts where you are fine with blocking execution until the command finishes, the asynchronous versions are better suited for applications where you need to handle long-running processes without blocking the main thread. This could be a webserver, or if you want to parallelize commands.

Processing: From Raw Output to Useful Data

As you can see, the output from the git command is just a string. To make it useful, we need to parse it. In the case of git diff --name-only, the output is a list of file paths, separated by newlines. We can easily turn this into an array of strings with stdout.toString().split('\n').filter(Boolean).

This is exactly what I do in commit-ai to get a list of changed files. I then use this list to read the content of each file and feed that into the prompt.

Here is the implementation from the commit-ai repository:

import {spawnSync} from "child_process";
import {GitError} from "./git-error.js";

export const getChangedFiles = (cwd?: string): string[] => {
    const options = cwd ? {cwd} : {};

    const {stdout, stderr} = spawnSync("git", ["diff", "--name-only"], {
        encoding: "utf8",
        ...options,
    });

    if (!stdout) {
        throw new GitError(stderr.toString()); // feed stderr into my Error object 
    }

    return stdout
        .toString()
        .split("\n")
        .filter((l) => !l.startsWith("warning"))
        .filter((s) => s.trim().length > 0);
};

Error Handling

When you’re running external commands, it’s important to handle errors gracefully. spawnSync will not throw an error if the command fails, so you need to check the status property of the returned object. If it’s not 0, you can get more information about the error from stderr.

In commit-ai, I actually created a custom GitError class to make it easier to handle git-related errors.

Conclusion

Interacting with Git from Node.js by spawning commands is a great approach for my use-case. You will want to evaluate whether there are better options than running commands directly, maybe through libraries. In my case, simply running a few commands was enough for me. Parsing the data through a simple pipeline afterwards gave me all the results I was hoping for.

If you’re interested in any of the code examples shown above or in my project ” commit-ai”, check it out on GitHub!