Qafoo GmbH - passion for software quality

Help you and your team benefit from new perspectives on cutting-edge quality engineering techniques and tools through the Qafoo team weblog.

By Kore Nordmann, first published at Thu, 12 Dec 2013 11:23:29 +0100

Download our free e-book "Crafting Quality Software" with a selection of the finest blog posts as PDF or EPub.

Crafting Quality Software

You can also buy a printed version of the book on Amazon or on epubli.

Grasp - Structural JavaScript Search and Replace

About a week ago I came across a new JavaScript-related utility called grasp. Grasp is a commandline utility to search and replace content in JavaScript files. It has a certain similarity to tools like grep or sed.

Of course there are quite a lot tools like this out of there. Why is another one needed? What is the difference between this utility and its counterparts?

The answer lies within its specialization on JavaScript. Even though the tools' author George Zahariev plans on supporting different languages in the future aside from JavaScript, there is a reason for this limitation: grasp does not utilize the source code as a plain text file, but understands about its structure, its building blocks. Therefore, it can be much more powerful than a plain text-based solution in a lot of situations.

Installing grasp

Qafoo experts can support your team to learn and use all the different varieties of available JavaScript tools and techniques to bring more efficiency to your everyday work.

grasp is written in JavaScript using node as an execution environment. As every good nodejs citizen it is being distributed using the node-package-manager (npm). Therefore its installation is completely painless once you have a running nodejs on your system:

$ npm install -g grasp

Once npm has finished its work, you should be able to simply execute the „grasp“-command and the tool will greet you with its usage information and help output.

The power of grasp

Let's take a first look at what grasp can do for us. As it operates on files, specifically JavaScript files, an example is needed. Let's use this one for now:

var isBlogPost = true; if (author == "Jakob" && isBlogPost) { qafooBlog.publish(post); } if (author == "Toby" && isBlogPost) { qafooBlog.publish(post); }

Before we take a look at a more real-world example, let's start with this quite easy one, as it is capable of demonstrating the basic features of grasp very nicely.

Why structure matters

The given example shows quite clearly why structural analysis of source code may matter for such a tool. Even though the shown if-statements are mostly the same, they are written in a different manner. One of the conditions is written on one line, while the other one is split into two. A JavaScript-Engine does not care about this newline, as it knows about the syntactical structure of a JavaScript file. However, a simple text-based search algorithm will be fooled by the newline and might miss what we originally searched for.

Assuming our aim is to find all the occurrences of the isBlogPost identifier, our goal can be reached using grep or the search function of any IDE quite easily:

$ grep 'isBlogPost' blog_if_example.js 1:var isBlogPost = true; 3:if (author == "Jakob" && isBlogPost) { 8: && isBlogPost) {

That worked like a charm. So let's get a little bit more restrictive. Say we want to find all statements which utilize the isBlogPost identifier inside their test condition. This can be accomplished using a regular expression that searches for a line, where an if is found followed by a pair of parenthesis which do contain isBlogPost somewhere between them:

$ grep 'if (.*isBlogPost.*)' blog_if_example.js 3:if (author == "Jakob" && isBlogPost) {

Oh wait. We missed an occurrence! Unfortunately, we only checked the occurrence on one line. The second occurrence was split up to multiple lines. As I mentioned before, it is of course possible to modify the regular expression to accommodate this fact, but it is neither useful nor pragmatic as there are a lot of other cases which can't be covered using the regexp approach.

grasp the code

As grasp has a syntactical understanding of the source code, like a JavaScript Engine, it does not care about the multiline if statement. It just sees an if with a test condition, which consists of a LogicalExpression (the &&), that does consist of a BinaryExpression (the auther == ...) and an Identifier (the isBlogPost).

What grasp sees and therefore works on is the Abstract-Syntax-Tree (AST) of the given source code. It is a tree-based structure, which does contain all the needed information, but without all the different ways of writing it down:

if └── test └── LogicalExpression (&&) ├── BinaryExpression (==) │ ├── Identifier (author) │ └── Literal ("Jakob") └── Identifier (isBlogPost)

Okay, we now have an idea about the view of grasp on our source code. Let's use this information to complete the before given task of selecting all if statements that contain the isBlogPost identifier in their test condition:

$ grasp 'if.test! #isBlogPost' blog_if_example.js 3:if (author == "Jakob" && isBlogPost) { 7-8:(multiline): if (author == "Toby" && isBlogPost) {

Hey, grasp was able to correctly provide us with what we searched for. But what exactly did we give grasp as input to achieve our goal?

CSS for ASTs

grasp does provide us with a CSS-like selector language to specify what we want to search for inside the AST of the given source file. Once you have developed a feeling for which nodes are placed where in the created AST of your input file, it is actually quite easy to write and grasp ;). A full list of all nodes, their child nodes and possible attributes is available on grasp's JavaScript-Syntax page

Let's take a closer look at the used selector to extract the usages of isBlogPost in if statements: if.test #isBlogPost. Especially how I came up with this selector:

  • I wanted to select something from within an if. Therefore, I looked up the IfStatement inside the linked JavaScript-Syntax-Page. I found out that its shortname is if. So I used that. Current selector: if

  • Further reading of the IfStatement-paragraph revealed that besides others it has an attribute called test, which houses the condition of the statement. grasp allows the access of attributes using a simple dot notation. Current selector: if.test.

  • Everything which classifies as an Identifier can be represented using a hash followed by the identifiers name, much like in CSS. As I am searching for the isBlogArticle identifier somewhere below the test attribute, I simply split the two statements using a whitespace. Exactly the same as with a normal CSS selector. Current selector: if.test #isBlogPost.

  • Like in CSS, the element matched is the last one inside the given chain of selectors. In this case the isBlogPost -identifier. We however want to extract the whole condition which contains this identifier. Luckily for us, grasp allows us to mark the part of the given expression we really want to extract using an exclamation mark. As we want to extract the test condition of every match, we put the exclamation mark right behind it. Current selector: if.test! #isBlogPost.

As you can see, constructing grasp-selectors progressively is not that hard. Just think about what you want to select and build it up, one part at a time. The full spectrum of the CSS-like selector syntax is available from the grasp squery documentation.

Taking a look at the AST behind the curtain

Once you start constructing more complex expressions you might wonder, what the AST you just matched looks like. Having information about this might come in handy to decide on how to narrow down your selection to what you really searched for. grasp does provide a json output mode, which is triggered using the -j argument. Essentially, it outputs a JSON representation of the AST instead of the usual match. Using a JSON prettifier to look at the generated structures is highly recommended ;):

$ grasp -j 'if.test #isBlogPost' blog_if_example.js | prettyson [{ type: 'Identifier', start: 49, end: 59, loc: { start: { line: 3, column: 25 }, end: { line: 3, column: 35 } }, name: 'isBlogPost' }, { type: 'Identifier', start: 126, end: 136, loc: { start: { line: 8, column: 7 }, end: { line: 8, column: 17 } }, name: 'isBlogPost' }]

A little bit more real world

I promised before to show a little bit more of a real world example later on. Here we are now, knowing about the ideas of grasp as well as the basics of its selector syntax. It is time. Assume we have the following file as input:

require.config({ baseUrl: "/", paths: { // Some aliases "translation": "my/deeply/nested/translation/module", "window": "my/global/window/wrapper", // External libs jquery: "vendor/some/component/manager/jquery/jquery", ui: "vendor/some/component/manager/jquery-ui/jquery-ui", underscore: "vendor/some/component/manager/underscore/underscore", // AMD Loaders text: "base/requirejs/loaders/text" }, shim: { jquery: { exports: "jQuery" }, ui: { deps: ["jquery"] }, underscore: { exports: "_" } } });

A quite common require.js configuration, defining some paths, some shims and a baseUrl. While working with something similar to this I realized I had a problem. Even though during development everything worked like a charm, a certain version of r.js - the require.js optimizer - didn't want to read the configuration to create a build. I found out that it was quite particular about the way keys were defined in the configuration objects. All of the keys needed to be strings, encapsulated in quotation marks. As the file was quite large, much larger than this example, I decided to use grasp to help me out:

$ grasp 'prop[key!=String] > @key' require.config.js --replace '""{{}}""'

While this looks a little bit scary at first, it did exactly what I wanted: Replacing every non-string-key in the configuration object with its double quotes encapsulated counter part. Let's take apart the selector to see what is really going on here:

  • First, every object property (short: prop) is selected.

  • A look into the syntax documentation or the grasp --help prop output tells us properties have a key- and a value-attribute. key seems to be exactly what we need. Using the square-bracket notation a condition is therefore defined, which instructs grasp to select all prop elements. Those elements need to have a key attribute that is not a String.

  • If we wanted to only extract all properties having a non-string-key, we would have already succeeded. We however intend to replace all of those keys. Therefore we need to really select the key-attribute and not the whole prop fulfilling the given condition. As with CSS the greater-than > operator tells the engine to select a direct child, while @key tells grasp that the requested child is the key attribute of the selected element.

  • The selector is ready. Missing is the replacement, specified by the --replace-option a replacement for the matched structures may be given. In this example it is a double pair of curly braces encapsulated in double quotes. The {{}} represents the contents of the structure to be replaced. In our case the key. Due to an open bug of grasp we need to provide two double quotes instead of one around the key as one pair is eaten by grasp.

Wrapping it up

As this post demonstrates, there is quite some potential for a search-and-replace-system which does incorporate knowledge about the programming language it operates on. Even though being quite young, grasp has already good chances to become a new utility in my JavaScript tool belt. It is definitely worth looking into. I hope you have as much fun playing with it as I had.

Download our free e-book "Crafting Quality Software" with a selection of the finest blog posts as PDF or EPub.

Crafting Quality Software

You can also buy a printed version of the book on Amazon or on epubli.

Get Technical Insights With Our Newsletter

Stay up to date with regular new technological insights by subscribing to our newsletter. We will send you articles to improve your developments skills.

Comments