带你揭开神秘的Javascript AST面纱之Babel AST 四件套的使用方法已关闭评论
128 次浏览

A+

所属分类：Web前端

摘要

作者：京东零售周明亮这里我们初步提到了一些基础概念和应用：有了初步的认识，还有常规的代码改造应用实践，现在我们来详细说说使用 AST，如何进行代码改造？

便宜好用的国外VPS推荐

作者：京东零售周明亮

写在前面

这里我们初步提到了一些基础概念和应用：

分析器
抽象语法树 AST
AST 在 JS 中的用途
AST 的应用实践

有了初步的认识，还有常规的代码改造应用实践，现在我们来详细说说使用 AST，如何进行代码改造？

Babel AST 四件套的使用方法

其实在解析 AST 这个工具上，有很多可以使用，上文我们已经提到过了。对于 JS 的 AST 大家已经形成了统一的规范命名，唯一不同的可能是，不同工具提供的详细程度不一样，有的可能会额外提供额外方法或者属性。

所以，在选择工具上，大家按照各自喜欢选择即可，这里我们选择了babel这个老朋友。

初识 Babel

我相信在这个前端框架频出的时代，应该都知道babel的存在。如果你还没听说过babel，那么我们通过它的相关文档，继续深入学习一下。

因为，它在任何框架里面，我们都能看到它的影子。

Babel JS 官网
Babel JS Github

作为使用最广泛的 JS 编译器，他可以用于将采用 ECMAScript 2015+ 语法编写的代码转换为向后兼容的 JavaScript 语法，以便能够运行在当前和旧版本的浏览器或其他环境中。

而它能够做到向下兼容或者代码转换，就是基于代码解析和改造。接下来，我们来说说：如何使用@babel/core里面的核心四件套：@babel/parser、@babel/traverse、@babel/types及@babel/generator。

1. @babel/parser

@babel/parser 核心代码解析器，通过它进行词法分析及语法分析过程，最终转换为我们提到的 AST 形式。

假设我们需要读取React中index.tsx文件中代码内容，我们可以使用如下代码：

const { parse } = require("@babel/parser")  // 读取文件内容 const fileBuffer = fs.readFileSync('./code/app/index.tsx', 'utf8'); // 转换字节 Buffer const fileCode = fileBuffer.toString(); // 解析内容转换为 AST 对象 const codeAST = parse(fileCode, {   // parse in strict mode and allow module declarations   sourceType: "module",   plugins: [     // enable jsx and typescript syntax     "jsx",     "typescript",   ], });

当然我不仅仅只读取React代码，我们甚至可以读取Vue语法。它也有对应的语法分析器，比如：@vue/compiler-dom。

此外，通过不同的参数传入 options，我们可以解析各种各样的代码。如果，我们只是读取普通的.js文件，我们可以不使用任何插件属性即可。

const codeAST = parse(fileCode, {   // parse in strict mode and allow module declarations   sourceType: "module" });

通过上述的代码转换，我们就可以得到一个标准的 AST 对象。在上一篇文章中，已做详细分析，在这里不在展开。比如：

// 原代码 const me = "我" function write() {   console.log("文章") }  // 转换后的 AST 对象 const codeAST = {   "type": "File",   "errors": [],   "program": {     "type": "Program",     "sourceType": "module",     "interpreter": null,     "body": [       {         "type": "VariableDeclaration",         "declarations": [           {             "type": "VariableDeclarator",             "id": {               "type": "Identifier",               "name": "me"             },             "init": {               "type": "StringLiteral",               "extra": {                 "rawValue": "我",                 "raw": ""我""               },               "value": "我"             }           }         ],         "kind": "const"       },       {         "type": "FunctionDeclaration",         "id": {           "type": "Identifier",           "name": "write"         },         "generator": false,         "async": false,         "params": [],         "body": {           "type": "BlockStatement",           "body": [             {               "type": "ExpressionStatement",               "expression": {                 "type": "CallExpression",                 "callee": {                   "type": "MemberExpression",                   "object": {                     "type": "Identifier",                     "computed": false,                     "property": {                       "type": "Identifier",                       "name": "log"                     }                   },                   "arguments": [                     {                       "type": "StringLiteral",                       "extra": {                         "rawValue": "文章",                         "raw": ""文章""                       },                       "value": "文章"                     }                   ]                 }               }             }           ]         }       }     ]   } }

2. @babel/traverse

当我们拿到一个标准的 AST 对象后，我们要操作它，那肯定是需要进行树结构遍历。这时候，我们就会用到 @babel/traverse 。

比如我们得到 AST 后，我们可以进行遍历操作：

const { default: traverse } = require('@babel/traverse');  // 进入结点 const onEnter = pt => {    // 进入当前结点操作    console.log(pt) } // 退出结点 const onExit = pe => {   // 退出当前结点操作 } traverse(codeAST, { enter: onEnter, exit: onExit })

那么我们访问的第一个结点，打印出pt的值，是怎样的呢？

// 已省略部分无效值 <ref *1> NodePath {   contexts: [     TraversalContext {       queue: [Array],       priorityQueue: [],       ...     }   ],   state: undefined,   opts: {     enter: [ [Function: onStartVist] ],     exit: [ [Function: onEndVist] ],     _exploded: true,     _verified: true   },   _traverseFlags: 0,   skipKeys: null,   parentPath: null,   container: Node {     type: 'File',     errors: [],     program: Node {       type: 'Program',       sourceType: 'module',       interpreter: null,       body: [Array],       directives: []     },     comments: []   },   listKey: undefined,   key: 'program',   node: Node {     type: 'Program',     sourceType: 'module',     interpreter: null,     body: [ [Node], [Node] ],     directives: []   },   type: 'Program',   parent: Node {     type: 'File',     errors: [],     program: Node {       type: 'Program',       sourceType: 'module',       interpreter: null,       body: [Array],       directives: []     },     comments: []   },   hub: undefined,   data: null,   context: TraversalContext {     queue: [ [Circular *1] ],     priorityQueue: [],     ...   },   scope: Scope {     uid: 0,     path: [Circular *1],     block: Node {       type: 'Program',       sourceType: 'module',       interpreter: null,       body: [Array],       directives: []     },     ...   } }

是不是发现，这一个遍历怎么这么多东西？太长了，那么我们进行省略，只看关键部分：

// 第1次 <ref *1> NodePath {   listKey: undefined,   key: 'program',   node: Node {     type: 'Program',     sourceType: 'module',     interpreter: null,     body: [ [Node], [Node] ],     directives: []   },   type: 'Program', }

我们可以看出是直接进入到了程序program结点。对应的 AST 结点信息：

  program: {     type: 'Program',     sourceType: 'module',     interpreter: null,     body: [       [Node]       [Node]     ],   },

接下来，我们继续打印输出的结点信息，我们可以看出它访问的是program.body结点。

// 第2次 <ref *2> NodePath {   listKey: 'body',   key: 0,   node: Node {     type: 'VariableDeclaration',     declarations: [ [Node] ],     kind: 'const'   },   type: 'VariableDeclaration', }  // 第3次 <ref *1> NodePath {   listKey: 'declarations',   key: 0,   node: Node {     type: 'VariableDeclarator',     id: Node {       type: 'Identifier',       name: 'me'     },     init: Node {       type: 'StringLiteral',       extra: [Object],       value: '我'     }   },   type: 'VariableDeclarator', }  // 第4次 <ref *1> NodePath {   listKey: undefined,   key: 'id',   node: Node {     type: 'Identifier',     name: 'me'   },   type: 'Identifier', }  // 第5次 <ref *1> NodePath {   listKey: undefined,   key: 'init',   node: Node {     type: 'StringLiteral',     extra: { rawValue: '我', raw: "'我'" },     value: '我'   },   type: 'StringLiteral', }

node当前结点
parentPath父结点路径
scope作用域
parent父结点
type当前结点类型

现在我们可以看出这个访问的规律了，他会一直找当前结点node属性，然后进行层层访问其内容，直到将 AST 的所有结点遍历完成。

这里一定要区分NodePath和Node两种类型，比如上面：pt是属于NodePath类型，pt.node才是Node类型。

其次，我们看到提供的方法除了进入 [enter]还有退出 [exit]方法，这也就意味着，每次遍历一次结点信息，也会退出当前结点。这样，我们就有两次机会获得所有的结点信息。

当我们遍历结束，如果找不到对应的结点信息，我们还可以进行额外的操作，进行代码结点补充操作。结点完整访问流程如下：

进入>Program
- 进入>node.body[0]
  - 进入>node.declarations[0]
    - 进入>node.id
    - 退出<node.id
    - 进入>node.init
    - 退出<node.init
  - 退出<node.declarations[0]
- 退出<node.body[0]
- 进入>node.body[1]
  - ...
  - ...
- 退出<node.body[1]
退出<Program

3. @babel/types

有了前面的铺垫，我们通过解析，获得了相关的 AST 对象。通过不断遍历，我们拿到了相关的结点，这时候我们就可以开始改造了。@babel/types 就提供了一系列的判断方法，以及将普通对象转换为 AST 结点的方法。

比如，我们想把代码转换为：

// 改造前代码 const me = "我" function write() {   console.log("文章") }  // 改造后的代码 let you = "你" function write() {   console.log("文章") }

首先，我们要分析下，这个代码改了哪些内容？

变量声明从const改为let
变量名从me改为you
变量值从"我"改为"你"

那么我们有两种替换方式：

方案一：整体替换，相当于把program.body[0]整个结点进行替换为新的结点。
方案二：局部替换，相当于逐个结点替换结点内容，即：program.body.kind,program.body[0].declarations[0].id，program.body[0].declarations[0].init。

借助@babel/types我们可以这么操作，一起看看区别：

const bbt = require('@babel/types'); const { default: traverse } = require('@babel/traverse');  // 进入结点 const onEnter = p => {   // 方案一，全结点替换   if (bbt.isVariableDeclaration(p.node) && p.listKey == 'body') {     // 直接替换为新的结点     p.replaceWith(       bbt.variableDeclaration('let', [         bbt.variableDeclarator(bbt.identifier('you'),                    bbt.stringLiteral('你')),       ]),     );   }   // 方案二，单结点逐一替换   if (bbt.isVariableDeclaration(p.node) && p.listKey == 'body') {     // 替换声明变量方式     p.node.kind = 'let';   }   if (bbt.isIdentifier(p.node) && p.node.name == 'me') {     // 替换变量名     p.node.name = 'you';   }   if (bbt.isStringLiteral(p.node) && p.node.value == '我') {     // 替换字符串内容     p.node.value = '你';   }   }; traverse(codeAST, { enter: onEnter });

我们发现，不仅可以进行整体结点替换，也可以替换属性的值，都能达到预期效果。

当然我们不仅仅可以全部遍历，我们也可以只遍历某些属性，比如VariableDeclaration，我们就可以这样进行定义:

traverse(codeAST, {    VariableDeclaration: function(p) {     // 只操作类型为 VariableDeclaration 的结点     p.node.kind = 'let';   } });

@babel/types提供大量的方法供使用，可以通过官网查看。对于@babel/traverse返回的可用方法，可以查看 ts 定义：
babel__traverse/index.d.ts 文件。

常用的方法：p.stop()可以提前终止内容遍历，还有其他的增删改查方法，可以自己慢慢摸索使用！它就是一个树结构，我们可以操作它的兄弟结点，父节点，子结点。

4. @babel/generator

完成改造以后，我们需要把 AST 再转换回去，这时候我们就需要用到 @babel/generator 工具。只拆不组装，那是二哈【狗头】。能装能组，才是一个完整工程师该干的事情。

废话不多说，上代码：

const fs = require('fs-extra'); const { default: generate } = require('@babel/generator');  // 生成代码实例 const codeIns = generate(codeAST, { retainLines: true, jsescOption: { minimal: true } });  // 写入文件内容 fs.writeFileSync('./code/app/index.js', codeIns.code);

配置项比较多，大家可以参考具体的说明，按照实际需求进行配置。

这里特别提一下：jsescOption: { minimal: true }这个属性，主要是用来保留中文内容，防止被转为unicode形式。

Babel AST 实践

嘿嘿～都到这里了，大家应该已经能够上手操作了吧！

什么？还不会，那再把 1 ～ 4 的步骤再看一遍。慢慢尝试，慢慢修改，当你发现其中的乐趣时，这个 AST 的改造也就简单了，并不是什么难事。

留个课后练习：

// 改造前代码 const me = "我" function write() {   console.log("文章") }  // 改造后的代码 const you = "你" function write() {   console.log("文章") } console.log(you, write())

大家可以去尝试下，怎么操作简单的 AST 实现代码改造！写文章不易，大家记得一键三连哈～

AST 应用是非常广泛，再来回忆下，这个 AST 可以干嘛？

代码转换领域，如：ES6 转 ES5， typescript 转 js，Taro 转多端编译，CSS预处理器等等。
模版编译领域，如：React JSX 语法，Vue 模版语法等等。
代码预处理领域，如：代码语法检查（ESLint），代码格式化（Prettier），代码混淆/压缩（uglifyjs）等等
低代码搭建平台，拖拽组件，直接通过 AST 改造生成后的代码进行运行。

下一期预告

《带你揭开神秘的Javascript AST面纱之手写一个简单的 Javascript 编译器》

便宜好用的国外VPS推荐