textlint

textlint

  • Docs
  • Rules
  • Blog
  • Help
  • GitHub

›Developer Guide

User Manual

  • Getting Started with textlint
  • Command Line Interface
  • Configuring textlint
  • Ignoring Text
  • Integrating with Editors, Tools, etc..

Developer Guide

  • Creating Rules
  • Creating Filter Rule
  • Creating Fixable Rule
  • Creating Preset
  • Advanced: Paragraph Rule
  • How to implement "after-all" in the rule?
  • Plugin
  • Formatter
  • Use as Node Modules
  • TxtAST Interface

Contributing

  • Contributing Guideline
Edit

TxtAST Interface

TxtAST define AST(Abstract Syntax Tree) for processing in textlint.

What is AST?

Abstract syntax tree is a tree representation of the abstract syntactic structure of text.

textlint's plugin parse text to AST. AST is a tree structure that is consist of Txt{{Type}}Node like TxtParagraphNode. Each node has common properties like type, raw, loc, range and parent that is defined in TxtNode interface.

Each node has own properties that is defined in each node type.

textlint ast-explorer

AST explorer for textlint is useful for understanding AST.

TxtNode

TxtNode is an abstract node.

/**
 * Basic TxtNode
 * Probably, Real TxtNode implementation has more properties.
 */
interface TxtNode {
    type: string;
    raw: string;
    range: TxtNodeRange;
    loc: TxtNodeLineLocation;
    // parent is runtime information
    // Not need in AST
    // For example, top Root Node like `Document` has not parent.
    parent?: TxtNode;

    [index: string]: any;
}


/**
 * Location
 */
interface TxtNodeLineLocation {
    start: TxtNodePosition;
    end: TxtNodePosition;
}

/**
 * Position's line start with 1.
 * Position's column start with 0.
 * This is for compatibility with JavaScript AST.
 * https://gist.github.com/azu/8866b2cb9b7a933e01fe
 */
interface TxtNodePosition {
    line: number; // start with 1
    column: number; // start with 0
}

/**
 * Range start with 0
 */
export type TxtNodeRange = readonly [startIndex: number, endIndex: number];

TxtNode must have these properties.

  • type: type of Node
  • raw: raw value of Node
    • if you want to get raw value, please use getSource(<node>) instead of it..
  • loc: location object
  • range: location info array like [startIndex, endIndex]
  • parent: (optional) parent node of this node.
    • It is attached in runtime
    • Parser user ignore this property

TxtTextNode

TxtTextNode is an abstract node that inherit TxtNode interface.

/**
 * Text Node.
 * Text Node has inline value.
 * For example, `Str` Node is a TxtTextNode.
 */
interface TxtTextNode extends TxtNode {
    value: string;
}

TxtTextNode must have these properties.

  • value: the value of inline node.

Example: Str node is a TxtTextNode.

TxtParentNode

TxtParentNode is an abstract node that inherit TxtNode interface.

/**
 * Parent Node.
 * Parent Node has children that are consist of TxtNode or TxtTextNode
 */
interface TxtParentNode extends TxtNode {
    children: Array<TxtNode | TxtTextNode>;
}

TxtParentNode must have these properties.

  • children: child nodes of this node.

Example: Paragraph node is a TxtParentNode.

type

type is TxtNode type.

All Types are defined in @textlint/ast-node-types. You can use this ASTNodeTypes value via following way:

import { ASTNodeTypes } from "@textlint/ast-node-types";

console.log(ASTNodeTypes.Str); // "Str"

You can get Node type for Type name by TypeofTxtNode in TypeScript.

// In TypeScript
import { ASTNodeTypes } from "@textlint/ast-node-types";

const nodeType = TypeofTxtNode<ASTNodeTypes.Str>; // TxtTextNode

All node types

These types are defined in @textlint/ast-node-types.

Type nameNode typeDescription
ASTNodeTypes.DocumentTxtDocumentNode(TxtParentNode)Root Node
ASTNodeTypes.DocumentExitTxtDocumentNode(TxtParentNode)
ASTNodeTypes.ParagraphTxtParagraphNode(TxtParentNode)Paragraph Node
ASTNodeTypes.ParagraphExitTxtParagraphNode(TxtParentNode)
ASTNodeTypes.BlockQuoteTxtBlockQuoteNode(TxtParentNode)> Block Quote Node
ASTNodeTypes.BlockQuoteExitTxtBlockQuoteNode(TxtParentNode)
ASTNodeTypes.ListTxtListNode(TxtParentNode)List Node
ASTNodeTypes.ListExitTxtListNode(TxtParentNode)
ASTNodeTypes.ListItemTxtListItemNode(TxtParentNode)List (each) item Node
ASTNodeTypes.ListItemExitTxtListItemNode(TxtParentNode)
ASTNodeTypes.HeaderTxtHeaderNode(TxtParentNode)# Header Node
ASTNodeTypes.HeaderExitTxtHeaderNode(TxtParentNode)
ASTNodeTypes.CodeBlockTxtCodeBlockNode(TxtParentNode)Code Block Node
ASTNodeTypes.CodeBlockExitTxtCodeBlockNode(TxtParentNode)
ASTNodeTypes.HtmlBlockTxtHtmlBlockNode(TxtParentNode)HTML Block Node
ASTNodeTypes.HtmlBlockExitTxtHtmlBlockNode(TxtParentNode)
ASTNodeTypes.LinkTxtLinkNode(TxtParentNode)Link Node
ASTNodeTypes.LinkExitTxtLinkNode(TxtParentNode)
ASTNodeTypes.DeleteTxtDeleteNode(TxtParentNode)Delete Node(~Str~)
ASTNodeTypes.DeleteExitTxtDeleteNode(TxtParentNode)
ASTNodeTypes.EmphasisTxtEmphasisNode(TxtParentNode)Emphasis(*Str*)
ASTNodeTypes.EmphasisExitTxtEmphasisNode(TxtParentNode)
ASTNodeTypes.StrongTxtStrongNode(TxtParentNode)Strong Node(**Str**)
ASTNodeTypes.StrongExitTxtStrongNode(TxtParentNode)
ASTNodeTypes.BreakTxtBreakNodeHard Break Node(Str<space><space>)
ASTNodeTypes.BreakExitTxtBreakNode
ASTNodeTypes.ImageTxtImageNodeImage Node
ASTNodeTypes.ImageExitTxtImageNode
ASTNodeTypes.HorizontalRuleTxtHorizontalRuleNodeHorizontal Node(---)
ASTNodeTypes.HorizontalRuleExitTxtHorizontalRuleNode
ASTNodeTypes.CommentTxtCommentNodeComment Node
ASTNodeTypes.CommentExitTxtCommentNode
ASTNodeTypes.StrTxtStrNodeStr Node
ASTNodeTypes.StrExitTxtStrNode
ASTNodeTypes.CodeTxtCodeNodeInline Code Node
ASTNodeTypes.CodeExitTxtCodeNode
ASTNodeTypes.HtmlTxtHtmlNodeInline HTML Node
ASTNodeTypes.HtmlExitTxtHtmlNode
ASTNodeTypes.TableTxtTableNodeTable node. textlint 13+
ASTNodeTypes.TableExitTxtTableNode
ASTNodeTypes.TableRowTxtTableRowNodeTable row node. textlint 13+
ASTNodeTypes.TableRowExitTxtTableRowNode
ASTNodeTypes.TableCellTxtTableCellNodeTable cell node. textlint 13+
ASTNodeTypes.TableCellExitTxtTableCellNode

Some node have additional properties. For example, TxtHeaderNode has level property.

export interface TxtHeaderNode extends TxtParentNode {
    type: "Header";
    depth: 1 | 2 | 3 | 4 | 5 | 6;
    children: PhrasingContent[];
}

For more details, see @textlint/ast-node-types.

  • @textlint/ast-node-types/src/NodeType.ts.

These type are based on HTML tag and Markdown syntax. Other plugin has defined other node type that is not defined in @textlint/ast-node-types, but you can specify it as just a string.

// A rule can treat "Example" node type
export default () => {
    return {
        ["Example"](node) {
            // do something
        }
    };
};

Minimal node property

TxtAST allow to extend node property. But, Following node should have some properties.

Header

  • depth: level of header
    • Example: <h1> is depth:1, <h2> is depth:2...

Link

  • url: link url

Image

  • url: image url

Built-in Parser

textlint has built-in parsers.

PackageVersionDescription
@textlint/markdown-to-astnpmmarkdown parser
@textlint/text-to-astnpmplain text parser

If you want to get other type, please create new issue.

Package

That TxtNode interface is defined in packages/ast-node-types.

If you want to use this interface from TypeScript, packages/ast-node-types is useful.

Online Parsing Demo

ast-explorer fork

AST explorer for textlint is useful for understanding AST.

Minimum(recommended) rules is following code:

/**
 * @param {RuleContext} context
 */
export default function(context) {
    const { Syntax } = context;
    // root object
    return {
        [Syntax.Document](node) {
        },
        [Syntax.Paragraph](node) {
        },
        [Syntax.Str](node) {
        }
    };
}

loc

loc is location info object.

{
  "loc": {
    "start": {
      "line": 2,
      "column": 4
    },
    "end": {
      "line": 2,
      "column": 10
    }
  }
}
  • line of location start with 1 (1-indexed).
  • column of location start with 0 (0-indexed).

This is for compatibility with JavaScript AST.

  • Why do line of location in JavaScript AST(ESTree) start with 1 and not 0?

Important Note:

Text -> AST TxtNode(0-based columns here) -> textlint -> TextLintMessage(1-based columns)

TxtNode has 0-based columns, but the result of linting named TextLintMessage has 1-based columns.

In other word, textlint's rule handle TxtNode, but formatter handle TextLintMessage.

Example

Input: *text*

Output: The AST by AST explorer for textlint + Markdown

{
  "type": "Document",
  "children": [
    {
      "type": "Paragraph",
      "children": [
        {
          "type": "Emphasis",
          "children": [
            {
              "type": "Str",
              "value": "text",
              "loc": {
                "start": {
                  "line": 1,
                  "column": 1
                },
                "end": {
                  "line": 1,
                  "column": 5
                }
              },
              "range": [
                1,
                5
              ],
              "raw": "text"
            }
          ],
          "loc": {
            "start": {
              "line": 1,
              "column": 0
            },
            "end": {
              "line": 1,
              "column": 6
            }
          },
          "range": [
            0,
            6
          ],
          "raw": "*text*"
        }
      ],
      "loc": {
        "start": {
          "line": 1,
          "column": 0
        },
        "end": {
          "line": 1,
          "column": 6
        }
      },
      "range": [
        0,
        6
      ],
      "raw": "*text*"
    }
  ],
  "loc": {
    "start": {
      "line": 1,
      "column": 0
    },
    "end": {
      "line": 1,
      "column": 6
    }
  },
  "range": [
    0,
    6
  ],
  "raw": "*text*"
}

Illustration

          *   text   *
          |   |__|   |
          |   value  |
          |__________|
               raw
  • Document is a TxtParentNode and type is Document
    • have children, but not have value
  • Paragraph is a TxtParentNode and type is Paragraph
    • have children, but not have value
  • Emphasis is a TxtTextNode and type is Emphasis
    • have value
  • "text" is a TxtTextNode and type is Str
    • have value

Unist

TxtAST have a minimum of compatibility for unist: Universal Syntax Tree.

We have discussed Unist in Compliances tests for TxtNode #141.

For testing Processor plugin

You can use @textlint/ast-tester for testing your processor plugin's parser.

  • textlint/@textlint/ast-tester: Compliance tests for textlint's AST
import { test, isTxtAST } from "@textlint/ast-tester";
// your implement
import yourParse from "your-parser";
// recommenced: test much pattern test
const AST = yourParse("This is text");

// Validate AST
test(AST); // if the AST is invalid, then throw Error

isTxtAST(AST); // true or false

:warning: Current test function does not check node specific properties. For example, TxtHeaderNode has level property, but test function does not check it.

  • Issue: ast-tester should validate individual Node type · Issue #1009 · textlint/textlint
← Use as Node ModulesContributing Guideline →
  • What is AST?
    • TxtNode
    • TxtTextNode
    • TxtParentNode
    • type
    • All node types
    • Minimal node property
  • Built-in Parser
  • Package
  • Online Parsing Demo
    • loc
  • Example
  • Unist
  • For testing Processor plugin
textlint
Docs
User ManualDeveloper Guide
Community
Project Chat
More
BlogGitHubStar
Copyright © 2023 textlint organization