Files
tidb/pkg/parser/docs/quickstart.md
2023-11-29 05:22:18 +00:00

193 lines
5.8 KiB
Markdown

# Quickstart
This parser is highly compatible with MySQL syntax. You can use it as a library, parse a text SQL into an AST tree, and traverse the AST nodes.
In this example, you will build a project, which can extract all the column names from a text SQL.
## Prerequisites
- [Golang](https://golang.org/dl/) version 1.13 or above. You can follow the instructions in the official [installation page](https://golang.org/doc/install) (check it by `go version`)
## Create a Project
```bash
mkdir colx
cd colx
go mod init colx
touch main.go
```
## Import Dependencies
First, you need to use `go get` to fetch the dependencies through git hash. The git hashes are available in [release page](https://github.com/pingcap/tidb/releases). Take `v7.5.0` as an example:
```bash
go get -v github.com/pingcap/tidb/pkg/parser@069631e
```
> **NOTE**
>
> The parser was merged into TiDB repo since v5.3.0. So you can only choose version v5.3.0 or higher in this TiDB repo.
>
> You may want to use advanced API on expressions (a kind of AST node), such as numbers, string literals, booleans, nulls, etc. It is strongly recommended using the `types` package in TiDB repo with the following command:
>
> ```bash
> go get -v github.com/pingcap/tidb/pkg/types/parser_driver@069631e
> ```
> and import it in your golang source code:
> ```go
> import _ "github.com/pingcap/tidb/pkg/types/parser_driver"
> ```
Your directory should contain the following three files:
```
.
├── go.mod
├── go.sum
└── main.go
```
Now, open `main.go` with your favorite editor, and start coding!
## Parse SQL text
To convert a SQL text to an AST tree, you need to:
1. Use the [`parser.New()`](https://pkg.go.dev/github.com/pingcap/tidb/pkg/parser?tab=doc#New) function to instantiate a parser, and
2. Invoke the method [`Parse(sql, charset, collation)`](https://pkg.go.dev/github.com/pingcap/tidb/pkg/parser?tab=doc#Parser.Parse) on the parser.
```go
package main
import (
"fmt"
"github.com/pingcap/tidb/pkg/parser"
"github.com/pingcap/tidb/pkg/parser/ast"
_ "github.com/pingcap/tidb/pkg/parser/test_driver"
)
func parse(sql string) (*ast.StmtNode, error) {
p := parser.New()
stmtNodes, _, err := p.ParseSQL(sql)
if err != nil {
return nil, err
}
return &stmtNodes[0], nil
}
func main() {
astNode, err := parse("SELECT a, b FROM t")
if err != nil {
fmt.Printf("parse error: %v\n", err.Error())
return
}
fmt.Printf("%v\n", *astNode)
}
```
Test the parser by running the following command:
```bash
go run main.go
```
If the parser runs properly, you should get a result like this:
```
&{{{{SELECT a, b FROM t}}} {[]} 0xc0000a1980 false 0xc00000e7a0 <nil> 0xc0000a19b0 <nil> <nil> [] <nil> <nil> none [] false false 0 <nil>}
```
> **NOTE**
>
> Here are a few things you might want to know:
> - To use a parser, a `parser_driver` is required. It decides how to parse the basic data types in SQL.
>
> You can use [`github.com/pingcap/tidb/pkg/parser/test_driver`](https://pkg.go.dev/github.com/pingcap/tidb/pkg/parser/test_driver) as the `parser_driver` for test. Again, if you need advanced features, please use the `parser_driver` in TiDB (run `go get -v github.com/pingcap/tidb/types/parser_driver@069631e` and import it).
> - The instantiated parser object is not goroutine safe and not lightweight. It is better to keep it in a single goroutine, and reuse it if possible.
> - Warning: the `parser.result` object is being reused without being properly reset or copied. This can cause unexpected behavior or errors if the object is used for multiple parsing operations or concurrently in multiple goroutines. To avoid these issues, make a copy of `parser.result` object before calling `parser.Parse()` again or before using it in another goroutine, or create a new `parser` object altogether for each new parsing operation.
## Traverse AST Nodes
Now you get the AST tree root of a SQL statement. It is time to extract the column names by traverse.
Parser implements the interface [`ast.Node`](https://pkg.go.dev/github.com/pingcap/tidb/pkg/parser/ast?tab=doc#Node) for each kind of AST node, such as `SelectStmt`, `TableName`, `ColumnName`, etc. [`ast.Node`](https://pkg.go.dev/github.com/pingcap/tidb/pkg/parser/ast?tab=doc#Node) provides a method `Accept(v Visitor) (node Node, ok bool)` to allow any struct that has implemented [`ast.Visitor`](https://pkg.go.dev/github.com/pingcap/tidb/pkg/parser/ast?tab=doc#Visitor) to traverse itself.
[`ast.Visitor`](https://pkg.go.dev/github.com/pingcap/tidb/pkg/parser/ast?tab=doc#Visitor) is defined as follows:
```go
type Visitor interface {
Enter(n Node) (node Node, skipChildren bool)
Leave(n Node) (node Node, ok bool)
}
```
Now you can define your own visitor, `colX`(columnExtractor):
```go
type colX struct{
colNames []string
}
func (v *colX) Enter(in ast.Node) (ast.Node, bool) {
if name, ok := in.(*ast.ColumnName); ok {
v.colNames = append(v.colNames, name.Name.O)
}
return in, false
}
func (v *colX) Leave(in ast.Node) (ast.Node, bool) {
return in, true
}
```
Finally, wrap `colX` in a simple function:
```go
func extract(rootNode *ast.StmtNode) []string {
v := &colX{}
(*rootNode).Accept(v)
return v.colNames
}
```
And slightly modify the main function:
```go
func main() {
if len(os.Args) != 2 {
fmt.Println("usage: colx 'SQL statement'")
return
}
sql := os.Args[1]
astNode, err := parse(sql)
if err != nil {
fmt.Printf("parse error: %v\n", err.Error())
return
}
fmt.Printf("%v\n", extract(astNode))
}
```
Test your program:
```bash
go run main.go 'select a, b from t'
```
```
[a b]
```
You can also try a different SQL statement as an input. For example:
```console
$ go run main.go 'SELECT a, b FROM t GROUP BY (a, b) HAVING a > c ORDER BY b'
[a b a b a c b]
$ go run main.go 'SELECT a, b FROM t/invalid_str'
parse error: line 1 column 19 near "/invalid_str"
```
Enjoy!