Skip to content

Commit

Permalink
Merge pull request #24 from nixys/feat/17
Browse files Browse the repository at this point in the history
feat(#17): Custom rules for types
  • Loading branch information
borisershov authored Jun 19, 2024
2 parents 215a9db + 9f18c70 commit cc48b6b
Show file tree
Hide file tree
Showing 13 changed files with 965 additions and 439 deletions.
43 changes: 25 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,24 @@ Filters description for specified table.
| `value` | String | Yes | - | The value to be used to replace at every cell in specified column. In accordance with the `type` this value may be either `Go template` or `command`. See below for details|
| `unique` | Bool | No | `false` | If true checks the generated value for cell is unique whole the column |

**Go template**

To anonymize a database fields you may use a Go template with the [Sprig template library's](https://masterminds.github.io/sprig/) functions. You may also use values of other columns in the rules for same row (with values before substitutions).

Additional filter functions:
- `null`: set a field value to `NULL`
- `isNull`: compare a field value with `NULL`

**Command**

To anonymize a database fields you may use a commands (scripts or binaries) with any logic you need. The command's concept has following properties:
- The command's `stdout` will be used as a new value for the anonymized field
- Command must return zero exit code, otherwise nxs-data-anonymizer will falls with error (in this case `stderr` will be used as an error text)
- Environment variables with the row data are available within the command:
- `ENVVARTABLE`: contains a name of the filtered table
- `ENVVARCURCOLUMN`: contains the current column name
- `ENVVARCOLUMN_{COLUMN_NAME}`: contains values (before substitutions) for all columns for the current row

##### Security settings

| Option | Type | Required | Default value | Description |
Expand All @@ -283,7 +301,7 @@ Filters description for specified table.
| Option | Type | Required | Default value | Description |
|--- | :---: | :---: | :---: |--- |
| `tables` | String | No | `pass` | Security policy for tables. If value `skip` is used all undescribed tables in config will be skipped while anonymization |
| `columns` | String | No | `pass` | Security policy for columns. If value `randomize` is used all undescribed columns in config will be randomized (with respect to types) while anonymization |
| `columns` | String | No | `pass` | Security policy for columns. If value `randomize` is used all undescribed columns in config will be randomized (with default rules in accordance to types) while anonymization |

_Values to masquerade a columns in accordance with the types see below._

Expand Down Expand Up @@ -356,25 +374,14 @@ _Values to masquerade a columns in accordance with the types see below._
| Option | Type | Required | Default value | Description |
|--- | :---: | :---: | :---: |--- |
| `columns` | Map of Filters | No | - | Default filter for columns (in any table). That filters will be applied for columns with this names without described filters |
| `types` | Slice of [Types](#types-settings) | No | - | Custom filters for types (in any table). With this filter rules you may override default filters for types |

###### Types settings

**Go template**

To anonymize a database fields you may use a Go template with the [Sprig template library's](https://masterminds.github.io/sprig/) functions. You may also use values of other columns in the rules for same row (with values before substitutions).

Additional filter functions:
- `null`: set a field value to `NULL`
- `isNull`: compare a field value with `NULL`

**Command**

To anonymize a database fields you may use a commands (scripts or binaries) with any logic you need. The command's concept has following properties:
- The command's `stdout` will be used as a new value for the anonymized field
- Command must return zero exit code, otherwise nxs-data-anonymizer will falls with error (in this case `stderr` will be used as an error text)
- Environment variables with the row data are available within the command:
- `ENVVARTABLE`: contains a name of the filtered table
- `ENVVARCURCOLUMN`: contains the current column name
- `ENVVARCOLUMN_{COLUMN_NAME}`: contains values (before substitutions) for all columns for the current row
| Option | Type | Required | Default value | Description |
|--- | :---: | :---: | :---: |--- |
| `regex` | String | Yes | - | Regular expression. Will be checked for match for column data type (in `CREATE TABLE` section) |
| `rule` | [Columns](#columns-settings) | Yes | - | Rule will be applied columns with data types matched for specified regular expression |

#### Example

Expand Down
12 changes: 11 additions & 1 deletion ctx/conf.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ type columnFilterConf struct {
type securityConf struct {
Policy securityPolicyConf `conf:"policy"`
Exceptions securityExceptionsConf `conf:"exceptions"`
Defaults filterConf `conf:"defaults"`
Defaults securityDefaultsConf `conf:"defaults"`
}

type securityPolicyConf struct {
Expand All @@ -49,6 +49,16 @@ type securityExceptionsConf struct {
Columns []string `conf:"columns"`
}

type securityDefaultsConf struct {
Columns map[string]columnFilterConf `conf:"columns"`
Types []securityDefaultsTypeConf `conf:"types"`
}

type securityDefaultsTypeConf struct {
Regex string `conf:"regex" conf_extraopts:"required"`
Rule columnFilterConf `conf:"rule" conf_extraopts:"required"`
}

type mysqlConf struct {
Host string `conf:"host" conf_extraopts:"required"`
Port int `conf:"port" conf_extraopts:"required"`
Expand Down
165 changes: 102 additions & 63 deletions ctx/context.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,10 @@ import (
"os"
"time"

"github.com/nixys/nxs-data-anonymizer/interfaces"
mysql_anonymize "github.com/nixys/nxs-data-anonymizer/modules/anonymizers/mysql"
pgsql_anonymize "github.com/nixys/nxs-data-anonymizer/modules/anonymizers/pgsql"
progressreader "github.com/nixys/nxs-data-anonymizer/modules/progress_reader"

"github.com/nixys/nxs-data-anonymizer/ds/mysql"
"github.com/nixys/nxs-data-anonymizer/misc"
Expand All @@ -19,13 +21,12 @@ import (

// Ctx defines application custom context
type Ctx struct {
Log *logrus.Logger
Input io.Reader
Output io.Writer
Rules relfilter.Rules
Progress progressCtx
Security SecurityCtx
DB DBCtx
Log *logrus.Logger
Output io.Writer
Progress progressCtx
DB DBCtx
Anonymizer interfaces.Anonymizer
PR *progressreader.ProgressReader
}

type DBCtx struct {
Expand Down Expand Up @@ -61,6 +62,8 @@ type SecurityCtx struct {
// Init initiates application custom context
func AppCtxInit() (any, error) {

var ir io.Reader

c := &Ctx{}

args, err := ArgsRead()
Expand All @@ -81,9 +84,9 @@ func AppCtxInit() (any, error) {
}

if args.Input == nil {
c.Input = os.Stdin
ir = os.Stdin
} else {
c.Input, err = os.Open(*args.Input)
ir, err = os.Open(*args.Input)
if err != nil {
c.Log.WithFields(logrus.Fields{
"details": err,
Expand All @@ -109,7 +112,7 @@ func AppCtxInit() (any, error) {
Type: args.DBType,
}

// Connect to MySQL if necessary
// DEPRECATED: Connect to MySQL if necessary
if conf.MySQL != nil {
m, err := mysql.Connect(mysql.Settings{
Host: conf.MySQL.Host,
Expand All @@ -134,56 +137,103 @@ func AppCtxInit() (any, error) {
}
}

c.Rules.Tables = make(map[string]relfilter.TableRules)

if misc.SecurityPolicyColumnsTypeFromString(conf.Security.Policy.Columns) == misc.SecurityPolicyColumnsRandomize {
switch args.DBType {
case DBTypeMySQL:
c.Rules.RandomizeTypes = mysql_anonymize.RandomizeTypesDefault
case DBTypePgSQL:
c.Rules.RandomizeTypes = pgsql_anonymize.RandomizeTypesDefault
}
}

for t, f := range conf.Filters {

c.Rules.Tables[t] = relfilter.TableRules{
Columns: func() map[string]relfilter.ColumnRule {
cc := make(map[string]relfilter.ColumnRule)
for c, cf := range f.Columns {
cc[c] = relfilter.ColumnRule{
Type: misc.ValueTypeFromString(cf.Type),
Value: cf.Value,
Unique: cf.Unique,
}
c.PR = progressreader.Init(ir)

tr := func() map[string]map[string]relfilter.ColumnRuleOpts {
tables := make(map[string]map[string]relfilter.ColumnRuleOpts)
for t, cs := range conf.Filters {
columns := make(map[string]relfilter.ColumnRuleOpts)
for c, f := range cs.Columns {
columns[c] = relfilter.ColumnRuleOpts{
Type: misc.ValueType(f.Type),
Value: f.Value,
Unique: f.Unique,
}
return cc
}(),
}
tables[t] = columns
}
}
return tables
}()

c.Rules.Defaults = relfilter.TableRules{
Columns: func() map[string]relfilter.ColumnRule {
cc := make(map[string]relfilter.ColumnRule)
for c, cf := range conf.Security.Defaults.Columns {
cc[c] = relfilter.ColumnRule{
Type: misc.ValueTypeFromString(cf.Type),
Value: cf.Value,
Unique: cf.Unique,
}
dr := func() map[string]relfilter.ColumnRuleOpts {
cc := make(map[string]relfilter.ColumnRuleOpts)
for c, cf := range conf.Security.Defaults.Columns {
cc[c] = relfilter.ColumnRuleOpts{
Type: misc.ValueTypeFromString(cf.Type),
Value: cf.Value,
Unique: cf.Unique,
}
return cc
}(),
}
}
return cc
}()

c.Rules.ExceptionColumns = func() map[string]any {
v := make(map[string]any)
for _, e := range conf.Security.Exceptions.Columns {
v[e] = nil
trc := func() []relfilter.TypeRuleOpts {
cc := []relfilter.TypeRuleOpts{}
for _, t := range conf.Security.Defaults.Types {
cc = append(
cc,
relfilter.TypeRuleOpts{
Selector: t.Regex,
Rule: relfilter.ColumnRuleOpts{
Type: misc.ValueTypeFromString(t.Rule.Type),
Value: t.Rule.Value,
Unique: t.Rule.Unique,
},
},
)
}
return v
return cc
}()

switch args.DBType {
case DBTypeMySQL:
c.Anonymizer, err = mysql_anonymize.Init(
c.PR,
mysql_anonymize.InitOpts{
Security: mysql_anonymize.SecurityOpts{
TablesPolicy: misc.SecurityPolicyTablesType(conf.Security.Policy.Tables),
ColumnsPolicy: misc.SecurityPolicyColumnsTypeFromString(conf.Security.Policy.Columns),
TableExceptions: conf.Security.Exceptions.Tables,
},
Rules: mysql_anonymize.RulesOpts{
TableRules: tr,
DefaultRules: dr,
ExceptionColumns: conf.Security.Exceptions.Columns,
TypeRuleCustom: trc,
},
},
)
if err != nil {
c.Log.WithFields(logrus.Fields{
"details": err,
}).Errorf("ctx init")
return nil, err
}
case DBTypePgSQL:
c.Anonymizer, err = pgsql_anonymize.Init(
c.PR,
pgsql_anonymize.InitOpts{
Security: pgsql_anonymize.SecurityOpts{
TablesPolicy: misc.SecurityPolicyTablesType(conf.Security.Policy.Tables),
ColumnsPolicy: misc.SecurityPolicyColumnsTypeFromString(conf.Security.Policy.Columns),
TableExceptions: conf.Security.Exceptions.Tables,
},
Rules: pgsql_anonymize.RulesOpts{
TableRules: tr,
DefaultRules: dr,
ExceptionColumns: conf.Security.Exceptions.Columns,
TypeRuleCustom: trc,
},
},
)
if err != nil {
c.Log.WithFields(logrus.Fields{
"details": err,
}).Errorf("ctx init")
return nil, err
}
}

// Progress settings
c.Progress.Humanize = conf.Progress.Humanize

Expand All @@ -195,17 +245,6 @@ func AppCtxInit() (any, error) {
return nil, err
}

c.Security = SecurityCtx{
TablePolicy: misc.SecurityPolicyTablesTypeFromString(conf.Security.Policy.Tables),
TableExceptions: func() map[string]any {
v := make(map[string]any)
for _, e := range conf.Security.Exceptions.Tables {
v[e] = nil
}
return v
}(),
}

return c, nil
}

Expand Down
10 changes: 10 additions & 0 deletions interfaces/anonymizer.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
package interfaces

import (
"context"
"io"
)

type Anonymizer interface {
Run(context.Context, io.Writer) error
}
Loading

0 comments on commit cc48b6b

Please sign in to comment.