- What do I do about the years of prior data that I have accumulated?
- How do I efficiently process new data as it comes in from my bank?
The prior data comes in many different sizes and styles. For the immediate future, I will focus solely on the years 2016-2019. I can add additional years after getting things to work for 2016-2019.
New data (transactions) come in from my bank through CSV and QFX downloads. I will download and save both of the CSV and QFX data, but work exclusively with the CSV files at this point.
There is a Download Transactions
button that loads a modal window with
settings for
- Account
- From Date
- To Date
- Download To (i.e., format)
{{{noindent}}} So I can easily pick the account, start date, end date, and format, download new transactions into a separate file. The problem that arises is that because this is a manual operation, it becomes difficult to keep track of what dates have been downloaded and processed already, or even wanting to reprocess some dates or all dates, etc. without accumulating duplicate data. An acceptable solution is to concatenate newly downloaded data into files according to account and year, then remove duplicate items. This would be accomplished with a Perl script concatenating the files, sorting the lines, and unique’ifying them and then saving and backing up new and original files.
A potential workflow would therefore be:
- Download raw data from a bank by account, start date, and end date, which
will be stored in the
~/Downloads
directory asexport.csv
. A script can be executed to review the data to determine what the year is. However, there is no indication in either the filename or the data from which account the data was downloaded. Therefore, any script that is run must be told what account this data is associated with, as for example here:catuniq [--dest <path>] <acct>
Invoking this command will look for a file name
export.csv
in the~/Downloads
directory, determine what year its transactions are, and catenate the file onto the proper destination file according to year and account, and then sort and unique’ify the new file. It will create the necessary destination directory and file to allow the command to run.The file onto which the export is catenated can either be indicated on the command line by means of the
--dest
option, or located through the environment variable$WORKUSB
if the--dest
option is omitted. Backups are stored in $WORKBAK with the Unix time appended to the end. - Process the raw data to remove extraneous and unnecessary data while reformating the data into the proper form and columns for efficiently working with the financial data, e.g., producing accounting and tax reports, budgets, etc.. This processed data will be stored in an SQLite3 database, avoiding duplicating data at all times.
- Export data from the SQLite3 database, and convert it into a form usable by the Ledger3 program, again always avoiding duplicating transactions.
- Do as much as possible with the Ledger3 data automatically to create proper double entry transactions to avoid time-consuming manual work.
This workflow requires a way to download any amount of data and process it with
the click of a button without worrying about duplicating transactions. Ledger3
is supposed to be able to detect duplicates and avoid reloading them with its
convert
command. Therefore it would be convenient to be able to download any
amount of new data from a bank and have it combined with old data without
duplicating transactions as well.
Then, the workflow would process the downloaded data into a form usable in both
an SQLite3 database and Ledger3, again without having to worry about
duplicating data. Ledger3 uses a hashing function to produce a unique hash for
each entry, similar to how git
works. This may be a possible solution for
the raw bank data as well.
package CATUNIQ;
our $VERSION = "0.1.2";
eval $VERSION;
# 2019-07-31T12:30
use strict;
use warnings; no warnings 'experimental';
use v5.16;
use File::Spec; # Portably perform operations on file names
use File::Path; # Create or remove directory trees
use File::Slurp; # Simple and Efficient Reading/Writing/Modifying of Complete Files
use File::Copy; # Copy files or filehandles
use List::Util 'uniqstr'; # A selection of general-utility list subroutines
use OptArgs2; # Integrated argument and option processing
use Data::Printer; # colored pretty-print of Perl data structures and objects
my @valid_accts = (qw'
6815
6831
6151'
);
my @valid_years = (qw'
2016
2017
2018
2019'
);
# ESTABLISH THE CL OPTIONS AND ARGUMENTS
opt help => (
isa => 'Flag',
comment => 'Help',
alias => 'h',
ishelp => 1,
);
arg acct => (
isa => 'Str',
comment => 'The name of the account to which the file is related; e.g. "usb_6815" or "usb_6831"',
required=> 1,
);
opt dest => (
isa => 'Str',
alias => 'd',
comment => 'path to the destination file upon which the new data will be catenated.',
default => exists $ENV{WORKUSB} ? $ENV{WORKUSB} : undef,
);
opt restore => (
comment => 'Restore a backed-up file related to <ACCT>',
isa => 'Flag',
alias => 'r',
);
# PROCESS THE CL OPTIONS AND ARGUMENTS
my $opts = optargs;
# VERIFY FILES
# Verify $WORKBAK exists
exists $ENV{WORKBAK} || die("STOP: \$WORKBAK is not defined");
# Verify correct form of 'acct', e.g., 'usb_6815'
my ($acct) = $opts->{acct} =~ /usb_(\d{4})/ or die("STOP: incorrect acct form: $opts->{acct}");
$acct ~~ @valid_accts or die("STOP: acct $acct is not a member of @valid_accts");
# verify a $dest has been supplied
die ("STOP: you did not supply a '-dest' option and \$WORKUSB is not defined.") unless exists $opts->{dest};
my $dest = File::Spec->catdir($opts->{dest}, $opts->{acct}); # e.g., $WORKUSB/usb_6815
if ($opts->{restore}) {
say "Running restore.";
restore();
exit;
}
# Find and verify 'export.csv'
my $export = File::Spec->catfile($ENV{HOME}, 'Downloads', 'export.csv');
-e -r -w $export or die("STOP: $export must exist, be readable, and be writable.");
# Find year from within export.csv
my @lines = read_file($export);
chomp(my $header = shift @lines); # remove the header line from $export
my ($year) = $lines[1] =~ /([[:digit:]]{4})/
or die("Cannot obtain a year from $export\n");
# verify $dest_year dir exists or create it, including subdirectories
my $dest_year = File::Spec->catdir($dest, $year); # e.g., $WORKUSB/usb_6815/2019
File::Path::make_path($dest_year, {verbose => 1}) unless ( -d $dest_year );
my $acct_year = "$opts->{acct}--${year}.csv"; # e.g., usb_6815--2019.csv
my $dest_file = File::Spec->catfile($dest, $year, $acct_year); # e.g., $WORKUSB/usb_6815/2019/usb_6815--2019.csv
# Backup original $dest_file to $WORKBAK before appending to
my $dest_bak = File::Spec->catfile($ENV{WORKBAK}, "$acct_year." . time());
copy($dest_file, $dest_bak);
# APPEND $export onto $dest_file
append_file($dest_file, @lines)
or die("STOP: append of $dest_file and \@lines failed.\n");
# UNIQUE new $dest_file
@lines = uniqstr sort map { # first change date to year-mm-dd for proper sorting
if (/([[:digit:]]{1,2})\/([[:digit:]]{1,2})\/([[:digit:]]{4})/) {
my $year = sprintf("%4d-%02d-%02d",$3,$1,$2);
s/$&/$year/;
}
$_;
} read_file($dest_file);
unshift @lines, pop @lines; # header ends up last after the sort; put it back to beginning
# Save new $dest_file
write_file($dest_file, @lines);
# Backup export.csv to $WORKBAK
move($export, File::Spec->catfile($ENV{WORKBAK}, "export.${acct_year}." . time()));
say "SUCCESS: $export catuniq'ed onto $dest_file.";
sub restore {
use POSIX qw(strftime);
my $acct = $opts->{acct}; # e.g. usb_6815
my $dt = qr/^(\d{4}-\d{2}-\d{2})/;
chdir $ENV{WORKBAK};
opendir (my $dh, '.') || die "Can't open $ENV{WORKBAK}: $!";
my @baks =
sort { # sort by most recent backup first
my ($at) = $a->{t} =~ $dt; # just sort by datetime
my ($bt) = $b->{t} =~ $dt;
$bt cmp $at;
}
map { # change Unix time to POSIX ISO datetime %Y-%m-%dT%H:%M:%S
my ($acct, $time) = /^(.*.csv).(\d+)$/;
$time = substr $time, 0, 10; # remove milliseconds from those times that have them
my $t = (strftime "%F T %T", localtime($time)) . sprintf(" --- %s", $acct);
{t => $t, o => $_}; # map to POSIX time, and original filename as a hashref
}
grep {/$acct.*.csv/ }
readdir($dh);
foreach (@baks) {
state $c = 0;
printf("[%2d] %s\n", $c++, $_->{t});
}
print "Pick a number: ";
chomp (my $num = <STDIN>);
say "You chose $baks[$num]->{t} ($baks[$num]->{o})";
print "restore (y/n)? ";
exit unless <STDIN> =~ /y/i;
my ($file) = $baks[$num]->{t} =~ /--- (.*)$/; # i.e., 'usb_6815--2019.csv'
my ($year) = $file =~ /(\d{4})\.csv$/; # i.e., '2019'
my $restore_path = File::Spec->catfile($dest,$year,$file); # full path to file to be restored
my $bak_path = File::Spec->catfile($ENV{WORKBAK}, $baks[$num]->{o}); # full path to backed-up file
# back up the file to be restored, just in case; use same directory
move( ${restore_path}, ${restore_path}.'.bak') or die "Backup of ${restore_path} failed: $!";
# note that the backed-up file will be deleted
move( ${bak_path}, ${restore_path}) or die "Restore of $baks[$num]->{o} failed: $!";
say "Successfully restored $baks[$num]->{o} to $restore_path";
}
ln -f $PWD/scripts/catuniq.pl $WORKBIN/catuniq
{{{heading(Header Lines)}}}
The header lines are:
"Date","Transaction","Name","Memo","Amount"
{{{heading(Data Columns)}}}
{{{subheading(Date)}}}
A sample date is:
1/4/2016
This should be transformed into:
2016-01-14
{{{subheading(Transaction)}}}
A Transaction
is one of
DEBIT
CREDIT
This should be transformed into:
debit | credit
{{{subheading(Name)}}}
A sample name
entry is:
DEBIT PURCHASE -VISA USPS PO BOXES 66800-3447779 DC
This should be transformed into:
visa purchase usps po boxes
{{{subheading(Memo)}}}
A sample memo
entry is:
Download from usbank.com. USPS PO BOXES 66800-3447779 DC
The Download from usbank.com
should be removed in all cases. The extraneous
numbers should be removed whenever possible.
{{{subheading(Amount)}}}
- -66.0000
This should be transformed into:
$-66.00
The minimum SQLite tables that should be created are:
- usb
- includes business (6815), trust (6831), personal (6151) data
- cases
- 190301, etc.
- people
- John Doe, Mary Jane, etc.
- businesses
- Law Office of …, etc.
- checks
- holds check information parsed from
worklog.<year>.otl
files
More can be created as needed.
The columns that should be created for the usb bank tables are:
rowid
(implicit creation)acct
in the form ofusb_6815|usb_6831|usb_6151
date
in the form ofyyyy-mm-dd
trans
containing eitherCREDIT | DEBIT
checkno
containing a check number, if present (check
is a reserved word and throws an error)txfr
containing a direction arrow (<
or>
) and a bank account (usb_6151
)payee
category
memo
desc1
desc2
caseno
containing a related case number (case
is apparently a reserved word and throws an error)amount
in the form\pm##,###.##
rowid | acct | date | trans | checkno | txfr | payee | category | memo | desc1 | desc2 | caseno | amount |
---|---|---|---|---|---|---|---|---|---|---|---|---|
primary key | usb_6815 | yyyy-mm-dd | credit | #### | < usb 6151 | text | text | text | text | text | integer | ##,###.## |
implicit | usb_6831 | not null | debit | null | > usb 6831 | not null | null | null | null | null | null | not null |
creation | usb_6151 | null |
usb (
rowid INTEGER PRIMARY KEY NOT NULL,
acct TEXT NOT NULL,
date TEXT NOT NULL,
trans TEXT NOT NULL,
checkno TEXT,
txfr TEXT,
payee TEXT NOT NULL,
category TEXT,
note TEXT,
desc1 TEXT,
desc2 TEXT,
caseno TEXT,
amount REAL NOT NULL,
OrigPayee TEXT NOT NULL,
OrigMemo TEXT NOT NULL )
The database column names are identical to the SQL column names except for
rowid
, which is missing since it is auto-generated by SQLite3.
const DB_TABLES = {
usb: 'usb',
checks: 'checks',
};
const DB_ACCTS = {
'6815': 'Business',
'6831': 'Trust',
'6151': 'Personal',
};
const DB_YEARS = [
'2016',
'2017',
'2018',
]
const DB_COLS = [
'acct',
'date',
'trans',
'checkno',
'txfr',
'payee',
'category',
'note',
'desc1',
'desc2',
'caseno',
'amount',
'OrigPayee',
'OrigMemo',
];
const EXPORT_DB_COLS = [
'rowid',
'acct',
'date',
'trans',
'checkno',
'txfr',
'payee',
'category',
'note',
'caseno',
'amount',
];
The column names for the checks
table columns are:
rowid
acct
e.g., 6815, 6831, 6151checkno
, e.g., 1001, 1002date
payee
subject
purpose
caseno
, e.g., 190301, 190205amount
rowid | acct | checkno | date | payee | subject | purpose | caseno | amount |
---|---|---|---|---|---|---|---|---|
INTEGER | TEXT | INTEGER | TEXT | TEXT | TEXT | TEXT | TEXT | REAL |
PRIMARY KEY | NOT NULL | NOT NULL | NOT NULL | NOT NULL | NOT NULL | NOT NULL |
checks (
acct TEXT NOT NULL,
checkno TEXT NOT NULL,
date TEXT NOT NULL,
payee TEXT NOT NULL,
subject TEXT NOT NULL,
purpose TEXT,
caseno TEXT NOT NULL,
amount REAL NOT NULL
)
const CHECKS_COLS = [
'acct',
'checkno',
'date',
'payee',
'subject',
'purpose',
'caseno',
'amount'
];
- ~csv-sqlite3 –csv <usb-acct> <year> [–attach <db>] [–checks]
- <usb-acct> is one of:
6815|6831|6151
- <year> is one of
2016|2017|2018
--attach <db>
is optional; the default isworkfin.sqlite
, otherwise<db>.sqlite
--checks
parses worklog.<year>.otl files for check data and places it into <db> in achecks
documents- Transform downloaded data and save in a new CSV file and a new sqlite3
database. The source CSV files are found at:
$WORKFIN/src/usb/usb_{6815|6831|6151}/{2016|2017|2018}/usb_6815--2016.csv
- The transformed CSV files are found at:
$WORKFIN/csv/usb_{6815|6831|6151}__{2016|2017|2018}.csv
- The new SQLite3 database is found at:
$WORKFIN/db/workfin.sqlite
- The exported database file in CSV format will be found at:
$WORKFIN/csv/workfin.csv
- <usb-acct> is one of:
csv-sqlite3 --delete [<db>]
-
Delete the SQLite3 database file
<db>.sqlite
, orworkfin.sqlite
if left blank; also delete the CSV files incsv/
;
csv-sqlite3 --log-level <level>
Set a log level (default is
warn
)
{{{heading(External Dependencies)}}}
This project’s external dependencies are the following Node.js modules:
command-line-args
- https://github.com/75lb/command-line-args#readme
command-line-usage
- https://github.com/75lb/command-line-usage
csv
- https://csv.js.org/
sqlite3
- https://github.com/mapbox/node-sqlite3/wiki
accounting
- http://openexchangerates.github.io/accounting.js/
{{{heading(Node.js Dependencies)}}}
This project’s Node.js internal dependencies are the following:
File System (fs)
- File System for working with files
Path (path)
- Utilities for working with file and directory paths
Utilities (util)
- Utilities for inspecting objects
Assert (assert)
- Assertion functions for verifying invariants
Create the CSV-SQLite3 project and install the Node.js package dependencies:
command-line-ars
command-line-usage
csv
sqlite3
accounting
yarn --yes --private init
sed -i '' -e 's/\"version\": \"1.0.0\",/\"version\": \"0.0.73\",/' package.json
yarn add command-line-args command-line-usage csv sqlite3 accounting
yarn add ssh://[email protected]:wlharvey4/wlparser.git#prod
cat package.json
Create the base file index.js
and import the project’s dependencies.
/* index.js */
const fs = require('fs');
const path = require('path');
const util = require('util');
const assert = require('assert').strict;
const cl_args = require('command-line-args');
const cl_usage = require('command-line-usage');
const csv = require('csv');
const sqlite3 = require('sqlite3').verbose(); // remove 'verbose' in production
const accounting = require('accounting');
<<define-database-tables-and-columns>>
--help
.
This section generates a usage message activated by the --help
option. It
uses the ~options_defs~ object created in the code in the next section.
const sections = [
{
header: 'CSV-SQLite3',
content: 'Processes raw usb csv files into a form usable by SQLite3'
},
{
header: 'Options',
optionList: option_defs,
},
{
content: `Project directory: {underline ${process.env.WORKNODE}/CSV-SQLite3}`
}
];
const usage = cl_usage(sections);
console.log(usage);
Options include giving the name of a database to attach to using --attach
<db>
. In the absence of this option, a default database will be used. A
database can be deleted here as well using the option --delete <db>
, with a
backup being saved in the WORKBAK
directory wtih the unix time suffixed to
the end. If the option {{{option(–checks)}}} is included, parse the
worklog.<year>.otl
worklog file for check information and save the same in a
checks
table in <db>
.
{{{subheading(Usage)}}}
Obtain usage information using the {{{option(–help)}}} option:
csv-sqlite3 --help | -h
{{{subheading(Choose the CSV File)}}}
For non-delete commands, identify the CSV file to transform via the --csv
option:
csv-sqlite3 --csv | -c 6815|6831 2004...2019
{{{subheading(Attach and Delete a Database)}}}
The database is attachable (it will be created automatically if it does not exist), and deleteable:
csv-sqlite3 --attach <db> | -a <db-name>
csv-sqlite3 --delete <db> | -d <db-name>
{{{subheading(Export the SQLite3 data to CSV)}}}
Export SQLite3 data into a CSV file of the same name:
csv-sqlite3 --export <db> | -e <db-name>
{{{subheading(Set a Log Level)}}}
csv-sqlite3 --log-level <value:0..10> | -l <value:0..10>
<<option-defs-variable>>
const option_defs = [
{ name: 'help', alias: 'h', type: Boolean, description: 'Prints this usage message.' },
{ name: 'attach', alias: 'a', type: String, description: 'Attach to an existing or new database file.' },
{ name: 'delete', alias: 'd', type: String, description: 'Delete an existing database file and related CSV files.' },
{ name: 'csv', alias: 'c', type: String, description: 'Process a CSV file [6815|6831] yyyy', multiple: true },
{ name: 'export', alias: 'e', type: String, description: 'Export sqlite3 data into a csv file (default \'workfin.csv\'' },
{ name: 'checks', type: Boolean, description: 'Find checks in worklog.<year>.otl and save to sqlite3 database checks file.' },
{ name: 'log-level', alias: 'l', type: Number, description: 'Set a log level 0..10' },
];
const options = cl_args(option_defs);
console.log(options);
if (options.help) {
<<csv-sqlite3-usage>>
process.exit(0);
}
let LOG_LEVEL = process.env.LOG_LEVEL || 1;
if (options['log-level'] >= 0) {
if (typeof options['log-level'] === 'number' && options['log-level'] <= 10)
LOG_LEVEL = options['log-level'];
else {
console.error(`Incorrect log-level: ${options['log-level']}; must be between 0 and 10`);
}
}
console.log(`Log-level set at: ${LOG_LEVEL}`);
The user can attach to a database file (either a specified file or the default
file, defined as $WORKFIN/db/workfin.sqlite
), or delete a specified database file
and the associated CSV file exported from the database data. All deleted files
are backed up to a backup directory that needs to be defined as a shell
environment variable: WORKBAK
. The backed-up files are appended with the
current date in UNIX time (seconds).
The attached database will be referenced as db
.
{{{heading(Verbose Mode)}}}
During development, call the verbose()
method on the sqlite3
object to
enable better stack traces. In production, remove this call for improved
performance.
if ( !process.env.WORKDB ) { // $WORKFIN/db
console.error('You must define a shell variable named WORKDB as a base directory for the .sqlite database file.');
process.exit(1);
}
if ( !process.env.WORKCSV ) { // $WORKFIN/csv
console.error('You must define a shell variable named WORKCSV as a base directory for the .csv files.');
process.exit(1);
}
if ( !process.env.WORKLEDGER ) { // $WORKFIN/ledger
console.error('You must define a shell variable named WORKLEDGER as a base directory for the .ledger files.');
process.exit(1);
}
if ( !process.env.WORKBAK) { // $WORK/workbak
console.error('You must define a shell variable named WORKBAK as a backup directory for storing deleted files.');
process.exit(1);
}
const WORKDB = process.env.WORKDB; // base directory for .sqlite db files
if (!fs.existsSync(WORKDB)) { fs.mkdirSync(WORKDB); }
const WORKCSV = process.env.WORKCSV; // base directory for .csv files
if (!fs.existsSync(WORKCSV)) { fs.mkdirSync(WORKCSV); }
const WORKLEDGER = process.env.WORKLEDGER; // base directory for .ledger files
const WORKBAK = process.env.WORKBAK; // base directory for storing deleted files
const DB_DEFAULT = 'workfin'; // default sqlite db name
const db_file = options.attach ? options.attach : // use provided option for attaching
options.delete ? options.delete : // use provided option for deletion
DB_DEFAULT; // if no provided option, use the default name
const db_path = path.format({
dir: WORKDB,
name: db_file,
ext: '.sqlite'
});
console.log(`db_path: ${db_path}`);
const csv_path = path.format({
dir: WORKCSV,
name: db_file,
ext: '.csv'
});
console.log(`csv_path: ${csv_path}`);
/*---DELETE--*/
if (options.hasOwnProperty('delete')) {
const WORKBAK_DB = path.format({
dir: WORKBAK,
name: 'db'
});
if (!fs.existsSync(WORKBAK_DB)) {
fs.mkdirSync(WORKBAK_DB, {recursive: true});
}
const WORKBAK_CSV= path.format({
dir: WORKBAK,
name: 'csv'
});
if (!fs.existsSync(WORKBAK_CSV)) {
fs.mkdirSync(WORKBAK_CSV, {recursive: true});
}
const WORKBAK_LEDGER = path.format({
dir: WORKBAK,
name: 'ledger'
});
if (!fs.existsSync(WORKBAK_LEDGER)) {
fs.mkdirSync(WORKBAK_LEDGER, {recursive: true});
}
// Backup workfin.sqlite, workfin.csv
const db_path_bak = path.format({
dir: WORKBAK_DB,
name: db_file,
ext: `.sqlite.${Date.now()}`
});
const csv_path_bak = path.format({
dir: WORKBAK_CSV,
name: db_file,
ext: `.csv.${Date.now()}`
});
try {
fs.renameSync(db_path, db_path_bak);
console.error(`Renamed ${db_path} to ${db_path_bak}`);
fs.renameSync(csv_path, csv_path_bak);
console.error(`Renamed ${csv_path} to ${csv_path_bak}`);
} catch (err) {
if (err.code === 'ENOENT')
console.log(`file ${db_path} and/or ${csv_path} did not exist; ignoring.`);
else {
throw err;
}
}
// Backup all .csv files
try {
const files = fs.readdirSync(WORKCSV);
files.forEach(file => {
const db_csv_path_file = path.format({
dir: WORKCSV,
name: file
});
const db_csv_path_bak = path.format({
dir: WORKBAK_CSV,
name: file,
ext: `.${Date.now()}`
});
fs.renameSync(db_csv_path_file, db_csv_path_bak);
console.log(`Renamed ${db_csv_path_file} to ${db_csv_path_bak}`);
});
} catch (err) {
if (err.code === 'ENOENT') {
console.log(`${db_csv_path} probably does not exist`);
} else {
throw err;
}
}
/* Ledger */
try {
const files = fs.readdirSync(WORKLEDGER);
files.forEach(file => {
if (!/zero/.test(file)) { // don't backup the zero ledger file
const ledger_file = path.format({
dir: WORKLEDGER,
name: file
});
const ledger_file_bak = path.format({
dir: WORKBAK_LEDGER,
name: file,
ext: `.${Date.now()}`
});
fs.renameSync(ledger_file, ledger_file_bak);
console.log(`Renamed ${ledger_file} to ${ledger_file_bak}`);
}
});
} catch (err) {
if (err.code === 'ENOENT') {
console.log(`${ledger_path} probably does not exist`);
} else {
throw err;
}
}
process.exit(0);
}
/*--ATTACH--*/
// attach in all situations except --delete
console.log('attaching...');
const db = new sqlite3.Database(db_path, err => {
if (err) {
return console.error(`Error opening database file ${db_path}: ${err.message})`);
}
console.log(`Successfully attached to database file ${db_path}`);
});
db.serialize();
db.run(`CREATE TABLE IF NOT EXISTS
<<define-usb-table-schema>>`);
// year is needed for checks, so define it here
const [acct,year] = options.csv;
if (!year) {
console.error('ERROR: year is not defined.');
process.exit(1);
}
const wl_year = parseInt(year, 10);
After the Sqlite3 database is exported and converted by Ledger, there are numerous individual entries that need to be converted but for which there is no real data available to help. An example would be checks. The check information is located in the worklog, so one solution is to parse the worklog to obtain check information, then parse the Ledger file to update it.
I created a package called wlparser
that can be used to find particular
elements in the worklog.<year>.otl
files. Among other things, this package
is set up to find and return all checks for a particular year. At this point,
the findChecks
program requires an argument for a <year>
, and will then
gather all checks and place them into the SQLite3 database in a table called
checks
. It can be run any number of times for any yeaer and will not add
duplicate entries.
When this csv-sqlite3
program converts the CSV files into the ledger
format, it can do a search for a particular check number in this SQLite3
checks
table and incorporate that data into the ledger file.
Use the commandline program find-checks <year>
to find and store checks from
the worklog yearly file identified by <year>
option (i.e.,
worklog.<year>.otl
) and save it into the SQLite3 database <db>
in a table
called checks
.
/* find-checks.js */
/* USAGE:
* find-checks <year> [db]
*/
const DEFAULT_DB = 'workfin';
const TABLE = 'checks';
const EXT = '.sqlite';
// make sure WORKDB is defined
if (typeof process.env.WORKDB === 'undefined') {
console.error('Must defined environment variable for WORKDB');
process.exit(1);
}
const WORKDB = process.env.WORKDB;
// make sure a <year> argument is included
if (process.argv.length < 3) {
console.error('Must include a <year> argument: "find-checks <year>"');
process.exit(1);
}
// make sure the <year> argument is a number
const wlyear = parseInt(process.argv[2],10);
if (isNaN(wlyear)) {
console.error(`The <year> argument: "${process.argv[2]}" must be a year, e.g., "2016"`);
process.exit(1);
}
// second optional argument is the name of the database, without extension
// if no second argument, use default db of $WORKDB/workfin.sqlite
const path = require('path');
const db_path = path.format({
dir: WORKDB,
name: `${process.argv[3] || DEFAULT_DB}`,
ext: EXT
});
// Everything is a go; load the wlparser, wlchecks, sqlite3 modules
const {WLChecks} = require('wlparser');
const wlchecks = new WLChecks(wlyear);
const sqlite3 = require('sqlite3').verbose(); // remove verbose() for production code
const CHECKS_COLS = [
'acct',
'checkno',
'date',
'payee',
'subject',
'purpose',
'caseno',
'amount'
];
let statement;
// Load the sqlite3 database
const db = new sqlite3.Database(db_path, err => {
if (err) {
console.error(`Database Error: ${err}`);
process.exit(1);
}
console.log(`Successfully opened database at ${db_path}`);
});
db.serialize();
statement = `CREATE TABLE IF NOT EXISTS
<<define-checks-table-schema>>`;
db.run(statement);
let cols = CHECKS_COLS.join(','); // create string of column names for INSERT statement
let values = CHECKS_COLS.map(col => `$${col}`).join(', '); // create string of placeholders for INSERT statement
statement = `INSERT INTO ${TABLE} (${cols}) VALUES (${values})`;
let all_checks = []; // used to filter out already-entered checks
wlchecks.on('check', data => {
delete data.type; // simply don't need this property
if (!all_checks.includes(data.checkno)) { // filter out already-entered checks
const new_data = {};
for (k in data) { // create the named parameters of form 'new_data = {$checkno: 1234}'
new_data[`$${k}`] = data[k];
}
db.run(statement, new_data, (err) => { // add the check data to the sqlite database
if (err) console.error(`ERROR: ${err}`);
});
};
}).on('checked', () => {
db.close();
}).on('error', err => {
process.exit(1);
});
// load all of the previously-entered checks into the array 'all_checks'
db.all(`SELECT checkno FROM ${TABLE} WHERE date LIKE '${wlyear}%'`, [], (err, check_data) => {
if (check_data) {
all_checks = check_data.map(row => row.checkno);
}
wlchecks.findChecks(); // start the stream running
});
{{{heading(Create Symbolic Link into WORKBIN)}}}
This code creates a symbolic link into the WORKBIN
directory for the command
line program find-checks
so that it can be run as a script. Note that the
link must be symbolic in order for the node_modules/wlparser/lib/*
files to
be found when find-checks
is run from the WORKBIN
directory.
ln -sf $PWD/find-checks.js $WORKBIN/find-checks
Once data has been downloaded from the bank’s web site and imported into the sqlite3 database, it must be converted into a Ledger file. This is accomplished using the {{{option(–export)}}} option.
To export the SQLite3 database data to CSV and Ledger files, use the {{{option(–export)}}} option:
--export [<csv-file>=workfin.csv] --csv <acct> <year>
The csv filename is optional, with the default being the same as the db file
(i.e., workfin
), with the extension .csv
, (i.e., workfin.csv
) in the
WORKCSV
directory. In order to facilitate matching an account to reconcile
against, the --export
option must be accompanied by the designation of a bank
account, e.g., 6815
and a year, e.g., 2016
. These will filter the data
that is exported from the sqlite
data file and will become the account to
reconcile against.
The export is accomplished by executing in a child process the command line program:
sqlite3 -header -csv db sql-statements
The child process runs the {{{command(sqlite3)}}} command, and connects the
STDOUT
stream to the target CSV file. Since the entire database contents for
a given year and account are exported, the output will truncate the target CSV
file upon opening it for writing. The program will halt after the export.
Upon an export of the SQLite3 data to a CSV file, the program will also send
the exported data through the Ledger {{{command(convert)}}} command and into
the workfin.ledger
data file.[fn::Refer to the Ledger manual at Sec. 7.2.1.2
for the {{{command(convert}}}} command.] This file is located in the ledger/
directory below the workfin
directory.
The {{{command(convert)}}} command uses a ledger
file filled with ledger
directives
to associate payee
‘s with account
‘s. If this directives
file does not exist, then it will be created.
/*--EXPORT--*/
if (options.hasOwnProperty('export')) {
const { spawnSync } = require('child_process');
const export_csv = options['export'] || db_file;
console.log(`exporting to ${export_csv}...`);
const export_csv_dir = WORKCSV;
if (!fs.existsSync(export_csv_dir)) {
fs.mkdirSync(export_csv_dir);
console.log(`Created ${export_csv_dir}`);
}
const export_csv_path = path.format({
dir: export_csv_dir,
name: export_csv,
ext: '.csv'
});
// --export must be accompanied by --csv <acct> <year> of the proper values
if (!options.hasOwnProperty('csv')) {
console.error('Export must be accompanied by a bank account (e.g., 6815), and a year (e.g., 2016)');
process.exit(1);
}
const _acct = options.csv[0],
_year = options.csv[1];
if (!(Object.keys(DB_ACCTS).includes(_acct) && DB_YEARS.includes(_year))) {
console.error(`Invalid values for acct: ${_acct} or year: ${_year}`);
process.exit(1);
}
const usb_acct = `usb_${_acct}`;
//'as' - Open file for appending in synchronous mode. The file is created if it does not exist.
let fd = fs.openSync(export_csv_path,'as');
const size = fs.statSync(export_csv_path).size;
const header = size === 0 ? 'header' : 'noheader';
console.log(`export_csv_path: ${export_csv_path}`);
const sql = `
SELECT ${EXPORT_DB_COLS.join(',')}
FROM usb
WHERE acct = '${usb_acct}' and date like '${_year}%';`;
console.log(`sql: ${sql}`);
let ret = spawnSync(
'sqlite3',
[
db_path,
'-csv',
`-${header}`,
sql,
],
{
encoding: 'utf-8',
stdio: [0,fd,2]
}
);
if (ret.error) {
console.log(`status: ${ret.status}\tsignal: ${ret.signal}`);
console.log(`error: ${ret.error}`);
}
console.log('done exporting');
fs.closeSync(fd);
/* CONVERT CSV TO LEDGER */
const ledger_dir = WORKLEDGER;
const ledger_path = path.format({
dir: ledger_dir,
name: export_csv,
ext: '.exported.ledger'
});
const zero_file = path.format({
dir: ledger_dir,
name: 'zero',
ext: '.ledger'
});
if (!fs.existsSync(ledger_dir)) {
fs.mkdirSync(ledger_dir);
}
//const l_file = fs.existsSync(ledger_path) ? ledger_path : zero_file;
const l_file = zero_file;
console.log(`converting: ${export_csv_path} to ledger_path: ${ledger_path}`);
fd = fs.openSync(ledger_path, 'as'); // 'as' - Open file for appending in synchronous mode.
// The file is created if it does not exist.
ret = spawnSync(
'ledger',
[
'convert',
`${export_csv_path}`,
'--invert',
'--input-date-format=%Y-%m-%d',
`--account=Assets:${DB_ACCTS[_acct]}`,
'--rich-data',
`--file=${l_file}`,
`--now=${(new Date()).toISOString().split('T')[0]}`,
],
{
encoding: 'utf-8',
stdio: [0,fd,2],
}
);
if (ret.error) {
console.log(`status: ${ret.status}\tsignal: ${ret.signal}`);
console.log(`error: ${ret.error}`);
}
fs.closeSync(fd);
process.exit(0);
}
/*--DON'T CONTINUE UNLESS --csv OPTION USED--*/
if (!options.hasOwnProperty('csv'))
process.exit(0);
The Zero Ledger File is a ledger
file with an opening balance, list of
accounts, and directives that associate payee
‘s with account
‘s. It is
used by the ledger
{{{command(convert)}}} command to prepare a ledger
file,
create initial set of accounts, and parse a CSV file into the ledger
format.
{{{heading(List of Accounts)}}}
- Expenses—where money goes
- Assets—where money sits
- Income—where money comes from
- Liabilities—where money is owed
- Equity—where value is
Beneath theses top levels, there can be any level of detail required.
{{{heading(Allowable Accounts)}}}
Here are defined some allowable accounts:
account Expenses
account Assets
account Income
account Liabilities
account Equity
Use the {{{option(–strict)}}} option to show incampatible accounts
{{{heading(Opening Balances)}}}
The first entry is a set of opening balances. It will look like this:
2016/01/01 * Opening Balance
Expenses $0
Assets:USB:Personal 6151 $0
Assets:USB:Business 6815 $0
Assets:USB:Trust 6831 $0
Assets:USB:Savings $0
Income $0
Liabilities $0
Equity:Opening Balance
{{{heading(Directives and Subdirectives)}}}
The Zero file uses two directives, each of which uses a sub-directive, of the form:
payee <PAYEE> alias </PAYEE_REGEX/> account <FULL:ACCOUNT> payee </PAYEE_REGEX/>
In the above, the first line rewrites the payee
field to establish a
legitimate payee. The alias
is a regex; anything that matches this directive
will be turned into the associated payee
. The second line uses an account
and a payee
directive to specify the proper account
. Anything that matches
the payee
regex will be assigned the account.
payee USPS
alias usps
account Expenses:Office:Postage
payee ^(USPS)$
payee Staples
alias staples
payee Ikea
alias ikea
payee Portland Art Museum
alias portland art
payee The Energy Bar
alias energy bar
account Expenses:Office:Supplies
payee ^(Staples|Portland Art Museum|Ikea|The Energy Bar)$
payee City of Portland
alias city of portland
account Expenses:Business:Travel
payee ^(City of Portland)$
payee RingCentral
alias ringcentral
payee AT&T
alias (at&?t)
payee CenturyLink
alias (centurylink|ctl)
payee NameBright
alias namebright
account Expenses:Business:Communication
payee ^(RingCentral|AT\&T|CenturyLink|NameBright)$
payee AVVO
alias avvo
account Expenses:Business:Advertising
payee AVVO
payee National Law Foundation
alias national law fou
payee Coursera
alias coursera
payee EdX Inc.
alias edx
account Expenses:Professional:CLE
payee ^(National Law Foundation|Coursera|EdX Inc.)$
payee State Bar of CA
alias state bar of ca
account Expenses:Professional:License
payee ^(State Bar of CA)$
payee Costco Gas
alias costco gas
account Expenses:Office:Transportation
payee ^(Costco Gas)$
payee American Express
alias amex
payee Citi
alias citi
account CC:Payment
payee ^(American Express|Citi)$
payee Apple Store
alias apple
payee Radio Shack
alias radioshack
account Expenses:Office:Supplies
payee ^(Apple Store|Radio Shack)$
payee State Bar WA
alias interest paid this period
account Trust:LTAB
payee State Bar WA
payee State Bar WA
alias ltab
account Trust:LTAB
payee State Bar WA
account Assets:Personal
payee usb_6151
{{{heading(An include
File }}}
Finally, include a file with an Accounts:Payable
Category:
include accounts_payable.ledger
The accounts_payable.ledger
file contains any outstanding accounts that
should be included to make the inputted data correct, such as a set of
outstanding invoices:
2016/01/18 * Clark County Indigent Defense ; Invoice No.s 092--099
Assets:Accounts Receivable $1852.50
Income:120703 -$ 82.50 ; Invoice No 092
Income:140707 -$ 525.00 ; Invoice No 093
Income:140709 -$ 397.50 ; Invoice No 094
Income:150701 -$ 15.00 ; Invoice No 095
Income:150704 -$ 742.50 ; Invoice No 096
Income:150705 -$ 7.50 ; Invoice No 097
Income:150706 -$ 7.50 ; Invoice No 098
Income:150707 -$ 75.00 ; Invoice No.099
{{{noindent}}} all of which will be used in this project. The pattern is to open a CSV file, parse a CSV string into records and pipe those records through the transformer to be massaged into shape. From there the new data is saved in another CSV file and also an SQLite3 database using
The processing of a CSV file, therefore, involves the following steps and Node.js modules:
- Find the correct CSV file (using
FileSystem
) and open it as a Readable Stream; - Open a new CSV file to hold the new transformed data as a Writable Stream
- Open an SQLite3 database to hold the new transformed data
- Read the CSV records from the file as a string (using
StreamReader
) - Parse the string into JS records (using
CSV-Parse
) - Transform the JS records into usable data (using
CSV-Transform
) - Save the new data in the new CSV file (using
StreamWriter
) - Save the new data in an SQLite3 database (using
SQLite3
)
This section receives the transformed records from the Transform function and
writes them to new csv files. The new csv files will be located in the
WORKCSV
directory, e.g., WORK/workfin/csv
. A file will be called, for
example, usb_6815__2016.csv
. Notice that this file name uses two
underscores, whereas the source files use two dashes; in all other respects,
they are the same.
const stringifier = csv.stringify({
header: true,
columns: DB_COLS,
});
const usb_acct = `usb_${acct}`;
const usb_acct_year = `${usb_acct}__${year}.csv`;
const csv_path_file = path.join(WORKCSV, usb_acct_year);
console.log(`CSV PATH FILE: ${csv_path_file}`);
let csv_stringifier;
try {
csv_stringifier = fs.createWriteStream(csv_path_file);
console.log(`WRITE STREAM: ${csv_path_file} has been successfully opened.`);
} catch (err) {
console.error(err.message);
process.exit(1);
}
stringifier.on('readable', function() {
console.log('stringifier is now readable');
let row;
while (row = this.read()) {
console.log(`stringifer row: ${row}`);
csv_stringifier.write(row);
}
});
stringifier.on('error', function(err) {
console.error(err.message);
});
stringifier.on('finish', function() {
console.log('stringifier is done writing to csv_stringifer');
csv_stringifier.end('stringifer called csv_stringifier\'s "end" method');
});
stringifier.on('close', function() {
console.log('stringifier is now closed');
});
csv_stringifier.on('close', function() {
console.log('csv_stringifier is now closed');
});
The Transformer is a Node.js Transform Stream. This means it is capable of
both reading and writing data. In this project, it writes data in the CSV
Parser, and then reads it here via its transformer.read()
method. This
transformer
object has a transform()
method that takes a function callback,
whose purpose is to to transform records that are read. This is the heart of
this project. The transform()
function is implemented in the following
section, and returns completely transformed CSV bank records at its end. These
transformed records are then written to both a new CSV file, and the SQLite3
database.
After it transforms the data, the transformer receives the new data via a
readable
event, where it can process the data.
The transformed data will also be saved into the SQLite3 database via an
INSERT
statement executed by the db.run()
method.
The Transform Function receives a record and massages it into shape. The following regular expressions were created based upon inspection of the raw data as it came from the bank for years 2016, 2017, and 2018. It does a decent job of creating readable payees and memos, as well as txfrs (transfers), but it has not been set up to do anything for check payees, categories or related records, for example.
const transform_function = function (record) {
const DEBIT = 'debit';
const CREDIT = 'credit';
const CHECK = 'check';
const CASH = 'cash';
const DEPOSIT = 'deposit';
const UNKNOWN = 'unknown';
const TRANS = 'transfer';
const USBANK = 'usbank';
let trfrom = '';
// Add new columns: acct, checkno, txfr, caseno, desc1, desc2, category
record.acct = usb_acct;
record.checkno = null; // check no.
record.txfr = null; // direction and acct #
record.caseno = null; // related case foreign key
record.desc1 = null; // noun
record.desc2 = null; // adjective
record.category= null; // categorization of the transaction
// Format date as yyyy-mm-dd; delete original Date
record.date = new Date(record['Date']).toISOString().split('T')[0];
delete record['Date'];
// Change Transaction to trans; delete original Transaction
record.trans = record['Transaction'].toLowerCase();
delete record['Transaction'];
// Change Amount to amount as Currency type; delete original Amount
record.amount = accounting.formatMoney(record['Amount']);
delete record['Amount'];
// Change Name to payee; keep original Name as OrigName; delete Name
record.payee = record['Name'].toLowerCase().trimRight();
record.OrigPayee = record['Name'];
delete record['Name'];
// Clean up Memo by removing Download message; return as note; keep Memo as OrigMemo
let re = new RegExp('Download from usbank.com.\\s*');
record.note = record['Memo'].replace(re,'').toLowerCase();
record.OrigMemo = record['Memo'];
delete record['Memo'];
// Add check no. to checkno column
if (record.payee === CHECK) {
const checkno = record.trans.replace(/^0*/,'');
record.checkno = checkno;
record.trans = DEBIT;
record.payee = `(${record.checkno}) check`;
record.note += `Purchase by check no. ${checkno}`;
record.desc1 = 'purchase';
record.desc2 = 'check';
}
if (record.payee.match(/(returned) (item)/)) {
record.desc1 = RegExp.$2;
record.desc2 = RegExp.$1;
record.payee = USBANK;
record.note = `${record.desc2} ${record.desc1}`;
}
if (record.payee.match(/(internet|mobile) (banking) transfer (deposit|withdrawal) (\d{4})\s*$/)) {
record.desc1 = RegExp.$3;
record.desc2 = RegExp.$1;
record.txfr = `${(RegExp.$3 === 'deposit') ? '<' : '>'} usb_${RegExp.$4}`;
tofrom = (record.trans === 'debit') ? 'to' : 'from';
record.payee = (record.trans === 'debit') ? `usb_${RegExp.$4}` : `usb_${options.csv[0]}`;
record.note = `${record.desc2} ${record.desc1}: ${TRANS} ${tofrom} ${record.note}`;
if (/>/.test(record.txfr)) {
record.payee = `Transfer to ${record.payee} from ${record.acct}`;
} else {
record.payee = `Transfer to ${record.payee} from usb_${RegExp.$4}`;
}
}
if (record.payee.match(/debit (purchase)\s*-?\s*(visa)? /)) {
record.desc1 = RegExp.$1;
record.desc2 = RegExp.$2;
record.payee = record.payee.replace(RegExp.lastMatch,'');
record.note = `${record.desc2} ${record.desc1} ${record.note}`.trimLeft();;
}
// Removed ELECTRONIC WITHDRAWAL for payment to State Bar of CA
if (record.payee.match(/^.*(state bar of ca)/)) {
record.payee = RegExp.$1;
}
// web authorized payment
// atm|electronic|mobile check|rdc deposit|withdrawal <name>
if (record.payee.match(/(web authorized) (pmt) |(atm|electronic|mobile)?\s*(check|rdc)?\s*(deposit|withdrawal)\s*(.*)?/)) {
tofrom = '';
record.desc1 = RegExp.$2 ? RegExp.$2 : RegExp.$4 ? RegExp.$4 : RegExp.$5 ? RegExp.$5 : 'undefined';
record.desc2 = RegExp.$1 ? RegExp.$1 : RegExp.$3 ? RegExp.$3 : 'undefined';
if (RegExp.$3 === 'atm' || RegExp.$3 === 'electronic' || RegExp.$3 === 'mobile' || RegExp.$5 === DEPOSIT) {
record.payee = (RegExp.$5 === 'deposit') ? `usb_${options.csv[0]}` : CASH;
} else {
record.payee = record.payee.replace(RegExp.lastMatch,'');
}
if (record.note.match(/paypal/) && record.trans === CREDIT) {
record.txfr = `< ${RegExp.lastMatch}`;
tofrom = ' from';
}
record.note = `${record.desc2} ${record.desc1}${tofrom} ${record.note}`.trimRight();
}
if (record.payee.match(/(zelle instant) (pmt) (from (\w+\s\w+))\s(.*)$/)) {
record.desc1 = RegExp.$2;
record.desc2 = RegExp.$1;
record.note = `${record.desc2} ${record.desc1} ${RegExp.$3}`;
record.payee = `usb_${options.csv[0]}`;
}
if (record.payee.match(/(overdraft|international) (paid|processing) (fee)/)) {
record.desc1 = RegExp.$3;
record.desc2 = `${RegExp.$1} ${RegExp.$2}`;
record.payee = USBANK;
record.note = `${record.desc2} ${record.desc1} to ${record.payee}`;
}
record.payee = record.payee.replace(/\s*portland\s{2,}or$|\s*vancouver\s{2,}wa.*$/,'');
record.note = record.note.replace(/\s*portland\s{2,}or$|\s*vancouver\s{2,}wa.*$/,'');
record.payee = record.payee.replace(/\s\d{3}\w+\s{2,}or$/,''); // Nike Company 019Beaverton OR
record.note = record.note.replace(/\s\d{3}\w+\s{2,}or$/,'');
record.payee = record.payee.replace(/\s*[-\d]{5,}\s*\w{2}$/,''); // '650-4724100 CA' & ' 855-576-4493WA' & ' 800-3333330 MA'
record.note = record.note.replace(/\s*[-\d]{5,}\s*\w{2}$/,'');
record.payee = record.payee.replace(/(\s\w*https)?www.*$/,''); // WWW.ATT.COM TX; UDEMY ONLINE COUHTTPSWWW.UDECA
record.note = record.note.replace(/(\s\w*https)?www.*$/,'');
record.payee = record.payee.replace(/\s*\w+\.com\s+\w{2}$/, '');
record.note = record.note.replace( /\s*\w+\.com\s+\w{2}$/, '');
record.payee = record.payee.replace(/aws.amazon.cWA/i,''); // serviaws.amazon.cWA
record.note = record.note.replace(/aws.amazon.cWA/i,'');
if (record.payee.match(/(bostype \/ wes bo)(hamilton\s+on)/)) { // WES BOHAMILTON ON
record.payee = 'Wes Bos';
record.note = record.note.replace(RegExp.$1,'Wes Bos');
record.note = record.note.replace(RegExp.$2, '');
}
record.payee = record.payee.replace(/\s{2,}/g,' ');
record.note = record.note.replace(/\s{2,}/g,' ');
/*
'DEBIT PURCHASE -VISA SQ *PHIL 877-417-4551WA'
You paid Phil $159 for Atreus keyboard kit and shipping
It is for a credit card processor that goes by the brand name
Square Up. Merchants can run credit card transactions through
their iPhone or iPads using the Square Up services. Mine was for
a taxi ride. https://800notes.com/Phone.aspx/1-877-417-4551
*/
record.payee = record.payee.replace(/sq/, 'square');
record.note = record.note.replace(/sq/, 'square');
return record;
}
const transformer = csv.transform(transform_function);
/* TRANSFORMER reads records through its TRANSFORM_FUNCTION */
/* -------------------------------------------------------- */
transformer.on('readable', function() {
let record;
while ((record = transformer.read())) {
console.log(`Transformer record:\n${util.inspect(record)}`);
/* STRINGIFIER WRITE Records */
/* ------------------------- */
stringifier.write(record);
/* DB RUN---INSERT RECORDS */
/* ----------------------- */
const tab_name = DB_TABLES['usb'];
const col_names = DB_COLS.join(',');
const col_phs = DB_COLS.map(c => '?').join(',');
const col_values= DB_COLS.map(c => record[c]);
let sql = `INSERT INTO ${ tab_name }( ${ col_names } )
VALUES ( ${ col_phs } )`;
console.log(`sql: ${ sql }`);
console.log(`col_values: ${ col_values }`);
db.run(sql, col_values, (err) => {
if (err) {
console.error(err.message);
console.error(`ERROR sql: ${ sql }`);
console.error(`ERROR values: ${ col_values }`);
process.exit(1);
}
});
}
});
transformer.on('error', function(err) {
console.error(err.message);
});
transformer.on('finish', function() {
console.log('Transformer finished writing records.');
});
transformer.on('end', function() {
console.log('Transformer end reached.');
stringifier.end();
});
This section implements the csv parser. By default, it does little other than
read a large string of data and parse it into an array of records. By giving
it the option columns = true
, however, the parser will use the first line as
a list of column headings, and produce an array of objects where the keys are
column names, and the values are column entries. Each record is written to the
stream transformer using its WRITE
method.
<<csv-transformer-write-method>>
const parser = csv.parse({columns: true});
const records = [];
parser.on('readable', function() {
console.log('Parser beginning to read records.');
let record;
/* PARSE A RECORD AND WRITE TO THE TRANSFORMER */
while ((record = parser.read())) {
console.log(`parser record:\n${util.inspect(record)}`);
transformer.write(record);
}
});
parser.on('error', function(err) {
console.error(err.message);
});
parser.on('end', function() {
console.log('Parser finished reading records.');
});
parser.on('finish', function () {
console.log('Parser finished writing records.');
console.log('Parser calling transformer end');
transformer.end();
});
This section implements the Stream Reader that reads the CSV file in as a large
string of data and sends it to the csv parser via the parser’s write
method.
CSV financial files are found in the directories $WORKUSB_[6815|6831]/yyyy
,
where yyyy
can be 2004–2019, and on. Given [6815|6831]
and a year
[2004|2005...2019]
, the file path will be
$WORKUSB_6815/YYYY/usb_6815--yyyy.csv
. This code makes sure the file exists
and the user has proper permissions to read it before proceeding.
if (options.csv) {
// const acct = options.csv[0],
// year = options.csv[1];
// const usb_acct = `usb_${acct}`;
if (!process.env.WORKUSB) {
console.error('You must assign a path to the shell variable WORKUSB');
process.exit(1);
}
const acct_year_path = `${process.env.WORKUSB}/${usb_acct}/${year}`;
const acct_year_csv_file = `${usb_acct}--${year}.csv`;
const acct_year_csv_file_path = `${acct_year_path}/${acct_year_csv_file}`;
if (!fs.existsSync(acct_year_csv_file_path) || !(fs.accessSync(acct_year_csv_file_path, fs.constants.R_OK) === undefined)) {
console.error(`Cannot find or access the CSV file at '${acct_year_csv_file_path}'.`);
process.exit(1);
}
console.log(`Successfully found the CSV file: '${acct_year_csv_file_path}'`);
/* CREATE THE STREAM HERE */
const csv_file_stream = fs.createReadStream(acct_year_csv_file_path, {encoding: 'utf8'});
/* Set up streaming events 'READABLE', 'ERROR', and 'END' */
csv_file_stream.on('readable', function () {
let record;
/* READ THE RECORDS */
while ((record = this.read())) {
console.log(`readable record: ${record}`);
/* WRITE A RECORD TO THE PARSER */
parser.write(record);
}
parser.end();
});
csv_file_stream.on('error', function(err) {
console.error(err.message);
});
csv_file_stream.on('end', function () {
console.log('Reader finished reading data.');
});
}
var sqlite3 = require('sqlite3').verbose();
var db = new sqlite3.Database(':memory:');
db.serialize(function() {
db.run("CREATE TABLE lorem (info TEXT)");
var stmt = db.prepare("INSERT INTO lorem VALUES (?)");
for (var i = 0; i < 10; i++) {
stmt.run("Ipsum " + i);
}
stmt.finalize();
db.each("SELECT rowid AS id, info FROM lorem", function(err, row) {
console.log(row.id + ": " + row.info);
});
});
db.close();
- Straightforward query and parameter binding interface
- Full Buffer/Blob support
- Extensive debugging support
- Query serialization API
- Extension support
- Big test suite
- Written in modern C++ and tested for memory leaks
- Bundles Sqlite3 3.26.0 as a fallback if the installing system doesn’t include SQLite
node-sqlite3
has built-in function call serialization and automatically waits
before executing a blocking action until no other action is pending. This
means that it’s safe to start calling functions on the database object even if
it is not yet fully opened. The Database#close()
function will wait until
all pending queries are completed before closing the database.
node-sqlite3
provides two functions to help control the execution flow of
statements. The default mode is to execute statements in parallel. However,
the Database#close
method will always run in exclusive mode, meaning it
waits until all previous queries have completed and node-sqlite3
will not run
any other queries while a close
is pending.
Calling Database#serialize()
within nested functions is safe:
// Queries scheduled here will run in parallel.
db.serialize(function() {
// Queries scheduled here will be serialized.
db.serialize(function() {
// Queries scheduled here will still be serialized.
});
// Queries scheduled here will still be serialized.
});
// Queries scheduled here will run in parallel again.
Note that queries scheduled not directly in the callback function are not necessarily serialized:
db.serialize(function() {
// These two queries will run sequentially.
db.run("CREATE TABLE foo (num)");
db.run("INSERT INTO foo VALUES (?)", 1, function() {
// These queries will run in parallel and the second query will probably
// fail because the table might not exist yet.
db.run("CREATE TABLE bar (num)");
db.run("INSERT INTO bar VALUES (?)", 1);
});
});
If you call it without a function parameter, the execution mode setting is
sticky and won’t change until the next call to Database#parallelize
.
Calling Database#parallelize()
within nested functions is safe:
db.serialize(function() {
// Queries scheduled here will be serialized.
db.parallelize(function() {
// Queries scheduled here will run in parallel.
});
// Queries scheduled here will be serialized again.
});
If you call it without a function parameter, the execution mode setting is
sticky and won’t change until the next call to Database#serialize
.
- Ledger manual sec. 7.2.1.2
The {{{command(convert)}}} command parses a comma separated value (csv) file and prints Ledger transactions. Importing csv files is a lot of work, but is very amenable to scripting.
$ ledger convert download.csv --input-date-format "%m/%d/%Y"
{{{heading(Fields Descriptors)}}}
Your bank’s csv files will have fields in different orders from other banks, so there must be a way to tell Ledger what to expect. Ledger expects the first line to contain a description of the fields on each line of the file.
- Insert a line at the beginning of the csv file that describes the fields to Ledger.
- The fields ledger can recognize contain these case-insensitive strings:
- date
- posted
- code
- payee
- desc / description
- amount
- cost
- total
- note
- the
Input Date Format
option tells ledger how to interpret the dates:--input-date-format DATE_FORMAT
- e.g.,
"%m/%d/%Y"
- Metadata
- If there are columns in the bank data you would like to keep in your ledger data, besides the primary fields described above, you can name them in the field descriptor list and Ledger will include them in the transaction as meta data if it doesn’t recognize the field name.
- CSV
- original line from the csv file
- Imported
- date data imported
- UUID
- unique checksum; if an entry with the same ‘UUID’ tag is already
included in the normal ledger file (specified via {{{option(–file
FILE (-f))}}} or via the environment variable
LEDGER_FILE
) this entry will not be printed again.
{{{heading(Convert Command Options)}}}
The convert command accepts the following options:
- –invert
- inverts the amount field
- –auto-match
- automatically matches an account from the Ledger journal for every CSV line
- –account STR
- use to specify the account to balance against
- –rich-data
- stores additional tag/value pairs
- –file (-f)
- the normal ledger file (could also be specified via the
environment variable
LEDGER_FILE
) - –input-date-format
- tells ledger how to interpret the dates
{{{heading(Command Directives)}}}
You can also use {{{command(convert)}}} with payee
and account
directives.
- Directives
a.
directive
:: Command directives must occur at the beginning of a line.b.
account
directive :: Pre-declare valid account names. This only has an effect if {{{option(–strict)}}} or {{{option(–pedantic)}}} is used. Theaccount
directive supports several optional sub-directives, if they immediately follow theaccount=
directive and if they begin with whitespace:account Expenses:Food note This account is all about the chicken! alias food payee ^(KFC|Popeyes)$ check commodity == "$" assert commodity == "$" eval print("Hello!") default
c.
payee
sub-directive :: Thepayee
sub-directive of theaccount
directive, which can occur multiple times, provides regexes that identify theaccount
if that payee is encountered and anaccount
within its transaction ends in the name “Unknown”.2012-02-27 KFC Expenses:Unknown $10.00 ; Read now as "Expenses:Food" Assets:Cash
d.
alias
sub-directive :: Thealias
sub-directive of theacount
directive, which can occur multiple times, allows the alias to be used in place of the full account name anywhere that account names are allowed.e.
payee
directive :: The payee directive supports two optional sub-directives, if they immediately follow the payee directive and–—if it is on a successive line—–begins with white-space:payee KFC alias KENTUCKY FRIED CHICKEN uuid 2a2e21d434356f886c84371eebac6e44f1337fda
The
alias
sub-directive of thepayee
directive provides a regex which, if it matches a parsed payee, the declared payee name is substituted:f.
alias
directive :: Define an alias for an account name. The aliases are only in effect for transactions read in after the alias is defined and are affected byaccount
directives that precede them. Thealias
sub-directive, which can occur multiple times, allows the alias to be used in place of the full account name anywhere that account names are allowed.alias Dining=Expenses:Entertainment:Dining alias Checking=Assets:Credit Union:Joint Checking Account 2011/11/28 YummyPalace Dining $10.00 Checking
- First, you can use the
payee
directive and itsalias
sub-directive to rewrite thepayee
field based on some rules.payee Aldi alias ^ALDI SUED SAGT DANKE
- Then you can use the
account
directive and itspayee
sub-directive to specify the account.account Aufwand:Einkauf:Lebensmittel payee ^(Aldi|Alnatura|Kaufland|REWE)$
{{{subheading(Directive Example)}}}
payee Aldi alias ^ALDI SUED SAGT DANKE account Aufwand:Einkauf:Lebensmittel payee ^(Aldi|Alnatura|Kaufland|REWE)$
Note that it may be necessary for the output of {{{command(ledger convert)}}}
to be passed through {{{command(ledger print)}}} a second time if you want to
match on the new payee
field. During the {{{command(ledger convert)}}} run,
only the original payee
name as specified in the csv data seems to be used.
In order to keep the package.json
version
number,
create-CSV-SQLite3-project, in sync with this document’s version number, I have
created a little script to update its version based upon the macro version
’s
current value. This macro is defined at the very top of this Org source file
just below the title and date. I update this version number after every
modification to this source file and before commiting the change. This little
script will then be run whenever an installation occurs.
This script can be run by invoking the Makefile target update-version
. This
will checkout the dev
branch, tangle the update-version.sh
script into the
scripts/
directory, run it, amend the most recent commit to include the
updated version number, push the amended commit to Github, and finally create a
prod
branch (for production
), install all of the files and documentation,
and commit and push the prod
branch to Github. At this point, the package is
ready to be cloned from Github and contains the most recent version number, the
dependencies installed, and run.
This program is a little {{{command(sed))}}} script that modifies this Org
source file in-place (after creating and storing a backup of the source) by
copying the version number found in the macro and updating the version number
of the package.json
file. It runs very quickly.
sed -i .bak -E -e '
/\#\+macro: version Version/bx
/,\/\\"version\\":/by
b
:x
h
s/^(.*macro: version Version )(.*)$/\2/
x
b
:y
H
x
s/\n//
s/^([[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+)(.*)([[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+)/\2\1/
' CSV-SQLite3.org
SOURCE = CSV-SQLite3
ORG = $(SOURCE).org
TEXI = $(SOURCE).texi
INFO = $(SOURCE).info
PDF = $(SOURCE).pdf
DOCS = docs
SCRIPTS=scripts
.PHONY: clean clean-world clean-prod
.PHONY: tangle weave texi info pdf
.PHONY: install install-docs install-info install-pdf open-pdf docs-dir
.PHONY: update-dev update-prod checkout-dev checkout-prod
.PHONY: update-version tangle-update-version run-update-version
texi: $(TEXI)
$(TEXI): $(ORG)
emacs -Q --batch $(ORG) \
--eval '(setq org-export-use-babel nil)' \
--eval '(org-texinfo-export-to-texinfo)'
tangle: $(ORG)
emacs -Q --batch $(ORG) \
--eval '(org-babel-tangle-file "$(ORG)")'
info weave install-info: $(DOCS)/$(INFO)
$(DOCS)/$(INFO): $(TEXI) | docs-dir
makeinfo --output=$(DOCS)/ $(TEXI)
install: package.json
package.json: $(ORG) | docs-dir
emacs -Q --batch $(ORG) \
--eval '(require '\''ob-shell)' \
--eval '(require '\''ob-js)' \
--eval '(setq org-confirm-babel-evaluate nil)' \
--eval '(org-texinfo-export-to-info)'
mv $(INFO) $(DOCS)/
make install-pdf
install-docs: install-info install-pdf
pdf install-pdf: $(DOCS)/$(PDF)
$(DOCS)/$(PDF): $(TEXI) | docs-dir
pdftexi2dvi -q -c $(TEXI)
mv $(PDF) $(DOCS)/
open-pdf: $(DOCS)/$(PDF)
open $(DOCS)/$(PDF)
docs-dir: $(DOCS)
$(DOCS):
mkdir -vp docs
update-version: update-dev update-prod
checkout-dev:
git checkout dev
update-dev: checkout-dev run-update-version
git add -u
git commit --amend -C HEAD
git push origin +dev
checkout-prod: clean-world checkout-dev
git checkout -B prod
update-prod: checkout-prod install clean-prod
git add -A .
git commit -m "Branch:prod"
git push origin +prod
run-update-version: tangle-update-version
./$(SCRIPTS)/update-version.sh
mv -v $(ORG).bak $(WORKBAK)/$(ORG).$(shell date "+%s")
tangle-update-version: $(SCRIPTS)/update-version.sh
$(SCRIPTS)/update-version.sh: $(ORG)
emacs -Q --batch $(ORG) \
--eval '(search-forward ":tangle scripts/update-version.sh")' \
--eval '(org-babel-tangle '\''(4))'
clean:
-rm *~
clean-world: clean
-rm *.{texi,info,pdf,js,json,lock,log,bak}
-rm -rf LogReader
-rm -rf node_modules $(SCRIPTS) $(DOCS)
clean-prod: clean
-rm *.{texi,org} Makefile LogReader
-rm -rf node_modules
© {{{date(%Y)}}} {{{author}}}