« mtg-scraper2 » is a lightweight and tiny web scraper for magic the gathering, it allows you to easily gather data from different source of mtg events, based on Node.js & TypeScript.
While magic the gathering is quite of a mature game and a lot of resources for anything you can think of is available, I've found the part where you can scrap and aggregate data a bit underwhelming. My goal for MTGScraper is to provide a tool to scrap and digest data related to MTG from anywhere and to create whatever you want on top of it with more control over how you would do it by assembling functions and behaviors for your pipeline.
Given that JavaScript is quite popular and that it is quite easy to jump into, I felt that it was logical to use Node.js and its ecosystem to create the library. Though, overtime I wish to make the library usable outside of Node.js, using APIs that are shared across most popular JavaScript runtime as much as possible.
Also, TypeScript is overall easier to use in my opinion, providing more correctness, even with a build step, it is also probably easier to get started with for newcomers which makes it better to build the library into instead of pure JavaScript directly.
Platforms or data source where the data can be used from
- Magic Online
- Manatraders
- Magic Arena
- Magic.gg
npm i mtg-scraper2
In itself, « mtg-scraper2 » is a collection of functions and tools to build your own data pipeline, using Node.js as an execution environment.
The easiest way to start with it, is to create a tiny script to generate data of a given time. At the moment, only Magic online is available as a data source.
Given that we are using Node.js, you can easily create a long-running task that can update periodically your data. Or you can create a one-shot script that can be periodically called by a CRON job. Also, you can easily use it in serverless ecosystem such a Cloud functions, Cloud run, and much more.
/**
* Using TypeScript
*/
import { setInterval as every } from 'node:timers/promises';
import { MTGOTournamentScraper, MTGOTournamentParser } from 'mtg-scraper2';
/**
* You do not need to use IIFE with you're not in a CommonJS ecosystem
*/
(async () => {
const anHour = 60 * 60 * 1000;
for await (const _ of every(anHour)) {
const today = new Date();
try {
const recentTournaments = await MTGOTournamentScraper(today);
for (const tournament of recentTournaments) {
try {
// Check if the tournament has already been scraped
// Or scrape-parse it and save the date somewhere
} catch (err) {
console.error(`Issue when parsing ${ tournament }.\n${err}`);
}
}
} catch (err) {
console.error(`Cannot scrap recent tournaments for ${ today.toUTCString() }`);
}
console.log(`Task ran at ${ today.toUTCString() } without any issue.`);
}
})();
import {
MTGOTournamentParser,
type ResultMTGOPayload,
type Format,
type Platform,
type Level
} from 'mtg-scraper2';
const obj = await MTGOTournamentParser('link-to-a-mtgo-tournament') as {
tournamentLink: string;
daybreakLink: string;
payload: ResultMTGOPayload;
format: Format;
level: Level;
platform: Platform;
date: Date;
eventID: string;
};
await saveMyObjectToADatabaseOrFileOrWhatever(obj);
/**
* You get back an object that you can directly stringify if you wish, or
* cut the object in chunks to put into a DB such as MySQL or PostgreSQL
*/
import {
MTGOTournamentScraper
} from 'mtg-scraper2';
const today = new Date();
const arr = await MTGOTournamentScraper(today) as Array<string>;
/**
* You get back an array of links (string) with links being
* all tournaments of MTGO of the current month and year.
*/
for (const tournament of arr) {
/*** Do something with the tournament string (the link) */
}
import {
checkURLLevelOfPlay
} from 'mtg-scraper2';
const link = 'https://www.mtgo.com/decklist/pauper-challenge-32-2024-01-2712609085';
const result = checkURLLevelOfPlay(link);
console.log(result); // "challenge"
If the level is unknown, the function will throw.
import {
checkURLFormat
} from 'mtg-scraper2';
const link = 'https://www.mtgo.com/decklist/pauper-challenge-32-2024-01-2712609085';
const result = checkURLFormat(link);
console.log(result); // "pauper"
If the format is unknown, the function will throw.
import {
checkURLPlatform
} from 'mtg-scraper2';
const link = 'https://www.mtgo.com/decklist/pauper-challenge-32-2024-01-2712609085';
const result = checkURLPlatform(link);
console.log(result); // "mtgo"
If the platform is unknown, the function will throw.
import {
checkFormat
} from 'mtg-scraper2';
console.log(checkFormat('pauper')); // true
console.log(checkFormat('jikjik')); // false
console.log(checkFormat('modern')); // true
Available formats;
- standard
- pioneer
- limited
- pauper
- modern
- legacy
- vintage
import {
checkPlatform
} from 'mtg-scraper2';
console.log(checkPlatform('mtgo')); // true
console.log(checkPlatform('www.mtgo')); // false
console.log(checkPlatform('lololol')); // false
Available platforms;
- mtgo
import {
checkLevelOfPlay
} from 'mtg-scraper2';
console.log(checkLevelOfPlay('challenge')); // true
console.log(checkLevelOfPlay('www.kkiioo')); // false
console.log(checkLevelOfPlay('lololol')); // false
console.log(checkLevelOfPlay('last-chance')); // true
Available levels;
- league
- preliminary
- challenge
- premier
- showcase-challenge
- showcase-qualifier
- showcase-open
- eternal-weekend
- super-qualifier
- last-chance
import {
getDateFromLink
} from 'mtg-scraper2';
const link = 'https://www.mtgo.com/decklist/pauper-challenge-32-2024-01-2712609085';
const result = getDateFromLink(link);
console.log(result); // { timeInMS: number; year: string; month: string; day: string; }
To generate names of archetype on the fly, you can use the filterer"helper
function. There is three shape for the lists, depending on where it comes from,
and a single kind of archetype signature. (You can find their definition in
types)
import { filterer, type Archetype } from 'mtg-scraper2';
const archetype: Archetype = { name: 'weird name bro', /*** the rest ... */ };
const list: unknown = { /*** ... */ };
const archetypeName = filterer(archetype, list); // It returns null if no archetype was found
console.log(archetypeName); // "weird name bro"
An example archetype in JSON format;
{
"name": "Rakdos Midrange",
"matches": {
"including_cards": [
{ "name": "Bloodtithe Harvester" },
{ "name": "Fatal Push" },
{ "name": "Thoughtseize" },
{ "name": "Fable of the Mirror-Breaker" },
{ "name": "Blood Crypt" }
]
}
}
By default, you need 3 matching card for the filterer function to validate an archetype. But the function signature allows a third parameter where you can change the value of that.
import { filterer, type Archetype } from 'mtg-scraper2';
const archetype: Archetype = { name: 'weird name bro', /*** the rest ... */ };
const list: unknown = { /*** ... */ };
const archetypeName = filterer(archetype, list, 5);
/**
* Now, the archetype need at least 5 "including_cards" and the
* list should have the five to be named the same way.
*/
The following types and schemas are shaped based upon the output of MTGO events.
Using Zod schema, you can validate an unknown object with zodRawTournamentBracketMTGO
to obtain the following interface.
import { zodRawTournamentBracketMTGO } from 'mtg-scraper2';
const obj: unknown = { /*** ... */ };
const objShaped = zodRawTournamentBracketMTGO.parse(obj); // throw if not valid
The shape of a Bracket scraped from MTGO.
export interface TournamentBracketMTGO {
index: number;
matches: Array<{
players: Array<{
loginid: number;
player: string;
seeding: number;
wins: number;
losses: number;
winner: boolean;
}>;
}>;
}
Using Zod schema, you can validate an unknown object with zodRawTournamentStandingMTGO
to obtain the following interface.
import { zodRawTournamentStandingMTGO } from 'mtg-scraper2';
const obj: unknown = { /*** ... */ };
const objShaped = zodRawTournamentStandingMTGO.parse(obj); // throw if not valid
The shape of a Standing scraped from MTGO.
export interface TournamentStandingMTGO {
tournamentid: string;
loginid: string;
login_name: string;
rank: number;
score: number;
opponentmatchwinpercentage: number;
gamewinpercentage: number;
opponentgamewinpercentage: number;
eliminated: boolean;
}
Using Zod schema, you can validate an unknown object with zodRawTournamentDecklistMTGO
to obtain the following interface.
import { zodRawTournamentDecklistMTGO } from 'mtg-scraper2';
const obj: unknown = { /*** ... */ };
const objShaped = zodRawTournamentDecklistMTGO.parse(obj); // throw if not valid
The shape of a Decklist scraped from MTGO.
export interface TournamentDecklistMTGO {
loginid: string;
tournamentid: string;
decktournamentid: string;
player: string;
main_deck: Array<TournamentCardMTGO>;
sideboard_deck: Array<TournamentCardMTGO>
}
Using Zod schema, you can validate an unknown object with zodRawTournamentCardMTGO
to obtain the following interface.
import { zodRawTournamentCardMTGO } from 'mtg-scraper2';
const obj: unknown = { /*** ... */ };
const objShaped = zodRawTournamentCardMTGO.parse(obj); // throw if not valid
The shape of a Tournament Card scraped from MTGO.
export interface TournamentCardMTGO {
decktournamentid: string;
docid: string;
qty: number;
sideboard: boolean;
card_attributes: {
digitalobjectcatalogid?: string;
card_name: string;
cost?: number;
rarity?: string;
color?: string;
cardset?: string;
card_type?: string;
colors?: string;
}
}
Using Zod schema, you can validate an unknown object with zodRawTournamentMTGO
to obtain the following interface.
import { zodRawTournamentMTGO } from 'mtg-scraper2';
const obj: unknown = { /*** ... */ };
const objShaped = zodRawTournamentMTGO.parse(obj); // throw if not valid
The shape of a Tournament scraped from MTGO.
export interface TournamentMTGO {
event_id: string;
description: string;
starttime: Date;
format: string;
type: string;
site_name: string;
decklists: Array<TournamentDecklistMTGO>;
standings: Array<TournamentStandingMTGO>;
brackets: Array<TournamentBracketMTGO>;
final_rank?: Array<{
tournamentid: string;
loginid: string;
rank: number;
roundnumber: number;
}>;
winloss: Array<{
tournamentid: string;
loginid: string;
losses: number;
wins: number;
}>;
player_count: {
tournamentid: string;
players: number;
queued_players: number;
};
}
Using Zod schema, you can validate an unknown object with zodRawLeagueCardMTGO
to obtain the following interface.
import { zodRawLeagueCardMTGO } from 'mtg-scraper2';
const obj: unknown = { /*** ... */ };
const objShaped = zodRawLeagueCardMTGO.parse(obj); // throw if not valid
The shape of a League card scraped from MTGO.
export interface LeagueCardMTGO {
leaguedeckid: string;
loginplayeventcourseid: string;
docid: string;
qty: number;
sideboard: boolean;
card_attributes: {
digitalobjectcatalogid?: string;
card_name: string;
cost?: number;
rarity?: string;
color?: string;
cardset?: string;
card_type?: string;
colors?: string;
}
}
Using Zod schema, you can validate an unknown object with zodRawLeagueMTGO
to obtain the following interface.
import { zodRawLeagueMTGO } from 'mtg-scraper2';
const obj: unknown = { /*** ... */ };
const objShaped = zodRawLeagueMTGO.parse(obj); // throw if not valid
The shape of a League scraped from MTGO.
export interface LeagueMTGO {
playeventid: string;
name: string;
publish_date: string;
instance_id: string;
site_name: string;
decklists: Array<{
loginplayeventcourseid: string;
loginid: string;
instance_id: string;
player: string;
main_deck: Array<LeagueCardMTGO>;
sideboard_deck: Array<LeagueCardMTGO>;
wins: {
loginplayeventcourseid: string;
wins: number;
losses: number;
}
}>;
}
Using Zod schema, you can validate an unknown object with zodRawResultMTGOPayload
to obtain the following interface.
import { zodRawResultMTGOPayload } from 'mtg-scraper2';
const obj: unknown = { /*** ... */ };
const objShaped = zodRawResultMTGOPayload.parse(obj); // throw if not valid
The shape of a full payload of a scraped MTGO tournament (the output of MTGOTournamentParser)
export interface ResultMTGOPayload {
returned: number; // 0 or 1
league_cover_page_list?: Array<LeagueMTGO>; // The league result is at index 0
tournament_cover_page_list?: Array<TournamentMTGO>; // The tournament result is at index 0
}
If the tournament is a league, it is always in the league_cover_page_list
, otherwise
any other kind of mtgo event will be in tournament_cover_page_list
.
export type Platform = 'mtgo';
export type Level =
'league' |
'preliminary' |
'challenge' |
'premier' |
'showcase-challenge' |
'showcase-qualifier' |
'showcase-open' |
'eternal-weekend' |
'super-qualifier' |
'last-chance';
export type Format =
'vintage' |
'legacy' |
'modern' |
'pioneer' |
'standard' |
'pauper' |
'limited';